Friday, October 5th, 2007...9:31 am
Reading An Excel File With Ruby
This tutorial will cover how to read (or parse) an excel file with ruby. I had to write a script to unpivot some data for a co-worker that saved him hours of time, and I got to write a ruby script, so it was a win-win. Here's how you can do the same thing.
Installing Parseexcel
Parseexcel is a ruby port of the perl parseexcel module.
It's installable via a nice gem like so:
That's that, now remember since it's a gem library we have to tell our script to use the gem libs when we run the script from the console using the -rubygems switch.
Using Parseexcel
Parseexcel is a very straight forward library, you can't do too much with it, but it gets the job done.
Getting a Workbook
This returns the actual excel file's workbook, from there we need to determine what worksheet we're on.
Getting a Worksheet
Will return the first worksheet, you could also use the each method on the workbook to iterate over all the worksheets.
Iterating over rows and columns
The worksheet object has a very nice each method that will allow us to iterate over the rows like so
j=0
i=0
if row != nil
row.each { |cell|
if cell != nil
contents = cell.to_s('latin1')
puts "Row: #{j} Cell: #{i} #{contents}"
end
i = i+1
}
j = j +1
end
}
Getting Cell Data
- Getting a String: cell.to_s('latin1')
- Getting a Float: cell.to_s('latin1')
- Getting a Int: cell.to_i
- Getting a Date: cell.date
Getting A Specific Cell
A basic script for dumping an excel file
#Open the excel file passed in from the commandline
workbook = Spreadsheet::ParseExcel.parse(ARGV[0])
#Get the first worksheet
worksheet = workbook.worksheet(0)
#cycle over every row
worksheet.each { |row|
j=0
i=0
if row != nil
#cycle over each cell in this row if it's not an empty row
row.each { |cell|
if cell != nil
#Get the contents of the cell as a string
contents = cell.to_s('latin1')
puts "Row: #{j} Cell: #{i}> #{contents}"
end
i = i+1
}
end
}
To run the script, remember to use the -rubygems switch so that you can find the parsexcel library.







43 Comments
October 5th, 2007 at 1:09 pm
Where does j get incremented?
October 6th, 2007 at 2:05 am
Nice to know this exists.
@Matt: it looks like the incrementer for 'j' is missing.
However, assuming the collections use the Enumerable mixin it could be rewritten as:
worksheet.each_with_index { |row, j|
if row != nil
#cycle over each cell in this row if it's not an empty row
row.each_with_index { |cell, i|
if cell != nil
#Get the contents of the cell as a string
contents = cell.to_s('latin1')
puts "Row: #{j} Cell: #{i}> #{contents}"
end
}
end
}
October 17th, 2007 at 2:17 pm
The Parseexcel gem has horrible horrible memory leaks and should not be used in a production environment.
October 20th, 2007 at 11:32 am
@Ned: can you suggest an alternative?
October 21st, 2007 at 5:32 pm
@Ed C.: I have successfuly used http://chicago.sourceforge.net/xlhtml/ plus the Ruby CSV library to parse Excel files. Works but hurts ;) But I think that no solution wich includes parsing Excel files hurts less ;)
October 22nd, 2007 at 4:16 am
What are the *horrible* memory leaks?
October 23rd, 2007 at 12:03 pm
To run the script, remember to use the -rubygems switch so that you can find the parsexcel library.
.... you can add it to your source like require 'rubygems'. But it has to be the first require. And it's only needed under *nix enviroments
October 28th, 2007 at 6:55 am
I just realised that formulas aren't supported. Dang. I am trying out roo (http://roo.rubyforge.org/) now as it supports formulas.
November 5th, 2007 at 5:04 am
A couple problems:
If a cell contains an integer, the to_s conversion returns it as a float, e.g. 12 gives "12.0" (I have a column that contains house numbers or names).
I cannot extract cells containing dates: nil, empty string or zero is returned for all conversions, date, to_s, to_i.
These issues could be caused by excel, or me, or both.
BTW: There is a rape and paste error on the 'Getting a Float', should be to_f.
An explanation of the 'Latin1' parameter to to_s would be nice.
November 5th, 2007 at 5:09 am
Update: the date conversion probably works fine. I didn't realise, but the date cells I was trying to extract were calculated (formulas). I don't suppose this library is supposed to be able to handle that (but it would be nice)!
November 6th, 2007 at 5:38 pm
When I try the sample script I get the following message "default formats are encoded in ISO-8859-1". I'm not sure what it wants me to do.
November 6th, 2007 at 6:04 pm
Ok, ignore the previous comment. I figured out that the message is just a standard message when you start using parseexcel. The main problem is that it's not reading my spreadsheet properly :(
November 12th, 2007 at 11:11 pm
[...] Reading an Excel file with Ruby [...]
December 3rd, 2007 at 9:34 am
[...] Reading An Excel File With Ruby Рабы корпораций, это специально для вас Чтение данных из Excel-файлов [...]
January 18th, 2008 at 4:00 am
Can't get the gem....
ERROR: While executing gem ... (Gem::RemoteFetcher::FetchErr A socket operation was attempted to an unreachable host.::EHOSTUNREACH) getting size of http://gems.rubyforge.org/Marshal.4.8
January 27th, 2008 at 1:15 pm
[...] Reading Excel Files From RubyWe Heart Code posts a tutorial showing how to use the Parseexcel library, a port of a Perl library that lets you parse Excel spreadsheets from your code.How To Debug Your Rails App With ruby-debugPatrick Lenz presents a great article that shows how to debug a Rails application from start to finish.Economical Use of Amazon S3 with Ruby on RailsRobert Dempsey has written a comprehensive article on how to use S3 in an economic, efficient way from Ruby on Rails applications.Ruby on Rails vs ColdFusion CommercialThe ever active RailsEnvy guys have produced their eighth Ruby on Rails vs X commercial.. presenting Ruby on Rails vs ColdFusion. [...]
May 22nd, 2008 at 2:35 pm
I was having problems getting the initial example working. The following did, was sucessful ...
require 'parseexcel'
workbook = Spreadsheet::ParseExcel.parse(ARGV[0])
worksheet = workbook.worksheet(1)
col = ('A'..'Z').to_a # Only good for up to 26 columns
j = 1
worksheet.each { |row|
i = 0
if row != nil
row.each { |cell|
if cell != nil
contents = cell.to_s('latin1')
puts "#{col[i]}#{j}: #{contents}"
end
i += 1
}
j += 1
# if j == 6 then break end # Uncomment to limit to a certain number of rows
end
}
The key for me was not using worksheet(0) [didn't work for me], but worksheet(1) [worked!].
May 22nd, 2008 at 2:39 pm
I was having problems getting the initial example working. The following however, was successful ...
require 'parseexcel'
workbook = Spreadsheet::ParseExcel.parse(ARGV[0])
worksheet = workbook.worksheet(1)
col = ('A'..'Z').to_a # Only good for up to 26 columns
j = 1
worksheet.each { |row|
i = 0
if row != nil
row.each { |cell|
if cell != nil
contents = cell.to_s('latin1')
puts "#{col[i]}#{j}: #{contents}"
end
i += 1
}
j += 1
# if j == 6 then break end # Uncomment to limit to a certain number of rows
end
}
The key for me was not using worksheet(0) [didn't work for me], but worksheet(1) [worked!].
June 4th, 2008 at 3:44 pm
I'm having a problem with parseexcel
When I try to use cell.to_s('latin1') I get an error for wrong number of arguments (1 for 0)
For some reason Ruby is not seeing parseexcel's to_s method. Any ideas what is going on here? (I tried with -rubygems to no avail)
July 27th, 2008 at 4:35 pm
Worked like a charm!
Thanks!
August 17th, 2008 at 3:48 pm
Actual topic. Writing is worthy of attention.
September 13th, 2008 at 11:51 pm
watch me yes dog
February 24th, 2009 at 1:25 am
hi,
Is it possible to retrive the worksheet names and merged cell ranges using tis gem.Thanks in advance.
Regards,
Poornima.D
May 21st, 2009 at 4:52 am
Занятно пишете, жизненно. Все-таки, для того, чтобы делать интересный блог, нужно не только просто сообщать о чем-то, но и делать это в интересной форме:)
May 21st, 2009 at 11:12 pm
any fix to
If a cell contains an integer, the to_s conversion returns it as a float, e.g. 12 gives "12.0" (I have a column that contains house numbers or names).
June 19th, 2009 at 4:29 am
Премного благодарен автору. Возможно, в будущем я и действительно реализую подобную затею. :)
June 21st, 2009 at 8:07 am
Эх... :) Вы бы знали что про Вас пишут в других блогах :)
June 22nd, 2009 at 5:20 pm
tai mahal casino hotels near emerald queen casino tulalip casino shuttle .tachi palace hotel u0026 casino lemoore ca barona casino map stampede casino in calgary cache creek casino room rates .show casino montreal online casino rated .wynn casino spa drop team casino las vegas casino security employment vegas casino online review .too garden city casino garden city ny .in the attached .Cool stuff - .too is required for mgm vegas casino stanley casino brighton wathc Get info on big g casino blackpool Before nicky tarantino casino metropolis casino When casino wien now or recreational or recreational rama casino edgewater casino in laughlin see Get lucky 7 casino seoul whether or no napa valley casino houses for sale casino nsw las vegas casino pictures see buffets at crown casino prd casino st joe frontier casino lucky buck casino In a valley casino san diego Is the
las vegas casino comps epiphone 335 casino .Whith Come to blackbear casino In linfox casino street whether or no tulip casino in seattle .is the same as .Come to .it contain take us konstanz casino hotel casino plaza quito often What is fantasy springs casino And skagit casino island casino escanaba leelanau sands casino mi conrad bimini resort x26 casino The rich reels casino The best of it contain .
July 4th, 2009 at 7:16 pm
Да, интернет - огромен, если и такое даже можно откопать ;)
July 7th, 2009 at 10:32 pm
Hi. And Bye.
July 14th, 2009 at 2:24 pm
Бежал от волка, попал на медведя. :)
July 23rd, 2009 at 10:55 pm
Автор, а скажите а куда написать по поводу обмена ссылок (на какое конкретно мыло)?
August 24th, 2009 at 3:51 pm
Премного благодарен. Это именно то, что мне нужно было :)
September 27th, 2009 at 1:09 pm
So cool gb style of site. What CMS do you use ?
September 27th, 2009 at 1:15 pm
So cool portal style of site. What CMS do you use ?
September 28th, 2009 at 7:01 pm
Very nice site design. What CMS do you use ?
October 9th, 2009 at 7:11 pm
So nice portal design. What CMS do you use ?
February 19th, 2010 at 9:29 pm
Весьма позновательный ресурс, спасибо!
March 9th, 2010 at 3:21 pm
the increment of j is not missing, it's just above the last end statement
however, the initialization of j is in the wrong place
it should be outside of the loop that way it doesn't get set back to zero through the loops iterations
May 10th, 2010 at 11:21 am
You should definitely take advantage of the free $50 bonus being offered at our new Bet Phoenix Casino. Only available now at: Bet Phoenix. Good luck!
January 4th, 2011 at 4:50 pm
Hey,
I am able to get your example working. But can you please tell me how to read an excel worksheet by name instead of number.
I tried to replace
worksheet = workbook.worksheet(0)
with
worksheet = workbook.worksheet('Sheet1')
But that is thowing me the following error message.
undefined method `each' for nil:NilClass (NoMethodError)
Please help!
February 15th, 2011 at 8:19 am
pay to play bingo. which people play netgame. online play poker online play poker . how to play ninuki renju. play fair pa marcus mitchell
April 1st, 2011 at 8:58 pm
Please usurp me to beget bank
Leave a Reply