Friday, October 5th, 2007...9:31 am

Reading An Excel File With Ruby

Jump to Comments

This tutorial will cover how to read (or parse) an excel file with ruby. I had to write a script to unpivot some data for a co-worker that saved him hours of time, and I got to write a ruby script, so it was a win-win. Here's how you can do the same thing.

Installing Parseexcel

Parseexcel is a ruby port of the perl parseexcel module.

It's installable via a nice gem like so:

gem install parseexcel

That's that, now remember since it's a gem library we have to tell our script to use the gem libs when we run the script from the console using the -rubygems switch.

Using Parseexcel

Parseexcel is a very straight forward library, you can't do too much with it, but it gets the job done.

Getting a Workbook

Spreadsheet::ParseExcel.parse(filenameandpath)

This returns the actual excel file's workbook, from there we need to determine what worksheet we're on.

Getting a Worksheet

worksheet = workbook.worksheet(0)

Will return the first worksheet, you could also use the each method on the workbook to iterate over all the worksheets.

Iterating over rows and columns

The worksheet object has a very nice each method that will allow us to iterate over the rows like so

worksheet.each { |row|
  j=0
  i=0
  if row != nil
  row.each { |cell|
    if cell != nil
      contents = cell.to_s('latin1')
      puts "Row: #{j} Cell: #{i} #{contents}"
    end
    i = i+1
  }
  j = j +1
  end
}

Getting Cell Data

  • Getting a String: cell.to_s('latin1')
  • Getting a Float: cell.to_s('latin1')
  • Getting a Int: cell.to_i
  • Getting a Date: cell.date

Getting A Specific Cell

cell = row.at(3) #returns cell at column 3

A basic script for dumping an excel file

require 'parseexcel'

#Open the excel file passed in from the commandline
workbook = Spreadsheet::ParseExcel.parse(ARGV[0])

#Get the first worksheet
worksheet = workbook.worksheet(0)

#cycle over every row
worksheet.each { |row|
  j=0
  i=0
  if row != nil
  #cycle over each cell in this row if it's not an empty row
  row.each { |cell|
    if cell != nil
      #Get the contents of the cell as a string
      contents = cell.to_s('latin1')
      puts "Row: #{j} Cell: #{i}> #{contents}"
    end
    i = i+1
  }
  end
}

To run the script, remember to use the -rubygems switch so that you can find the parsexcel library.

ruby -rubygems excelparse.rb myfile.xls

43 Comments

  • Where does j get incremented?

  • Nice to know this exists.

    @Matt: it looks like the incrementer for 'j' is missing.

    However, assuming the collections use the Enumerable mixin it could be rewritten as:

    worksheet.each_with_index { |row, j|
    if row != nil
    #cycle over each cell in this row if it's not an empty row
    row.each_with_index { |cell, i|
    if cell != nil
    #Get the contents of the cell as a string
    contents = cell.to_s('latin1')
    puts "Row: #{j} Cell: #{i}> #{contents}"
    end
    }
    end
    }

  • The Parseexcel gem has horrible horrible memory leaks and should not be used in a production environment.

  • @Ned: can you suggest an alternative?

  • @Ed C.: I have successfuly used http://chicago.sourceforge.net/xlhtml/ plus the Ruby CSV library to parse Excel files. Works but hurts ;) But I think that no solution wich includes parsing Excel files hurts less ;)

  • What are the *horrible* memory leaks?

  • To run the script, remember to use the -rubygems switch so that you can find the parsexcel library.

    .... you can add it to your source like require 'rubygems'. But it has to be the first require. And it's only needed under *nix enviroments

  • I just realised that formulas aren't supported. Dang. I am trying out roo (http://roo.rubyforge.org/) now as it supports formulas.

  • A couple problems:

    If a cell contains an integer, the to_s conversion returns it as a float, e.g. 12 gives "12.0" (I have a column that contains house numbers or names).

    I cannot extract cells containing dates: nil, empty string or zero is returned for all conversions, date, to_s, to_i.

    These issues could be caused by excel, or me, or both.

    BTW: There is a rape and paste error on the 'Getting a Float', should be to_f.

    An explanation of the 'Latin1' parameter to to_s would be nice.

  • Update: the date conversion probably works fine. I didn't realise, but the date cells I was trying to extract were calculated (formulas). I don't suppose this library is supposed to be able to handle that (but it would be nice)!

  • Christian R. Garner
    November 6th, 2007 at 5:38 pm

    When I try the sample script I get the following message "default formats are encoded in ISO-8859-1". I'm not sure what it wants me to do.

  • Christian R. Garner
    November 6th, 2007 at 6:04 pm

    Ok, ignore the previous comment. I figured out that the message is just a standard message when you start using parseexcel. The main problem is that it's not reading my spreadsheet properly :(

  • [...] Reading an Excel file with Ruby [...]

  • [...] Reading An Excel File With Ruby Рабы корпораций, это специально для вас Чтение данных из Excel-файлов [...]

  • Can't get the gem....

    ERROR: While executing gem ... (Gem::RemoteFetcher::FetchErr A socket operation was attempted to an unreachable host.::EHOSTUNREACH) getting size of http://gems.rubyforge.org/Marshal.4.8

  • [...] Reading Excel Files From RubyWe Heart Code posts a tutorial showing how to use the Parseexcel library, a port of a Perl library that lets you parse Excel spreadsheets from your code.How To Debug Your Rails App With ruby-debugPatrick Lenz presents a great article that shows how to debug a Rails application from start to finish.Economical Use of Amazon S3 with Ruby on RailsRobert Dempsey has written a comprehensive article on how to use S3 in an economic, efficient way from Ruby on Rails applications.Ruby on Rails vs ColdFusion CommercialThe ever active RailsEnvy guys have produced their eighth Ruby on Rails vs X commercial.. presenting Ruby on Rails vs ColdFusion. [...]

  • I was having problems getting the initial example working. The following did, was sucessful ...
    require 'parseexcel'

    workbook = Spreadsheet::ParseExcel.parse(ARGV[0])
    worksheet = workbook.worksheet(1)
    col = ('A'..'Z').to_a # Only good for up to 26 columns

    j = 1

    worksheet.each { |row|
    i = 0
    if row != nil
    row.each { |cell|
    if cell != nil
    contents = cell.to_s('latin1')
    puts "#{col[i]}#{j}: #{contents}"
    end
    i += 1
    }
    j += 1
    # if j == 6 then break end # Uncomment to limit to a certain number of rows
    end
    }

    The key for me was not using worksheet(0) [didn't work for me], but worksheet(1) [worked!].

  • I was having problems getting the initial example working. The following however, was successful ...
    require 'parseexcel'

    workbook = Spreadsheet::ParseExcel.parse(ARGV[0])
    worksheet = workbook.worksheet(1)
    col = ('A'..'Z').to_a # Only good for up to 26 columns

    j = 1

    worksheet.each { |row|
    i = 0
    if row != nil
    row.each { |cell|
    if cell != nil
    contents = cell.to_s('latin1')
    puts "#{col[i]}#{j}: #{contents}"
    end
    i += 1
    }
    j += 1
    # if j == 6 then break end # Uncomment to limit to a certain number of rows
    end
    }

    The key for me was not using worksheet(0) [didn't work for me], but worksheet(1) [worked!].

  • I'm having a problem with parseexcel
    When I try to use cell.to_s('latin1') I get an error for wrong number of arguments (1 for 0)
    For some reason Ruby is not seeing parseexcel's to_s method. Any ideas what is going on here? (I tried with -rubygems to no avail)

  • Worked like a charm!
    Thanks!

  • Actual topic. Writing is worthy of attention.

  • watch me yes dog

  • hi,
    Is it possible to retrive the worksheet names and merged cell ranges using tis gem.Thanks in advance.

    Regards,
    Poornima.D

  • Занятно пишете, жизненно. Все-таки, для того, чтобы делать интересный блог, нужно не только просто сообщать о чем-то, но и делать это в интересной форме:)

  • any fix to

    If a cell contains an integer, the to_s conversion returns it as a float, e.g. 12 gives "12.0" (I have a column that contains house numbers or names).

  • Премного благодарен автору. Возможно, в будущем я и действительно реализую подобную затею. :)

  • Эх... :) Вы бы знали что про Вас пишут в других блогах :)

  • tai mahal casino hotels near emerald queen casino tulalip casino shuttle .tachi palace hotel u0026 casino lemoore ca barona casino map stampede casino in calgary cache creek casino room rates .show casino montreal online casino rated .wynn casino spa drop team casino las vegas casino security employment vegas casino online review .too garden city casino garden city ny .in the attached .Cool stuff - .too is required for mgm vegas casino stanley casino brighton wathc Get info on big g casino blackpool Before nicky tarantino casino metropolis casino When casino wien now or recreational or recreational rama casino edgewater casino in laughlin see Get lucky 7 casino seoul whether or no napa valley casino houses for sale casino nsw las vegas casino pictures see buffets at crown casino prd casino st joe frontier casino lucky buck casino In a valley casino san diego Is the
    las vegas casino comps epiphone 335 casino .Whith Come to blackbear casino In linfox casino street whether or no tulip casino in seattle .is the same as .Come to .it contain take us konstanz casino hotel casino plaza quito often What is fantasy springs casino And skagit casino island casino escanaba leelanau sands casino mi conrad bimini resort x26 casino The rich reels casino The best of it contain .

  • Да, интернет - огромен, если и такое даже можно откопать ;)

  • Hi. And Bye.

  • Бежал от волка, попал на медведя. :)

  • Автор, а скажите а куда написать по поводу обмена ссылок (на какое конкретно мыло)?

  • Премного благодарен. Это именно то, что мне нужно было :)

  • So cool gb style of site. What CMS do you use ?

  • So cool portal style of site. What CMS do you use ?

  • Very nice site design. What CMS do you use ?

  • So nice portal design. What CMS do you use ?

  • Весьма позновательный ресурс, спасибо!

  • the increment of j is not missing, it's just above the last end statement

    however, the initialization of j is in the wrong place

    it should be outside of the loop that way it doesn't get set back to zero through the loops iterations

  • You should definitely take advantage of the free $50 bonus being offered at our new Bet Phoenix Casino. Only available now at: Bet Phoenix. Good luck!

  • Hey,

    I am able to get your example working. But can you please tell me how to read an excel worksheet by name instead of number.

    I tried to replace
    worksheet = workbook.worksheet(0)
    with
    worksheet = workbook.worksheet('Sheet1')

    But that is thowing me the following error message.

    undefined method `each' for nil:NilClass (NoMethodError)

    Please help!

  • pay to play bingo. which people play netgame. online play poker online play poker . how to play ninuki renju. play fair pa marcus mitchell

  • Please usurp me to beget bank

Leave a Reply