Ubuntu 12.04 LTS
Ruby ruby 1.9.3dev (2011-09-23 revision 33323) [i686-linux]
Rails 3.2.9
Following is the content of my received CSV file:
"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total"
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80"
However when I am trying to parse the CSV file I am getting error:
1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' }
=> {:col_sep=>",", :quote_char=>"\""}
1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):22
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'
Then I tried simplifying the data i.e.
"name","age","email"
"jignesh","30","[email protected]"
however still I am getting the same error:
1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):23
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'
Again I tried simplifying the data like this:
name,age,email
jignesh,30,[email protected]
and it works.See the output below:
1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }
name
age
email
jignesh
30
[email protected]
=> nil
But I will be receiving the CSV files having quoted data so removing quotes solution is not actually I am looking for.I am unable to figure out what is causing the error: CSV::MalformedCSVError: Illegal quoting in line 1. in my earlier examples.
I have verified that in the CSV there are no leading/trailing spaces by enabling "Show whitespace characters" and "Show Line Endings" in my text editor.Also I have verified the encoding using following.
1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding
=> #<Encoding:UTF-8>
Note: I tried using CSV.read too but same error with that method.
Can anybody please help me getting out of the problem and make me understand where it is going wrong?
=====================
I just found following post at: http://www.ruby-forum.com/topic/448070 and tried following:
file_data = file.read
file_data.gsub!('"', "'")
arr_of_arrs = CSV.parse(file_data)
arr_of_arrs.each do |arr|
Rails.logger.debug "=======#{arr}"
end
and got the following output:
=======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]
=======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"]
which messed up reading the data properly as the default col_sep used is a comma character. However I tried using quote_char option like this:
arr_of_arrs = CSV.parse(file_data, :quote_char => "'")
but it ended up the following error:
CSV::MalformedCSVError (Illegal quoting in line 1.):
Thanks, Jignesh
quote_chars = %w(" | ~ ^ & *)
begin
@report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift)
rescue CSV::MalformedCSVError
quote_chars.empty? ? raise : retry
end
it's not perfect but it works most of the time.
N.B. CSV.parse
takes the same parameters as CSV.read
, so either a file or data from memory can be used