Luigi
Luigi

Reputation: 5603

Illegal quoting in line 1 using Ruby CSV

I am getting this error:

Illegal quoting in line 1. (CSV::MalformedCSVError)

Line 1 in my file is as follows:

"Status"    "Internal ID"   "Language"  "Created At"    "Updated At"    "IP Address"    "Location"  "Username"  "GET Variables" "Referrer"  "Number of Saves"   "Weighted Score"    "Completion Time"   "Invite Code"   "Invite Email"  "Invite Name"   "Invite: branchid"  "Invite: lastname"  "Invite: clientname"    "Invite: membershipid"  "Invite: clientid"  "Invite: dateofbirth"   "Invite: membershiptype"    "Invite: branch"    "Invite: unitid"    "Invite: shortname" "Invite: changedatetime"    "Invite: homephone" "Collector" 

My code looks like this:

CSV.foreach(file, :col_sep => "\t", :encoding => 'ISO-8859-1', :headers => true) do |column|
    puts column[0]
end

Since I have no control over the csv file, I would like a solution that doesn't involve me opening the file and saving it in another format.

EDIT:

Binary encoding of my file is below:

"\xFF\xFES\x00t\x00a\x00t\x00u\x00s\x00\t\x00I\x00n\x00t\x00e\x00r\x00n\x00a\x00l\x00 \x00I\x00D\x00\t\x00L\x00a\x00n\x00g\x00u\x00a\x00g\x00e\x00\t\x00C\x00r\x00e\x00a\x00t\x00e\x00d\x00 \x00A\x00t\x00\t\x00U\x00p\x00d\x00a\x00t\x00e\x00d\x00 \x00A\x00t\x00\t\x00I\x00P\x00 \x00A\x00d\x00d\x00r\x00e\x00s\x00s\x00\t\x00L\x00o\x00c\x00a\x00t\x00i\x00o\x00n\x00\t\x00U\x00s\x00e\x00r\x00n\x00a\x00m\x00e\x00\t\x00G\x00E\x00T\x00 \x00V\x00a\x00r\x00i\x00a\x00b\x00l\x00e\x00s\x00\t\x00R\x00e\x00f\x00e\x00r\x00r\x00e\x00r\x00\t\x00N\x00u\x00m\x00b\x00e\x00r\x00 \x00o\x00f\x00 \x00S\x00a\x00v\x00e\x00s\x00\t\x00W\x00e\x00i\x00g\x00h\x00t\x00e\x00d\x00 \x00S\x00c\x00o\x00r\x00e\x00\t\x00C\x00o\x00m\x00p\x00l\x00e\x00t\x00i\x00o\x00n\x00 \x00T\x00i\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00 \x00C\x00o\x00d\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00 \x00E\x00m\x00a\x00i\x00l\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00 \x00N\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00l\x00a\x00s\x00t\x00n\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00c\x00l\x00i\x00e\x00n\x00t\x00n\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00s\x00h\x00i\x00p\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00c\x00l\x00i\x00e\x00n\x00t\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00d\x00a\x00t\x00e\x00o\x00f\x00b\x00i\x00r\x00t\x00h\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00s\x00h\x00i\x00p\x00t\x00y\x00p\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00u\x00n\x00i\x00t\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00s\x00h\x00o\x00r\x00t\x00n\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00c\x00h\x00a\x00n\x00g\x00e\x00d\x00a\x00t\x00e\x00t\x00i\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00h\x00o\x00m\x00e\x00p\x00h\x00o\x00n\x00e\x00\t\x00C\x00o\x00l\x00l\x00e\x00c\x00t\x00o\x00r\x00\t\x00\"\x00 \x00\t\x00A\x00r\x00e\x00 \x00y\x00o\x00u\x00 \x00c\x00u\x00r\x00r\x00e\x00n\x00t\x00l\x00y\x00 \x00a\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00Y\x00M\x00C\x00A\x00 \x00o\x00f\x00 \x00P\x00i\x00e\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00c\x00u\x00r\x00r\x00e\x00n\x00t\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00d\x00o\x00e\x00s\x00 \x00t\x00h\x00e\x00 \x00Y\x00 \x00a\x00f\x00f\x00e\x00c\x00t\x00 \x00y\x00o\x00u\x00,\x00 \x00y\x00o\x00u\x00r\x00 \x00f\x00a\x00m\x00i\x00l\x00y\x00,\x00 \x00o\x00r\x00 \x00y\x00o\x00.\x00.\x00.\x00\"\x00\t\x00N\x00P\x00S\x00S\x00c\x00o\x00r\x00e\x00\t\x00C\x00o\x00m\x00m\x00e\x00n\x00t\x00\t\x00\"\x00 \x00\t\x00H\x00a\x00s\x00 \x00a\x00 \x00Y\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00 \x00s\x00u\x00p\x00p\x00o\x00r\x00t\x00e\x00d\x00 \x00y\x00o\x00u\x00 \x00i\x00n\x00 \x00r\x00e\x00a\x00c\x00h\x00i\x00n\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00h\x00a\x00v\x00e\x00 \x00y\x00o\x00u\x00 \x00b\x00e\x00e\x00n\x00 \x00s\x00u\x00p\x00p\x00o\x00r\x00t\x00e\x00d\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00D\x00o\x00 \x00y\x00o\x00u\x00 \x00f\x00e\x00e\x00l\x00 \x00l\x00i\x00k\x00e\x00 \x00y\x00o\x00u\x00 \x00a\x00r\x00e\x00 \x00c\x00o\x00n\x00n\x00e\x00c\x00t\x00e\x00d\x00 \x00t\x00o\x00 \x00t\x00h\x00e\x00 \x00Y\x00,\x00 \x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00o\x00f\x00t\x00e\x00n\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00e\x00n\x00 \x00w\x00a\x00s\x00 \x00t\x00h\x00e\x00 \x00l\x00a\x00s\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00y\x00o\x00u\x00 \x00v\x00i\x00s\x00i\x00t\x00e\x00d\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00U\x00s\x00e\x00d\x00 \x00w\x00e\x00i\x00g\x00h\x00t\x00s\x00,\x00 \x00e\x00x\x00e\x00r\x00c\x00i\x00s\x00e\x00 \x00e\x00q\x00u\x00i\x00p\x00m\x00e\x00n\x00t\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00U\x00s\x00e\x00d\x00 \x00p\x00o\x00o\x00l\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00B\x00a\x00s\x00k\x00e\x00t\x00b\x00a\x00l\x00l\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00W\x00a\x00l\x00k\x00e\x00d\x00 \x00t\x00h\x00e\x00 \x00t\x00r\x00a\x00c\x00k\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00T\x00o\x00o\x00k\x00 \x00a\x00 \x00c\x00l\x00a\x00s\x00s\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00T\x00o\x00o\x00k\x00 \x00a\x00 \x00c\x00h\x00i\x00l\x00d\x00 \x00t\x00o\x00 \x00a\x00 \x00c\x00l\x00a\x00s\x00s\x00 \x00o\x00r\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00W\x00h\x00i\x00c\x00h\x00 \x00c\x00l\x00a\x00s\x00s\x00 \x00o\x00r\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00t\x00i\x00c\x00i\x00p\x00a\x00t\x00e\x00 \x00i\x00n\x00?\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00W\x00h\x00i\x00c\x00h\x00 \x00c\x00l\x00a\x00s\x00s\x00 \x00o\x00r\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00t\x00i\x00c\x00i\x00p\x00a\x00t\x00e\x00 \x00i\x00n\x00?\x00]\x00 \x00[\x00t\x00e\x00x\x00t\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00w\x00o\x00u\x00l\x00d\x00 \x00y\x00o\x00u\x00 \x00r\x00a\x00t\x00e\x00 \x00t\x00h\x00e\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00w\x00o\x00u\x00l\x00d\x00 \x00y\x00o\x00u\x00 \x00r\x00a\x00t\x00e\x00 \x00t\x00h\x00e\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00c\x00l\x00e\x00a\x00n\x00l\x00i\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00c\x00o\x00u\x00r\x00t\x00e\x00o\x00u\x00s\x00 \x00a\x00n\x00d\x00 \x00r\x00e\x00s\x00p\x00o\x00n\x00s\x00i\x00v\x00e\x00 \x00w\x00a\x00s\x00 \x00t\x00h\x00e\x00 \x00Y\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00C\x00a\x00n\x00 \x00y\x00o\x00u\x00 \x00g\x00i\x00v\x00e\x00 \x00a\x00n\x00 \x00e\x00x\x00a\x00m\x00p\x00l\x00e\x00 \x00o\x00f\x00 \x00h\x00o\x00w\x00 \x00t\x00h\x00a\x00t\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00m\x00e\x00m\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00c\x00o\x00u\x00l\x00d\x00 \x00t\x00h\x00e\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00h\x00a\x00v\x00e\x00 \x00b\x00e\x00e\x00n\x00 \x00m\x00o\x00r\x00e\x00 \x00h\x00e\x00l\x00p\x00f\x00u\x00l\x00?\x00<\x00/\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00D\x00o\x00 \x00y\x00o\x00u\x00 \x00h\x00a\x00v\x00e\x00 \x00a\x00n\x00y\x00 \x00o\x00t\x00h\x00e\x00r\x00 \x00c\x00o\x00m\x00m\x00e\x00n\x00t\x00s\x00 \x00a\x00b\x00o\x00u\x00t\x00 \x00y\x00o\x00u\x00r\x00 \x00Y\x00 \x00e\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x005\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x006\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x007\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x008\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x009\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x000\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x001\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x002\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x002\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x003\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x004\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x005\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x006\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x007\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x008\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x009\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x000\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00i\x00s\x00 \x00y\x00o\x00u\x00r\x00 \x00g\x00e\x00n\x00d\x00e\x00r\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00i\x00s\x00 \x00y\x00o\x00u\x00r\x00 \x00a\x00g\x00e\x00 \x00g\x00r\x00o\x00u\x00p\x00?\x00\"\x00\t\x00W\x00h\x00a\x00t\x00 \x00i\x00s\x00 \x00y\x00o\x00u\x00r\x00 \x00c\x00u\x00r\x00r\x00e\x00n\x00t\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00s\x00h\x00i\x00p\x00 \x00t\x00y\x00p\x00e\x00?\x00\n"

Upvotes: 1

Views: 3098

Answers (2)

Stefan
Stefan

Reputation: 114188

Binary encoding of my file is below:

"\xFF\xFES\x00t\x00a\x00t\x00u\x00s\x00...

0xFF 0xFE is the byte order mark for UTF-16LE.

You have to specify the encoding when processing this file with CSV#foreach:

This method also understands an additional :encoding parameter that you can use to specify the Encoding of the data in the file to be read. You must provide this unless your data is in Encoding::default_external(). CSV will use this to determine how to parse the data. You may provide a second Encoding to have the data transcoded as it is read. For example, encoding: "UTF-32BE:UTF-8" would read UTF-32BE data from the file but transcode it to UTF-8 before CSV parses it.

Furthermore you have to specify that a BOM is present. According to the IO#new docs:

If “BOM|UTF-8”, “BOM|UTF-16LE” or “BOM|UTF16-BE” are (...) present, the BOM is stripped

Applied to your file and example:

CSV.foreach(file, col_sep: "\t", encoding: "BOM|UTF-16LE:UTF-8", headers: true) do |row|
  # ...
end

Upvotes: 4

the Tin Man
the Tin Man

Reputation: 160551

On *nix systems the file command is used to give a reasonable-hint to what the file contents are:

file /usr/share/dict/words
/usr/share/dict/words: ASCII text

file /usr/bin/ruby
/usr/bin/ruby: Mach-O universal binary with 2 architectures
/usr/bin/ruby (for architecture i386):  Mach-O executable i386
/usr/bin/ruby (for architecture x86_64):  Mach-O 64-bit executable x86_64

If you're on *nix, try running that against your CSV file and see what it says. It's not fool-proof, but it's reasonably accurate.

As something to get you started, here's how to convert space-delimited fields to tab-delimited:

row = '"Status"    "Internal ID"   "Language"  "Created At"    "Updated At"    "IP Address"    "Location"  "Username"  "GET Variables" "Referrer"  "Number of Saves"   "Weighted Score"    "Completion Time"   "Invite Code"   "Invite Email"  "Invite Name"   "Invite: branchid"  "Invite: lastname"  "Invite: clientname"    "Invite: membershipid"  "Invite: clientid"  "Invite: dateofbirth"   "Invite: membershiptype"    "Invite: branch"    "Invite: unitid"    "Invite: shortname" "Invite: changedatetime"    "Invite: homephone" "Collector" '
row.gsub!(/"\s+"/, %Q["\t"]) # => "\"Status\"\t\"Internal ID\"\t\"Language\"\t\"Created At\"\t\"Updated At\"\t\"IP Address\"\t\"Location\"\t\"Username\"\t\"GET Variables\"\t\"Referrer\"\t\"Number of Saves\"\t\"Weighted Score\"\t\"Completion Time\"\t\"Invite Code\"\t\"Invite Email\"\t\"Invite Name\"\t\"Invite: branchid\"\t\"Invite: lastname\"\t\"Invite: clientname\"\t\"Invite: membershipid\"\t\"Invite: clientid\"\t\"Invite: dateofbirth\"\t\"Invite: membershiptype\"\t\"Invite: branch\"\t\"I...

Upvotes: 0

Related Questions