Reputation: 5603
I am getting this error:
Illegal quoting in line 1. (CSV::MalformedCSVError)
Line 1 in my file is as follows:
"Status" "Internal ID" "Language" "Created At" "Updated At" "IP Address" "Location" "Username" "GET Variables" "Referrer" "Number of Saves" "Weighted Score" "Completion Time" "Invite Code" "Invite Email" "Invite Name" "Invite: branchid" "Invite: lastname" "Invite: clientname" "Invite: membershipid" "Invite: clientid" "Invite: dateofbirth" "Invite: membershiptype" "Invite: branch" "Invite: unitid" "Invite: shortname" "Invite: changedatetime" "Invite: homephone" "Collector"
My code looks like this:
CSV.foreach(file, :col_sep => "\t", :encoding => 'ISO-8859-1', :headers => true) do |column|
puts column[0]
end
Since I have no control over the csv file, I would like a solution that doesn't involve me opening the file and saving it in another format.
EDIT:
Binary encoding of my file is below:
"\xFF\xFES\x00t\x00a\x00t\x00u\x00s\x00\t\x00I\x00n\x00t\x00e\x00r\x00n\x00a\x00l\x00 \x00I\x00D\x00\t\x00L\x00a\x00n\x00g\x00u\x00a\x00g\x00e\x00\t\x00C\x00r\x00e\x00a\x00t\x00e\x00d\x00 \x00A\x00t\x00\t\x00U\x00p\x00d\x00a\x00t\x00e\x00d\x00 \x00A\x00t\x00\t\x00I\x00P\x00 \x00A\x00d\x00d\x00r\x00e\x00s\x00s\x00\t\x00L\x00o\x00c\x00a\x00t\x00i\x00o\x00n\x00\t\x00U\x00s\x00e\x00r\x00n\x00a\x00m\x00e\x00\t\x00G\x00E\x00T\x00 \x00V\x00a\x00r\x00i\x00a\x00b\x00l\x00e\x00s\x00\t\x00R\x00e\x00f\x00e\x00r\x00r\x00e\x00r\x00\t\x00N\x00u\x00m\x00b\x00e\x00r\x00 \x00o\x00f\x00 \x00S\x00a\x00v\x00e\x00s\x00\t\x00W\x00e\x00i\x00g\x00h\x00t\x00e\x00d\x00 \x00S\x00c\x00o\x00r\x00e\x00\t\x00C\x00o\x00m\x00p\x00l\x00e\x00t\x00i\x00o\x00n\x00 \x00T\x00i\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00 \x00C\x00o\x00d\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00 \x00E\x00m\x00a\x00i\x00l\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00 \x00N\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00l\x00a\x00s\x00t\x00n\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00c\x00l\x00i\x00e\x00n\x00t\x00n\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00s\x00h\x00i\x00p\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00c\x00l\x00i\x00e\x00n\x00t\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00d\x00a\x00t\x00e\x00o\x00f\x00b\x00i\x00r\x00t\x00h\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00s\x00h\x00i\x00p\x00t\x00y\x00p\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00u\x00n\x00i\x00t\x00i\x00d\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00s\x00h\x00o\x00r\x00t\x00n\x00a\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00c\x00h\x00a\x00n\x00g\x00e\x00d\x00a\x00t\x00e\x00t\x00i\x00m\x00e\x00\t\x00I\x00n\x00v\x00i\x00t\x00e\x00:\x00 \x00h\x00o\x00m\x00e\x00p\x00h\x00o\x00n\x00e\x00\t\x00C\x00o\x00l\x00l\x00e\x00c\x00t\x00o\x00r\x00\t\x00\"\x00 \x00\t\x00A\x00r\x00e\x00 \x00y\x00o\x00u\x00 \x00c\x00u\x00r\x00r\x00e\x00n\x00t\x00l\x00y\x00 \x00a\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00Y\x00M\x00C\x00A\x00 \x00o\x00f\x00 \x00P\x00i\x00e\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00c\x00u\x00r\x00r\x00e\x00n\x00t\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00d\x00o\x00e\x00s\x00 \x00t\x00h\x00e\x00 \x00Y\x00 \x00a\x00f\x00f\x00e\x00c\x00t\x00 \x00y\x00o\x00u\x00,\x00 \x00y\x00o\x00u\x00r\x00 \x00f\x00a\x00m\x00i\x00l\x00y\x00,\x00 \x00o\x00r\x00 \x00y\x00o\x00.\x00.\x00.\x00\"\x00\t\x00N\x00P\x00S\x00S\x00c\x00o\x00r\x00e\x00\t\x00C\x00o\x00m\x00m\x00e\x00n\x00t\x00\t\x00\"\x00 \x00\t\x00H\x00a\x00s\x00 \x00a\x00 \x00Y\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00 \x00s\x00u\x00p\x00p\x00o\x00r\x00t\x00e\x00d\x00 \x00y\x00o\x00u\x00 \x00i\x00n\x00 \x00r\x00e\x00a\x00c\x00h\x00i\x00n\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00h\x00a\x00v\x00e\x00 \x00y\x00o\x00u\x00 \x00b\x00e\x00e\x00n\x00 \x00s\x00u\x00p\x00p\x00o\x00r\x00t\x00e\x00d\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00D\x00o\x00 \x00y\x00o\x00u\x00 \x00f\x00e\x00e\x00l\x00 \x00l\x00i\x00k\x00e\x00 \x00y\x00o\x00u\x00 \x00a\x00r\x00e\x00 \x00c\x00o\x00n\x00n\x00e\x00c\x00t\x00e\x00d\x00 \x00t\x00o\x00 \x00t\x00h\x00e\x00 \x00Y\x00,\x00 \x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00o\x00f\x00t\x00e\x00n\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00e\x00n\x00 \x00w\x00a\x00s\x00 \x00t\x00h\x00e\x00 \x00l\x00a\x00s\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00y\x00o\x00u\x00 \x00v\x00i\x00s\x00i\x00t\x00e\x00d\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00U\x00s\x00e\x00d\x00 \x00w\x00e\x00i\x00g\x00h\x00t\x00s\x00,\x00 \x00e\x00x\x00e\x00r\x00c\x00i\x00s\x00e\x00 \x00e\x00q\x00u\x00i\x00p\x00m\x00e\x00n\x00t\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00U\x00s\x00e\x00d\x00 \x00p\x00o\x00o\x00l\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00B\x00a\x00s\x00k\x00e\x00t\x00b\x00a\x00l\x00l\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00W\x00a\x00l\x00k\x00e\x00d\x00 \x00t\x00h\x00e\x00 \x00t\x00r\x00a\x00c\x00k\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00T\x00o\x00o\x00k\x00 \x00a\x00 \x00c\x00l\x00a\x00s\x00s\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00T\x00o\x00o\x00k\x00 \x00a\x00 \x00c\x00h\x00i\x00l\x00d\x00 \x00t\x00o\x00 \x00a\x00 \x00c\x00l\x00a\x00s\x00s\x00 \x00o\x00r\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00W\x00h\x00i\x00c\x00h\x00 \x00c\x00l\x00a\x00s\x00s\x00 \x00o\x00r\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00t\x00i\x00c\x00i\x00p\x00a\x00t\x00e\x00 \x00i\x00n\x00?\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00i\x00c\x00h\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00f\x00o\x00l\x00l\x00o\x00w\x00i\x00n\x00g\x00 \x00a\x00c\x00t\x00i\x00v\x00i\x00t\x00i\x00e\x00s\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00.\x00.\x00.\x00 \x00[\x00W\x00h\x00i\x00c\x00h\x00 \x00c\x00l\x00a\x00s\x00s\x00 \x00o\x00r\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00 \x00d\x00i\x00d\x00 \x00y\x00o\x00u\x00 \x00p\x00a\x00r\x00t\x00i\x00c\x00i\x00p\x00a\x00t\x00e\x00 \x00i\x00n\x00?\x00]\x00 \x00[\x00t\x00e\x00x\x00t\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00w\x00o\x00u\x00l\x00d\x00 \x00y\x00o\x00u\x00 \x00r\x00a\x00t\x00e\x00 \x00t\x00h\x00e\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00p\x00r\x00o\x00g\x00r\x00a\x00m\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00w\x00o\x00u\x00l\x00d\x00 \x00y\x00o\x00u\x00 \x00r\x00a\x00t\x00e\x00 \x00t\x00h\x00e\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00 \x00o\x00f\x00 \x00t\x00h\x00e\x00 \x00c\x00l\x00e\x00a\x00n\x00l\x00i\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00c\x00o\x00u\x00r\x00t\x00e\x00o\x00u\x00s\x00 \x00a\x00n\x00d\x00 \x00r\x00e\x00s\x00p\x00o\x00n\x00s\x00i\x00v\x00e\x00 \x00w\x00a\x00s\x00 \x00t\x00h\x00e\x00 \x00Y\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00C\x00a\x00n\x00 \x00y\x00o\x00u\x00 \x00g\x00i\x00v\x00e\x00 \x00a\x00n\x00 \x00e\x00x\x00a\x00m\x00p\x00l\x00e\x00 \x00o\x00f\x00 \x00h\x00o\x00w\x00 \x00t\x00h\x00a\x00t\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00m\x00e\x00m\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00H\x00o\x00w\x00 \x00c\x00o\x00u\x00l\x00d\x00 \x00t\x00h\x00e\x00 \x00s\x00t\x00a\x00f\x00f\x00 \x00h\x00a\x00v\x00e\x00 \x00b\x00e\x00e\x00n\x00 \x00m\x00o\x00r\x00e\x00 \x00h\x00e\x00l\x00p\x00f\x00u\x00l\x00?\x00<\x00/\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00D\x00o\x00 \x00y\x00o\x00u\x00 \x00h\x00a\x00v\x00e\x00 \x00a\x00n\x00y\x00 \x00o\x00t\x00h\x00e\x00r\x00 \x00c\x00o\x00m\x00m\x00e\x00n\x00t\x00s\x00 \x00a\x00b\x00o\x00u\x00t\x00 \x00y\x00o\x00u\x00r\x00 \x00Y\x00 \x00e\x00.\x00.\x00.\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x005\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x006\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x007\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x008\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x009\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x000\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x001\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x002\x00 \x00a\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x002\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x003\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x004\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x005\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x006\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x007\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x008\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x009\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00t\x00i\x00m\x00e\x00 \x00d\x00o\x00 \x00y\x00o\x00u\x00 \x00u\x00s\x00u\x00a\x00l\x00l\x00y\x00 \x00v\x00i\x00s\x00i\x00t\x00 \x00t\x00h\x00e\x00 \x00Y\x00?\x00 \x00[\x001\x000\x00 \x00p\x00m\x00]\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00i\x00s\x00 \x00y\x00o\x00u\x00r\x00 \x00g\x00e\x00n\x00d\x00e\x00r\x00?\x00\"\x00\t\x00\"\x00 \x00\t\x00W\x00h\x00a\x00t\x00 \x00i\x00s\x00 \x00y\x00o\x00u\x00r\x00 \x00a\x00g\x00e\x00 \x00g\x00r\x00o\x00u\x00p\x00?\x00\"\x00\t\x00W\x00h\x00a\x00t\x00 \x00i\x00s\x00 \x00y\x00o\x00u\x00r\x00 \x00c\x00u\x00r\x00r\x00e\x00n\x00t\x00 \x00m\x00e\x00m\x00b\x00e\x00r\x00s\x00h\x00i\x00p\x00 \x00t\x00y\x00p\x00e\x00?\x00\n"
Upvotes: 1
Views: 3098
Reputation: 114188
Binary encoding of my file is below:
"\xFF\xFES\x00t\x00a\x00t\x00u\x00s\x00...
0xFF
0xFE
is the byte order mark for UTF-16LE.
You have to specify the encoding when processing this file with CSV#foreach
:
This method also understands an additional
:encoding
parameter that you can use to specify the Encoding of the data in the file to be read. You must provide this unless your data is inEncoding::default_external()
. CSV will use this to determine how to parse the data. You may provide a second Encoding to have the data transcoded as it is read. For example,encoding: "UTF-32BE:UTF-8"
would read UTF-32BE data from the file but transcode it to UTF-8 before CSV parses it.
Furthermore you have to specify that a BOM is present. According to the IO#new
docs:
If “BOM|UTF-8”, “BOM|UTF-16LE” or “BOM|UTF16-BE” are (...) present, the BOM is stripped
Applied to your file and example:
CSV.foreach(file, col_sep: "\t", encoding: "BOM|UTF-16LE:UTF-8", headers: true) do |row|
# ...
end
Upvotes: 4
Reputation: 160551
On *nix systems the file
command is used to give a reasonable-hint to what the file contents are:
file /usr/share/dict/words
/usr/share/dict/words: ASCII text
file /usr/bin/ruby
/usr/bin/ruby: Mach-O universal binary with 2 architectures
/usr/bin/ruby (for architecture i386): Mach-O executable i386
/usr/bin/ruby (for architecture x86_64): Mach-O 64-bit executable x86_64
If you're on *nix, try running that against your CSV file and see what it says. It's not fool-proof, but it's reasonably accurate.
As something to get you started, here's how to convert space-delimited fields to tab-delimited:
row = '"Status" "Internal ID" "Language" "Created At" "Updated At" "IP Address" "Location" "Username" "GET Variables" "Referrer" "Number of Saves" "Weighted Score" "Completion Time" "Invite Code" "Invite Email" "Invite Name" "Invite: branchid" "Invite: lastname" "Invite: clientname" "Invite: membershipid" "Invite: clientid" "Invite: dateofbirth" "Invite: membershiptype" "Invite: branch" "Invite: unitid" "Invite: shortname" "Invite: changedatetime" "Invite: homephone" "Collector" '
row.gsub!(/"\s+"/, %Q["\t"]) # => "\"Status\"\t\"Internal ID\"\t\"Language\"\t\"Created At\"\t\"Updated At\"\t\"IP Address\"\t\"Location\"\t\"Username\"\t\"GET Variables\"\t\"Referrer\"\t\"Number of Saves\"\t\"Weighted Score\"\t\"Completion Time\"\t\"Invite Code\"\t\"Invite Email\"\t\"Invite Name\"\t\"Invite: branchid\"\t\"Invite: lastname\"\t\"Invite: clientname\"\t\"Invite: membershipid\"\t\"Invite: clientid\"\t\"Invite: dateofbirth\"\t\"Invite: membershiptype\"\t\"Invite: branch\"\t\"I...
Upvotes: 0