Richlewis
Richlewis

Reputation: 15374

Formatting a csv file to correct format

I have downloaded a csv file using HTTParty and saved the file locally so I can inspect it at a later date, however it seems as if the data is not formatted correctly

[["Team Name", "User Name", "Dataset Name", "No of Searches", "Credits Remaining"], ["", "",
"DRI", "129", "99085"], ["", "", "Property Register Search (G)", "124", "99414"], ["", "",
"Landline Verification", "1", "99783"], ["", "",
"Equifax (G)", "372", "97798"], ["", "", "Director Register", "135", "98499"], ["", "",
"Mobile Verification", "2", "99845"], ["", "",
"BT OSIS", "428", "91588"], ["", "", 
"Experian (G)", "97", "99913"], ["", "", "Standard (G)",
"873", "82151"], ["", "", "CCJ", "120", "98367"]]

So that I can use the CSV class provided by ruby i need the data to be in the following format don't I ?

Team Name, User Name, Dataset Name, No of Searches, Credits Remaining
"", "", DRI, 129, 99085
"", "", Property Register Search (G), 124, 99414]
"", "", Landline Verification, 1, 99783 
"", "", Equifax (G), 372, 97798
"", "", Director Register, 135, 98499
"", "", Mobile Verification, 2, 99845 
"", "", BT OSIS, 428, 91588]
"", "", Experian (G), 97, 99913
"", "", Standard (G), 873, 82151
"", "", CCJ, 120, 98367

What i am looking to achieve is to get to a point where i can hash this out so i could access Credits Remaining for the Dataset Name Standard

Hope that makes sense

Thanks

UPDATE

Thanks to @mudasobwa for your answer, i now have my csv file contents in an array of hashes (I think :) )

{"TeamName"=>[nil, nil, nil, nil, nil, nil, nil, nil, nil, nil],
 "UserName"=>[nil, nil, nil, nil, nil, nil, nil, nil, nil, nil],
 "DatasetName"=> ["DRI", "PropertyRegisterSearch(G)", "LandlineVerification","Equifax(G)", "DirectorRegister", "MobileVerification", "BTOSIS", "Experian(G)", "Standard(G)","CCJ"],
 "NoofSearches"=>["129", "124", "1", "372", "135", "2", "428", "97", "873", "120"],
 "CreditsRemaining"=>["99085", "99414", "99783", "97798", "98499", "99845", "91588", "99913", "82151", "98367"]
}

How would i get the NoofSearches that DatasetName DRI corresponds to, so I would expect to get 129 returned

Upvotes: 0

Views: 661

Answers (3)

hallucinations
hallucinations

Reputation: 3454

Another solution using Array#zip.

Apparently, the file that you downloaded is not in CSV format. But, it seems like the string that you have in the file can be directly evaluated into a Ruby Array even though it's hacky.

#!/usr/bin/env ruby

file = File.open("test.data", "r")
#NOTE: eval is evil!
csv_arrs = eval(file.read.gsub("\n", "")) 
file.close

headers = csv_arrs.shift
query = {
  :select => "No of Searches",
  :key => "Dataset Name",
  :value => "DRI"
}

r = csv_arrs.find {|a| Hash[ headers.zip(a) ][ query[:key] ] == query[:value]}
puts r[headers.index(query[:select])]

Upvotes: 0

Aleksei Matiushkin
Aleksei Matiushkin

Reputation: 120990

▶ csv = [["Team Name", "User Name", "Dataset Name", "No of Searches", "Credits Remaining"], ["", "",
▷   "DRI", "129", "99085"], ["", "", "Property Register Search (G)", "124", "99414"], ["", "",  
▷   "Landline Verification", "1", "99783"], ["", "",  
▷   "Equifax (G)", "372", "97798"], ["", "", "Director Register", "135", "98499"], ["", "",  
▷   "Mobile Verification", "2", "99845"], ["", "",  
▷   "BT OSIS", "428", "91588"], ["", "",   
▷   "Experian (G)", "97", "99913"], ["", "", "Standard (G)",  
▷ "873", "82151"], ["", "", "CCJ", "120", "98367"]]

Then the following will give you what you want:

▶ csv.transpose.map { |e| [e.shift, e] }.to_h

or:

▶ csv.transpose.group_by(&:shift).map { |k, v| [k, v.first] }.to_h

To access the NoofSearches that DatasetName DRI corresponds to:

▶ hash = csv.transpose.map { |e| [e.shift, e] }.to_h
# ⇓ lookup array of noofs
#                        ⇓ by index of 'DRI' in 'Dataset Name' 
▶ hash['No of Searches'][hash['Dataset Name'].index('DRI')]

Upvotes: 1

Oliver Zeyen
Oliver Zeyen

Reputation: 853

This example should turn your csv into a array of hashes with the data accessible by former colum names.

data = []

CSV.foreach('test.csv', headers: true) { |row| data << row.to_hash }

data.inspect

=> [{:col1=>'value1', :col2=>'value2', :col3=> 'value3'}, 
    {:col1=>'value4', :col2=>"value5", :col3=>'value6'}]

Contents of data.csv looked like this:

col1,col2,col3
value1,value2,value3
value4,value5,value6

Upvotes: 1

Related Questions