JP Silvashy
JP Silvashy

Reputation: 48525

Difficulty determining the file type of text database file

So the USDA has some weird database of general nutrition facts about food, and well naturally we're going to steal it for use in our app. But anyhow the format of the lines is like the following:

~01001~^~0100~^~Butter, salted~^~BUTTER,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01002~^~0100~^~Butter, whipped, with salt~^~BUTTER,WHIPPED,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01003~^~0100~^~Butter oil, anhydrous~^~BUTTER OIL,ANHYDROUS~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01004~^~0100~^~Cheese, blue~^~CHEESE,BLUE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87

With those odd ~ and ^ separating the values, It also lacks a header row but thats ok, I can figure that out from the other stuff on their site: http://www.ars.usda.gov/Services/docs.htm?docid=8964

Any help would be great! If it matters we're making an open/free API with Ruby to query this data.

Additionally I'm having a tough time posing this question so I've made it a community wiki so we can all pitch in!

Upvotes: 0

Views: 84

Answers (2)

DVK
DVK

Reputation: 129403

This looks like a very standard CSV (comma separated value) file, except the field separator character was changed from , to ^ and quote character from " to ~

Unfortunately, I'm not familiar with Ruby to recommend which library to use, but in Perl there's a boatload of standard CPAN modules the best of which allow you to configure both field separator and quote character of a CSV reader... I would expect Ruby should have something similar as well - if so, you're in luck!

Upvotes: 3

Bob Kaufman
Bob Kaufman

Reputation: 12815

^ appears to be a field delimiter and ~ a string delimiter. Normally I'd expect to see , and " in those roles, but the choice of the very uncommon characters means that a string like

Cheese, Bleu

won't get all trippy with the string parser.

Upvotes: 1

Related Questions