user2224121
user2224121

Reputation: 11

Filter large json file with Ruby

As a total beginner of programming, I am trying to filter a JSON file for my master's thesis at university. The file contains approximately 500 hashes of which 115 are the ones I am interested in.

What I want to do:

(1) Filter the file and select the hashes I am interested in

(2) For each selected hash, return only some specific keys

The format of the array with the hashes ("loans") included:

{"header": {
   "total":546188,
   "page":868,
   "date":"2013-04-11T10:21:24Z",
   "page_size":500},
 "loans": [{
   "id":427853,
   "name":"Peter Pan",
   ...,
   "status":"expired",
   "paid_amount":525,
   ...,
   "activity":"Construction Supplies",
   "sector":"Construction"," },
    ... ]
 }

Being specific, I would like to have the following:

(1) Filter out the "loans" hashes with "status":"expired"

(2) Return for each such "expired" loan certain keys only: "id", "name", "activity", ...

(3) Eventually, export all that into one file that I can analyse in Excel or with some stats software (SPSS or Stata)

What I have come up with myself so far is this:

require 'rubygems'
require 'json'

toberead = File.read('loans_868.json')
another = JSON.parse(toberead)

read = another.select {|hash| hash['status'] == 'expired'}

puts hash

This is obviously totally incomplete. And I feel totally lost. Right now, I don't know where and how to continue. Despite having googled and read through tons of articles on how to filter JSON...

Is there anyone who can help me with this?

Upvotes: 1

Views: 1852

Answers (1)

mattwise
mattwise

Reputation: 1506

The JSON will be parsed as a hash, 'header' is one key, 'loans' is another key.

so after your JSON.parse line, you can do

loans = another['loans']

now loans is an array of hashes, each hash representing one of your loans. you can then do

expired_loans = loans.select {|loan| loan['status'] == 'expired'}
puts expired_loans

to get at your desired output.

Upvotes: 2

Related Questions