Shamith c
Shamith c

Reputation: 3739

Convert .json to .csv in ruby

I want to convert .json file into .csv file using ruby. Pleases help me to do this.

Also propose any tool to achieve this.

Upvotes: 21

Views: 39738

Answers (7)

korCZis
korCZis

Reputation: 580

I think the easies way to convert the JSON to CSV in ruby is using json2csv ruby gem.

PS: I may be biased as I am author of this.

json2csv github repo

Running the command

json2csv convert data/sample.json

Input File

In this particular case it will convert following json:

cat data/sample.json

{
    "12345": {
        "Firstname": "Joe",
        "Lastname": "Doe",
        "Address": {
            "Street": "#2140 Taylor Street, 94133",
            "City": "San Francisco",
            "Details": {
                "note": "Pool available"
            }
        }
    },

    "45678": {
        "Firstname": "Jack",
        "Lastname": "Plumber",
        "Address": {
            "Street": "#111 Sutter St, 94104",
            "City": "San Francisco",
            "Details": {
                "note": "Korean Deli near to main entrance"
            }
        }
    }
}

Output

cat data/sample.json.csv

id,Firstname,Lastname,Address.Street,Address.City,Address.Details.note
12345,Joe,Doe,"#2140 Taylor Street, 94133",San Francisco,Pool available
45678,Jack,Plumber,"#111 Sutter St, 94104",San Francisco,Korean Deli near to main entrance

Upvotes: 5

Ishwaryaa Balaji
Ishwaryaa Balaji

Reputation: 1

This is handle headers & nested json.

require 'csv'
require 'json'

@headers = []
file = File.open('file.json')
JSON.parse(file.read).each do |h|
  h.keys.each do |key|
    @headers << key
  end
end

uniq_headers = @headers.uniq
file = File.open('file.json')
finalrow = []

JSON.parse(file.read).each do |h|
  final = {}
  @headers.each do |key2|
    final[key2] = h[key2]
  end

  finalrow << final
end

CSV.open('output.csv' , 'w') do |csv|
  csv << uniq_headers
  finalrow.each do |deal|
    csv << deal.values
  end
end

Upvotes: 0

Sculper
Sculper

Reputation: 755

Edit:

This functionality described below is now available as a gem. After installing with gem install json_converter, the following snippet can be used to generate a CSV from a valid JSON string or object:

require 'json_converter'
json_converter= JsonConverter.new

# Assume json is a valid JSON string or object
csv = json_converter.generate_csv json

Original Answer:

If your JSON data is relatively simple (no nesting or arrays), Alex's answer is probably the cleanest way of handling this problem.

However, if you do need to take arrays and nested objects into account, I've attempted to port a web version of such a converter to ruby. It can be found here. The methods that handle the actual restructuring of data are array_from and flatten.

The array_from method attempts to identify what a "row" of data looks like for a given dataset. It is not perfect, and you may want to tweak this part for different datasets.

# Attempt to identify what a "row" should look like
def array_from(json)
  queue, next_item = [], json
  while !next_item.nil?

    return next_item if next_item.is_a? Array

    if next_item.is_a? Hash
      next_item.each do |k, v|
        queue.push next_item[k]
      end
    end

    next_item = queue.shift
  end

  return [json]
end

The flatten method recursively iterates over the JSON object(s), and generates an object that represents headers and values. If an object is nested, the header for its column will be prefixed with its parent key(s), delimited by the / character.

# The path argument is used to construct header columns for nested elements
def flatten(object, path='')
  scalars = [String, Integer, Fixnum, FalseClass, TrueClass]
  columns = {}

  if [Hash, Array].include? object.class
    object.each do |k, v|
      new_columns = flatten(v, "#{path}#{k}/") if object.class == Hash
      new_columns = flatten(k, "#{path}#{k}/") if object.class == Array
      columns = columns.merge new_columns
    end

    return columns
  elsif scalars.include? object.class
      # Remove trailing slash from path
      end_path = path[0, path.length - 1]
      columns[end_path] = object
      return columns
  else
    return {}
  end
end

If there are any null values in the original JSON, you'll need to convert these to something other than nil before attempting the conversion - you'll generally end up with uneven rows if you don't. The nils_to_strings method handles that:

# Recursively convert all nil values of a hash to empty strings
def nils_to_strings(hash)
  hash.each_with_object({}) do |(k,v), object|
    case v
    when Hash
      object[k] = nils_to_strings v
    when nil
      object[k] = ''
    else
      object[k] = v
    end
  end
end

Here's a brief example of how this would be used:

json = JSON.parse(File.open('in.json').read)
in_array = array_from json
in_array.map! { |x| nils_to_strings x }

out_array = []
in_array.each do |row|
  out_array[out_array.length] = flatten row
end

headers_written = false
CSV.open('out.csv', 'w') do |csv|
  out_array.each do |row|
    csv << row.keys && headers_written = true if headers_written === false
    csv << row.values
  end
end

And finally, here's some example input/output:

Input:

{
  "Forms": [
    {
      "Form": {
        "id": "x",
        "version_id": "x",
        "name": "x",
        "category": "",
        "subcategory": null,
        "is_template": null,
        "moderation_status": "x",
        "display_status": "x",
        "use_ssl": "x",
        "modified": "x",
        "Aggregate_metadata": {
          "id": "x",
          "response_count": "x",
          "submitted_count": "x",
          "saved_count": "x",
          "unread_count": "x",
          "dropout_rate": "x",
          "average_completion_time": null,
          "is_uptodate": "x"
        }
      },
      "User": {
        "username": "[email protected]"
      }
    },
    {
      "Form": {
        "id": "x",
        "version_id": "x",
        "name": "x",
        "category": "",
        "subcategory": null,
        "is_template": null,
        "moderation_status": "x",
        "display_status": "x",
        "use_ssl": "x",
        "modified": "x",
        "Aggregate_metadata": {
          "id": "x",
          "response_count": "x",
          "submitted_count": "x",
          "saved_count": "x",
          "unread_count": "x",
          "dropout_rate": "x",
          "average_completion_time": null,
          "is_uptodate": "x"
        }
      },
      "User": {
        "username": "[email protected]"
      }
    }
  ]
}

Output:

Form/id,Form/version_id,Form/name,Form/category,Form/subcategory,Form/is_template,Form/moderation_status,Form/display_status,Form/use_ssl,Form/modified,Form/Aggregate_metadata/id,Form/Aggregate_metadata/response_count,Form/Aggregate_metadata/submitted_count,Form/Aggregate_metadata/saved_count,Form/Aggregate_metadata/unread_count,Form/Aggregate_metadata/dropout_rate,Form/Aggregate_metadata/average_completion_time,Form/Aggregate_metadata/is_uptodate,User/username
x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,[email protected]
x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,[email protected]

Upvotes: 3

David Lio
David Lio

Reputation: 327

Ruby code to convert from json to csv.

Handles arrays by converting them to double quoted strings.

#json-to-csv.rb
require 'json'
require 'csv'

file = File.read('./input.json')
hash = JSON.parse(file)

CSV.open('./output.csv', 'w') do |csv|
  headers = hash.first.keys
  csv << headers

  hash.each do |item|
    values = item.values
    printable_values = Array.new
    values.each do |value|

      printable_values << value.to_s.gsub(/\[|\]/,'').gsub(/"/,'\'')

    end

    csv << printable_values

  end

end

Upvotes: 3

fguillen
fguillen

Reputation: 38772

Based on @Alex's answer but adding csv headers and example test.

# utils.rb
require "csv"
module Utils
  def self.array_of_hashes_to_csv(array_of_hashes)
    CSV.generate do |csv|
      csv << array_of_hashes.first.keys
      array_of_hashes.each { |hash| csv << hash.values }
    end
  end
end

# utils_test.rb
class UtilsTest < MiniTest::Unit::TestCase
  def test_array_of_hashes_to_csv
    array_of_hashes = [
      { :key1 => "value1", :key2 => "value2" },
      { :key1 => "value3", :key2 => "value4" }
    ]
    expected_result = "key1,key2\nvalue1,value2\nvalue3,value4\n"
    assert_equal(expected_result, Utils.array_of_hashes_to_csv(array_of_hashes))
  end
end

Upvotes: 4

Mark Locklear
Mark Locklear

Reputation: 5325

To actually write to file...

require 'csv'
require 'json'

CSV.open("your_csv.csv", "w") do |csv| #open new file for write
  JSON.parse(File.open("your_json.json").read).each do |hash| #open json to parse
    csv << hash.values #write value to file
  end
end

Upvotes: 13

Alex Peattie
Alex Peattie

Reputation: 27647

Try something like this:

require 'csv'
require 'json'

csv_string = CSV.generate do |csv|
  JSON.parse(File.open("foo.json").read).each do |hash|
    csv << hash.values
  end
end

puts csv_string

Upvotes: 39

Related Questions