Reputation: 611
I have a Rails API using rails 5.2.1 and I'm developing a Web Scraping API that extracts web data using the Mechanize Gem.
What I'm going to do is, I'm going to pass all of this web scraping data data to React.js but first I need to get this data on my database which I'm having hard time how to figure this out.
So when I extract my data and I receive it all in a JSON file (called data.json), which I'm stunned I got until here haha.
{
"Brazil":[
{"Jungle Plants":[bla bla bla ]},
{"Desert Plants":[ bla bla bla ]}],
"Egypt":[
{"Jungle Plants":[bla bla bla ]},
{"Desert Plants":[ bla bla bla ]}]
and so on...
}
So the next thing I need to do is separate my JSON data into my database which I already have a migration setup like this:
class CreatePlants < ActiveRecord::Migration[5.2]
def change
create_table :plants do |t|
t.string :country_name
t.string :plant_categories
t.timestamps
end
end
end
So up in the table what I want to is, I want that the country name like 'Brazil' or 'Egypt' or whatever country name is goes to the country_name column and so for the plant_categories.
My question is, how can I organize/separate my data.json
file in to my database? I will appreciate so much your help! <3
Upvotes: 2
Views: 3918
Reputation: 2130
This would be my approach, I would use a JSON or JSONB column to save each array as they come from the body, so I would change the migration to this:
class CreatePlants < ActiveRecord::Migration[5.2]
def change
create_table :plants do |t|
t.string :country_name
t.jsonb :data
t.timestamps
end
end
end
So let's say you have your parsed response (after you applied JSON.parse
on the body), resulting in the following hash:
response =
{
"Brazil": [
{ "Jungle Plants": [ bla bla bla ] },
{ "Desert Plants": [ bla bla bla ] }
],
"Egypt": [
{ "Jungle Plants": [ bla bla bla ] },
{ "Desert Plants": [ bla bla bla ] }
]
and so on...
}
I'd do:
response.each { |key, value| Plant.create!(country_name: key, data: value) }
That way, if you execute the following, after the previous statement:
Plant.find_by(country_name: 'Brazil').data
You would get your array:
[{ "Jungle Plants": [ bla bla bla ] }, { "Desert Plants": [ bla bla bla ] }]
But it is your decision, you can perfectly tackle this "problem" in different ways, for example, you could save the arrays as an string
and then retrieve them from the DB and apply a JSON.parse
or apply the serialize method in the model definition to convert them to an array
again. Another option is to filter the response even further and save them in many more columns.
I would say that the best way depends on what you want to do with the data afterwards, but for flexibility reasons my choice would be the one I stated before, to save the array as they come from the API to a JSONB column and then do whatever I want with them (another advantage is that saving the data as JSONB allows you to perfom queries on them in ways you couldn't do with just a string)
Upvotes: 2