iimah9
iimah9

Reputation: 1

Extract specific data from JSON file

I have this json file that is available in the link https://raw.githubusercontent.com/Cyral/Bakeoof/master/full_format_recipes.json

and I used Pandas to open the recipes JSON file.

import pandas as pd
df = pd.read_json('full_format_recipes.json', lines=True)
print(df)

and this is the output I get

  0      \
0  {'directions': ['1. Place the stock, lentils, ...   

                                               1      \
0  {'directions': ['Combine first 9 ingredients i...   

                                               2      \
0  {'directions': ['In a large heavy saucepan coo...   

                                               3      \
0  {'directions': ['Heat oil in heavy large skill...   

                                               4      \
0  {'directions': ['Preheat oven to 350°F. Lightl...   

                                               5      \
0  {'directions': ['Mix basil, mayonnaise and but...   

                                               6      \
0  {'directions': ['Cook potatoes and carrots in ...   

                                               7      \
0  {'directions': ['Stir together sugar and chili...   

                                               8      \
0  {'directions': ['Stir together soy sauce, suga...   

                                               9      ...  \
0  {'directions': ['Chop enough parsley leaves to...  ...   

                                               20120  \
0  {'directions': ['Bring all ingredients to a si...   

                                               20121  \
0  {'directions': ['1. Preheat the oven to 400°F....   

                                               20122  \
0  {'directions': ['Mix first 4 ingredients in la...   

                                               20123  \
0  {'directions': ['Stir water, sugar and juice i...   

                                               20124  \
0  {'directions': ['Wash spareribs. Remove excess...   

                                               20125  \
0  {'directions': ['Beat whites in a bowl with an...   

                                               20126  \
0  {'directions': ['Bring broth to simmer in sauc...   

                                               20127  \
0  {'directions': ['Using a sharp knife, cut a sh...   

                                               20128  \
0  {'directions': ['Heat 2 tablespoons oil in hea...   

                                               20129  
0  {'directions': ['Position rack in bottom third...  

[1 rows x 20130 columns]

I want to extract only the directions and the title for each recipe, How can I do that?

Upvotes: 0

Views: 165

Answers (2)

umarf786
umarf786

Reputation: 1

If you get rid of lines=True it should format your code accordingly to allow you to access your JSON as you normally would. You don't need it in that structure.

Upvotes: 0

jrd1
jrd1

Reputation: 10716

Remove lines=True - that's meant for JSON files in which an object exists on each line, or if you want to read in each object individually. That is, with line=True although each object has the same properties, they are not collated into a single entity.

Making that modification, you'll be able to access the properties you desire:

import pandas as pd
url = 'https://raw.githubusercontent.com/Cyral/Bakeoof/master/full_format_recipes.json'
df = pd.read_json(url)
print(df['directions'])
print(df['title'])

Ref: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html

Upvotes: 1

Related Questions