king majesty
king majesty

Reputation: 69

scraping data from json after using requests

i am trying to extract specific data from requested json file

so after passing Authorization and using requests.get i got my request , i think it is called dictionary for python coders and called json for javascript coders it containt too much information that i dont need and i would like to extract one or two only for example {"bio" : " hello world " } and that json file contains more that one " bio " for example i am scraping 100 accounts and i would like to extract all " bio " in one code

so i tried this :

from bs4 import BeautifulSoup
import requests

headers = {"Authorization" : "xxxx"}
req = requests.get('website', headers = headers)
data = req.text
soup = BeautifulSoup(data,'html.parser')

titles = soup.find_all('span',{'class':'bio'})
for title in titles :
    print(title.text)

and didnt work , i tried multiple ideas with no success if possible please write me a code that i can understande since iam trying to learn more about my mistakes

thanks

Upvotes: 0

Views: 594

Answers (2)

Robert Kearns
Robert Kearns

Reputation: 1706

The Aphid library I created is perfect for this.

from command-prompt

py -m pip install Aphid

Then its just as easy as loading your json data and searching it with aphid.

import json
import Aphid

resp = requests.get(yoururl)
data = json.loads(resp.text)

results = Aphid.findall(data, 'bio')

results is now equal to a list of tuples(key, value), of every occurence of the 'bio' key.

Upvotes: 1

ilias iliadis
ilias iliadis

Reputation: 631

After you get your request either:

  • you get a simple json file (in which case you import it to python using json) or

  • you get an html file from which you can extract the json code (using BeautifulSoup) which in turn you will parse using json library.

Upvotes: 0

Related Questions