Reputation: 43
I am trying to understand how I can return several dictionaries from a function. If I print out data_dict in the function itself, I get five dictionaries. If data_dict will be returned from the function, stored in a variable and then printed out, only the last dictionary will be shown. How can all five dictionaries be returned?
import requests
from bs4 import BeautifulSoup
import re
import json
source = requests.get('https://www.tripadvisor.ch/Hotel_Review-g188113-d228146-Reviews-Coronado_Hotel-Zurich.html#REVIEWS').text
soup = BeautifulSoup(source, 'lxml')
pattern = re.compile(r'window.__WEB_CONTEXT__={pageManifest:(\{.*\})};')
script = soup.find("script", text=pattern)
dictData = pattern.search(script.text).group(1)
jsonData = json.loads(dictData)
def get_reviews():
data_dict = {}
for locations in jsonData['urqlCache']['669061039']['data']['locations']:
for data in locations['reviewListPage']['reviews']:
data_dict['reviewid'] = data['id']
data_dict['authoridtripadvisor'] = data['userId']
userProfile = data['userProfile']
data_dict['author'] = userProfile['displayName']
print(data_dict)
#return data_dict
reviews = get_reviews()
print(reviews)
Thank you for all suggestions!
Upvotes: 0
Views: 648
Reputation: 142641
Your problem is that in data_dict
you can keep only one dictionary.
You have to create list for all dictionares
all_dictionaries = []
and append()
every dictionary to this list
all_dictionaries.append(data_dict)
and return
this list
return all_dictionaries
And inside for
-loop you have to create new dictionary for new data. You can't use one data_dict
and replace elements in this dictionary.
def get_reviews():
all_dictionaries = []
for locations in jsonData['urqlCache']['669061039']['data']['locations']:
for data in locations['reviewListPage']['reviews']:
data_dict = {}
data_dict['reviewid'] = data['id']
data_dict['authoridtripadvisor'] = data['userId']
userProfile = data['userProfile']
data_dict['author'] = userProfile['displayName']
print(data_dict)
all_dictionaries.append(data_dict)
return all_dictionaries
Upvotes: 3
Reputation: 3494
EDIT: See @Furas's answer but pretty much the same thing (didn't see the other answer before hitting submit)
If you know the number of dictionaries you want to return ahead of time, you can return them like this:
def get_reviews():
# ...
return dict1, dict2, dict3
and then use the result like this:
d1, d2, d3 = get_reviews()
but if you want to return an arbitrary number of results back you should return a list containing all of your dictionaries:
import requests
from bs4 import BeautifulSoup
import re
import json
source = requests.get('https://www.tripadvisor.ch/Hotel_Review-g188113-d228146-Reviews-Coronado_Hotel-Zurich.html#REVIEWS').text
soup = BeautifulSoup(source, 'lxml')
pattern = re.compile(r'window.__WEB_CONTEXT__={pageManifest:(\{.*\})};')
script = soup.find("script", text=pattern)
dictData = pattern.search(script.text).group(1)
jsonData = json.loads(dictData)
def get_reviews():
data = []
for locations in jsonData['urqlCache']['669061039']['data']['locations']:
for data in locations['reviewListPage']['reviews']:
data.append({
'reviewid': data['id'],
'authoridtripadvisor': data['userId'],
'author': data['userProfile']['displayName']
})
return data
reviews = get_reviews()
print(reviews)
Upvotes: 2