Reputation: 23
There is an API that only produces one hundred results per page. I am trying to make a while loop so that it goes through all pages and takes results from all pages, but it does not work. I would be grateful if you could help me figure it out.
params = dict(
order_by='salary_desc',
text=keyword,
area=area,
period=30, # days
per_page=100,
page = 0,
no_magic='false', # disable magic
search_field='name' # available: name, description, company_name
)
response = requests.get(
BASE_URL + '/vacancies',
headers={'User-Agent': generate_user_agent()},
params=params,
)
response
items = response.json()['items']
vacancies = []
for item in items:
vacancies.append(dict(
id=item['id'],
name=item['name'],
salary_from=item['salary']['from'] if item['salary'] else None,
salary_to=item['salary']['to'] if item['salary'] else None,
currency = item['salary']['currency'] if item['salary'] else None,
created=item['published_at'],
company=item['employer']['name'],
area = item['area']['name'],
url=item['alternate_url']
))
I loop through the dictionary, if there is a result in the dictionary, I add +1 to the page parameter as an iterator:
while vacancies == True:
params['page'] += 1
Result in dictionary params ['page'] = zero remains (pages in API start at zero).
When calling params after starting the loop, the result is:
{'area': 1,
'no_magic': 'false',
'order_by': 'salary_desc',
'page': 0,
'per_page': 100,
'period': 30,
'search_field': 'name',
'text': '"python"'}
Perhaps I am doing the loop incorrectly, starting from the logic that while there is a result in the dictionary, the loop must be executed.
Upvotes: 1
Views: 834
Reputation: 2665
while vacancies == True: #
params['page'] += 1
will never evaluate to literal True
regardless of it's contents. Python dict
's; even thought they are Truthy They aren't True
. You need to lessen the strictness of the statement.
if vacancies: # is truthy if it's len > 0, falsey otherwise
# Do something
Or you can explicitly check that it has content
if len(vacancies) > 0:
# Do something
This solves the problem of how to evaluate based on an object but doesn't solve the overall logic problem.
for _ in vacancies:
params["page"] += 1
# Does something for every item in vacancies
What you do each loop will depend on the problem and will require another question!
fixed below
params = dict(
order_by='salary_desc',
text=keyword,
area=area,
period=30, # days
per_page=100,
page = 0,
no_magic='false', # disable magic
search_field='name' # available: name, description, company_name
)
pages = []
while True:
params["page"] += 1
response = requests.get(BASE_URL + '/vacancies', headers={'User-Agent': generate_user_agent()}, params=params,)
items = response.json()['items']
if not items:
break
pages.append(items) # Do it for each page
Make vacancies for each page
results = []
for page in pages:
vacancies = []
for item in page:
vacancies.append(dict(
id=item['id'],
name=item['name'],
salary_from=item['salary']['from'] if item['salary'] else None,
salary_to=item['salary']['to'] if item['salary'] else None,
currency = item['salary']['currency'] if item['salary'] else None,
created=item['published_at'],
company=item['employer']['name'],
area = item['area']['name'],
url=item['alternate_url']
))
results.append(vacancies)
Results will be the fine list of all items.
Upvotes: 2
Reputation: 171
vacancies
is never True
.
If you want to test on the boolean value of "vacancies" you could use bool(vacancies)
.
But with Python, you can use
while vacancies:
# some code logic
This way, Python will auto cast to bool your list.
If your list as something inside (len(your_list) > 0
), bool(your_list)
evaluatues to True
, else it's False
.
Also, instead of using dict()
, you could write your dict this way:
params = {
'order_by': 'salary_desc',
'text':keyword,
'area': area,
'period': 30, # days
'per_page': 100,
'page': 0,
'no_magic': 'false', # disable magic
'search_field': 'name' # available: name, description, company_name
}
which is more pythonic.
Upvotes: 0