Reputation: 1815
I have used google image api in python to download 500 images. After downloading few images, it is giving an bad request error. Below is the code
from google_images_search import GoogleImagesSearch
import os
from tqdm import tqdm
import time
# Set up Google Images Search
gis = GoogleImagesSearch(API_KEY, API_SECRET, validate_images=False)
def download_images_in_batches(location, output_folder):
search_params = {
'q': location,
'num': 50, # Number of images to download per batch
'fileType': 'jpg|png' # File types to include in the search
}
# Create the folder for the location if it doesn't exist
location_folder = os.path.join(output_folder, location)
os.makedirs(location_folder, exist_ok=True)
# Download images in batches
total_images = 500 # Total number of images to download
images_per_batch = 50 # Number of images to download per batch
batches = total_images // images_per_batch
for batch in tqdm(range(batches)):
start_index = batch * images_per_batch + 1
search_params['start'] = start_index
search_params['num'] = 50
time.sleep(1)
# Perform the search and download the images
gis.search(search_params=search_params)
for index, image in enumerate(gis.results()):
if index >= images_per_batch:
break
image.download(location_folder)
gis.next_page()
print(f"Downloaded {total_images} images for {location} in folder '{location_folder}'")
download_images_in_batches('query_to_search', 'destination_path')
I would like to download 500 images for a respective location and I am doing that in batches. After downloading 200 images, I am getting the below error
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://customsearch.googleapis.com/customsearch/v1?cx=822fca83c44f645bb&q=US+Embassy+Baghdad&searchType=image&num=10&start=201&fileType=jpg%7Cpng&safe=off&key=AIzaSyD0pMnbiJmUnFactRxZvChEqY0i2G7gkFs&alt=json returned "Request contains an invalid argument.". Details: "[{'message': 'Request contains an invalid argument.', 'domain': 'global', 'reason': 'badRequest'}]">
My api has a limit that is greater than 500 requests per day. Can anyone tell me where am I doing wrong?
Upvotes: 0
Views: 603
Reputation: 353
@chethan said it correct, each Programmatic access request for Custom Search API has an upper limit of response. It is 10 images. If more results are requested, the following is returned by API: "Request contains an invalid argument.". Details: "[{'message': 'Request contains an invalid argument.', 'domain': 'global', 'reason': 'badRequest'}
And the overall limit is 100(which doesn't make sense to me but what I understood is you can only get upto 100 images on one topic)
This is the default setting that I found. Hope it helps. Cheers!
Reference: Note: The JSON API will never return more than 100 results, even if more than 100 documents match the query
Reference: Also note that the maximum value for num is 10
reference link
Upvotes: 0
Reputation: 16
Check what is the count of images returned by the corresponding programmable search engine (https://programmablesearchengine.google.com) used. Probably there are fewer images returned.
The count of results differ between Google search on browser and programmable search engine.
I would encourage you to directly query the Google CSE than using an API like google_images_search. This is because Google CSE documentation states explicitly to query with different search strings and download 10 images per query.
In your case I would have 50 different queries and each query resulting 10 images which is total of 500 at the end. You can formulate different queries following the documentation of Google CSE API (https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list).
Upvotes: 0