nux
nux

Reputation: 45

google reverse image url display in python

I wrote python code to search for an image in google with some google dork keywords. Here is the code:

def showD(self):

    self.text, ok = QInputDialog.getText(self, 'Write A Keyword', 'Example:"twitter.com"')

    if ok == True:
        self.google()

def google(self):
    filePath = self.imagePath
    domain = self.text
    searchUrl = 'http://www.google.com/searchbyimage/upload'
    multipart = {'encoded_image': (filePath, open(filePath, 'rb')), 'image_content': '', 'q': f'site:{domain}'}
    response = requests.post(searchUrl, files=multipart, allow_redirects=False)
    fetchUrl = response.headers['Location']
    webbrowser.open(fetchUrl)


App = QApplication(sys.argv)
window = Window()
sys.exit(App.exec())

I just didn't figure how to display the url of the search result in my program. I tried this code:

import requests
from bs4 import BeautifulSoup
import re

query = "twitter"
search = query.replace(' ', '+')
results = 15
url = (f"https://www.google.com/search?q={search}&num={results}")

requests_results = requests.get(url)
soup_link = BeautifulSoup(requests_results.content, "html.parser")
links = soup_link.find_all("a")

for link in links:
    link_href = link.get('href')
    if "url?q=" in link_href and not "webcache" in link_href:
        title = link.find_all('h3')

        if len(title) > 0:
            print(link.get('href').split("?q=")[1].split("&sa=U")[0])
            # print(title[0].getText())
            print("------")

But it only works for normal google search keyword and failed when I try to optimize it for the result of google image search. It didn't display any result.

Upvotes: 2

Views: 859

Answers (1)

Yevhenii Kosmak
Yevhenii Kosmak

Reputation: 3860

Currently there is no simple way to scrape Google's "Search by image" using plain HTTPS requests. Before responding to this type of request, they presumably check if user is real using several sophisticated techniques. Even your working example of code does not work for long — it happens to be banned by Google after 20-100 requests.

All public solutions in Python that really scrape Google with images use Selenium and imitate the real user behaviour. So you can go this way yourself. Interfaces of python-selenium binding are not so tough to get used to, except maybe the setup process.

The best of them, for my taste, is hardikvasa/google-images-download (7.8K stars on Github). Unfortunately, this library has no such input interface as image path or image in binary format. It only has the similar_images parameter which expects a URL. Nevertheless, you can try to use it with http://localhost:1234/... URL (you can easily set one up this way).

enter image description here

You can check all these questions and see that all the solutions use Selenium for this task.

Upvotes: 2

Related Questions