Mohamed Elgendy
Mohamed Elgendy

Reputation: 21

How can I store the links found for each item searched into a list?

This code does a simple google search of shortcodes and then prints the links found up to 10 links. How can I store the links found for each shortcode into a list or a dictionary that corresponds to the searched shortcode?

try:
    from googlesearch import search
except ImportError:
    print("No module named 'google' found")

with open('Unknown.xlsx', "rb") as f:
    df = pd.read_excel(f)  # can also index sheet by name or fetch all sheets
    shortcode_list = df['Short Code'].tolist()

def stopwatch(sec):
    while sec:
        minn, sec = divmod(sec, 60)
        timeformat = '{:02d}:{:02d}'.format(minn, sec)
        print(timeformat, end='\r')
        time.sleep(1)
        sec -= 1

pauses = np.arange(2, 8, 1).tolist()
pause = np.random.choice(pauses)

delays = np.arange(1, 60, 1).tolist()
delay = np.random.choice(delays)

for i in tqdm(range(len(shortcode_list))):
    try:
        shortcode = shortcode_list[i]
        delays = np.arange(1, 60, 1).tolist()
        delay = np.random.choice(delays)
        pauses = np.arange(2, 8, 1).tolist()
        pause = np.random.choice(pauses)
        stopwatch(delay)
        string = "text * to " + '"' + str(shortcode) + '"'
        query = string
        url = ('https://www.google.com?q=' + query)
        res = requests.get(url, headers=headers)
        print("\nThe query will be " + query + " " + str(res))
        for k in search(query, tld="co.in", num=10, stop=10, pause=pause, country='US',
                        user_agent=googlesearch.get_random_user_agent(), verify_ssl=True):
            print(k)
    except HTTPError as exception:
        if exception.code == 429:
            print(exception)
            print("Waiting for 8 minutes and Continue")
            stopwatch(480)
            continue

Upvotes: 0

Views: 95

Answers (1)

emmunaf
emmunaf

Reputation: 426

You can use a dictionary that has the shortcode as key and a list as value.

By using this approach your code should be result in something like this:

import numpy as np
from tqdm import tqdm
import time
import requests

try:
    from googlesearch import search
except ImportError:
    print("No module named 'google' found")

shortcode_list = ["abc", "SO"]
def stopwatch(sec):
    while sec:
        minn, sec = divmod(sec, 60)
        timeformat = '{:02d}:{:02d}'.format(minn, sec)
        print(timeformat, end='\r')
        time.sleep(1)
        sec -= 1

pauses = np.arange(2, 8, 1).tolist()
pause = np.random.choice(pauses)

delays = np.arange(1, 60, 1).tolist()
delay = np.random.choice(delays)

results = {}  # Create an empty dict 
for i in tqdm(range(len(shortcode_list))):
    try:
        shortcode = shortcode_list[i]
        delays = np.arange(1, 5, 1).tolist()
        delay = np.random.choice(delays)
        pauses = np.arange(2, 8, 1).tolist()
        pause = np.random.choice(pauses)
        stopwatch(delay)
        string = "text * to " + '"' + str(shortcode) + '"'
        query = string
        url = ('https://www.google.com?q=' + query)
        res = requests.get(url)
        print("\nThe query will be " + query + " " + str(res))
        cur_res = []  # Create a list to store the results for that shortcode
        for k in search(query,  num_results=10):
            print(k)
            cur_res.append(k)  # Add a single res to the list
        results[shortcode_list[i]] = cur_res  # Update the res dict

    except Exception as exception:
        print(exception)
        print("Waiting for 8 minutes and Continue")
        stopwatch(480)
        continue

As a suggestion for your next question you should delete or modify not reproducible code (ie. that unknown xls) and your snippet should be ready to be debugged to help the people who wants to help you (include all the import, .. ).

Upvotes: 1

Related Questions