Multithreading Scrape Html and Safely Save to One File

Question

I want scrape the title from given url in multiple thread (example in 5 thread) and save them to one text file. how to do it and how to make sure I safely save the output to one file?

this is my code:

import csv
import requests
requests.packages.urllib3.disable_warnings()

urls = []

with open('Input.csv') as csvDataFile:
    csvReader = csv.reader(csvDataFile)
    for row in csvReader:
        urls.append(row[1])

def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

def get_title( url ):
    try:
        r = requests.get(url)
        html_content = r.text.encode('UTF-8')
        title = find_between(html_content , "", "")
        return title
    except:
        return ""

for url in urls:
    f = open('myfile.txt', 'a')
    f.write(get_title(url) + '
')
    f.close()

Multithreading Scrape Html and Safely Save to One File

Answers (1)

Related Questions