In python, search strings using regular expression and replace it with another

Question

I have a db.sql file that includes lots of urls like as follows.

....25460A Panini Press Gourmet Sandwich Maker



As you can see, there is http://geni.us/4Lk5\ in the file.

I have another product.csv files that contains ID (like 4LK5 above) and Amazon product URL like as follows.

4Lk5    8738    8/16/2016 0:20  https://www.amazon.com/gp/product/B00IWOJRSM/ref=as_li_qf_sp_asin_il_tl?ie=UTF8
Jx9Aj2  8738    8/22/2016 20:16 https://www.amazon.com/gp/product/B007EUSL5U/ref=as_li_qf_sp_asin_il_tl?ie=UTF8
9sl2    8738    8/22/2016 20:18 https://www.amazon.com/gp/product/B00C3GQGVG/ref=as_li_qf_sp_asin_il_tl?ie=UTF8


As you can see, there is 4LK5 which matches with Amazon product URL.

I have already read the csv file and pick only ID and Amazon product url with python.

def openFile(filename, mode):
    index = 0
    result = []
    with open(filename, mode) as csvfile:
        spamreader = csv.reader(csvfile, delimiter = ',', quotechar = '
')
        for row in spamreader:
            result.append({
                "genu_id": row[0],
                "amazon_url": row[3]
            });
    return result


I have to add some code to search appropriate URL with genu_id in the db.sql and replace with amazon_url described on the code above.

Please help me.

zwer · Accepted Answer

There is no need for regex if you have such a predefined structure - if all links are in the form of http://geni.us/ you can do it with simple str.replace() by reading each row of your CSV and replacing the matches in your SQL file. Something like:

import csv

with open("product.csv", "rb") as source, open("db.sql", "r+") as target:  # open the files
    sql_contents = target.read()  # read the SQL file contents
    reader = csv.reader(source, delimiter="	")  # build a CSV reader, tab as a delimiter
    for row in reader:  # read the CSV line by line
        # replace any match of http://geni.us/ with third column's value
        sql_contents = sql_contents.replace("http://geni.us/{}".format(row[0]), row[3])
    target.seek(0)  # seek back to the start of your SQL file
    target.truncate()  # truncate the rest
    target.write(sql_contents)  # write back the changed content
    # ...
    # Profit? :D

Of course, if your original CSV file is comma-delimited, replace the delimiter in the csv.reader() call - the one you presented here seems tab-delimited.

In python, search strings using regular expression and replace it with another

Answers (1)

Related Questions