Kevin Johnson
Kevin Johnson

Reputation: 47

csv.writer Append a csv file with new data only

I have a script that is used to scrape data from a website and stores it into a spreadsheet

with open("c:\source\list.csv") as f:
  for row in csv.reader(f):
    for url in row:
      r = requests.get(url)
      soup = BeautifulSoup(r.content, 'lxml')
      tables = soup.find('table', attrs={"class": "hpui-standardHrGrid-table"})
          for rows in tables.find_all('tr', {'releasetype': 'Current_Releases'})[0::1]:
        item = []
        for val in rows.find_all('td'):
          item.append(val.text.strip())
      with open('c:\output_file.csv', 'a', newline='') as f:
        writer = csv.writer(f)
        writer.writerow({url})
        writer.writerows(item)

As of right now, when this script runs, about 50 new lines are added to the bottom of the CSV file (Totally expected with the append function) but what I would like it to do is to determine if there are duplicate entries in the CSV file and skip them, and then change the mismatches.

I feel like this should be possible but I can't seem to think of a way

Any thoughts?

Upvotes: 0

Views: 1206

Answers (2)

Antimony
Antimony

Reputation: 2240

You cannot do that without reading the data from the CSV file. Also to "change the mismatches", you will just have to over write them.

f = open('c:\output_file.csv', 'w', newline='')
writer = csv.writer(f)

for item in list_to_write_from:
    writer.writerow(item)

Here, you are assuming that list_to_write_from will contain the most current form of the data you need.

Upvotes: 1

Kevin Johnson
Kevin Johnson

Reputation: 47

I found a workaround to this problem as the answer provided did not work for me

I added:

if os.path.isfile("c:\source\output_file.csv"):
    os.remove("c:\source\output_file.csv")

To the top of my code, as this will check to see if that file exists, and deletes it, only to recreate it with the most up to date information later. This is a duct tape way of doing things, but it works.

Upvotes: 0

Related Questions