Baobab1988
Baobab1988

Reputation: 715

Adding specific rows from one CSV file to another and applying specific conditions

I have a CSV file with a column 'Flag' that has 0 and 1 values. My goal is to move all rows with 0 values to another CSV file. This script will be scheduled to run every hour and move rows with '0' values to another file.

So far I wrote the below code:

with open("path/to/my/input/file.csv", "rt", encoding="utf8") as f:
reader = csv.DictReader(f, delimiter=',')
with open("/path/to/my/output/file.csv", "a+", encoding="utf8") as f_out:
    writer = csv.DictWriter(f_out, fieldnames=reader.fieldnames, delimiter=",")
    writer.writeheader()
    for row in reader:
        if row['flag'] == '0':
            writer.writerow(row)

With @Raghvendra help below by adding 'a+' to my code I'm able to add rows to my output.csv file. However, it adds header row to my output file each time the script runs. Also, how to prevent adding rows with matching ID? Would it be possible to replace rows in my output.csv file where the ID match with ID in input.csv file instead of adding rows with duplicated ID to output.csv?

Would someone be able to help me with this? Thanks in advance!

input file.csv:

id       date          data1     data2    flag
1     2020-03-01      mydata    mydata1    0
2     2020-03-02      mydata     mydata    1
3     2020-03-03      mydata    mydata1    0

Upvotes: 1

Views: 1182

Answers (2)

Armali
Armali

Reputation: 19375

Now my problem is to prevent adding records with duplicated IDs to my output.csv. I would need to overwrite records with matching IDs instead if possible.

In order to match IDs, we cannot avoid to read the output file.

import csv

data = dict()
# first read the output file in (if one exists already)
try:
    with open("output file.csv", encoding="utf8") as f_out:
        for row in csv.DictReader(f_out): data[row['id']] = row
except OSError: pass

# now add the new rows from the input file; rows with existing id are replaced
with open("input file.csv", encoding="utf8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        if row['MyColumn'] == '0': data[row['id']] = row

with open("output file.csv", "w", encoding="utf8") as f_out:
    writer = csv.DictWriter(f_out, fieldnames=reader.fieldnames)
    writer.writeheader()
    for row in data: writer.writerow(data[row])

Upvotes: 1

Raghvendra Rao
Raghvendra Rao

Reputation: 156

To append new rows to the file rather than overriding the values try using append (a) permission on the file instead of write (w).

with open("/path/to/my/output/file.csv", "a+", encoding="utf8") as f_out:

No need of writting t as it refers to the text mode which is the default.

Documented here:

Character   Meaning
    'r'     open for reading (default)
    'w'     open for writing, truncating the file first
    'x'     open for exclusive creation, failing if the file already exists
    'a'     open for writing, appending to the end of the file if it exists
    'b'     binary mode
    't'     text mode (default)
    '+'     open a disk file for updating (reading and writing)
    'U'     universal newlines mode (deprecated)

The second part of your question is not so clear. Can you elaborate a little more?

Upvotes: 1

Related Questions