devesh marwah
devesh marwah

Reputation: 51

Search Delete using csv.reader in python

I want to search a column and delete from csv file using python. I cannot dataframes as I need to work with large files and can't load it in RAM. How to do it? example csv file-

Home,Contact,Adress
abc,123,xyz 

I need to find and delete Contact for example. I thought to use csv.reader but cannot figure out how to do it

Upvotes: 3

Views: 215

Answers (2)

ThangTD
ThangTD

Reputation: 1684

If your application prefers to work with pandas still, I'd suggest to play with pandas chunking tactic. See example below:

iterator = pandas.read_csv('/tmp/abc.csv', chunksize=10**5)
df_new = pandas.DataFrame(columns=['your_remaining_columns'])

for df in iterator:
    del df['col_b']
    df_new = pandas.concat([df_new, df])

print(df_new.shape[0])
print(df_new.columns)

I was able to process a 50GB csv file with complex data (non utf8 encoding, cell contains ,, doing deduplication and filtered out bad rows) by this approach before.

Upvotes: 0

S.B
S.B

Reputation: 16526

Check this :

import csv

col = 'Contact'

with open('your_csv.csv') as f:
    with open('new_csv.csv', 'w', newline='') as g:
        
        # creating csv reader
        reader = csv.reader(f)

        # getting the 'col' index in the header, we want to delete it in the next lines
        col_index = next(reader).index(col)

        for line in reader:
            del line[col_index]

            # writing to new csv file
            writer = csv.writer(g)
            writer.writerow(line)

Explanation for using newline='' is here.

Upvotes: 1

Related Questions