messier101
messier101

Reputation: 101

Python: can't delete unwanted commas in a text file

I have a CSV file that contains an information about people:

name,age,height

Maria,25,172

George,45,180,

Peter,23,179,

The problem is that some strings contain an extra commas in the end, and some don't (this appears because this information was got from the internet using urlopen in the other Python script which processes the raw data).

I tried to write some code to fix this, but I couldn`t get a result. What I've written:

import re


data = open('file.csv').read()

new_data = re.sub('\W$', '', data)
print(new_data)

But this code substitutes only the last comma in the whole document. I tried to write a cycle, which counts all lines and then analyses each line, but maybe my coding skills are not great and I didn't reach a success. Please, tell me, what I'm doing wrong.

Upvotes: 2

Views: 1331

Answers (2)

wnnmaw
wnnmaw

Reputation: 5524

This is simple enough you don't really need regex (and its probably faster to not use it)

Here's what I would do:

with open("file.csv", 'r') as f:
    newLines = [line[:-1] if line.endswith(",") else line for line in f.readlines()]

Then all you need to do is write it back to the file

Upvotes: 0

Valentin Lorentz
Valentin Lorentz

Reputation: 9753

The problem is the whole file is handled as a string, and $ matches only the end of the string. You would better use re.sub('\W\n', '\n', data)

You can also do that without regexp: new_data = data.replace(',\n', '\n'), which is probably faster.

Upvotes: 4

Related Questions