kal
kal

Reputation: 29371

Replacing a line in a csv file?

I have a set of 10 CSV files, which normally have a an entry of this kind

a,b,c,d
d,e,f,g

Now due to some error entries in this file have become of this kind

a,b,c,d
d,e,f,g
,,,
h,i,j,k

Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.

Any command that you recommend that can replaces the erroneous lines in all the files.

Upvotes: 1

Views: 1360

Answers (7)

user56950
user56950

Reputation: 21

yes, awk or grep are very good option if you are working in linux platform. However you can use perl regex for other platform. using join & split options.

Upvotes: 0

bdowling
bdowling

Reputation: 425

Most simply:

$   grep -v ,,,, oldfile > newfile   
$   mv newfile oldfile

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 753845

It depends on what you mean by replace. If you mean 'remove', then a trivial variant on @wnoise's solution is:

grep -v '^,,,$' old-file.csv > new-file.csv

Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) - and no other characters on the line, then:

grep -v '^,*$' ...

There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.

Check out Mastering Regular Expressions.

Upvotes: 5

David Z
David Z

Reputation: 131600

Do you want to replace them with something, or delete them entirely? Either way, it can be done with sed. To delete:

sed -i -e '/^,\+$/ D' yourfile1.csv yourfile2.csv ...

To replace: well, see wnoise's answer, or if you don't want to create new files with the output,

sed -i -e '/^,\+$/ s//replacement/' yourfile1.csv yourfile2.csv ...

or

sed -i -e '/^,\+$/ c\
replacement' yourfile1.csv yourfile2.csv ...

(that should be entered exactly as is, including the line break). Of course, you can also do this with awk or perl or, if you're only deleting lines, even grep:

egrep -v '^,+$' < oldfile.csv > newfile.csv

I tested these to make sure they work, but I'd advise you to do the same before using them (just in case). You can omit the -i option from sed, in which case it'll print out the results (rather than writing them back to the file), or omit the output redirection >newfile.csv from grep.

EDIT: It was pointed out in a comment that some features of these sed commands only work on GNU sed. As far as I can tell, these are the -i option (which can be replaced with shell redirection, sed ... <infile >outfile ) and the \+ modifier (which can be replaced with \{1,\} ).

Upvotes: 1

MatthieuP
MatthieuP

Reputation: 1126

What about trying to keep only lines which are matching the desired format instead of handling one exception ?

If the provided input is what you really want to match:

grep -E '[a-z],[a-z],[a-z],[a-z]' < oldfile.csv > newfile.csv

If the input is different, provide it, the regular expression should not be too hard to write.

Upvotes: 1

Keltia
Keltia

Reputation: 14743

Replace or remove, your post is not clear... For replacement see wnoise's answer. For removing, you could use

awk '$0 !~ /,,,/ {print}' <old-file.csv > new-file.csv

Upvotes: 1

wnoise
wnoise

Reputation: 9922

sed 's/,,,/replacement/' < old-file.csv > new-file.csv

optionally followed by mv new-file.csv old-file.csv

Upvotes: 2

Related Questions