I am looking for a few scripts which would allow to manipulate generic csv files... typically something like: add-row FILENAME INSERT_ROW get-row FILENAME GREP_ROW replace-row FILENAME GREP_ROW INSERT_ROW delete-row FILENAME GREP_ROW where FILENAME the name of a csv file, with the first row containing headers, "" used to delimit strings which might contain ',' GREP_ROW a string of pairs field1=value1[,fieldN=valueN,...] used to identify a row based on its fields values in a csv file INSERT_ROW a string of pairs field1=value1[,fieldN=valueN,...] used to replace(or add) the fields of a row. peferably in python using the csv package... ideally leveraging python to associate each field as a variable and allowing more advanced GREP rules like fieldN > XYZ...

pythonperlcsv

user1078518

Reputation: 21

Are there a set of simple scripts to manipulate csv files available somewhere?

I am looking for a few scripts which would allow to manipulate generic csv files...

typically something like:

add-row FILENAME INSERT_ROW
get-row FILENAME GREP_ROW
replace-row FILENAME GREP_ROW INSERT_ROW
delete-row FILENAME GREP_ROW

where

FILENAME the name of a csv file, with the first row containing headers, "" used to delimit strings which might contain ','
GREP_ROW a string of pairs field1=value1[,fieldN=valueN,...] used to identify a row based on its fields values in a csv file
INSERT_ROW a string of pairs field1=value1[,fieldN=valueN,...] used to replace(or add) the fields of a row.

peferably in python using the csv package... ideally leveraging python to associate each field as a variable and allowing more advanced GREP rules like fieldN > XYZ...

Upvotes: 2

Answers (4)

Raymond Hettinger

Reputation: 226336

The usual way in Python is to use the csv.reader to load the data into a list of tuples, then do your add/replace/get/delete operations on that native python object, and then use csv.writer to write the file back out.

In-place operations on CSV files wouldn't make much sense anyway. Since the records are not typically of fixed length, there is no easy way to insert, delete, or modify a record without moving all the other records at the same time.

That being said, Python's fileinput module has a mode for in-place file updates.

Upvotes: 0

daxim

Reputation: 39158

App::CCSV does some of that.

Upvotes: 0

Jonathan Hall

Reputation: 79604

Perl has the DBD::CSV driver, which lets you access a CSV file as if it were an SQL database. I've played with it before, but haven't used it extensively, so I can't give a thorough review of it. If your needs are simple enough, this may work well for you.

Upvotes: 4

Jeff Burdges

Reputation: 4261

Perl has a tradition of in-place editing derived from the unix philosophy.

We could for example write simple add-row-by-num.pl command as follows :

#!/usr/bin/perl -pi
BEGIN { $ln=shift; $line=shift; }
print "$line\n" if $ln==$.;
close ARGV if eof;

Replace the third line by $_="$line\n" if $ln==$.; to replace lines. Eliminate the $line=shift; and replace the third line by $_ = "" if $ln==$.; to delete lines.

We could write a simple add-row-by-regex.pl command as follows :

#!/usr/bin/perl -pi
BEGIN { $regex=shift; $line=shift; }
print "$line\n" if /$regex/;

Or simply the perl command perl -pi -e 'print "LINE\n" if /REGEX/'; FILES. Again, we may replace the print $line by $_="$line\n" or $_ = "" for replace or delete, respectively.

We do not need the close ARGV if eof; line anymore because we need not rest the $. counter after each file is processed.

Is there some reason the ordinary unix grep utility does not suffice? Recall the regular expression (PATERN){n} matches PATERN exactly n times, i.e. (\s*\S+\s*,){6}{\s*777\s*,) demands a 777 in the 7th column.

There is even a perl regular expression to transform your fieldN=value pairs into this regular expression, although I'd use split, map, and join myself.

Btw, File::Inplace provides inplace editing for file handles.

Upvotes: 4

Are there a set of simple scripts to manipulate csv files available somewhere?

Answers (4)

Related Questions