Reputation: 509
My sed
is pretty shaky, so I'm not sure how to take a row like this
1,2,"12,345",x,y,"a,b"
and turn it into
1,2,12345,x,y,"a,b"
So the number "12,345" becomes 12345, but "a,b" remains untouched.
I would need to somehow preserve the values around the comma when the values are numeric. I have an idea how the regex would look like to only deal with digits, but not really sure how to just remove the comma, as opposed to removing the whole column.
Upvotes: 4
Views: 3106
Reputation: 7948
use this pattern (\d),(\d)(?!(([^"]*"){2})*[^"]*$)
and replace w/ $1$2
Demo
Upvotes: 0
Reputation: 2621
In one regex substitution you could do something as nasty as this:
/\G(?|(")(\d+)(?:,(\d+))*(")|()([^,]+)()())(,|$)/g
replace with
\1\2\3\4\5
This should work fine with Perl.
demo: http://regex101.com/r/kQ5fU1
Upvotes: 1
Reputation: 77085
Parsing CSV should be done with a proper csv parser. I would recommend perl
as well.
perl -MText::ParseWords -ne '
@line = parse_line(",", 1, $_);
print join "," , map { s/,//g if $_ =~ /^[0-9,"]+$/; $_ } @line
' text.csv
$ cat text.csv
1,2,"12,345",x,y,"a,b"
"a,c","12,345",x,y,"a,b"
$ perl -MText::ParseWords -ne '
@line = parse_line(",", 1, $_);
print join "," , map { s/,//g if $_ =~ /^[0-9,"]+$/; $_ } @line
' text.csv
1,2,"12345",x,y,"a,b"
"a,c","12345",x,y,"a,b"
To make in-place changes you can use -i
option or re-direct the output to another file.
Upvotes: 2
Reputation: 444
You can use:
echo '1,2,"12,345",x,y,"a,b"' | sed 's/"\([0-9]*\),\([0-9]*\)"/\1\2/g'
EDIT: Actually, my solution only works if there is one comma enclosed between double quotes.
Upvotes: 0
Reputation: 241768
Perl solution, using Text::CSV:
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
my @rows;
my $csv = 'Text::CSV'->new({ binary => 1}) or die 'Text::CVS'->error_diag;
open my $IN, '<', 'file.csv' or die $!;
while (my $row = $csv->getline($IN)) {
for my $cell (@$row) {
$cell =~ s/,// if $cell =~ /^[0-9,]+$/;
}
push @rows, $row;
}
$csv->eof or $csv->error_diag;
open my $OUT, '>', 'new.csv' or die $!;
$csv->print($OUT, $_) for @rows;
close $OUT or die $!;
Upvotes: 1