Reputation: 1545
I am parsing a CSV file that looks like this :
999,"FOO",0.00249,0.00249,0.00249,0.0000,1,1
888,"BAR",0.00249,0.00249,0.00249,0.0000,1,1
777,"FOOBAR",0.00999,0.00999,0.00999,0.0000,1,1
666,"ABC",0.00999,0.00999,0.00999,0.0000,1,1
555,"DEF","-0.00100","-0.00100","-0.00100",0.0000,1,1
444,"EFG","-0.00100","-0.00100","-0.00100",0.0000,1,1
The only column that should be quote-delimited is the second colum (e.g. "FOO"
,"BAR"
etc.). The other columns are always to be interpreted as numeric.
As you can see in the above example, lines 555 and 444 have a quoted numeric value "-0.00100"
.
I am therefore looking to strip quotes from around numeric values.
I have done a little research on here on Stackoverflow and identified the following: https://stackoverflow.com/a/18624948/4474629
I have tried to adapt it thus :
column -t -s, | sed s/"(-?\d[\d.]*)"/\1/g | more
But the output still prints ?
555 "DEF" "-0.00100" "-0.00100" "-0.00100" 0.0000 1 1
444 "EFG" "-0.00100" "-0.00100" "-0.00100" 0.0000 1 1
The expected result is:
555 "DEF" -0.00100 -0.00100 -0.00100 0.0000 1 1
444 "EFG" -0.00100 -0.00100 -0.00100 0.0000 1 1
Upvotes: 1
Views: 62
Reputation: 58371
This might work for you (GNU sed):
sed 's/"//3g' file
Only the second column is quoted therefore there should only be 2 double quotes and any others should be removed.
Upvotes: 0
Reputation: 85767
If you look at the sed manual, you will see that neither (
nor ?
nor \d
nor )
are supported. The question you linked to used perl, where these constructs work.
You can adapt your script thus:
sed 's/"\(-\?[0-9][0-9.]*\)"/\1/g'
(using single quotes to prevent interpretation of special characters by the shell).
Even \?
is a GNU extension; if your sed doesn't support it, you may have to use -\{0,1\}
instead.
Upvotes: 2