RSFalcon7
RSFalcon7

Reputation: 2311

How to remove quotation and spaces arround numbers in a CSV file using sed?

I have some numbers in a CSV file which I'm trying to remove quotations and spaces arround it.

Input: 1," 23","45","67 ",89

Expected output: 1,23,45,67,89

I'm trying to remove with:

sed -r -e 's#\"[ ]*\([0-9]+\)[ ]*\"#\1#g' file.csv

But I'm getting the error "sed: -e expression #1, char 38: invalid reference \1 on s' command's RHS", if I remove the-r` option, I don't get the error, but it does not work either.

Upvotes: 0

Views: 61

Answers (2)

mklement0
mklement0

Reputation: 437111

Tom Fenech provided the crucial pointer in a comment:

The only problem with the OP's command is a minor syntax problem:

Since sed is used with -r in order to activate extended regular expressions, ( and ) - for defining capture groups - must NOT be \-escaped.
(By contrast, when sed is used without -r, basic regular expressions must be used, where such escaping is needed.)

The correct form is therefore (\ before ( and ) removed):

sed -r 's#\"[ ]*([0-9]+)[ ]*\"#\1#g' file.csv

If you want the command to work on OSX also, use -E instead of -r.

Alternatively, for maximum portability (POSIX compliance) you could just use \{1,\} instead of + and do away with the -r switch entirely:

sed 's#\"[ ]*\([0-9]\{1,\}\)[ ]*\"#\1#g' file.csv

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174696

You could try the below perl command,

$ echo '1," 23","45","67 ",89, "foo" , "bar" ' | perl -pe 's/[" ]+(\d+)[ "]+/\1/g'
1,23,45,67,89, "foo" , "bar" 

Upvotes: 1

Related Questions