Another.Chemist
Another.Chemist

Reputation: 2559

Modify content inside quotation marks, BASH

Good day to all,

I was wondering how to modify the content inside quotation marks and left unmodified the outside.

Input line:

,,,"Investigacion,,, desarrollo",,,

Output line:

,,,"Investigacion, desarrollo",,,

Initial try:

sed 's/\"",,,""*/,/g' 

But nothing happens, thanks in advance for any clue

Upvotes: 2

Views: 95

Answers (4)

jaypal singh
jaypal singh

Reputation: 77155

Using a language that has built-in CSV parsing capabilities like perl will help.

perl -MText::ParseWords -ne '
    print join ",", map { $_ =~ s/,,,/,/; $_ } parse_line(",", 1, $_)
' file
,,,"Investigacion, desarrollo",,,

Text::ParseWords is a core module so you don't need to download it from CPAN. Using the parse_line method we set the delimiter and a flag to keep the quotes. Then just do simple substitution and join the line to make your CSV again.

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 204259

The idiomatic awk way to do this is simply:

$ awk 'BEGIN{FS=OFS="\""} {sub(/,+/,",",$2)} 1' file
,,,"Investigacion, desarrollo",,,

or if you can have more than one set of quoted strings on each line:

$ cat file
,,,"Investigacion,,, desarrollo",,,"foo,,,,bar",,,

$ awk 'BEGIN{FS=OFS="\""} {for (i=2;i<=NF;i+=2) sub(/,+/,",",$i)} 1' file
,,,"Investigacion, desarrollo",,,"foo,bar",,,

This approach works because everything up to the first " is field 1, and everything from there to the second " is field 2 and so on so everything between "s is the even-numbered fields. It can only fail if you have newlines or escaped double quotes inside your fields but that'd affect every other possible solution too so you'd need to add cases like that to your sample input if you want a solution that handles it.

Upvotes: 3

konsolebox
konsolebox

Reputation: 75568

Using awk:

awk '{ p = ""; while (match($0, /"[^"]*,{2,}[^"]*"/)) { t = substr($0, RSTART, RLENGTH); gsub(/,+/, ",", t); p = p substr($0, 1, RSTART - 1) t; $0 = substr($0, RSTART + RLENGTH); }; $0 = p $0 } 1' 

Test:

$ echo ',,,"Investigacion,,, desarrollo",,,' | awk ...
,,,"Investigacion, desarrollo",,,
$ echo ',,,"Investigacion,,, desarrollo",,,",,, "' | awk ...
,,,"Investigacion, desarrollo",,,", "

Upvotes: 1

anubhava
anubhava

Reputation: 785721

Using egrep, sed and tr:

s=',,,"Investigacion,,, desarrollo",,,'
r=$(egrep -o '"[^"]*"|,' <<< "$s"|sed '/^"/s/,\{2,\}/,/g'|tr -d "\n")

echo "$r"
,,,"Investigacion, desarrollo",,,

Upvotes: 2

Related Questions