Reputation: 109
I'm essentially trying to "tidy" a lot of data in a CSV. I don't need any of the information that's in "quotes".
Tried sed 's/".*"/""/'
but it removes the commas if there's more than one section together.
I would like to get from this:
1,2,"a",4,"b","c",5
To this:
1,2,,4,,,5
Is there a sed wizard who can help? :)
Upvotes: 3
Views: 165
Reputation: 88829
With Perl:
perl -p -e 's/".*?"//g' file
?
forces *
to be non-greedy.
Output:
1,2,,4,,,5
Upvotes: 2
Reputation: 133680
Could you please try following.
awk -v s1="\"" 'BEGIN{FS=OFS=","} {for(i=1;i<=NF;i++){if($i~s1){$i=""}}} 1' Input_file
Non-one liner form of solution is:
awk -v s1="\"" '
BEGIN{
FS=OFS=","
}
{
for(i=1;i<=NF;i++){
if($i~s1){
$i=""
}
}
}
1
' Input_file
Detailed explanation:
awk -v s1="\"" ' ##Starting awk program from here and mentioning variable s1 whose value is "
BEGIN{ ##Starting BEGIN section of this code here.
FS=OFS="," ##Setting field separator and output field separator as comma(,) here.
}
{
for(i=1;i<=NF;i++){ ##Starting a for loop which traverse through all fields of current line.
if($i~s1){ ##Checking if current field has " in it if yes then do following.
$i="" ##Nullifying current field value here.
}
}
}
1 ##Mentioning 1 will print edited/non-edited line here.
' Input_file ##Mentioning Input_file name here.
Upvotes: 2
Reputation: 627262
You may use
sed 's/"[^"]*"//g' file > newfile
See online sed
demo:
s='1,2,"a",4,"b","c",5'
sed 's/"[^"]*"//g' <<< "$s"
# => 1,2,,4,,,5
Details
The "[^"]*"
pattern matches "
, then 0 or more characters other than "
, and then "
. The matches are removed since RHS is empty. g
flag makes it match all occurrences on each line.
Upvotes: 3