Reputation: 7
Let's take
"apple" , "tomato,potato", 100, 250, "banana"
is my data I want replace the outer delimiter i.e. , which is same as inner delimiter. Ultimately, want my data like :
"apple" ; "tomato,potato"; 100; 250; "banana"
and the position of the string with delimeter can be anywhere in data.
I have tried many chatGPT sed commands all are not working
Upvotes: -5
Views: 317
Reputation: 478
Here is one solution with DuckDB on a larger example (2 lines)
$ cat /tmp/test.csv
"apple" , "tomato,potato", 100, 250, "banana"
"grapes","peas,beans",150,300,kiwi
the result from the DuckDB prompt
D COPY (from '/tmp/test.csv') TO '/dev/stdout' (HEADER 0, delimiter ';');
apple;tomato,potato;100;250;banana
grapes;peas,beans;150;300;kiwi
You can also use read_csv function for enhanced CSV parsing (RFC 4180 compliant)
$ duckdb << EoS
COPY (from read_csv('/tmp/test.csv')) TO '/tmp/test_semicolumn.csv' (HEADER 0, delimiter ';');
EoS
will return the same result on a new file
Upvotes: 0
Reputation: 167
The problem here is probably that you are missing the "inplace" parameter -i.
sed -i -E 's/( *\, *)((([^,"\,])+)|("[^"]+"))/; \2/g' nameofthefile.csv
this will substitute all occurencies of "," with ";" in the file without change them if they are inside quotes. Please change directory to the path of the file.
If you want to do this replacement for multiple files csv in the same path use this command:
sed -i -E 's/( *\, *)((([^,"\,])+)|("[^"]+"))/; \2/g' *.csv
Upvotes: -2
Reputation: 58518
This might work for you (GNU sed):
sed -E ':a;s/^(("[^,"]*"[^"]*)*"[^,"]*),/\1\n/;ta;y/\n,/,;/' file
First convert any commas inside double quotes to newlines.
Then transliterate newlines to commas and commas to semi-colons.
An alternate solution (kudos to Walter A):
sed -E 's/(("[^"]*"|)[^",]*),/\1;/g' file
Upvotes: 2
Reputation: 20032
When the input has an even number of double quotes, you can match a quoted field with ("[^"]*")
and the following substring until the next ,
with ([^",]*)
.
When you also want to replace delimiters between fields without quotes, add an *
after the regex for a quoted field.
echo '"apple" , "tomato,potato", 100, 250, "banana" ' | sed -E 's/("[^"]*")*([^",]*),/\1\2;/g'
# or with a file
sed -E 's/("[^"]*")*([^",]*),/\1\2;/g' file
Result
"apple" ; "tomato,potato"; 100; 250; "banana"
Upvotes: 2
Reputation: 204381
I wouldn't normally answer a question that shows no attempt by the OP but since there are multiple answers posted already....
Your input isn't valid CSV and so YMMV trying to use a tool that understands CSV on it.
Using any awk in any shell on every Unix box you could treat the input as "
-separated fields instead of ,
-separated fields and replace the ,
s in every odd-numbered "field":
$ awk 'BEGIN{FS=OFS="\""} {for (i=1; i<=NF; i+=2) gsub(/,/,";",$i)} 1' file
"apple" ; "tomato,potato"; 100; 250; "banana"
If that's not all you need then see whats-the-most-robust-way-to-efficiently-parse-csv-using-awk for ideas if you want to use awk for this.
Upvotes: 1
Reputation: 247092
Some implementations of CSV parsers may have trouble with that "sloppy" CSV. While the spaces around commas are OK, the fact that the quotes are not immediately next to the comma isn't OK. For example,
the ruby CSV module throws a MalformedCSVError "Any value after quoted field isn't allowed"
$ echo '"apple" , "tomato,potato", 100, 250, "banana"' | ruby -rcsv -e 'p CSV.parse(STDIN.readline)'
/home/linuxbrew/.linuxbrew/Cellar/ruby/3.3.2/lib/ruby/3.3.0/csv/parser.rb:1066:in `parse_quotable_robust': Any value after quoted field isn't allowed in line 1. (CSV::MalformedCSVError)
python:
$ python3
Python 3.12.3 (main, Apr 9 2024, 08:09:14) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> line = '"apple" , "tomato,potato", 100, 250, "banana"'
>>> for fields in csv.reader([line]):
... print(fields)
...
['apple ', ' "tomato', 'potato"', ' 100', ' 250', ' "banana"']
the bash5.2 dsv
module does the same thing as python: the "tomato,potato"
value is not considered one field because there is a space before the opening quote, so the quotes don't have special meaning.
Upvotes: 1
Reputation: 17
You can simply replace ,
with ;
using this sed command sed 's/, /; /g'
.
for example:
echo '"apple" , "tomato,potato", 100, 250, "banana"' | sed 's/, /; /g'
output:
"apple" ; "tomato,potato"; 100; 250; "banana"
Upvotes: -1