Reputation: 319
How do I place commas between fields?
Input data
12123 'QA test case 1' 'QA environment'
12234 'UAT test case 1' 'UAT environment'
Expected output
12123, 'QA test case 1', 'QA environment'
12234, 'UAT test case 1', 'UAT environment'
Upvotes: 1
Views: 2127
Reputation: 170
This solution (I believe) is not very nice, but standard at least:
awk 'BEGIN{SP="[:space:]"}{gsub("(["SP"]*('\''[^'\'']*'\''|[^'\''"SP"])+)","&,");if(match($0, (",["SP"]+$")))$0=substr($0,1,RSTART-1)substr($0,RSTART+1)}1'
Though some "broken" awk implementations don't support character classes with the [[:foo:]] style, in that case you can use:
awk 'BEGIN{SP=" \t\f\v\r\n"}{gsub("(["SP"]*('\''[^'\'']*'\''|[^'\''"SP"])+)","&,");if(match($0, (",["SP"]+$")))$0=substr($0,1,RSTART-1)substr($0,RSTART+1)}1'
Note: I used '\'' to place each single quote character because that's a simple and standard way to do it. If you want to use this line in a ".awk" file, just replace every occurrence with a single quote.
Upvotes: 0
Reputation: 41446
Here is how I handle csv
with awk
cat file
12123 'QA test case 1' 'QA environment' some more
12234 'UAT test case 1' 'UAT environment'
awk '{for (i=1;i<NF;i++) {if ($i~t) c++;printf "%s"(c%2?FS:", "),$i}print $NF}' t=\' file
12123, 'QA test case 1', 'QA environment', some, more
12234, 'UAT test case 1', 'UAT environment'
This keep track of how many '
it finds.
If its 0 2 4 6 etc
you are outside a group, split using ,
If its 1 3 5 7 etc
you are inside a group, split by (a space)
Since you now have a good separator, you can get rid of the '
awk '{for (i=1;i<NF;i++) {if ($i~t) c++;sub(t,"",$i);printf "%s"(c%2?FS:","),$i}sub(t,"",$NF);print $NF}' t=\' file
12123,QA test case 1,QA environment,some,more
12234,UAT test case 1,UAT environment
You could also use FPAT
that is used to define fields, opposed to FS
that define separators, but then you need gnu awk
4.x, and it would not be portable.
awk '{for (i=1;i<NF;i++) printf "%s, ",$i;print $NF}' FPAT="[^' ]+|'[^']+'" file
12123, 'QA test case 1', 'QA environment', some, more
12234, 'UAT test case 1', 'UAT environment'
How does the FPAT="[^' ]+|'[^']+'"
works?
1. A field should not contain one or more '
or space. [^' ]+
eks some
and more
2. A field starts with '
then one or more not '
and then ends with '
. '[^']+'
eks 'test data'
Upvotes: 1
Reputation: 203219
$ sed "s/ '/,&/g" file
12123, 'QA test case 1', 'QA environment'
12234, 'UAT test case 1', 'UAT environment'
Upvotes: 2
Reputation: 2504
Try this awk:
awk -F" '" '{ print $1, $2, $3 }' OFS=", '" data
or using a BEGIN
block:
awk -F" '" 'BEGIN {OFS="," FS} { print $1, $2, $3 }' data
In either case, the FS
is being set to '
(space + "'") and OFS
is being set to "," + '
. It's based on the assumption '
is a validly unique field separator and all input data is formatted/arranged as in the question.
Upvotes: 1
Reputation: 437176
A naïve bash
implementation that assumes that no (escaped) '
instances ever appear inside a field:
Input is assumed to come from file file
:
# Read all whitespace-separated tokens (potentially across quoted field boundaries).
while read -ra tkns; do
# Initialize per-line variables.
numTkns=${#tkns[@]} i=0 inField=0
# Loop over all tokens.
for tkn in "${tkns[@]}"; do
# Determine if we're inside a quoted field.
[[ $tkn == \'* ]] && inField=1
[[ $tkn == *\' ]] && inField=0
# Determine the output separator:
if (( ++i == numTkns )); then
sep=$'\n' # last token, terminate output line with \n
else
# inside a field: use just a space; between fields: use ', '
(( inField )) && sep=' ' || sep=', '
fi
# Output token and separator.
printf '%s%s' "$tkn" "$sep"
done
done < file
Upvotes: 2
Reputation: 246754
Another option is to use a CSV parser:
ruby -rcsv -ne '
puts CSV.generate_line(
CSV.parse_line($_.strip, {:col_sep => " ", :quote_char => "'\''"}
), {:force_quotes => 1})
' file
"12123","QA test case 1","QA environment"
"12234","UAT test case 1","UAT environment"
Upvotes: 3
Reputation: 38195
Your input data looks very much like an argument list. Therefore, one convenient approach would be to define a bash function that simply returns its argument list as comma separated tokens and invoke that for each line in your file.
However, the simple implementation below will lose the quotes around the multi-word tokens (but it will place the commas properly). If you need those quotes exactly as they were, it would get a bit more complicated (it's very easy to output every token quoted though):
#!/bin/bash
function csv_args() {
while [ -n "$1" ]; do
echo -n "$1"
shift
[ -n "$1" ] && echo -n ', '
done
echo
}
while read line; do
eval csv_args $line
done < /path/to/your/file
Upvotes: 2