Reputation: 89
I have .txt file (tab separated) with two columns. First column has comma separated multiple values. Input format
a,b,c x
a y
c,d,e z
Desired output is
a x
b x
c x
a y
c z
d z
e z
The number of comma separated entries vary between 1-40. I tried following awk command, but it only works if there are two comma separated entries in first column:
awk 'BEGIN{FS="[ ,]"} {print $1, $NF; print $2, $NF}' input
How can I modify the above command (or anything new) that can do the job. Thanks.
Upvotes: 2
Views: 119
Reputation: 67507
another awk
$ awk -F',| +' '{for(i=1;i<NF;i++) print $i,$NF}'
set the field delimiter to space(s) or comma, print pair of all fields (up to the last field) with last field.
Upvotes: 3
Reputation: 2471
You can use this sed too
sed -E ':A;s/([^,]*),(.*)(\t.*$)/\1\3\n\2\3/;tA' infile
You get an output tab separated.
If you don't want this tab :
sed -E ':A;s/([^,]*),(.*)(\t.*$)/\1\3\n\2\3/;tA;s/\t/ /g' infile
Upvotes: 0
Reputation: 58420
This might work for you (GNU sed):
sed -r 's/\s+(\S+)$/ \1/;s/,(.*( \S+))$/\2\n\1/;P;D' file
Remove extra spaces from the current line. Replace each ,
by a space followed by the last field of the current line followed by a newline. Print the first line in the current line, delete the first line and repeat.
Upvotes: 0
Reputation: 117
BEGIN {
FS = "\t"
}
NF == 2 {
n = split($1, a, ",")
for (i = 1; i <= n; i++) {
print a[i], $2
}
}
Upvotes: 1
Reputation: 241868
Perl to the rescue!
perl -lane 'print "$F[0] $_ $F[2]" for glob "{$F[1]}"' file
-n
reads the input line by line-l
removes newlines from the input and adds them to prints-a
splits each line on whitespace into the @F arrayUpdate: You changed the specification (without marking any update). The solution remains similar, just use
perl -lane 'print "$_ $F[1]" for glob "{$F[0]}"' file
Upvotes: 2
Reputation: 589
awk's split function to the rescue:
awk '{split($1,arr,","); for (key in arr) { print arr[key],$2 }}' input
Upvotes: 1