Reputation: 4499
I have two tab delimited files
file1.txt
field1
field2
field3
file2.txt
field1 value f11
field1 value f12
field1 value f13
field2 value f21
field2 value f22
field2 value f23
field3 value f31
I want the output
field1 value f11, value f12 , valuef13
field2 value f21, value f22 , valuef23
field3 value f31
I am trying this
awk -F"\t" 'NR==FNR{a[$1] = $1;next} { print a[$1] }' file1.txt file2.txt
and it is coming as empty
If filds are in file1 and not in file2 then that should comas blank
UPDATE:
This is my output from my terminal
ubuntu@ubuntu:/mnt/coding/awk$ cat f1.txt
field1
field2
field3
field9
ubuntu@ubuntu:/mnt/coding/awk$ cat f2.txt
field1 value f11
field1 value f12
field1 value f13
field2 value f21
field2 value f22
field2 value f23
field3 value f31
ubuntu@ubuntu:/mnt/coding/awk$ awk -F"\t" 'NR==FNR{a[$1] = a[$1]", "$2;next}{gsub("^,","",a[$1]);print $1"\t"a[$1]}' f2.txt f1.txt
field1
field2
field3
field9
ubuntu@ubuntu:/mnt/coding/awk$ awk --version
GNU Awk 3.1.8
Copyright (C) 1989, 1991-2010 Free Software Foundation.
UPDATE 2:
ubuntu@ubuntu:/mnt/coding/awk$ od -xcb f1.txt
0000000 6966 6c65 3164 660a 6569 646c 0a32 6966
f i e l d 1 \n f i e l d 2 \n f i
146 151 145 154 144 061 012 146 151 145 154 144 062 012 146 151
0000020 6c65 3364 660a 6569 646c 0a39 000a
e l d 3 \n f i e l d 9 \n \n
145 154 144 063 012 146 151 145 154 144 071 012 012
0000035
ubuntu@ubuntu:/mnt/coding/awk$ od -xcb f2.txt
0000000 6966 6c65 3164 2020 6176 756c 2065 3166
f i e l d 1 v a l u e f 1
146 151 145 154 144 061 040 040 166 141 154 165 145 040 146 061
0000020 0a31 6966 6c65 3164 2020 6176 756c 2065
1 \n f i e l d 1 v a l u e
061 012 146 151 145 154 144 061 040 040 166 141 154 165 145 040
0000040 3166 0a32 6966 6c65 3164 2020 6176 756c
f 1 2 \n f i e l d 1 v a l u
146 061 062 012 146 151 145 154 144 061 040 040 166 141 154 165
0000060 2065 3166 0a33 6966 6c65 3264 2020 6176
e f 1 3 \n f i e l d 2 v a
145 040 146 061 063 012 146 151 145 154 144 062 040 040 166 141
0000100 756c 2065 3266 0a31 6966 6c65 3264 2020
l u e f 2 1 \n f i e l d 2
154 165 145 040 146 062 061 012 146 151 145 154 144 062 040 040
0000120 6176 756c 2065 3266 0a32 6966 6c65 3264
v a l u e f 2 2 \n f i e l d 2
166 141 154 165 145 040 146 062 062 012 146 151 145 154 144 062
0000140 2020 6176 756c 2065 3266 0a33 6966 6c65
v a l u e f 2 3 \n f i e l
040 040 166 141 154 165 145 040 146 062 063 012 146 151 145 154
0000160 3364 2020 6176 756c 2065 3366 0a31 000a
d 3 v a l u e f 3 1 \n \n
144 063 040 040 166 141 154 165 145 040 146 063 061 012 012
0000177
ubuntu@ubuntu:/mnt/coding/awk$
Other:
awk -F"\t" 'NR==FNR{a[$1] = a[$1]", "$2; print "["$1"/"$2"]"; next}{gsub("^,","",a[$1]);print $1"\t"a[$1]}' f2.txt f1.txt
[field1 value f11/]
[field1 value f12/]
[field1 value f13/]
[field2 value f21/]
[field2 value f22/]
[field2 value f23/]
[field3 value f31/]
[/]
field1
field2
field3
field9
Upvotes: 0
Views: 236
Reputation: 882806
Here's one possible solution:
NR==FNR{a[$1]=a[$1]", "$2;next}{gsub("^,","",a[$1]);print $1"\t"a[$1]}
This works by storing the keys and values in file2
first, then processing file1 to output them. The following (slightly formatted) transcript shows this in action:
pax> cat file1.txt
field1
field2
field3
field9
pax> cat file2.txt
field1 value f11
field1 value f12
field1 value f13
field2 value f21
field2 value f22
field2 value f23
field3 value f31
pax> awk -F"\t" 'NR == FNR{
...> a[$1] = a[$1]", "$2;
...> next
...> }
...> {
...> gsub ("^,", "", a[$1]);
...> print $1"\t"a[$1]
...> }' file2.txt file1.txt
field1 value f11, value f12, value f13
field2 value f21, value f22, value f23
field3 value f31
field9
Based on your update where I suggested you use the sed
command print "["$1"/"$2"]"
and you got:
[field1 value f11/]
[field1 value f12/]
[field1 value f13/]
[field2 value f21/]
[field2 value f22/]
[field2 value f23/]
[field3 value f31/]
that indicates that those fields are not tab-separated. This is also confirmed by the output of the od
command where you can clearly see th e 2020
two-space sequence.
You need to sort that out or the commands won't work with -F"\t"
. For example, if I replace each tab in my file2.txt
with a space, I see what you're seeing:
pax> awk -F"\t" 'NR==FNR{a[$1] = a[$1]", "$2;next}
...> {gsub("^, ","",a[$1]);print $1"\t"a[$1]}' file2.txt file1.txt
field1
field2
field3
field9
So, you need to make a decision, either:
awk
options to work with the actual file format.Given you have spaces elsewhere on the line, the first option is probably the easiest. To modify the file to replace the first group of spaces on each line with a tab, you can use:
sed 's/ */\t/' file2.txt >file2a.txt
mv file2a.txt file2.txt # once you're happy.
(or use the -i
inplace editing option of GNU sed
).
Upvotes: 4
Reputation: 55369
Using other tools instead of awk (and assuming there's a tab to separate fields from values in file2):
for f in `cat /tmp/file1.txt`; do grep $f /tmp/file2.txt | cut -f2- | paste -s -d, | sed "s/^/$f\t/g" ; done
(I'd have posted this as a comment but it doesn't seem to be possible to escape backticks in comments)
Upvotes: 0