Reputation: 39
I got two kind of text file file1 and file2 as following
ID22, abc0o, 1011, h232a, 78m, 928aaa
ID2344, oklabc, 12as2, 987, 7f82, sas28aas
ID092, ac, 12, haha, 782oee, gsd839
and the second one
ID1, 1, 2, 3, 4, 5
ID22, 6, 7
ID097222, 8, 9, 10
ID67, 11, 12, 13, 14, 1
ID2344, 8, 17, 23, 7, 555
ID2328999, 642, 43, 34, 34, 121
ID2344, 2111, 12
ID22, 1212, 9999, 23, 232, 96564
ID092, 1010, 1111, 1213, 1415, 18718
ID2328999, 9999, 333, 222, 7f82, 28
ID22, 8888, 777, 4444
ID2344, 220020, 666, 555, 782m, 839
well what I would like to make and save to other file is find the first column of file1 in file2 and add the rest of line in file 2 to file1 in the same line and preserve the order too. Of course the values of the first column in file1 are unique. The result should be as below.
ID22, abc0o, 1011, h232a, 78m, 928aaa, 6, 7, 1212, 9999, 23, 232, 96564, 8888, 777, 4444
ID2344, oklabc, 12as2, 987, 7f82, sas28aas, 8, 17, 23, 7, 555, 2111, 12, 220020, 666, 555, 782m, 839
ID092, ac, 12, haha, 782oee, gsd839, 1010, 1111, 1213, 1415, 18718
Upvotes: 0
Views: 133
Reputation: 133428
Could you please try following, written and tested with shown samples in GNU awk
.
awk '
BEGIN{
FS=OFS=", "
}
{
first=$1
$1=""
sub(/^, +/,"")
}
FNR==NR{
arr[first]=$0
next
}
(first in arr){
arr[first]=(arr[first]?arr[first] OFS:"")$0
}
END{
for(key in arr){
print key,arr[key]
}
}
' file1 file2
Explanation: Adding detailed explanation for above solution.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS=", " ##Setting field separator and output field separator as comma space
}
{
first=$1 ##Creating first with value of 1st field here.
$1="" ##Nullifying first field here.
sub(/^, +/,"") ##Substituting initial space with NULL here.
}
FNR==NR{ ##Checking condition which will be TRUE when file1 is being read.
arr[first]=$0 ##Creating arr with index of first and value of current line.
next ##next will skip all further statements from here.
}
(first in arr){ ##Checking condition if first is present in arr then do following.
arr[first]=(arr[first]?arr[first] OFS:"")$0 ##Keep adding current line value into arr[first] value.
}
END{ ##Starting END block of this program from here.
for(key in arr){ ##Traversing through arr here.
print key,arr[key] ##Printing index of arr and value of arr here.
}
}
' file1 file2 ##Mentioning Input_file names here.
Upvotes: 1
Reputation: 37394
Another awk, the straight forward way:
$ awk '
NR==FNR {
k=$1 # store $1 as key k
$1="" # null $1
a[k]=a[k] "," $0 # append records excluding the $1 to a[k]
next
}
$1 in a {
print $0 a[$1] # output
}' file2 file1 # mind the order
Output:
ID22, abc0o, 1011, h232a, 78m, 928aaa, 6, 7, 1212, 9999, 23, 232, 96564, 8888, 777, 4444
ID2344, oklabc, 12as2, 987, 7f82, sas28aas, 8, 17, 23, 7, 555, 2111, 12, 220020, 666, 555, 782m, 839
ID092, ac, 12, haha, 782oee, gsd839, 1010, 1111, 1213, 1415, 18718
Upvotes: 2
Reputation: 12867
awk -F, '
NR==FNR {
map[$1]=$0
}
FNR!=NR {
if (map[$1] != "" )
{
map[$1]=map[$1]","$0
}
}
END {
for (i in map) {
if (map[i]!="")
{
print map[i]
}
}
}' file1 file2
awk -F, 'NR==FNR { map[$1]=$0 } FNR!=NR { if (map[$1] != "" ) { map[$1]=map[$1]","$0} } END { for (i in map) { if (map[i]!="") { print map[i] } } }' file1 file2
Process the first file with awk (MR==FNR). Read the line as a value for array map with the first comma separated field as the index. Then for the second file (FNR!=NR), where there is an entry in map for the first comma delimited field, append the line to the entry in map. At the end, loop through map and where the value is not empty, print.
Upvotes: 1