Reputation: 3070
I have a list like this:
s1 d2
s1 d4
s3 d2
s4 d1
s1 d3
s4 d1
s5 d6
s3 d5
s1 d2
s1 d3
I need to obtain, for every element in the first column (s_
) the list of element in the second column (d_
) in the same order of appearance. In this case:
s1 d2 d4 d3 d2 d3
s3 d2 d5
s4 d1 d1
s5 d6
The order of the s_
is not important, the order of the d_
is.
Can you suggest a simple and fast approach to do it (because the list is large), maybe in awk?
Upvotes: 1
Views: 209
Reputation: 28000
This would guarantee the order of both keys and values:
awk 'END {
for (i = 0; ++i <= c;)
print idx[i], s[idx[i]]
}
{
s[$1] = s[$1] ? s[$1] OFS $2 : $2
t[$1]++ || idx[++c] = $1
}' infile
Upvotes: 1
Reputation: 51653
Here you go:
awk '{ ss[$1]++ ; ds[$1 NR]=$2 }
END { for ( e in ss )
{ a=e
for (i=1;i<=NR;i++)
{ a=a " " ds[e i] }
printf("%s\n",gensub(" +"," ","g",a))
}
}' INPUTFILE
HTH
Upvotes: 1
Reputation: 14014
Something like this, perhaps (for the command line):
awk '{ vals[$1] = vals[$1] " " $2 }; END { for (key in vals) { print key,vals[key] }}' list
Formatted prettier as an awk script:
{ vals[$1] = vals[$1] " " $2 }
END {
for (key in vals) {
print key,vals[key]
}
}
What this does is store, by index of the first values, a string that contains the progressive values on the right side. So each time it finds one, it concatenates it to the end of that string. Then at the end, it prints each pair out.
Upvotes: 5
Reputation: 2720
I would use an associative array to memorize the "sX" and then do string concatenation on the value.
BEGIN {
print "ID\tList\n";
}
{
id[$1]=id[$1] $2;
}
END{
for (var in id)
print var,"\t",id[var];
}
Upvotes: 2