Reputation: 67211
I want to filter the data from a text file in unix. I have text file in unix as below:
A 200
B 300
C 400
A 100
B 600
B 700
How could i modify/create data as below from the above data i have in awk?
A 200 100
B 300 600 700
C 400
i am not that much good in awk and i believe awk/perl is best for this.
Upvotes: 2
Views: 133
Reputation: 58361
This might work for you:
sort -sk1,1 file | sed ':a;$!N;s/^\([^ ]*\)\( .*\)\n\1/\1\2/;ta;P;D'
A 200 100
B 300 600 700
C 400
Upvotes: 0
Reputation: 36252
Using sed
:
Content of script.sed
:
## First line. Newline will separate data, so add it after the content.
## Save it in 'hold space' and read next one.
1 {
s/$/\n/
h
b
}
## Append content of 'hold space' to current line.
G
## Search if first char (\1) in line was saved in 'hold space' (\4) and add
## the number (\2) after it.
s/^\(.\)\( *[0-9]\+\)\n\(.*\)\(\1[^\n]*\)/\3\4\2/
## If last substitution succeed, goto label 'a'.
ta
## Here last substitution failed, so it is the first appearance of the
## letter, add it at the end of the content.
s/^\([^\n]*\n\)\(.*\)$/\2\1/
## Label 'a'.
:a
## Save content to 'hold space'.
h
## In last line, get content of 'hold space', remove last newline and print.
$ {
x
s/\n*$//
p
}
Run it like:
sed -nf script.sed infile
And result:
A 200 100
B 300 600 700
C 400
Upvotes: 0
Reputation: 27990
awk 'END {
for (R in r)
print R, r[R]
}
{
r[$1] = $1 in r ? r[$1] OFS $2 : $2
}' infile
If the order of the values in the first field is important, more code will be needed. The solution will depend on your awk implementation and version.
Explanation:
r[$1] = $1 in r ? r[$1] OFS $2 : $2
Set the value of the array r element $1 to:
expression ? if true : if false is the ternary operator. See ternary operation for more.
Upvotes: 3
Reputation: 36252
Using awk
, sorting the output inside it:
awk '
{ data[$1] = (data[$1] ? data[$1] " " : "") $2 }
END {
for (i in data) {
idx[++j] = i
}
n = asort(idx);
for ( i=1; i<=n; i++ ) {
print idx[i] " " data[idx[i]]
}
}
' infile
Using external program sort
:
awk '
{ data[$1] = (data[$1] ? data[$1] " " : "") $2 }
END {
for (i in data) {
print i " " data[i]
}
}
' infile | sort
For both commands output is:
A 200 100
B 300 600 700
C 400
Upvotes: 0
Reputation: 2791
You could do it like this, but with Perl there's always more than one way to do it:
my %hash;
while(<>) {
my($letter, $int) = split(" ");
push @{ $hash{$letter} }, $int;
}
for my $key (sort keys %hash) {
print "$key " . join(" ", @{ $hash{$key} }) . "\n";
}
Should work like that:
$ cat data.txt | perl script.pl
A 200 100
B 300 600 700
C 400
Upvotes: 2
Reputation: 22810
Not language-specific. More like pseudocode, but here's the idea :
- Get all lines in an array
- Set a target dictionary of arrays
- Go through the array :
- Split the string using ' '(space) as the delimiter, into array parts
- If there is already a dictionary entry for `parts[0]` (e.g. 'A').
If not create it.
- Add `parts[1]` (e.g. 100) to `dictionary(parts[0])`
And that's it! :-)
I'd do it, probably in Python, but that's rather a matter of taste.
Upvotes: 1