Reputation: 330
Currently, I have a command that outputs data in the following format:
apple: banana
apple: cantaloupe
apple: durian
apple: eggplant
banana: cantaloupe
banana: durian
durian: eggplant
eggplant:
In other words, it's a tree-like structure in which apple
is the root, which has children banana
and eggplant
, and banana
also has sub-children cantaloupe
and durian
. eggplant
has no children, yet still has a trailing colon.
I want to concatenate the output into this format:
apple: banana eggplant
banana: cantaloupe durian
durian: eggplant
eggplant:
Some objects may show up more than once in the output (in this case, cantaloupe
, durian
, and eggplant
have multiple parent nodes). While this example doesn't have it, there may also be multiple root nodes (i.e. same breadth as apple
).
How would I go about modifying this output? I'm using bash/shell scripting in general right now, so I was thinking awk
would probably be the best way to handle this, but if this is better handled in Python, Ruby, Perl, or some other scripting language, I'm also open to suggestions.
Upvotes: 2
Views: 414
Reputation: 753990
awk -F: '{ list[$1] = list[$1] $2 } END { for (i in list) printf "%s:%s\n", i, list[i] }'
Accumulate entries using the associative arrays in awk
, building up the list. String concatenation in awk
is a bit weird. At the end, print out the keys and the entries for the key. If there's ordering required, you need to say so.
Assuming that the keys on the left should be output in the order of first appearance on the LHS of the input, then you can use this slightly more complex script:
awk -F: '{ if (!($1 in list)) keys[++n] = $1; list[$1] = list[$1] $2 }
END { for (j = 1; j <= n; j++) printf "%s:%s\n", keys[j], list[keys[j]] }'
Upvotes: 2
Reputation: 785246
You can use awk:
awk -F ': *' '{a[$1] = (a[$1]? a[$1] OFS $2 : $2)}
END { for (i in a) print i ": " a[i] }' file
eggplant:
apple: banana cantaloupe durian eggplant
banana: cantaloupe durian
durian: eggplant
To maintain the original order:
awk -F ': *' '!($1 in a){b[++n]=$1} {a[$1] = (a[$1]? a[$1] OFS $2 : $2)}
END{for (i=1; i<=n; i++) print b[i] ": " a[b[i]]}' file
apple: banana cantaloupe durian eggplant
banana: cantaloupe durian
durian: eggplant
eggplant:
Upvotes: 2