How to print all rows from column1 and only certain rows from other columns

Question

I have a file containing 3 columns and thousand of rows. Below is an example.

File.txt
Column1 column2 column3
173     banana   red
896     banana   red
567     apple    green
742     apple    green
893     apple    green
567     avocado  black
345     avocado  black

I need to print all rows from column1, but only a unique name from column2 and column3.

I want this output:
Column1 column2 column3
173     banana   red
896              
567     apple    green
742     
893     
567     avocado  black
345

Better if I can get in the format below:

Banana-red: 173 896              
Apple-green: 567 742 893  
Avocado-black: 567 345

Ed Morton · Accepted Answer

$ awk 'NR>1{k=$2"-"$3; a[k]=a[k]" "$1} END{for (k in a) print k ":" a[k]}' file
apple-green: 567 742 893
banana-red: 173 896
avocado-black: 567 345

The rows will be output in random order courtesy of the in operator, the columns will be in the order they occur in your input for each key value. If you really want the first letter of each key capitalized as in the expected output in your question:

$ awk 'NR>1{k=$2"-"$3; a[k]=a[k]" "$1} END{for (k in a) print toupper(substr(k,1,1)) substr(k,2) ":" a[k]}' file
Apple-green: 567 742 893
Banana-red: 173 896
Avocado-black: 567 345

and if you want the rows output in the order they occurred in the input:

$ awk 'NR>1{k=$2"-"$3; a[k]=a[k]" "$1l; if (!seen[k]++) keys[++numKeys]=k} END{for (keyNr=1; keyNr<=numKeys; keyNr++) {k=keys[keyNr]; print toupper(substr(k,1,1)) substr(k,2) ":" a[k]} }' file
Banana-red: 173 896
Apple-green: 567 742 893
Avocado-black: 567 345

How to print all rows from column1 and only certain rows from other columns

Answers (1)

Related Questions