everestial
everestial

Reputation: 7255

How to print the data that have largest length using awk?

I have this input:

1 happy,t,c,d
2 t,c
3 e,fgh,k
4 yk,j,f
5 leti,j,f,g

I want to print the length of the largest item (with comma as a delimiter), which should yield:

 1 5,1,1,1
 2 1,1
 3 1,3,1
 4 2,1,1
 5 4,1,1,1

And then I want to select the max value for the 2nd column finally creating:

 1 5
 2 1
 3 3
 4 2
 5 4

How can I do this in awk?

1) For the first task I have tried:

awk -v col=$2 -F',' '{OFS=","; for(i=1; i<=NF; i++) print length($i);}' test.data.txt

Which doesn't output correct data:

7
1
1
1
3
1
3
3
1
4
1
1
6
1
1
1

The only problem is that I am not able to use -v option properly to read only that column. So, I have all data in one column, and values added (from length) from column1 and space between column1 and column2.

2) To select the max value, I am doing:

awk -F',' '{OFS="\t"; m=length($1); for(i=1; i<=NF; i++) if (length($i) > m) m=length($i); print m}' test.data.txt

This works properly, but due to the presence of 1st column the values are added to the max values giving me:

7
3
3
4
6

instead of:

5
1
3
2
4

Lastly, I want to merge these two processes in one go. Any suggestions on improvements?

Upvotes: 0

Views: 1123

Answers (4)

silentstorm29
silentstorm29

Reputation: 1

Let's assume I have following file:

abc             14            10     lsjhmehrofer
adlcwd          23            124    cerklfelkfv
sjxhkj          34            868    tguyjggt 
vergrtbhretshrt 23            24335  gdrvhtyfjrbhvdgthter

you can use: awk '{ print length(), NR, $0 | "sort -rn | head -1 " }' abc.txt

priyankauser ~ % awk '{ print length(), NR, $0 | "sort -rn | head -2 " }' abc.txt
57 4 vergrtbhretshrt 23            24335  gdrvhtyfjrbhvdgthter
49 1 abc             14            10     lsjhmehrofer
priyankauser ~ % awk '{ print length(), NR, $0 | "sort -rn | head -1 " }' abc.txt
57 4 vergrtbhretshrt 23            24335  gdrvhtyfjrbhvdgthter

here 4 is the line number that has max length

57 4 vergrtbhretshrt 23 24335 gdrvhtyfjrbhvdgthter

Upvotes: 0

George Vasiliou
George Vasiliou

Reputation: 6345

awk -F'[, ]' -v OFS="," '{m=length($2);for (i=3;i<=NF;i++) if (length($i) > m) m=length($i)}{print $1,m}' file
1,5
2,1
3,3
4,2
5,4

For the first case:

awk -F'[, ]' -v OFS="," '{printf "%s",$1;for (i=2;i<=NF;i++) printf "%s%s",(i==2?" ":OFS),length($i)}{print ""}'
1 5,1,1,1
2 1,1
3 1,3,1
4 2,1,1
5 4,1,1,1

Shorter alternative:

awk -F'[, ]' -v OFS="," '{printf "%s ",$1;for (i=2;i<=NF;i++) printf "%s%s",length($i),(i==NF?ORS:OFS)}'

While print in awk prints data and changes line by printing a new line at the end, printf does not change line on it's own.

PS: Thanks Ed Morton for the valuable comment.

Upvotes: 4

glenn jackman
glenn jackman

Reputation: 247210

Trying to golf this down:

gawk -F'[ ,]' '{m=0;for(i=2;i<=NF;i++){l=length($i);if(l>m)m=l}print$1,m}'

perl -MList::Util=max -F'\s+|,' -lne'$,=" ";print shift(@F),max map{length}@F'
perl -MList::Util=max -F'\s+|,' -lne'print"@{[shift(@F),max map{length}@F]}"'
perl -MList::Util=max -F'\s+|,' -lpe'$_="@{[shift(@F),max map{length}@F]}"'    

ruby -F'[ ,]' -lape'$_="#{$F[0]} #{$F[1..-1].map{|e|e.size}.max}"'

Upvotes: 2

John1024
John1024

Reputation: 113994

We start with this data file:

$ cat data
1 happy,t,c,d
2 t,c
3 e,fgh,k
4 yk,j,f
5 leti,j,f,g

For the first task:

$ awk '{n=split($2,a,/,/); printf "%2s %s",$1,length(a[1]); for(i=2; i<=n; i++) printf ",%s",length(a[i]); print""}' data
 1 5,1,1,1
 2 1,1
 3 1,3,1
 4 2,1,1
 5 4,1,1,1

For the second task:

$ awk '{n=split($2,a,/,/); max=length(a[1]); for(i=2; i<=n; i++) if (length(a[i])>max)max=length(a[i]); print $1,max}' data
1 5
2 1
3 3
4 2
5 4

How it works

For the second task:

  • n=split($2,a,/,/)

    We split up the contents of field 2 into array a

  • max=length(a[1])

    We assign the length of the first element of array a to the awk variable max.

  • for(i=2; i<=n; i++) if (length(a[i])>max)max=length(a[i])

    If any succeeding element of array a is larger than max, we update `max.

  • print $1,max

    We print the first field and the value of max.

Upvotes: 3

Related Questions