Bash sort tab delimited rows based on specific column with most values delimited by comma

Question

I have rows like so:

rs6605071   chr1:962943 XM_017002478.2  stuff1,stuff2                           morestuff
rs6605071   chr1:962943 XM_017002479.1  stuff1,stuff2,stuff3,stuff4,stuff5      morestuff
rs6605071   chr1:962943 XR_001737138.1  stuff1,stuff2,stuff3                    morestuff
rs6605071   chr1:962943 XR_001737478.1  stuff1,stuff2,stuff3,stuff4             morestuff
rs6605071   chr1:962943 NC_426604.3     stuff1                                  morestuff
rs6605071   chr1:962943 NC_426605.3     stuff1                                  morestuff

I would like to sort my rows by the 4th column for the desired output:

rs6605071   chr1:962943 XM_017002479.1  stuff1,stuff2,stuff3,stuff4,stuff5      morestuff
rs6605071   chr1:962943 XR_001737478.1  stuff1,stuff2,stuff3,stuff4             morestuff
rs6605071   chr1:962943 XM_017002478.2  stuff1,stuff2                           morestuff
rs6605071   chr1:962943 NC_426604.3     stuff1                                  morestuff
rs6605071   chr1:962943 NC_426605.3     stuff1                                  morestuff

What is the best approach to achieve such result in bash ?

Edit 1: The column 4 shouldn't be sorted alphabetically. It has to be sorted according to the number of values found (delimited by commas).

Thank you in advance

Bash sort tab delimited rows based on specific column with most values delimited by comma

Answers (1)

Related Questions