Daolin
Daolin

Reputation: 634

R function to sort column by string length then by alphabet?

I would like to sort one column in my data frame by string length first then by alphabet, I tried code below:

#sort column by string length then alphabet
GSN[order(nchar(GSN[,3]),GSN[,3]),]

But I got error

Error in nchar(GSN[, 3]) : 'nchar()' requires a character vector

My data looks like:

    Flowcell Lane    barcode         sample         plate row column
314       NA   NA AACAGACATT   LD06_7620SDS GSN1_Hind384D   B      4
307       NA   NA  AACAGCACT   LG10_2688SDS GSN1_Hind384D   C      3
289       NA   NA     AACCTC  U09_105007SDS GSN1_Hind384D   A      1
232       NA   NA AACGACCACC         13_232 GSN1_Hind384C   H      5
10        NA   NA AACGCACATT          13_10 GSN1_Hind384A   B      2
165       NA   NA      AACGG         13_165 GSN1_Hind384B   E      9

I would like to sort "barcode" column. Thanks for your time.

Upvotes: 4

Views: 3570

Answers (2)

dmt
dmt

Reputation: 2183

I wish to add a tidyverse solution

library(tidyverse)

GSN_sorted =  GSN %>%
    mutate(barcode = as.character(barcode)) %>%
    arrange(str_length(barcode), barcode)

Note the factor to character conversion originally pointed out by Alex A.

Upvotes: 3

Alex A.
Alex A.

Reputation: 5586

You can add another column to your data frame that contains the number of characters in the barcode, then sort in the usual way.

GSN <- transform(GSN, n=nchar(as.character(barcode)))

GSN[with(GSN, order(n, barcode)), ]

It appears that the issue you were having is because R thinks that barcode is a factor rather than a character vector, so nchar() is invalid. Converting it to character via as.character() solves this.

Upvotes: 3

Related Questions