Dinho
Dinho

Reputation: 724

How to reorder column values ascending order that are seperated by "," and only keep first value in R

I have a column in a df that consists of values like so:

ID
2
NA
1
3
4
5,7
9,6,10
12
15
16
17
NA
19
22,23

I would like to reorder every row based on ascending order. Note - this column is a "character" based field and some rows are already in the correct order.

From there, I only want to keep the first value and remove the others.

Desired output:

ID
2
NA
1
3
4
5
6
12
15
16
17
NA
19
22

Upvotes: 1

Views: 177

Answers (4)

TarJae
TarJae

Reputation: 78937

Here is another tidyverse solution: Making use of (dyplr, purrr, stringr and readr

library(tidyverse)
df %>% 
    mutate(ID = map_chr(str_split(ID, ","), ~ 
                           toString(sort(as.numeric(.x)))),
           ID = parse_number(ID))
)

output:

   ID
1   2
2  NA
3   1
4   3
5   4
6   5
7   6
8  12
9  15
10 16
11 17
12 NA
13 19
14 22

Upvotes: 4

akrun
akrun

Reputation: 887173

We could use str_extract

library(stringr)
library(dplyr)
df1 %>%
    mutate(ID = as.numeric(str_extract(ID, '\\d+')))

-output

   ID
1   2
2  NA
3   1
4   3
5   4
6   5
7   9
8  12
9  15
10 16
11 17
12 NA
13 19
14 22

data

df1 <- structure(list(ID = c("2", NA, "1", "3", "4", "5,7", "9,6,10", 
"12", "15", "16", "17", NA, "19", "22,23")), class = "data.frame", row.names = c(NA, 
-14L))

Upvotes: 1

jay.sf
jay.sf

Reputation: 72944

We may use the minimum instead of sorting / extracting:

DF <- transform(DF, ID=sapply(strsplit(ID, ','), \(x) min(as.double(x))))
DF
#    ID
# 1   2
# 2  NA
# 3   1
# 4   3
# 5   4
# 6   5
# 7   6
# 8  12
# 9  15
# 10 16
# 11 17
# 12 NA
# 13 19
# 14 22

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388982

You can split the data on comma, sort them and extract the 1st value.

df$ID <- sapply(strsplit(df$ID, ','), function(x) sort(as.numeric(x))[1])

#   ID
#1   2
#2  NA
#3   1
#4   3
#5   4
#6   5
#7   6
#8  12
#9  15
#10 16
#11 17
#12 NA
#13 19
#14 22

A couple of tidyverse alternatives.

library(tidyverse)

#1.
#Same as base R but in tidyverse
df %>% mutate(ID = map_dbl(str_split(ID, ','), ~sort(as.numeric(.x))[1]))

#2.
df %>%
  mutate(row = row_number()) %>%
  separate_rows(ID, sep = ',', convert = TRUE) %>%
  group_by(row) %>%
  summarise(ID = min(ID)) %>%
  select(-row)

Upvotes: 4

Related Questions