Reputation: 839

How to remove duplicate values in an individual column for multiple columns at once in R

Sample data

           sessionid             qf      Office
                12                3       LON1,LON2,LON1,SEA2,SEA3,SEA3,SEA3
                12                4       DEL2,DEL1,LON1,DEL1
                13                5       MAn1,LON1,DEL1,LON1

Here i want to remove duplicate values in column "OFFICE" by each row.

Expected Output

            sessionid             qf      Office
                12                3       LON1,LON2,SEA2,SEA3
                12                4       DEL2,DEL1,LON1
                13                5       MAN1,LON1,DEL1

Upvotes: 0

Answers (2)

Michael Bird

Reputation: 783

Here is a base R way of doing it, it works as you'd expect, first split Office by the comma, remove duplicates, then paste back together again

df$Office <- sapply(lapply(strsplit(df$Office, ","),
                           function(x) {
                             unique(x)
                           }),
                    function(x) {
                      paste(x, collapse = ",")
                    },
                    simplify = T)

or with %>%

df$Office <-  df$Office %>%
  strsplit(",") %>%
  lapply(function(x){unique(x)}) %>%
  sapply(function(x){paste(x,collapse = ",")},simplify = T)

Upvotes: 3

akrun

Reputation: 887501

We could use tidyverse. Split the 'Office' by the deimiter and expand to 'long' format, then get the distinct rows, grouped by 'sessionid', and 'qf', paste the contents of 'Office'

library(tidyverse)
separate_rows(df1, Office) %>%
      distinct() %>%
     group_by(sessionid, qf) %>% 
     summarise(Office = toString(Office))
# A tibble: 3 x 3
# Groups:   sessionid [?]
#  sessionid    qf                 Office
#      <int> <int>                  <chr>
#1        12     3 LON1, LON2, SEA2, SEA3
#2        12     4       DEL2, DEL1, LON1
#3        13     5       MAn1, LON1, DEL1

Upvotes: 2

How to remove duplicate values in an individual column for multiple columns at once in R

Answers (2)

Related Questions