R: Merge data conditional on cells

Question

I have a data frame df1 that has a main variable main_v and a variable that sometimes has an additional comment in it, additional_v:

df1:
    main_v additional_v
    city1
    city2 200 sq mi
    city3 100 inhabitants
    city2 10 mio inhabitants
    city4
    city1
    city4
    city1 10 sq mi

I want to do the following: merge the same main_v entries whenever additional_v is empty and keep one entry. When additional_v is not empty, keep each instance of main_v. The individual additional_v entries should not be combined for every main_v but kept as separate entries.

The resulting df2 should look something like this:

df2:
    main_v   additional_v
        city1
        city1 10 sq mi
        city2 200 sq mi
        city2 10 mio inhabitants
        city3 100 inhabitants
        city4

I don't know how to approach this problem. Any help would be appreciated. I don't have a preference for particular packages.

akrun · Accepted Answer

We could use distinct from dplyr

library(dplyr)
distinct(df1) %>%
    arrange(main_v)

Or with unique from base R

unique(df1)

data

df1 <- structure(list(main_v = c("city1", "city2", "city3", "city2", 
"city4", "city1", "city4", "city1"), additional_v = c("", "200 sq mi", 
"100 inhabitants", "10 mio inhabitants", "", "", "", "10 sq mi"
)), class = "data.frame", row.names = c(NA, -8L))

R: Merge data conditional on cells

Answers (1)

data

Related Questions