Reputation: 345
I have a data frame df1
that has a main variable main_v
and a variable that sometimes has an additional comment in it, additional_v
:
df1:
main_v additional_v
city1
city2 200 sq mi
city3 100 inhabitants
city2 10 mio inhabitants
city4
city1
city4
city1 10 sq mi
I want to do the following: merge the same main_v
entries whenever additional_v
is empty and keep one entry. When additional_v
is not empty, keep each instance of main_v
. The individual additional_v
entries should not be combined for every main_v
but kept as separate entries.
The resulting df2
should look something like this:
df2:
main_v additional_v
city1
city1 10 sq mi
city2 200 sq mi
city2 10 mio inhabitants
city3 100 inhabitants
city4
I don't know how to approach this problem. Any help would be appreciated. I don't have a preference for particular packages
.
Upvotes: 1
Views: 22
Reputation: 887891
We could use distinct
from dplyr
library(dplyr)
distinct(df1) %>%
arrange(main_v)
Or with unique
from base R
unique(df1)
df1 <- structure(list(main_v = c("city1", "city2", "city3", "city2",
"city4", "city1", "city4", "city1"), additional_v = c("", "200 sq mi",
"100 inhabitants", "10 mio inhabitants", "", "", "", "10 sq mi"
)), class = "data.frame", row.names = c(NA, -8L))
Upvotes: 1