Reputation: 1251
I have a data set with columns containing first names and last names. I want to filter those rows where the first name and the last name is identical.
For example, if the first name says Peter and the last name Parker several times in the data I want to filter those rows.
For now, I tried:
library(dplyr)
dat %>%
filter(duplicated(as.numeric(`First name`)) & duplicated(as.numeric(`Last name`)))
However, the returned values in the column first name and last name are not the same.
@arg0naut
dat %>%
filter(duplicated(paste0(`First name`, `Last name`)))
# A tibble: 5 x 2
`First name` `Last name`
<chr> <chr>
1 Frank Seehaus
2 Nadine Urseanu
3 Rudolf Schicker
4 Renate Kaymer
5 Brigitte Reibenspies
I want to see:
# A tibble: 5 x 2
`First name` `Last name`
<chr> <chr>
1 Peter Parker
2 Perer Perker
3 Peter Parker
...
Upvotes: 0
Views: 515
Reputation: 14764
You could try:
library(dplyr)
dat %>%
filter(duplicated(paste0(`First name`, `Last name`)))
Output on the basis of data below:
First name Last name
1 Peter Parker
If you'd like to have all the duplications returned, you could do:
dat %>%
group_by(`First name`, `Last name`) %>%
filter(n() > 1)
Output on the basis of data below:
# A tibble: 2 x 2
# Groups: First name, Last name [1]
`First name` `Last name`
<fct> <fct>
1 Peter Parker
2 Peter Parker
Example data:
dat <-
data.frame(
`First name` = c("Peter", "Peter", "John", "John"),
`Last name` = c("Parker", "Parker", "Biscuit", "Chocolate"),
check.names = FALSE
)
dat
First name Last name
1 Peter Parker
2 Peter Parker
3 John Biscuit
4 John Chocolate
Upvotes: 2