Reputation: 87
I have animal tracking data where each animal was encountered over time and the sex was recorded at each encounter. There are three types of encounters (type1, type2, and type3). Each row represents an animal and each encounter is classified as M (male) or F (female). Each character in the type represents an encounter (eg. MMMM is an animal seen four times and recorded as male each time).
Sample data:
animal.ID type1 type2 type3
1 MMMMMMM M M
2 MFMM M M
3 FFM F F
4 FFFFFFFFF F F
5 MM M M
I want to know if the sex (male or female) was recorded consistently for each animal.
I want to produce something like this, where a column indicates if sex was consistently recorded consistently (1) or not (0).
animal.ID type1 type2 type3 consistent
1 MMMMMMM M M 1
2 MFMM M M 0
3 FFM F F 0
4 FFFFFFFFF F F 1
5 MM M M 1
I can use if_else to get the 'consistent' column for the type2 and type3 data:
df %>%
mutate(consistent = if_else(type2 == type3), 1, 0))
But, I can't include the type1 data since it has multiple characters in each string, and, different numbers of character in each string.
One approach could be to use str_split to split type1 into multiple columns, but, I don't know how to do that given the different number of characters in each string.
Upvotes: 1
Views: 110
Reputation: 8880
Another solution using logic @Ronak Shah
library(tidyverse)
df %>%
unite("all_type", starts_with("type"), sep = "", remove = F) %>%
mutate(consistent = map(strsplit(all_type, ""), ~ +(n_distinct(.x) == 1)))
Upvotes: 0
Reputation: 388982
We can use charToRaw
to get the "raw" representation of type1
and assign 1 if they all are the same.
df$consistent <- +(sapply(df$type1, function(x) length(unique(charToRaw(x)))) ==1)
Using dplyr
, we can use the same logic as :
library(dplyr)
df %>%
rowwise() %>%
mutate(consistent = +(n_distinct(charToRaw(type1)) == 1))
# animal.ID type1 type2 type3 consistent
# <int> <chr> <chr> <chr> <int>
#1 1 MMMMMMM M M 1
#2 2 MFMM M M 0
#3 3 FFM F F 0
#4 4 FFFFFFFFF F F 1
#5 5 MM M M 1
data
df <- structure(list(animal.ID = 1:5, type1 = c("MMMMMMM", "MFMM",
"FFM", "FFFFFFFFF", "MM"), type2 = c("M", "M", "F", "F", "M"),
type3 = c("M", "M", "F", "F", "M")), class = "data.frame", row.names = c(NA, -5L))
Upvotes: 1
Reputation: 30474
One approach may be to use strsplit
and unlist
, checking that all characters are equal to type2
(in addition to checking that type2
equals type3
).
df %>%
rowwise() %>%
mutate(consistent = ifelse(type2 == type3 & all(unlist(strsplit(type1, "")) == type2), 1, 0))
Output
# A tibble: 5 x 5
animal.ID type1 type2 type3 consistent
<int> <chr> <chr> <chr> <dbl>
1 1 MMMMMMM M M 1
2 2 MFMM M M 0
3 3 FFM F F 0
4 4 FFFFFFFFF F F 1
5 5 MM M M 1
Upvotes: 3