Reputation: 866
The table below has two columns A
and B
which I want to compare. If A
value doesn't match with value in B
, then I keep the unique ID
binding these two so track miss matches.
However problem with this approach is by default R
is case sensitive. Is there a possible way that this particular code I can ignore case sensitivity?
Input Data
data <- read.table(header = TRUE, text = "A ID B
mA 100 MA
ab 101 ab
Ca 102 Ca
KaK 103 KAK")
A ID B
mA 100 MA
ab 101 ab
Ca 102 Ca
KaK 103 KAK
Code To Compare
output <- as.data.frame(data$ID[as.character(data$A) != as.character(data$B)])
Output
ID
100
103
Without case sensitivity the output will be empty data frame as all will match.
Upvotes: 1
Views: 1774
Reputation: 3639
Two other approaches
library(tidyverse)
library(stringr)
my_data <- tribble(~A, ~ID, ~B,
'mA', 100, 'MA',
'ab', 101, 'ab',
'Ca', 102, 'Ca',
'KaK', 103, 'KAK',
'AA', 104, 'BB',
'cd', 105, 'cd',
'aa', 106, 'bb')
# returns a vector of IDs
my_data$ID[str_detect(my_data$A, regex(my_data$B, ignore_case = TRUE))]
#[1] 100 101 102 103 105
# Processing and returning a tibble
my_data %>%
filter(str_detect(A, regex(B, ignore_case = TRUE))) %>%
select(ID)
## A tibble: 5 x 1
# ID
# <dbl>
# 1 100
# 2 101
# 3 102
# 4 103
# 5 105
Upvotes: 1
Reputation:
Sorry! I cannot comment but there are a couple of ways. Use grep and ignore.case=TRUE
or maybe wrap within a toupper()
or tolower
statement.
Ok, got a laptop:
dat<-as.data.frame(dat)
dat[]<-lapply(dat,toupper)
#Add ! to return the opposite
> data.frame(ID=dat$ID[dat$A %in% dat$B])
ID
1 100
2 101
3 102
4 103
Upvotes: 1
Reputation: 11140
Here's one way by changing the case of both columns to either upper (toupper
) or lower (tolower
). Also note the correct way to subset below. You'd also need to add drop = FALSE
when subsetting a single column to keep dataframe structure. -
data[tolower(data$A) != tolower(data$B), "ID", drop = FALSE]
[1] ID
<0 rows> (or 0-length row.names)
Upvotes: 3