Reputation: 395
I have the following df
df <-
a b c
20 10 20€
20€ 10 20 Euro
I want to test if the number 20 is part of a field. Result should therefore looks like this:
[1]true [2]false [3]true
[4]true [5]false [6]true
I tried
grepl(df[3,3], 20)
grepl(df[3,3], "20")
Both of which return false.
Upvotes: 1
Views: 75
Reputation: 11128
You may choose to use Vectorize
function over here or may be purrr::map_dfr
Vectorize(grepl, vectorize.args = 'x')(pattern='20', df)
purrr::map_dfr(df, ~grepl('20', .x))
But my solution is not better than the above one(@r2evans has more elegant), In case you want to match strictly 20, then you can also use boundary conditions \\b20\\b
instead of just 20
.
data:
structure(list(a = c("20", "20€"), b = c("10", "10"), c = c("20€",
"2 Euro")), class = "data.frame", row.names = c(NA, -2L))
Output:
Vectorize(grepl, vectorize.args = 'x')(pattern='20', df)
a b c
[1,] TRUE FALSE TRUE
[2,] TRUE FALSE TRUE
Upvotes: 0
Reputation: 160407
You said you wanted a matrix-like view of logicals. Brian's comment is correct, the pattern comes first ... but you also need to account for the structure: grepl(ptn, some_data_frame)
returns a vector (looks like an "all-or-nothing" per-column), while grepl(ptn, some_matrix)
returns a logical for every element in the matrix ... albeit not with the correct dimensions, correctable.
`dim<-`(grepl("20", as.matrix(df)), dim(df))
# [,1] [,2] [,3]
# [1,] TRUE FALSE TRUE
# [2,] TRUE FALSE TRUE
### or, more eye-friendly
out <- grepl("20", as.matrix(df))
dim(out) <- dim(df)
out
# [,1] [,2] [,3]
# [1,] TRUE FALSE TRUE
# [2,] TRUE FALSE TRUE
BTW: if you are looking for any number including "20", to include 120 and 200, then this is fine. If you want fields where the only number component is "20" (neither 120 nor 200 count), then you need "\\b20\\b"
as your pattern. (Thanks Andrew.)
Data:
df <- read.table(header=T, text="
a b c
20 10 20€
20€ 10 20Euro")
BTW: the reason that grepl("20", df)
returns a vector of length 3 (one for each column) is that internally it is converting the object to character. This explains why you only get three:
as.character(df)
# [1] "c(20, 20)" "c(10, 10)" "1:2"
Upvotes: 4