Reputation: 280
Columns can only be numerical or alphanumerical values (i.e. acronym followed by numbers). However, several of these columns have symbols and dates that shouldn't be there. I want the desired output where all values that are not alphanumerical or numerical get flagged as '1'.
i.e.
1234 0
ABC1234 0
# 1
12/13/17 1
$ 1
ABC 1
I am looking for code that isn't specific to the ex data above, rather generalized to apply to several columns.
Edited: Clarification
Upvotes: 0
Views: 44
Reputation: 887851
We can use grepl
to create a logical output which can be coerced to binary
+(grepl("[^[:alnum:]]", v1))
#[1] 0 0 1 1 1
If it needs to be a letter followed by number
+(!grepl("^[A-Za-z]*\\d+$", v1))
#[1] 0 0 1 1 1
If this is to check every column
df1[] <- lapply(df1, function(x) +(grepl("[^[:alnum:]]", x)))
If the intention is to find any
value that are not alpha numeric in a column
v2 <- sapply(df1, function(x) any(grepl("[^[:alnum:]]", x)))
v1 <- c(1234, "ABC1234", "#", "12/13/17", "$")
Upvotes: 2