Reputation: 6776
I have a data.frame as below. I would like to get a list of cells that dont have even a single number or a-to-z and their frequency. How could I do that? In case of below data I want a table. In the table's first column i will have * and . These second column will show frequency of those values (1 and 2 respectively). "a*" and "21.9" wont appear because they contain at least one number or a-z
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9),ncol=3,byrow=TRUE)
smdf<-as.data.frame(sm)
Upvotes: 0
Views: 1224
Reputation: 19960
Does this provide what you are looking for?
require(plyr)
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9),ncol=3,byrow=TRUE)
count(sm[!grepl("[[:alnum:]]", sm)])
x freq
1 * 1
2 . 2
If you want to also exclude the NA and spaces, you can easily just add the appropriate conditions to the filter. As a side note, I am fairly certain a more elegant regex could solve this without the extra parameters but my regex skills are in progress. Will update if I manage to figure out such a thing.
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9, " ", NA, 13),ncol=3,byrow=TRUE)
count(sm[!grepl("[[:alnum:]]", sm) & !is.na(sm) & sm != " "])
x freq
1 * 1
2 . 2
However, if there is a specific list of characters you wish to count you can always make a vector of the characters and count only those. This doesn't require the extra 'space' and 'NA' components.
sm <- matrix(c(51,".",22,"*","a*","21.9",".",22,9, " ", NA, 13),ncol=3,byrow=TRUE)
x <- unlist(strsplit("*~!@#$%^&(){}_+:\"<>?,./;'[]-=", split=""))
count(sm[sm %in% x])
x freq
1 * 1
2 . 2
Upvotes: 2