Aida
Aida

Reputation: 89

How to identify rows that are not starting with a certain pattern in R

I have a data table with a column with different characters and I want to check if all the rows are starting with "99", "15", "16"or "17" and have between 4-5 characters.

Any idea of how to do it? I thought about using grep but it doesn't really works..!

Thanks

Aida

Upvotes: 3

Views: 763

Answers (1)

eipi10
eipi10

Reputation: 93841

Assuming your data frame is called dat and x is the column of interest, the following will return all the rows that start with the desired characters and that have 4 or 5 characters:

dat[grepl("^(15|16|17|99)", dat$x) & nchar(dat$x) %in% 4:5, ]

To return rows that do not meet the criteria:

dat[!(grepl("^(15|16|17|99)", dat$x) & nchar(dat$x) %in% 4:5), ]

Here are answers to your comment:

1) To check whether all rows meet the criteria:

all(grepl("^(15|16|17|99)", dat$x) & nchar(dat$x) %in% 4:5)

2) To identify rows that do not meet the criteria:

which(!(grepl("^(15|16|17|99)", dat$x) & nchar(dat$x) %in% 4:5))

Upvotes: 2

Related Questions