Reputation: 4006
From a dataframe I want to subset all rows that contain some pattern like "A" or "36" or "1?2". I don't care which column matches the pattern, as long as there is a match somewhere in that row.
Dataframe:
aName bName pName call alleles logRatio strength
AX-11086564 F08_ADN103 2011-02-10_R10 AB CG 0.363371 10.184215
AX-11086564 A01_CD1919 2011-02-24_R11 BB GG -1.352707 9.54909
AX-11086564 B05_CD2920 2011-01-27_R6 AB CG -0.183802 9.766334
AX-11086564 D04_CD5950 2011-02-09_R9 AB CG 0.162586 10.165051
AX-11086564 D07_CD6025 2011-02-10_R10 AB CG -0.397097 9.940238
AX-11086564 B05_CD3630 2011-02-02_R7 AA CC 2.349906 9.153076
AX-11086564 D04_ADN103 2011-02-10_R2 BB GG -1.898088 9.872966
AX-11086564 A01_CD2588 2011-01-27_R5 BB GG -1.208094 9.239801
My actual data frame contains many rows, and I don't want to hard code their names. The patterns can be more complicated, so I want to use regular expressions.
Code to read in this dataframe in R:
data <- read.table(textConnection("
aName bName pName call alleles logRatio strength
AX-11086564 F08_ADN103 2011-02-10_R10 AB CG 0.363371 10.184215
AX-11086564 A01_CD1919 2011-02-24_R11 BB GG -1.352707 9.54909
AX-11086564 B05_CD2920 2011-01-27_R6 AB CG -0.183802 9.766334
AX-11086564 D04_CD5950 2011-02-09_R9 AB CG 0.162586 10.165051
AX-11086564 D07_CD6025 2011-02-10_R10 AB CG -0.397097 9.940238
AX-11086564 B05_CD3630 2011-02-02_R7 AA CC 2.349906 9.153076
AX-11086564 D04_ADN103 2011-02-10_R2 BB GG -1.898088 9.872966
AX-11086564 A01_CD2588 2011-01-27_R5 BB GG -1.208094 9.239801
"), header = TRUE)
Upvotes: 0
Views: 98
Reputation: 121568
Here I define a wrapper of grep to serach in a data.frame:
search_data_frame <-
function(patt,data)
unlist(lapply (seq_len(nrow(data)),function(i) grep(patt,data[i,])))
Then you use it :
data[search_data_frame('36',data),]
aName bName pName call alleles logRatio strength
6 AX-11086564 B05_CD3630 2011-02-02_R7 AA CC 2.349906 9.153076
2 AX-11086564 A01_CD1919 2011-02-24_R11 BB GG -1.352707 9.549090
Note the I read your data using stringsAsFactors=FALSE
otherwise you should coerce your factors to characters before.
`
Upvotes: 2
Reputation: 30425
You can use grepl
apply
and rowSums
> rowSums(apply(data, 2, grepl, pattern = "A")) > 0
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> rowSums(apply(data, 2, grepl, pattern = "1?2")) > 0
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> rowSums(apply(data, 2, grepl, pattern = "36")) > 0
[1] TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
> out <- rowSums(apply(data, 2, grepl, pattern = "36")) > 0
> data[out,]
aName bName pName call alleles logRatio strength
1 AX-11086564 F08_ADN103 2011-02-10_R10 AB CG 0.363371 10.184215
6 AX-11086564 B05_CD3630 2011-02-02_R7 AA CC 2.349906 9.153076
Note apply
will coerce by as.vector
Upvotes: 2