Srikanth Kadithota
Srikanth Kadithota

Reputation: 11

Extract the rows with the data containing "string" in a column in R programming

Excel file contains many columns with numeric, alphabets and alphanumeric.

  Column1 Column2 column2
1       1    abcd     fm1
2       2    bcde     fm2
3       3    cdef     fm3
4       4    aced     fm4
5       5    cadf     fm5

I have imported the file in R

data1 <- read.csv("Test1.csv")

Now want to extract the rows with the data containing "cd" in the column2 in R programming.

df <- structure(list(Column1 = 1:5, Column2 = c("abcd", "bcde", "cdef", 
    "aced", "cadf"), column2 = c("fm1", "fm2", "fm3", "fm4", "fm5"
    )), .Names = c("Column1", "Column2", "column2"), class = "data.frame", row.names = c(NA, 
    -5L))

Upvotes: 1

Views: 433

Answers (3)

Steven
Steven

Reputation: 3292

Before seeing the answer provided by @akrun above, I put this together:

    #Data
    dF <- structure(list(Column1 = 1:5, Column2 = c("abcd", "bcde", "cdef", 
                    "aced", "cadf"), column2 = c("fm1", "fm2", "fm3", "fm4", "fm5"
                    )), .Names = c("Column1", "Column2", "column2"), class = "data.frame",
                    row.names = c(NA, -5L))

    #Find rows with the string "cd" in the second column of the dataFrame 'dF'
    rows <- grep("cd", dF$Column2, ignore.case = F)
    #Display those rows
    dF[rows,]

The grep() function--and its cousins--is simple and, once you get the hang of regular expressions, is very powerful.

Upvotes: 0

Tyler Rinker
Tyler Rinker

Reputation: 109864

The Search function in the qdap, a package I maintain, makes this task pretty easy:

library(qdap)
Search(df, "cd", 2, 0)

##   Column1 Column2 column2
## 1       1    abcd     fm1
## 2       2    bcde     fm2
## 3       3    cdef     fm3

First argument is data.frame, 2nd the term, and optional 3rd arg is column name or number, 4th is string distance as the function defaults to fuzzy matching. Using 0 makes it match exactly.

Upvotes: 1

akrun
akrun

Reputation: 887108

You could use grep

df[grep('cd', df$Column2),]

data

df <- structure(list(Column1 = 1:5, Column2 = c("abcd", "bcde", "cdef", 
"aced", "cadf"), column2 = c("fm1", "fm2", "fm3", "fm4", "fm5"
 )), .Names = c("Column1", "Column2", "column2"), class = "data.frame",
 row.names = c(NA, -5L))

Upvotes: 1

Related Questions