silent_hunter
silent_hunter

Reputation: 2508

Filtering data frame in R

Below you can see my data frame.

df<-data.frame( 
                items=c("1 Food Item 1",
                "1.1 Food Item 2",
                "01.1.1 Food Item 3",
                "01.1.2 Food Item 4",
                "01.1.3 Food Item 5",
                "2 Food Item 6",
                "2.1 Food Item 7",
                "02.1.1 Food Item 8",
                "10 Food Item 9",
                "10.1 Food Item 10",
                "10.1.1 Food Item 11",
                "10.1.2 Food Item 12")
    )

df

This df contains items that begin with different numbers with two, three, and four digits. Now I want to filter this df, and the final output should be items only with four digits:

"01.1.1 Food Item 3",
"01.1.2 Food Item 4",
"01.1.3 Food Item 5",
"02.1.1 Food Item 8",
"10.1.1 Food Item 11",
"10.1.2 Food Item 12"

So can anybody help me with how to solve this problem?

Upvotes: 2

Views: 45

Answers (2)

br00t
br00t

Reputation: 1614

library(stringr)
df<-data.frame( 
  items=c("1 Food Item 1",
          "1.1 Food Item 2",
          "01.1.1 Food Item 3",
          "01.1.2 Food Item 4",
          "01.1.3 Food Item 5",
          "2 Food Item 6",
          "2.1 Food Item 7",
          "02.1.1 Food Item 8",
          "10 Food Item 9",
          "10.1 Food Item 10",
          "10.1.1 Food Item 11",
          "10.1.2 Food Item 12")
)

idx <- df$items |> str_detect('^\\d{2}\\.\\d{1}\\.\\d{1}') |> which()
df[ idx, ] |> print()

Upvotes: 2

akrun
akrun

Reputation: 887118

Use subset with grepl in base R - matches the pattern of 2 digits (\\d{2}) followed by a dot, then a digit, followed by a dot and another digit and spaces (\\s+) after

subset(df, grepl("^\\d{2}\\.\\d\\.\\d\\s+", items))

-output

           items
3   01.1.1 Food Item 3
4   01.1.2 Food Item 4
5   01.1.3 Food Item 5
8   02.1.1 Food Item 8
11 10.1.1 Food Item 11
12 10.1.2 Food Item 12

Upvotes: 2

Related Questions