Stataq
Stataq

Reputation: 2297

How to remove all the variable that named as .x, .y?

I have a list of data.frame (lst1). In each data.frame in lst1, we have some variables that looks like test.x, test.y, try.x, try.y. etc.

I want to filter out those variables that were created by merging dataset without filter out those variable first (try, test, etc.). How should I filter them out now?

Thanks.

Upvotes: 0

Views: 472

Answers (3)

akrun
akrun

Reputation: 887108

In base R, we can use endsWith

lapply(List, function(x) x[!(endsWith(names(x), 
        '.x')|endsWith(names(x), '.y'))])

-output

#$A
#  a b
#1 1 5

#$B
#  a b
#1 5 6

data

List <- list(A = structure(list(a = 1, b = 5, test.x = NA, test.y = 5), class = "data.frame", row.names = c(NA, 
-1L)), B = structure(list(a = 5, b = 6, test.x = NA, try.x = 7), class = "data.frame", row.names = c(NA, 
-1L)))

Upvotes: 0

Ian Campbell
Ian Campbell

Reputation: 24790

Here's a tidyverse approach:

We can use the dplyr::select function to select only the columns we want. matches() allows us to select columns using regular expressions. \\.[xy]$ matches columns that contain a period followed by x or y and $ anchors the match to the end of the string.

The purrr::map function allows us to apply the selection to each list element. ~ defines a formula which is automatically converted to a function.

library(tidyverse)
lst2 <- lst1 %>%
  map(~dplyr::select(.,-matches("\\.[xy]$")))

map(lst2, head, 2)
#[[1]]
#  ID name
#1  1    A
#2  2    B
#[[2]]
#  ID name
#1  1    A
#2  2    B
#[[3]]
#  ID name
#1  1    A
#2  2    B
#[[4]]
#  ID name
#1  1    A
#2  2    B
#[[5]]
#  ID name
#1  1    A
#2  2    B

Sample Data:

lst1 <- replicate(5,data.frame(ID = 1:15, name = LETTERS[1:15], test.x = runif(15), test.y = runif(15)),simplify = FALSE)
map(lst1, head, 2)
#[[1]]
#  ID name     test.x    test.y
#1  1    A 0.03772391 0.2630905
#2  2    B 0.11844048 0.2929392
#[[2]]
#  ID name   test.x    test.y
#1  1    A 0.398029 0.5151159
#2  2    B 0.348489 0.9534869
#[[3]]
#  ID name    test.x    test.y
#1  1    A 0.7447383 0.6862136
#2  2    B 0.3623562 0.7542699
#
#[[4]]
#  ID name    test.x    test.y
#1  1    A 0.9341495 0.8660333
#2  2    B 0.8383039 0.6299427
#[[5]]
#  ID name     test.x     test.y
#1  1    A 0.02662444 0.04502225
#2  2    B 0.29855214 0.46189116

Upvotes: 2

Duck
Duck

Reputation: 39595

You can also try this:

#Data
List <- list(A=data.frame(a=1,b=5,test.x=NA,test.y=5),
             B=data.frame(a=5,b=6,test.x=NA,try.x=7))
#Remove
myfun <- function(x)
{
  i <- which(grepl('.x|.y',names(x)))
  x <- x[,-i]
  return(x)
}
#Apply
List2 <- lapply(List,myfun)

Output:

List2
$A
  a b
1 1 5

$B
  a b
1 5 6

Upvotes: 3

Related Questions