Reputation: 628
I have 4 data frames for 4 different data groups (total 16 data frames) with the same column structure each having column a, b, c, d etc. (over hundreds of columns), but the values are different for each data frame. The only thing that are the same are the number of variables and column names (to some degree, but there is no pattern. The column names are names for items, not a, b, c etc.) for each "data group".
For example:
dat1 = data.frame(x = c(0.1,0.2,0.3,0.4,0.5),
y = c(0.6,0.7,0.8,0.9,0.10),
z = c(0.12,0.13,0.14,0.15,0.16))
which produces
x y z
1 0.1 0.6 0.12
2 0.2 0.7 0.13
3 0.3 0.8 0.14
4 0.4 0.9 0.15
5 0.5 0.1 0.16
and second data frame
dat2 = data.frame(x = c(1,2,3,4,5), y = c(6,7,8,9,10), z = c(12,13,14,15,16))
x y z
1 1 6 12
2 2 7 13
3 3 8 14
4 4 9 15
5 5 10 16
I want to do my data cleaning in dat1
based on certain criteria, such that if I remove column x
in dat1
then column x
will also be removed in dat2
. These specific criteria could be
dat1[,tail(dat1, n = 1) < 0.2]
y z
1 0.6 0.12
2 0.7 0.13
3 0.8 0.14
4 0.9 0.15
5 0.1 0.16
such that dat2
also automatically deletes colunm x
.
y z
1 6 12
2 7 13
3 8 14
4 9 15
5 10 16
Is there a way to do this? I have been trying to search for it on StackOverflow, but I couldn't find anything useful. Thanks.
Upvotes: 1
Views: 65
Reputation: 76402
Something like this?
With the data you posted, it works as expected.
cols.to.remove <- function(DF1, DF2) {
d <- setdiff(names(DF1), names(DF2))
-which(d %in% names(DF1))
}
dat2 <- dat2[cols.to.remove(dat2, dat1)]
dat2
# y z
#1 6 12
#2 7 13
#3 8 14
#4 9 15
#5 10 16
Upvotes: 1