Reputation: 331
I have the following dataframe (DF_A):
PARTY_ID O1 O2 O3 O4
P1 0 0 1 0
P2 2 1 0 1
P3 0 0 0 0
P4 2 1 1 1
P5 1 0 0 1
I also have another dataframe (DF_B) with the position of the columns that I need in DF_A. This is DF_B:
PARTY_ID POS_1 POS_2
P1 1 2
P2 2 1
P3 3 1
P4 2 1
P5 1 4
I need to give the position of the columns (DF_B) for getting the values of DF_A. The desired result is something like this:
PARTY_ID V1 V2
P1 0 0
P2 1 2
P3 0 0
P4 1 2
P5 1 1
I'm trying to use which function, but it seems not to work. Can anyone please help me?
SIDE NOTE: I need to do this the fastest way possible because my real data have more than 100K rows.
Upvotes: 2
Views: 1062
Reputation: 2218
The one with a simple for loop:
DF_C <- DF_B # creating dataframe with same dimension and column/row identifiers
for(i in 1:nrow(DF_C)) { DF_C[i,] <- DF_A[i,as.numeric(DF_B[i,])] } #over rows
Upvotes: 1
Reputation: 61214
> ind <- as.matrix(DF_B[,-1])
> t(sapply(1:nrow(ind), function(i) DF_A[, -1][, ind[i,]][i,] ))
O1 O2
[1,] 0 0
[2,] 1 2
[3,] 0 0
[4,] 1 2
[5,] 1 1
If you want to get a data.frame:
> DF <- t(sapply(1:nrow(ind), function(i) DF_A[, -1][, ind[i,]][i,] ))
> data.frame(PARTY_ID=DF_A[,1], DF)
PARTY_ID O1 O2
1 P1 0 0
2 P2 1 2
3 P3 0 0
4 P4 1 2
5 P5 1 1
Upvotes: 1
Reputation: 2056
A quick and dirty way to do this using apply
from base R:
DF_C <- apply(DF_A, 1, function(x) {
cols_to_use <- as.numeric(unlist(DF_B[DF_B$"PARTY_ID"==x["PARTY_ID"],2:3]))
x[-1][cols_to_use]
})
DF_C <- cbind(DF_A$PARTY_ID,t(DF_C))
colnames(DF_C) <- c("PARTY_ID", "V1","V2")
Upvotes: 1