Reputation: 103
I have two dfs, both dim [1] 54 210
. One (lets call it dfx
) contains 1, 0s to mark incorrect and correct answers on a test. dfy
contains the response time for each of these questions. I'd like to subset(merge()
(perhaps) all items from dfy
that are == 1 in dfx
. The data is in the wide format, ID = rownames and columns represent each question.
Example:
dfx
Q1 Q2 Q3 Q4 Q5 …
1 1 1 1 1
1 1 1 1 1
1 1 0 1 1
1 1 0 1 1
Dfy
Q1_3 Q2_3 Q3_3 Q4_3 Q5_3 ...
16.01 8.23 18.13 11.14 18.03
17.25 7.50 11.72 10.84 7.24
I would need a dfz that is a subset of dfy, in which if dfx[Q1] == 1
, dfy [Q1_3]
is returned as dfz[Q1_3]
, otherwise NA
or dfx[Q1]
( which is 0).
I can do it if I specify cols by
dfz<- cbind(ifelse(dfx$Q1 == 1, dfy$Q1_3, dfx$Q1))
however I don't know how to apply it for the whole df.
Any ideas?
Upvotes: 0
Views: 139
Reputation: 21047
If both data frames have the same size, and dfx
has only ones and zeros, you can multiply them to get what you need:
dfz <- dfy * dfx
On your next comment, you ask how can you manipulate columns from a dataframe based on the values of other data frame. I frequently use the sqldf
package for this kind of thing. It let's you manipulate dataframes using SQL instructions. You'll need some id
column that let's you relate your dataframes.
A simple example:
library(sqldf)
sqldf("select df_a.id
, case
when df_b.q1 = 1 then df_a.q1
else 0
end as value
from df_a
inner join df_b on df_a.id = df_b.id")
As you can see, you can join dataframes as if they were tables in a database.
Hope this helps.
Upvotes: 1