Reputation: 23
I want to extract values from different columns, depending on the value of column x
. This column is in a dataframe 1 and contains different factor levels (e.g. 1,2,3,4,5,6). The columns where I want to extract the values from are in dataframe 2. Examples of both dataframes:
Dataframe 1 is called istrata
(173 rows)
> istrata[1:5,]
POSCODN Geslacht Agegrp
1 2651 0.4761905 1
2 2651 0.4761905 5
3 2652 0.5785124 1
4 2652 0.5785124 1
5 2661 0.5270758 3
Dataframe 2 is called strata
with (1721 rows):
> strata[1:5,]
POSCODN Geslacht agegrp_1 agegrp_2 agegrp_3 agegrp_4 agegrp_5 agegrp_6
1 2651 0.4761905 0.34085213 0.10025063 0.13784461 0.27318296 0.13784461 0.010025063
2 2652 0.5785124 0.34710744 0.23966942 0.11570248 0.19008264 0.10743802 0.000000000
3 2661 0.5270758 0.36462094 0.13357401 0.15162455 0.25270758 0.09747292 0.000000000
4 2662 0.6229508 0.39344262 0.26229508 0.11475410 0.21311475 0.01639344 0.000000000
5 2665 0.5387931 0.28448276 0.08189655 0.17241379 0.31465517 0.13362069 0.012931034
So in the end I want to achieve that when in the first row of dataframe 1, Agegrp
is 1, it imputes the value of the dataframe 2 from row 1 (since similar POSCODN), column 3 (agegrp_1
). Another example, row 5 in dataframe 1 is in Agegrp 5
and has POSCODN
2661. So here it should look at row 3 in dataframe 2 and column 4 (agegrp_3
) See the dataframe example below (an addition to istrata
):
Geslacht I_Agegrp
1 0.4761905 0.34085213
2 0.4761905 0.13784461
3 0.5785124 0.34710744
4 0.5785124 0.34710744
5 0.5270758 0.15162455
Is there a way to do this?
Help is much appreciated!
Upvotes: 1
Views: 1131
Reputation: 28441
From the help for ?'['
:
When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x; the result is then a vector with elements corresponding to the sets of indices in each row of i.
So the subset can be a matrix. column 1 will be the rows and column 2 will represent the columns.
If x is x <- c(3,2,1,1,1)
, we can combine that with the rows 1:nrow(df)
. The matrix will look like:
cbind(1:nrow(df), x)
x
[1,] 1 3
[2,] 2 2
[3,] 3 1
[4,] 4 1
[5,] 5 1
If we used this matrix to subset df[cbind(1:nrow(df), x)]
, the first extraction will be df[1,3]
, the second extraction df[2,2]
and so on. But we need to account for the Gender
column and add 1 to the second column, x + 1
.
df[cbind(1:nrow(df), x+1)]
[1] 0.1378446 0.2396694 0.3646209 0.3934426 0.2844828
Edit
With the new names:
strata[cbind(1:nrow(strata), istrata$Agegrp + 1L)]
Upvotes: 2