Keizer
Keizer

Reputation: 23

Extract values of different columns

I want to extract values from different columns, depending on the value of column x. This column is in a dataframe 1 and contains different factor levels (e.g. 1,2,3,4,5,6). The columns where I want to extract the values from are in dataframe 2. Examples of both dataframes:

Dataframe 1 is called istrata (173 rows)

    > istrata[1:5,]
       POSCODN  Geslacht    Agegrp
    1    2651   0.4761905      1
    2    2651   0.4761905      5
    3    2652   0.5785124      1
    4    2652   0.5785124      1
    5    2661   0.5270758      3

Dataframe 2 is called strata with (1721 rows):

     > strata[1:5,]
         POSCODN   Geslacht   agegrp_1   agegrp_2   agegrp_3   agegrp_4   agegrp_5    agegrp_6
    1      2651 0.4761905 0.34085213 0.10025063 0.13784461 0.27318296 0.13784461 0.010025063
    2      2652 0.5785124 0.34710744 0.23966942 0.11570248 0.19008264 0.10743802 0.000000000
    3      2661 0.5270758 0.36462094 0.13357401 0.15162455 0.25270758 0.09747292 0.000000000
    4      2662 0.6229508 0.39344262 0.26229508 0.11475410 0.21311475 0.01639344 0.000000000
    5      2665 0.5387931 0.28448276 0.08189655 0.17241379 0.31465517 0.13362069 0.012931034

So in the end I want to achieve that when in the first row of dataframe 1, Agegrp is 1, it imputes the value of the dataframe 2 from row 1 (since similar POSCODN), column 3 (agegrp_1). Another example, row 5 in dataframe 1 is in Agegrp 5 and has POSCODN 2661. So here it should look at row 3 in dataframe 2 and column 4 (agegrp_3) See the dataframe example below (an addition to istrata):

      Geslacht     I_Agegrp 
1     0.4761905   0.34085213
2     0.4761905   0.13784461
3     0.5785124   0.34710744
4     0.5785124   0.34710744
5     0.5270758   0.15162455

Is there a way to do this?

Help is much appreciated!

Upvotes: 1

Views: 1131

Answers (1)

Pierre L
Pierre L

Reputation: 28441

From the help for ?'[':

When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x; the result is then a vector with elements corresponding to the sets of indices in each row of i.

So the subset can be a matrix. column 1 will be the rows and column 2 will represent the columns.

If x is x <- c(3,2,1,1,1), we can combine that with the rows 1:nrow(df). The matrix will look like:

cbind(1:nrow(df), x)
       x
[1,] 1 3
[2,] 2 2
[3,] 3 1
[4,] 4 1
[5,] 5 1

If we used this matrix to subset df[cbind(1:nrow(df), x)], the first extraction will be df[1,3], the second extraction df[2,2] and so on. But we need to account for the Gender column and add 1 to the second column, x + 1.

df[cbind(1:nrow(df), x+1)]
[1] 0.1378446 0.2396694 0.3646209 0.3934426 0.2844828

Edit

With the new names:

strata[cbind(1:nrow(strata), istrata$Agegrp + 1L)]

Upvotes: 2

Related Questions