Álvaro Carrasco
Álvaro Carrasco

Reputation: 23

Obtaining data from an array to a dataframe

so i have 2 datasets, the first one is a dataframe

df1 <- data.frame(user=c(1:10), h01=c(3,3,6,8,9,10,4,1,2,5), h12=c(5,5,3,4,1,2,8,8,9,10),a=numeric(10))

the first column represents the user id, and h01 represents the id of a cell phone antenna from which the user is connected for a period of time (00:00 - 1:00AM) and h12 represents the same but between 1:00AM and 2:00AM.

And then i have an array

array1 <- array(c(23,12,63,11,5,6,9,41,23,73,26,83,41,51,29,10,1,5,30,2), dim=c(10,2))

The rows represent the cell phone antenna id, the columns represent the periods of time and the values in array1 represent how many people is connected to the antenna at that period of time. So array1[1,1] will print how many people is connected between 00:00 and 1:00 to antenna 1, array1[2,2] will print how many people is connected between 1:00 and 2:00 to antenna 2 and so on.

What i want to do is for each user in df1 obtain from array1 how many people in total is connected to the same antennas in the same period of time and place the value in column a.

For example, the first user is connected to antenna 3 between 00:00 and 1:00AM, and antenna 5 between 1:00AM and 2:00AM, so the value in a should be array1[3,1] plus array1[5,2]

I used a for loop to do this

aux1 <- df1[,2]
aux2 <- df1[,3]
for(i in 1:length(df1$user)){
  df1[i,4] <- sum(array1[aux1[i],1],array1[aux2[i],2])
}

which gives

   user h01 h02   a
1     1   3   5  92
2     2   3   5  92
3     3   6   3  47
4     4   8   4  92
5     5   9   1  49
6     6  10   2 156
7     7   4   8  16
8     8   1   8  28
9     9   2   9  42
10   10   5  10   7

This loop works and gives the correct values, the problem is the 2 datasets (df1 and array1) are really big. df1 has over 20.000 users and 24 periods of time, and array1 has over 1300 antennas, not to mention that this data corresponds to users from one socioeconomic level, and i have 5 in total, so simplifying the code is mandatory.

I would love if someone could show me a different approach to this, specially if its withouth a for loop.

Upvotes: 0

Views: 75

Answers (1)

Martin Seehafer
Martin Seehafer

Reputation: 156

Try this approach:

df1$a <- array1[df1$h01,1] + array1[df1$h12,2]

Upvotes: 2

Related Questions