NRM
NRM

Reputation: 21

How to populate a matrix from a for loop in R

I keep getting a 'subscript out of bounds' error when I try to populate a matrix using a for loop that I have scripted below. My data are a large csv file that look similar to the following dummy dataset:

      Sample k3 Year
1  B92028UUU  1 1990
2  B93001UUU  1 1993
3  B93005UUU  1 1993
4  B93006UUU  1 1993
5  B93010UUU  1 1993
6  B93011UUU  1 1994
7  B93022UUU  1 1994
8  B93035UUU  1 2014
9  B93036UUU  1 2014
10 B95015UUU  2 2013
11 B95016UUU  2 2013
12 B98027UUU  2 1990
13 B05005FUS  2 1990
14 B06006FIS  2 2001
15 B06010MUS  2 2001
16 B05023FUN  2 2001
17 B05024FUN  3 2001
18 B05025FIN  3 2001
19 B05034MMN  3 2002
20 B05037MMS  3 1996
21 B05041MUN  3 1996
22 B06047FUS  3 2007
23 B05048MUS  3 2000
24 B06059FUS  3 2000
25 B05063MUN  3 2000

My script is as follows:

Year.Matrix = matrix(1:75,nrow=25,byrow=T)
colnames(Year.Matrix)=c("Group 1","Group 2","Group 3")
rownames(Year.Matrix)=1990:2014

for(i in 1:3){
  x=subset(data2,k3==i)
for(j in 1990:2014){
  y=subset(x,Year==j)
  z=nrow(y)
  Year.Matrix[j,i]=z
    }
}

Not sure why I am getting the error message but from other posts I gather that the issue arises when I try to populate my matrix, and perhaps because I do not have an entry for each year from each of my k3 levels?

Any commentary would be helpful!

Upvotes: 2

Views: 90

Answers (3)

Dhawal Kapil
Dhawal Kapil

Reputation: 2757

Not sure what you are trying to do but as Hubert L said. Your value of j index should be an integer while populating Year.Matrix it should be values like 1..2..3.. since you have done (j in 1990:2014) it will give j values as 1990..1991..1992.....2014 to fix this offset your row index as below. Your for loop

for(i in 1:3){
    print(i)
    x=subset(data2,k3==i)
    for(j in seq_along(1990:2014)){
        print(j)
        y=subset(x,Year==j)
        z=nrow(y)
        Year.Matrix[j,i]=z
    }
}

keep using print statement to debug your function. Running this loop will immediately tell you data you are going to index Year.Matrix[1990,1] which will through out of bound exception.

Fix this for loop by offsetting the index as:

for(i in 1:3){
    print(i)
    x=subset(data2,k3==i)
    for(j in 1990:2014){
        print(j)
        y=subset(x,Year==j)
        z=nrow(y)
        Year.Matrix[1990-j+1,i]=z
    }
}

Upvotes: 0

Tad Dallas
Tad Dallas

Reputation: 1189

You can also use dplyr to do this. A dplyr solution would be the following:

dat %>% 
   group_by(Year, k3) %>%
   summarize(N=n())

Upvotes: 1

agstudy
agstudy

Reputation: 121568

No need to use a loop here. You are just computing length by year and k3 columns:

library(data.table)
setDT(dat)[,.N,"Year,k3"]
    Year k3 N
 1: 1990  1 1
 2: 1993  1 4
 3: 1994  1 2
 4: 2014  1 2
 5: 2013  2 2
 6: 1990  2 2
 7: 2001  2 3
 8: 2001  3 2
 9: 2002  3 1
10: 1996  3 2
11: 2007  3 1
12: 2000  3 3

Upvotes: 2

Related Questions