Reputation: 21
I keep getting a 'subscript out of bounds' error when I try to populate a matrix using a for loop that I have scripted below. My data are a large csv file that look similar to the following dummy dataset:
Sample k3 Year
1 B92028UUU 1 1990
2 B93001UUU 1 1993
3 B93005UUU 1 1993
4 B93006UUU 1 1993
5 B93010UUU 1 1993
6 B93011UUU 1 1994
7 B93022UUU 1 1994
8 B93035UUU 1 2014
9 B93036UUU 1 2014
10 B95015UUU 2 2013
11 B95016UUU 2 2013
12 B98027UUU 2 1990
13 B05005FUS 2 1990
14 B06006FIS 2 2001
15 B06010MUS 2 2001
16 B05023FUN 2 2001
17 B05024FUN 3 2001
18 B05025FIN 3 2001
19 B05034MMN 3 2002
20 B05037MMS 3 1996
21 B05041MUN 3 1996
22 B06047FUS 3 2007
23 B05048MUS 3 2000
24 B06059FUS 3 2000
25 B05063MUN 3 2000
My script is as follows:
Year.Matrix = matrix(1:75,nrow=25,byrow=T)
colnames(Year.Matrix)=c("Group 1","Group 2","Group 3")
rownames(Year.Matrix)=1990:2014
for(i in 1:3){
x=subset(data2,k3==i)
for(j in 1990:2014){
y=subset(x,Year==j)
z=nrow(y)
Year.Matrix[j,i]=z
}
}
Not sure why I am getting the error message but from other posts I gather that the issue arises when I try to populate my matrix, and perhaps because I do not have an entry for each year from each of my k3 levels?
Any commentary would be helpful!
Upvotes: 2
Views: 90
Reputation: 2757
Not sure what you are trying to do but as Hubert L said. Your value of j
index should be an integer while populating Year.Matrix
it should be values like 1..2..3..
since you have done (j in 1990:2014)
it will give j
values as 1990..1991..1992.....2014
to fix this offset your row
index as below. Your for loop
for(i in 1:3){
print(i)
x=subset(data2,k3==i)
for(j in seq_along(1990:2014)){
print(j)
y=subset(x,Year==j)
z=nrow(y)
Year.Matrix[j,i]=z
}
}
keep using print
statement to debug your function. Running this loop will immediately tell you data you are going to index Year.Matrix[1990,1]
which will through out of bound exception.
Fix this for loop by offsetting the index as:
for(i in 1:3){
print(i)
x=subset(data2,k3==i)
for(j in 1990:2014){
print(j)
y=subset(x,Year==j)
z=nrow(y)
Year.Matrix[1990-j+1,i]=z
}
}
Upvotes: 0
Reputation: 1189
You can also use dplyr
to do this. A dplyr
solution would be the following:
dat %>%
group_by(Year, k3) %>%
summarize(N=n())
Upvotes: 1
Reputation: 121568
No need to use a loop here. You are just computing length by year
and k3
columns:
library(data.table)
setDT(dat)[,.N,"Year,k3"]
Year k3 N
1: 1990 1 1
2: 1993 1 4
3: 1994 1 2
4: 2014 1 2
5: 2013 2 2
6: 1990 2 2
7: 2001 2 3
8: 2001 3 2
9: 2002 3 1
10: 1996 3 2
11: 2007 3 1
12: 2000 3 3
Upvotes: 2