Reputation: 391
I am sure that the range which I am trying to access exists, but still, the program shows error. I am trying to make a program which will sum up the values in a column corresponding to the values in the other column.
for example:
28400 4
28400 34
28400 9
65478 2
65478 5
65478 3
what my program will do is, it will add up 4,34 and 9 and then it will add up 2,5 and 3 and then following will be the output-
47
47
47
10
10
10
I am importing data from a CSV file. following is the code-
import pandas as pd
import numpy as np
assessment = pd.read_csv('/home/user/Documents/MOOC dataset original/studentVle2.csv')
assessment = assessment.values
count=0
stucount=28400
sumc=[]
i=0
for stu in assessment[:,2:3]:
if(stucount==stu):
count = count + assessment[i,5]
i=i+1
else:
sumc.append(count)
count = 0
count = count + assessment[i,5]
i=i+1
stucount=stu
#print(sumc)
stucount=28400
i=0
a=[]
for stu in assessment[:,2:3]:
if(stucount==stu):
a.append(sumc[i])
stucount = stu
else:
i=i+1
a.append(sumc[i])
stucount = stu
print(a)
Error:
File "/home/user/Documents/final project files/test.py", line 36, in <module>
a.append(sumc[i])
IndexError: list index out of range
and by the way, before adding some lines, like i=i+1,stucount=stu this error was not shown, but now it shows even though what happening is the same.
Upvotes: 2
Views: 3110
Reputation: 776
The error is because you are not adding assessment value to sumc
list for the last student after the loop ends. So, for n
unique student id, the list length is only n-1
. After for
loop, add sumc.append(count)
. See below.
assessment = assessment.values
count=0
stucount=28400
sumc=[]
i=0
for stu in assessment[:,2:3]:
if(stucount==stu):
count = count + assessment[i,5]
i=i+1
else:
sumc.append(count)
count = 0
count = count + assessment[i,5]
i=i+1
stucount=stu
sumc.append(count)
print(sumc)
stucount=28400
i=0
a=[]
for stu in assessment[:,2:3]:
if(stucount==stu):
a.append(sumc[i])
stucount = stu
else:
a.append(sumc[i])
stucount = stu
i=i+1
print(a)
Upvotes: 4
Reputation: 1921
Here, i'm just going by your initial problem statement of what you have and what you want to get.
df = pd.DataFrame([[28400,4],
[28400,34],
[28400,9],
[65478,2],
[65478,5],
[65478,3]], columns=list('AB'))
sums = df.groupby('A').B.sum()
df.A.map(sums)
And you get
0 47
1 47
2 47
3 10
4 10
5 10
Name: A, dtype: int64
Was this what you were looking for?
Upvotes: 3
Reputation: 205
Place i=i+1
below stucount = stu
and then try
import pandas as pd
import numpy as np
assessment = pd.read_csv('/home/user/Documents/MOOC dataset original/studentVle2.csv')
assessment = assessment.values
count=0
stucount=28400
sumc=[]
i=0
for stu in assessment[:,2:3]:
if(stucount==stu):
count = count + assessment[i,5]
i=i+1
else:
sumc.append(count)
count = 0
count = count + assessment[i,5]
i=i+1
stucount=stu
#print(sumc)
stucount=28400
i=0
a=[]
for stu in assessment[:,2:3]:
if(stucount==stu):
a.append(sumc[i])
stucount = stu
else:
a.append(sumc[i])
stucount = stu
i=i+1
print(a)
The output will be different and change accordingly....error will be removed
Upvotes: 3
Reputation: 116
I think you should add i=i+1
after the error line a.append(sumc[i])
.
Because in your code, may out of range of list at last.
Upvotes: 2