Reputation: 368
The following function counts the number of points within different segments of a circle. This function works as intended when exporting counts for a single point in time. However, when trying export this count at different points in time using a groupby call, it still combines all counts to a single output.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Time' : ['19:50:10.1','19:50:10.1','19:50:10.1','19:50:10.1','19:50:10.2','19:50:10.2','19:50:10.2','19:50:10.2'],
'id' : ['A','B','C','D','A','B','C','D'],
'x' : [1,8,0,-5,1,-1,-6,0],
'y' : [-5,2,-5,2,5,-5,-2,2],
'X2' : [0,0,0,0,0,0,0,0],
'Y2' : [0,0,0,0,0,0,0,0],
'Angle' : [0,0,0,0,0,0,0,0],
})
def checkPoint(x, y, rotation_angle, refX, refY, radius = 10):
section_angle_start = [(i + rotation_angle - 45) for i in [0, 90, 180, 270, 360]]
Angle = np.arctan2(x-refX, y-refY) * 180 / np.pi
Angle = Angle % 360
# adjust range
if Angle > section_angle_start[-1]:
Angle -= 360
elif Angle < section_angle_start[0]:
Angle += 360
for i in range(4):
if section_angle_start[i] < Angle < section_angle_start[i+1]:
break
else:
i = 0
return i+1
tmp = []
result = []
The following is my attempt to pass the checkPoint
function to each unique group in Time
.
for group in df.groupby('Time'):
for i, row in df.iterrows():
seg = checkPoint(row.x, row.y, row.Angle, row.X2, row.Y2)
tmp.append(seg)
result.append([tmp.count(i) for i in [1,2,3,4]])
df = pd.DataFrame(result, columns = ['1','2','3','4'])
Out:
1 2 3 4
0 2 1 3 2
1 4 2 6 4
Intended Out:
1 2 3 4
0 0 1 2 1
1 2 0 1 1
Upvotes: 0
Views: 68
Reputation: 8269
Your inner loop is running through your entire DataFrame, and generating the double-counting you are observing.
As @Kenan suggested, you can limit the inner loop to the group:
for group in df.groupby('Time'):
for i, row in group[1].iterrows():
seg = checkPoint(row.x_live, row.y_live, row.Angle, row.BallX, row.BallY)
tmp.append(seg)
result.append([tmp.count(i) for i in [1,2,3,4]])
df_result = pd.DataFrame(result, columns = ['1','2','3','4'])
print(df_result)
Resulting in
1 2 3 4
0 0 1 2 1
1 2 1 3 2
Or you can use a groupby-apply construct to avoid the explicit loop:
def result(g):
tmp = []
for i, row in g.iterrows():
seg = checkPoint(row.x_live, row.y_live, row.Angle, row.BallX, row.BallY)
tmp.append(seg)
return pd.Series([tmp.count(i) for i in [1,2,3,4]], index=[1,2,3,4])
print(df.groupby('Time').apply(result))
Which gets you:
1 2 3 4
Time
19:50:10.1 0 1 2 1
19:50:10.2 2 0 1 1
Upvotes: 2