ju.
ju.

Reputation: 344

Pandas groupby: Nested loop fails with key error

I have CSV file with the following test content:

Name;N_D;mu;set
A;10;20;0
B;20;30;0
C;30;40;0
x;5;15;1
y;15;25;1
z;25;35;1

I'm reading the file with pandas, group the data and then iterate through the data. Within each group, I want to iterate through the rows of the data set:

import pandas as pd
df = pd.read_csv("samples_test.csv", delimiter=";", header=0)

groups = df.groupby("set")
for name, group in groups:
    somestuff = [group["N_D"], group["mu"], name]

    for i, txt in enumerate(group["Name"]):
        print(txt, group["Name"][i])

The code fails on the line print(txt, group["Name"][i]) at the first element of the second group with an key error. I don't understand, why...

Upvotes: 1

Views: 238

Answers (1)

anky
anky

Reputation: 75080

Your code fails since the series index does not match with the enumerator index for each loop hence cannot match the keys for filtering, (Note: Also use .loc[] or .iloc[] and avoid chained indexing group["Name"][i])

groups = df.groupby("set")
for name, group in groups:
    somestuff = [group["N_D"], group["mu"], name]

    for i, txt in enumerate(group["Name"]):
        print(i,group["Name"])

0 0    A
  1    B
  2    C
Name: Name, dtype: object
1 0    A
  1    B
  2    C
.......
....

Your code should be changed to below using .iloc[] and get_loc for getting the column index:

groups = df.groupby("set")
for name, group in groups:
    somestuff = [group["N_D"], group["mu"], name]
    for i, txt in enumerate(group["Name"]):
        print(txt,group.iloc[i,group.columns.get_loc('Name')])

A A
B B
C C
x x
y y
z z

Upvotes: 1

Related Questions