Edward
Edward

Reputation: 4623

Loop for pandas columns

I want to apply kruskal test for several columns. I do as bellow

import pandas as pd
import scipy 
df = pd.DataFrame({'a':range(9), 'b':[1,2,3,1,2,3,1,2,3], 'group':['a', 'b', 'c']*3})

and then the Loop

groups = {}
res = []
for grp in df['group'].unique():
    for column in df[[0, 1]]:
        groups[grp] = df[column][df['group']==grp].values
    args = groups.values()
g = scipy.stats.kruskal(*args)
res.append(g)
print (res) 

I get

[KruskalResult(statistic=8.0000000000000036, pvalue=0.018315638888734137)]

But i want

[KruskalResult(statistic=0.80000000000000071, pvalue=0.67032004603563911)]
[KruskalResult(statistic=8.0000000000000036, pvalue=0.018315638888734137)]

Where is my mistake?

for a single column i do as below

import pandas as pd
import scipy
df = pd.DataFrame({'numbers':range(9), 'group':['a', 'b', 'c']*3})
groups = {}
for grp in df['group'].unique():
    groups[grp] = df['numbers'][df['group']==grp].values
print(groups)
args = groups.values()
scipy.stats.kruskal(*args)

Upvotes: 0

Views: 284

Answers (2)

Edward
Edward

Reputation: 4623

before i made like this

groups = {}
res = []
for column in df[[0, 1]]:
    for grp in df['group'].unique():
        groups[grp] = df[column][df['group']==grp].values
    args = groups.values()
g = scipy.stats.kruskal(*args)
res.append(g)
print (res)

and i get

[KruskalResult(statistic=8.0000000000000036, pvalue=0.018315638888734137)]

The problem was in indent (((

Upvotes: 0

Zeugma
Zeugma

Reputation: 32085

Your for loops are upside down: the one-column algorithm is your loop invariant with regards to the column you chose. So the column for loop must be the outer loop. In plain English "for each column apply the kruskal algorithm which consists of this group.unique for loop:

groups = {}
res = []
for column in df[[0, 1]]:
    for grp in df['group'].unique():
        groups[grp] = df[column][df['group']==grp].values
    args = groups.values()
    g = scipy.stats.kruskal(*args)
    res.append(g)
print (res) 

Upvotes: 1

Related Questions