Reputation: 660
My Python code returns the following error message:
File "/Users/christianmagelssen/Desktop/Koding/analyse/moduler/resultater.py", line 64, in allokereGrupper
group1['GRUPPE'] = velger
ValueError: Length of values (1) does not match length of index (3)
I have tried many different things to solve this issue:
I know that my code worked 3 months ago but on another dataset. Can someone help me so I understand what I am doing wrong here?
Here is all my code
results.py
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import random
class Resultat:
def lastInnOgRydd(path, LagreCsv = False):
df = pd.read_csv(path, skiprows=2, decimal=".")
filt = df['FINISH'] == 'DNF'
dnf = df[filt]
dnf = dnf.replace('DNF', 1)
if LagreCsv == True:
dnf.to_csv('DNF.csv')
df.replace('DNF', np.NaN, inplace=True)
df.replace('GARBAGE GARBAGE', np.NaN, inplace=True) #Denne finnes det nok en bedre løsning på
df.dropna(subset=['FINISH'], inplace=True)
df.dropna(subset=['NAME'], inplace=True)
return df
def endreDataType(df):
df["FINISH"] = df["FINISH"].str.replace(',', '.').astype(float)
df["INTER 1"] = df["INTER 1"].str.replace(',', '.').astype(float)
df["SECTION IM4-FINISH"] = df["SECTION IM4-FINISH"].str.replace(',', '.').astype(float)
df["COMMENT"] = df['COMMENT'].astype(int)
df["COMMENT"] = df['COMMENT'].astype(str)
df["COMMENT"] = df['COMMENT'].str.replace('11', 'COURSE 1')
df["COMMENT"] = df['COMMENT'].str.replace('22', 'COURSE 2')
df["COMMENT"] = df['COMMENT'].str.replace('33', 'COURSE 3')
df["COMMENT"] = df['COMMENT'].str.replace('55', 'UTKJORING')
df["COMMENT"] = df['COMMENT'].str.replace('99', 'STRAIGHT-GLIDING')
pd.to_numeric(df['FINISH'], downcast='float', errors='raise')
pd.to_numeric(df['INTER 1'], downcast='float', errors='raise')
pd.to_numeric(df['SECTION IM4-FINISH'], downcast='float', errors='raise')
return df
def navnendringCommentTilCourse(df):
df.rename(columns={'COMMENT': 'COURSE'}, inplace=True)
return df
def finnBesteRunder(df):
grupper = df.groupby(['BIB#', 'COURSE'])
bestruns = grupper['FINISH'].apply(lambda x: x.nsmallest(2).mean()).reset_index()
df1 = bestruns.pivot('BIB#', 'COURSE', 'FINISH').reset_index()
df1['GJENNOMSNITT'] = df1['COURSE 1'].add(df1['COURSE 2']).add(df1['COURSE 3']).div(3)
#df1['PRESTASJON'] = df1['MEAN'].div(df1['STRAIGHT-GLIDING']) # fjerner denne nå, men må med i den ordentilige analysen
return df1
def allokereGrupper(df1):
df1 = df1.sort_values(by='GJENNOMSNITT', ascending=True)
mask = np.arange(len(df1)) % 2
group1 = df1.loc[mask == 0]
group1 = group1.drop_duplicates(subset=['BIB#'])
print(group1)
group2 = df1.loc[mask == 1]
group2 = group2.drop_duplicates(subset=['BIB#'])
print(group2)
grupper = ['RANDOM', 'BLOCKED']
for i in group1['BIB#']:
velger = random.sample(grupper, k=1)
group1['GRUPPE'] = velger
main.py
from moduler import Resultat
path = "http://www.cmagelssen.no/pilot2.csv"
df = Resultat.lastInnOgRydd(path)
df = Resultat.endreDataType(df)
df = Resultat.navnendringCommentTilCourse(df)
df = Resultat.finnBesteRunder(df)
df = Resultat.allokereGrupper(df)
Upvotes: 1
Views: 251
Reputation: 2498
The problem is that velger
is a list. It looks like either ['RANDOM']
or ['BLOCKED']
. When you try to create the 'GRUPPE'
column, you must feed a non-iterable, like a string or int.
If you feed it an iterable, Pandas assumes that your iterable is the same length as your dataframe, and fills every dataframe row with the corresponding value in the iterable. (3rd row gets 3rd list element, for example). But of course your iterable has length one, and the dataframe group1
does not necessarily just have one element. Maybe in your previous dataset that was the case.
It's not entirely clear to me what is your goal from the code, but if your intention is to fill every cell in the 'GRUPPE'
column with the same value (either 'RANDOM'
or 'BLOCKED'
, then change:
group1['GRUPPE'] = velger
to
group1['GRUPPE'] = velger[0]
Upvotes: 1