Leendert
Leendert

Reputation: 11

pingouin mixed anova: ValueError: cannot convert float NaN to integer

I did an experiment in which participants had to label a tree using a desktop application (CC) and a Virtual Reality application (VR). for each application they had to label a different tree. There were 3 trees; easy, medium and hard. I want to investigate if demographic groups have any interaction with tree type on the performances.

in the example below it is for trunk accuracy.

i have the following dataframe:enter image description here

i am using pingouin.mixed_anova to perform some tests. At first I tried to use treetype as a between-subject variable (which is correct) and that didn't work. If I use sensegroup, VRexpgroup or deskexpgroup as between variables there is no problem, but when I use treetype it returns: ValueError: cannot convert float NaN to integer. I then thought I realised my mistake and tried to use treetype as within-subject variable (which is incorrect) but it still returns the same error. See below my code snippet. I make sure tree type is lowercase and categorical (which I did not have to do for the other within variables). I also made sure that there are no non-numeric and NaN values in under the values column. I then print the unique values which returns correct numbers I loop over the dependent variables, between variables and within variables which give no errors as long as I don't include 'treetype'.

code snippet:

# Assuming melted_pers_df is your DataFrame containing the data
for dev in dependent_variables:
    for group_var in group_variables:
      for within_var in within_variables:
        # Select relevant data for the current dependent variable and group variable
        rel_df = melted_pers_df.loc[melted_pers_df['measurement'] == dev].copy()
        rel_df['treetype'] = rel_df['treetype'].str.lower()
        rel_df['treetype'] = rel_df['treetype'].astype('category')
        # Convert 'value' column to numeric type with errors='coerce'
        rel_df['value'] = pd.to_numeric(rel_df['value'], errors='coerce')

        # Drop rows with NaN values in the 'value' column
        rel_df.dropna(subset=['value'], inplace=True)
        
        # Print unique values of the 'value' column
        print(f"Unique values of 'value'for {within_var} + {group_var} + {dev}: {rel_df['value'].unique()}")

        # Print the DataFrame just before performing mixed ANOVA
        print(f"DataFrame just before mixed ANOVA for for {within_var} + {group_var} + {dev}:")
        display(rel_df)
 
          # Perform mixed ANOVA only if the dependent variable is numeric
        mixed_anova_result = pg.mixed_anova(dv='value', between=group_var, within=within_var, subject='Participant ID', data=rel_df)

        # Print a more readable version of the mixed ANOVA result
        print(f"\nMixed ANOVA test result for {within_var} + {group_var} + {dev}:")
        print(mixed_anova_result.round(3))

I tried to look for alternative packages that have mixed anova but couldn't find any.

Upvotes: 0

Views: 147

Answers (0)

Related Questions