chris chan
chris chan

Reputation: 17

"TypeError: cannot do positional indexing on Index with these indexers [gen9ou] of type str" when running Dash app

I am getting the aformentioned error whenever I try to run my Dash app via my callback function. Over here, I am trying to create a line graph that filters data based on a given tier and top n used Pokemon in said tier. Whenever I run my code, it traces my error to about the area that I marked below. Could someone give me an answer for why am I getting a TypeError over there?

def update_graph(top_n, given_tier):
    # Filter df_final by the selected tier
    filtered_df = df_final[df_final['Tier'] == given_tier]

    sorted_df = df_final.sort_values(by=['Month', 'Usage Rate'], ascending=[False, False])
    #Get the first top_n rows in order to extract the pokemon names
    first_n_rows = sorted_df.head(top_n) <---Traced Error
    top_n_array = first_n_rows['Name'].values
    # print(sorted_df)
    # print(top_5_array)

    # Make an empty df with columns 'Name' and 'Usage Rate'
    concat_df = pd.DataFrame(columns=['Name', 'Usage Rate', 'Month', 'Tier'])
    pd.set_option('display.max_rows', 100)
    # i = specific name
    for i in top_n_array:
        specific_name = sorted_df.loc[sorted_df['Name'] == i, ['Name', 'Usage Rate', 'Month', 'Tier']]
        concat_df = pd.concat([concat_df, specific_name], ignore_index=True)

    print(concat_df)

    fig = px.line(concat_df, x='Month', y='Usage Rate', color="Name", title=f'Top {top_n} Results')
    fig.update_layout(yaxis=dict(autorange=True))
    return fig

I used my sample dataset from an excel file with the column names: Rank, Name, Usage Rate, Raw Usage, Raw %, Tier, Month. I am trying to create a line graph in plotly Dash that I can filter based on tier (gen9ou, gen9uu, gen9ru, etc) and top n mons used in said tier.

Upvotes: 0

Views: 119

Answers (1)

First of all, in your code you sort df_final instead of filtered_df. This means that the top N Pokémon might be selected from the entire dataset instead of just from the filtered tier. Also your function does not account for the different months when selecting the top N Pokémon. Finally, you selected the top N rows directly from the sorted dataframe, which is notcorrect. You can try this approach:

import pandas as pd
import plotly.express as px

# Larger sample DataFrame to test
data = {
    'Rank': [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6],
    'Name': ['Pikachu', 'Charizard', 'Bulbasaur', 'Squirtle', 'Jigglypuff', 'Gengar',
             'Pikachu', 'Charizard', 'Bulbasaur', 'Squirtle', 'Jigglypuff', 'Gengar',
             'Pikachu', 'Charizard', 'Bulbasaur', 'Squirtle', 'Jigglypuff', 'Gengar'],
    'Usage Rate': [20, 15, 10, 5, 25, 30, 18, 12, 8, 4, 22, 28, 19, 14, 9, 6, 23, 29],
    'Raw Usage': [2000, 1500, 1000, 500, 2500, 3000, 1800, 1200, 800, 400, 2200, 2800, 1900, 1400, 900, 600, 2300, 2900],
    'Raw %': [2.0, 1.5, 1.0, 0.5, 2.5, 3.0, 1.8, 1.2, 0.8, 0.4, 2.2, 2.8, 1.9, 1.4, 0.9, 0.6, 2.3, 2.9],
    'Tier': ['gen9ou', 'gen9ou', 'gen9ou', 'gen9uu', 'gen9uu', 'gen9ou',
             'gen9ou', 'gen9ou', 'gen9ou', 'gen9uu', 'gen9uu', 'gen9ou',
             'gen9ou', 'gen9ou', 'gen9ou', 'gen9uu', 'gen9uu', 'gen9ou'],
    'Month': ['January', 'January', 'January', 'January', 'January', 'January',
              'February', 'February', 'February', 'February', 'February', 'February',
              'March', 'March', 'March', 'March', 'March', 'March']
}
df_final = pd.DataFrame(data)

def update_graph(top_n, given_tier):
    filtered_df = df_final[df_final['Tier'] == given_tier]

    if filtered_df.empty:
        print(f"No data available for tier: {given_tier}")
        return px.line(title=f'Top {top_n} Results')

    sorted_df = filtered_df.sort_values(by=['Month', 'Usage Rate'], ascending=[False, False])

    print("Filtered and sorted DataFrame:")
    print(sorted_df)

    top_n_names = sorted_df.groupby('Month').apply(lambda x: x.nlargest(top_n, 'Usage Rate')).reset_index(drop=True)
    top_n_array = top_n_names['Name'].unique()

    print(f"Top {top_n} names: {top_n_array}")

    concat_df = pd.DataFrame(columns=['Name', 'Usage Rate', 'Month', 'Tier'])
    pd.set_option('display.max_rows', 100)

    for i in top_n_array:
        specific_name = filtered_df.loc[filtered_df['Name'] == i, ['Name', 'Usage Rate', 'Month', 'Tier']]
        concat_df = pd.concat([concat_df, specific_name], ignore_index=True)

    print("Concatenated DataFrame for plotting:")
    print(concat_df)

    if concat_df.empty:
        print(f"No data available for the top {top_n} Pokémon in tier: {given_tier}")
        return px.line(title=f'Top {top_n} Results')

    fig = px.line(concat_df, x='Month', y='Usage Rate', color="Name", title=f'Top {top_n} Results')
    fig.update_layout(yaxis=dict(autorange=True))
    return fig

fig = update_graph(3, 'gen9ou')
fig.show()

which gives you enter image description here

Upvotes: 1

Related Questions