hahahahahaha
hahahahahaha

Reputation: 1

"TypeError: 'Categorical' with dtype category does not support reduction 'mean"' when using Fastai to predict a column value

Code:

The following is a code that reads a CSV file and processes it using Fastai.

from pathlib import Path

import pandas as pd
from fastai.tabular.all import *

path = Path("D:\\workdir\\req4_IndustryControl_0\\IndustryControl\\data")
df = pd.read_csv(path/"tech_datas_his.csv")

def is_empty(value):
    return pd.isnull(value) or value == ''

print(df.shape)
df = df[df.map(is_empty).any(axis=1) == False]
print(df.shape)

col_list = df.columns.tolist()
y_names=['KP_D74']
category_names = ['databrand','KP_STA','tag1','tag2']
constant_names = list(set(col_list) - set(category_names)- set(y_names))
procs = [Categorify, FillMissing, Normalize]
df.info()

dls = TabularDataLoaders.from_df(df,path, procs=procs, cat_names=category_names, cont_names=constant_names,
                                 y_names=y_names, valid_idx=list(range(1,10000)), bs=64)
dls.show_batch(10)
learn = tabular_learner(dls,y_range=(0.0,4.0))
learn.fit_one_cycle(5)
learn.save("proportion_predict")
test_pf = pd.read_csv(path/"test.csv")
row, clas, probs = learn.predict(test_pf.iloc[0])
print(row)
print(clas,probs)

CSV structure:

# Column     Non-Null Count   Dtype

----------------------------

0   dataid     170717 non-null  int64  
1   dataprod   170717 non-null  object
2   databatch  170717 non-null  object
3   databrand  170717 non-null  object
4   datatime   170717 non-null  object
5   KP_STA     170717 non-null  object
79  KP_D74     170717 non-null  float64
89  tag1       170717 non-null  int64  
90  tag2       170717 non-null  int64  
dtypes: float64(74), int64(12), object(5)

Error Info

TypeError: 'Categorical' with dtype category does not support reduction 'mean'

This error occurred in 'dls = TabularDataLoaders.....'

TraceBack:

TraceBackScreenShot

Upvotes: 0

Views: 159

Answers (1)

Taylor England
Taylor England

Reputation: 11

This likely means that one of your columns is incompatible as a categorical data type or a continuous datatype. You can check the types using dtype.

for col in df.columns:
    print(col)
    print(df[col].dtype)
    print(df.cat_column.dtype == 'category')

Upvotes: 1

Related Questions