hartono -
hartono -

Reputation: 41

Can't convert column to category dtypes Pandas with read_csv

I have data from csv and load it with read_csv in Pandas. I try to convert 6 column to float32 and its worked, but category column not converted..

I have checked my 'div' column and there is no problem with it:

df_concat['div'].unique()

array(['L', 'J', 'K', 'U', 'E', 'B', 'A', 'C', 'N', 'X', 'M', 'O', 'D',
       'I', 'P', 'Q', 'S', 'R', 'T'], dtype=object)

I tried to limit data with nrows=4000000 and it success converted to category dtypes ! what's wrong with it?

this my code:

names = ['bdate', 'nama_site', 'kode_store', 'div', 'merdivdesc', 'cat', 'catdesc', 'subcat', 'subcatdesc', 'brand', 'sku', 'sku_desc', 'tillcode', 'netsales', 'profit', 'margin', 'qty']

dtype = {
    'netsales' : 'float32', 'profit' : 'float32', 'margin' : 'float32', 'qty' : 'float32',
    'div' : 'category'
}

data = pd.read_csv('clean_jan20_minified.csv', sep='|', dtype=dtype, chunksize=20000, names=names, skiprows=[0], nrows=4000000)

chunk_list = []  
for chunk in data:  
    chunk_list.append(chunk)

df_concat = pd.concat(chunk_list, ignore_index=True)

when i try manually convert with df_concat['div']=df_concat['div'].astype('category') it works. but i need convert it when read_csv

Upvotes: 2

Views: 418

Answers (1)

David Erickson
David Erickson

Reputation: 16673

When using pd.concat, it looks like you lost your category data type.

See this article just above General guidelines at the end of the article: https://pbpython.com/pandas_dtypes_cat.html

"In this case, the data is still there but the type has been converted to an object. Once again, this is pandas attempt to combine the data without throwing errors but not making assumptions. If you want to convert to a category data type now, you can use astype('category') ."

Also, you might want to try .reorder_categories per this post: pandas - concat with columns of same categories turns to object

Without Sample data, I cannot help you troubleshoot.

Upvotes: 1

Related Questions