Reputation: 17
I have moved a Pandas script I wrote from one computer to another. When running it on the new computer I am getting this error but am unsure what is causing it.
dfm = master_df
dfa = pd.read_csv(path)
dfa["Size"] = pd.cut(dfa["NOMSIZE_IN_MM_U"],bins=[0,300,600,float('inf')])
dfa["Depth"] = pd.cut(dfa["DEPTH_U"],bins=[0,2,4,6,float('inf')])
dfm['Size'] = pd.cut(dfm['NOMSIZE_IN_MM'], bins = [0,300,600,float('inf')])
dfm['Depth'] = pd.cut(dfm['AVE_DEPTH'], bins = [0,2,4,6,float('inf')])
master_df = dfm.join(dfa.set_index(['Size', 'Depth'])['REPAIR_DURATION'],on=['Size', 'Depth'])
Returns:
Traceback (most recent call last):
File "s:/!AMD Share/Julian D - Student/LARM Gravity/Python Scripts/LARM3_GS.py", line 442, in <module>
master_df = dfm.join(dfa.set_index(['Size', 'Depth'])['REPAIR_DURATION'],on=['Size', 'Depth'])
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4767, in join
rsuffix=rsuffix, sort=sort)
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4782, in _join_compat
suffixes=(lsuffix, rsuffix), sort=sort)
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 54, in merge
return op.get_result()
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 569, in get_result
join_index, left_indexer, right_indexer = self._get_join_info()
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 726, in _get_join_info
sort=self.sort)
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 1353, in _left_join_on_index
_get_multiindex_indexer(join_keys, right_ax, sort=sort)
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 1304, in _get_multiindex_indexer
rlab, llab, shape = map(list, zip(* map(fkeys, index.levels, join_keys)))
File "C:\Users\DITTHAJ0\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\reshape\merge.py", line 1390, in _factorize_keys
lk.is_dtype_equal(rk)):
AttributeError: 'CategoricalIndex' object has no attribute 'is_dtype_equal'
Where dfa:
NOMSIZE_IN_MM_U DEPTH_U REPAIR_DURATION
0 300 2 1
1 300 4 1
2 300 6 2
3 300 8 3
4 600 2 2
5 600 4 2
6 600 6 2
7 600 8 5
8 900 2 4
9 900 4 4
10 900 6 5
11 900 8 10
Master Data:
ID AVE_DEPTH NOMSIZE_IN_MM
1 0 3.985 915
2 1 2.655 915
3 2 4.200 915
Upvotes: 1
Views: 394
Reputation: 62403
pandas 1.2.1
- update with pip
or conda
, depending on your environment.import pandas as pd
# test dataframes
dfm = pd.DataFrame({'ID': [0, 1, 2], 'AVE_DEPTH': [3.985, 2.655, 4.200], 'NOMSIZE_IN_MM': [915, 915, 915]})
dfa = pd.DataFrame({'NOMSIZE_IN_MM_U': [300, 300, 300, 300, 600, 600, 600, 600, 900, 900, 900, 900], 'DEPTH_U': [2, 4, 6, 8, 2, 4, 6, 8, 2, 4, 6, 8], 'REPAIR_DURATION': [1, 1, 2, 3, 2, 2, 2, 5, 4, 4, 5, 10]})
# add bins
dfa["Size"] = pd.cut(dfa["NOMSIZE_IN_MM_U"],bins=[0,300,600,float('inf')])
dfa["Depth"] = pd.cut(dfa["DEPTH_U"],bins=[0,2,4,6,float('inf')])
dfm['Size'] = pd.cut(dfm['NOMSIZE_IN_MM'], bins = [0,300,600,float('inf')])
dfm['Depth'] = pd.cut(dfm['AVE_DEPTH'], bins = [0,2,4,6,float('inf')])
# join or merge the dataframes
.join
# set index - it's better to be explicit
dfm.set_index(['Size', 'Depth'], inplace=True)
dfa.set_index(['Size', 'Depth'], inplace=True)
# join dataframes
df = dfm.join(dfa.REPAIR_DURATION)
# display(df)
ID AVE_DEPTH NOMSIZE_IN_MM REPAIR_DURATION
Size Depth
(600.0, inf] (2.0, 4.0] 0 3.985 915 4
(2.0, 4.0] 1 2.655 915 4
(4.0, 6.0] 2 4.200 915 5
.merge
# merge dataframes
df = dfm.merge(dfa[['Size', 'Depth', 'REPAIR_DURATION']], on=['Size', 'Depth'])
# display(df)
ID AVE_DEPTH NOMSIZE_IN_MM Size Depth REPAIR_DURATION
0 0 3.985 915 (600.0, inf] (2.0, 4.0] 4
1 1 2.655 915 (600.0, inf] (2.0, 4.0] 4
2 2 4.200 915 (600.0, inf] (4.0, 6.0] 5
Upvotes: 2