Reputation: 321
I have a series of SKUs in a DataFrame: [35641, 265689494123, 36492, 56526246546, 26412...]
.
The problem is that the long barcodes (like 56526246546) in the DataFrame need to be truncated at certain points. The length over 5 should trigger the deletion process, which truncates like [7:12] in a list.
I tried using the following code without any prevail:
if df.loc[len(df['SKU']) > 5]:
df.loc[df['SKU'].df.slice(start=7,stop=12)]
I get following error messages:
KeyError Traceback (most recent call last)
c:\users\User\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()
KeyError: True
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-64-cea7b4ca2640> in <module>
1 #g[:] = (elem[:12] for elem in g)
----> 2 if df.loc[len(df['SKU']) > 5]:
3 df.loc[df['SKU'].df.slice(start=7,stop=12)]
c:\users\User\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexing.py in __getitem__(self, key)
1498
1499 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1500 return self._getitem_axis(maybe_callable, axis=axis)
1501
1502 def _is_scalar_access(self, key):
c:\users\User\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1911 # fall thru to straight lookup
1912 self._validate_key(key, axis)
-> 1913 return self._get_label(key, axis=axis)
1914
1915
c:\users\User\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis)
139 raise IndexingError('no slices here, handle elsewhere')
140
--> 141 return self.obj._xs(label, axis=axis)
142
143 def _get_loc(self, key, axis=None):
c:\users\User\appdata\local\programs\python\python37\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, drop_level)
3583 drop_level=drop_level)
3584 else:
-> 3585 loc = self.index.get_loc(key)
3586
3587 if isinstance(loc, np.ndarray):
c:\users\User\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2657 return self._engine.get_loc(key)
2658 except KeyError:
-> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key))
2660 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2661 if indexer.ndim > 1 or indexer.size > 1:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()
KeyError: True
How do I fix this code?
P.S Some of the error messages seem to be popping up due to the fact that I've added the code BEFORE converting the dict into a DataFrame.
Upvotes: 2
Views: 56
Reputation: 12417
According to the output you want, I think you can use:
df['SKU'] = df['SKU'].apply(lambda x: int(str(x)[6:11]) if len(str(x)) > 5 else x)
Output:
SKU
0 35641
1 49412
2 36492
3 46546
4 26412
Upvotes: 1
Reputation: 1216
Here is my suggestion:
df.loc[:, 'SKU'] = df.loc[:, 'SKU'].astype(str).apply(lambda x: x[7:12] if len(x) > 5 else x)
Upvotes: 0