Reputation: 15
I am trying to turn a series of heights that are in inches and turn them into cm amounts. below is the method I am using but am running into an issue that is also posted below. I have tried using regex but that did not work for me.
Calling the data head of a series
fighter_details.Height.head()
What the data looks like:
INDEX Data
0 NaN
1 5' 11"
2 6' 3"
3 5' 11"
4 5' 6"
Method I created to convert to cm
def inch_to_cm(x):
if x is np.NaN:
return x
else:
# format: '7\' 11"'
ht_ = x.split("' ")
ft_ = float(ht_[0])
in_ = float(ht_[1].replace("\"",""))
return ((12*ft_) + in_) * 2.54
Execution of method
fighter_details['Height'] = fighter_details['Height'].apply(inch_to_cm)
Error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [240], in <cell line: 1>()
----> 1 fighter_details['Height'] = fighter_details['Height'].apply(inch_to_cm)
File ~/opt/anaconda3/envs/book_env/lib/python3.8/site-packages/pandas/core/series.py:4108, in Series.apply(self, func, convert_dtype, args, **kwds)
4106 else:
4107 values = self.astype(object)._values
-> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype)
4110 if len(mapped) and isinstance(mapped[0], Series):
4111 # GH 25959 use pd.array instead of tolist
4112 # so extension arrays can be used
4113 return self._constructor_expanddim(pd_array(mapped), index=self.index)
File pandas/_libs/lib.pyx:2467, in pandas._libs.lib.map_infer()
Input In [239], in inch_to_cm(x)
3 return x
4 else:
5 # format: '7\' 11"'
----> 6 ht_ = x.split("' ")
7 ft_ = float(ht_[0])
8 in_ = float(ht_[1].replace("\"",""))
AttributeError: 'float' object has no attribute 'split'
Upvotes: 1
Views: 1029
Reputation: 262149
It looks like you're using the wrong column.
That said, better use a vectorial method for efficiency.
You can extract the ft/in components, convert each to cm and sum:
df['Data_cm'] = (df['Data']
.str.extract(r'(\d+)\'\s*(\d+)"')
.astype(float)
.mul([12*2.54, 2.54])
.sum(axis=1)
)
Output:
INDEX Data Data_cm
0 0 NaN 0.00
1 1 5' 11" 180.34
2 2 6' 3" 190.50
3 3 5' 11" 180.34
4 4 5' 6" 167.64
Upvotes: 2
Reputation: 142919
It seems the problem is that you use wrong column's name 'Height'
but it has to be 'Data'
.
Minimal working code:
import pandas as pd
import numpy as np
def inch_to_cm(x):
if x is np.NaN:
return x
else:
# format: '7\' 11"'
ht_ = x.split("' ")
ft_ = float(ht_[0])
in_ = float(ht_[1].replace("\"",""))
return ((12*ft_) + in_) * 2.54
fighter_details = pd.DataFrame({
"Data": [np.NaN, '5\' 11"', '6\' 3"', '5\' 11"', '5\' 6"']
})
print('\n--- before ---\n')
print(fighter_details)
fighter_details['Data'] = fighter_details['Data'].apply(inch_to_cm)
print('\n--- after ---\n')
print(fighter_details)
Result:
--- before ---
Data
0 NaN
1 5' 11"
2 6' 3"
3 5' 11"
4 5' 6"
--- after ---
Data
0 NaN
1 180.34
2 190.50
3 180.34
4 167.64
Upvotes: 0