epg32
epg32

Reputation: 15

Converting inches to CM on series

I am trying to turn a series of heights that are in inches and turn them into cm amounts. below is the method I am using but am running into an issue that is also posted below. I have tried using regex but that did not work for me.

Calling the data head of a series

fighter_details.Height.head()

What the data looks like:

INDEX   Data
0       NaN
1       5' 11"
2       6' 3"
3       5' 11"
4       5' 6"

Method I created to convert to cm

def inch_to_cm(x):
    if x is np.NaN:
        return x
    else:
        # format: '7\' 11"'
        ht_ = x.split("' ")
        ft_ = float(ht_[0])
        in_ = float(ht_[1].replace("\"",""))
        return ((12*ft_) + in_) * 2.54

Execution of method

fighter_details['Height'] = fighter_details['Height'].apply(inch_to_cm)

Error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [240], in <cell line: 1>()
----> 1 fighter_details['Height'] = fighter_details['Height'].apply(inch_to_cm)

File ~/opt/anaconda3/envs/book_env/lib/python3.8/site-packages/pandas/core/series.py:4108, in Series.apply(self, func, convert_dtype, args, **kwds)
   4106     else:
   4107         values = self.astype(object)._values
-> 4108         mapped = lib.map_infer(values, f, convert=convert_dtype)
   4110 if len(mapped) and isinstance(mapped[0], Series):
   4111     # GH 25959 use pd.array instead of tolist
   4112     # so extension arrays can be used
   4113     return self._constructor_expanddim(pd_array(mapped), index=self.index)

File pandas/_libs/lib.pyx:2467, in pandas._libs.lib.map_infer()

Input In [239], in inch_to_cm(x)
      3     return x
      4 else:
      5     # format: '7\' 11"'
----> 6     ht_ = x.split("' ")
      7     ft_ = float(ht_[0])
      8     in_ = float(ht_[1].replace("\"",""))

AttributeError: 'float' object has no attribute 'split'

Upvotes: 1

Views: 1029

Answers (2)

mozway
mozway

Reputation: 262149

It looks like you're using the wrong column.

That said, better use a vectorial method for efficiency.

You can extract the ft/in components, convert each to cm and sum:

df['Data_cm'] = (df['Data']
 .str.extract(r'(\d+)\'\s*(\d+)"')
 .astype(float)
 .mul([12*2.54, 2.54])
 .sum(axis=1)
 )

Output:

   INDEX    Data  Data_cm
0      0     NaN     0.00
1      1  5' 11"   180.34
2      2   6' 3"   190.50
3      3  5' 11"   180.34
4      4   5' 6"   167.64

Upvotes: 2

furas
furas

Reputation: 142919

It seems the problem is that you use wrong column's name 'Height' but it has to be 'Data'.


Minimal working code:

import pandas as pd
import numpy as np

def inch_to_cm(x):
    if x is np.NaN:
        return x
    else:
        # format: '7\' 11"'
        ht_ = x.split("' ")
        ft_ = float(ht_[0])
        in_ = float(ht_[1].replace("\"",""))
        return ((12*ft_) + in_) * 2.54

fighter_details = pd.DataFrame({
    "Data": [np.NaN, '5\' 11"', '6\' 3"', '5\' 11"', '5\' 6"']
})    

print('\n--- before ---\n')
print(fighter_details)

fighter_details['Data'] = fighter_details['Data'].apply(inch_to_cm)

print('\n--- after ---\n')
print(fighter_details)

Result:

--- before ---

     Data
0     NaN
1  5' 11"
2   6' 3"
3  5' 11"
4   5' 6"

--- after ---

     Data
0     NaN
1  180.34
2  190.50
3  180.34
4  167.64

Upvotes: 0

Related Questions