Archit
Archit

Reputation: 588

getting error: TypeError: object of type 'float' has no len() in pandas

I have a pandas dataframe df

import numpy as np
import pandas as pd

df = pd.DataFrame({"ID": [2,3,4,5,6,7,8,9,10],
      "type" :["A", "B", "B", "A", "A", "B", "A", "A", "A"],
      "F_ID" :["0", "[7 8 9]", "[10]", "0", "[2]", "0", "0", "0", "0"]})

# convert the string representations of list structures to actual lists
F_ID_as_series_of_lists = df["F_ID"].str.replace("[","").str.replace("]","").str.split(" ")

#type(F_ID_as_series_of_lists) is pd.Series, make it a list for pd.DataFrame.from_records
F_ID_as_records = list(F_ID_as_series_of_lists)

f_id_df = pd.DataFrame.from_records(list(F_ID_as_records)).fillna(np.nan)

I am getting an error in the line:

f_id_df = pd.DataFrame.from_records(list(F_ID_as_records)).fillna(np.nan)

Error is: TypeError: object of type 'float' has no len()

how can i solve this ?

Upvotes: 3

Views: 5304

Answers (2)

Sean O'Malley
Sean O'Malley

Reputation: 216

There is another way using list comprehensions and utilizing what we've learned from the type error itself.

Say that you have a pandas series that is a string data type, and you want to split the column into two parts given the '/' symbol, but but not all columns are populated.

pd.DataFrame({'TEXT_COLUMN' : ['12/4', '54/19', np.NaN, '89/33']})

..and we want to divide that column into two different columns, but we know pandas will mess this up when we put it back into a DataFrame, so let's put it in a list:

split_list = list(df.TEXT_COLUMN.str.split('/'))

The split_list returns, and we can see why we get a float error when attempting to parse:

>> [['12','4'],['54','19'], np.NaN, ['89','33']]

Now that we have that list, we want to then place it in a comprehension that corrects for the null value issue. We can do so by creating a conditional on type within the comprehension:

better_split_list = [x if type(x) != np.float else [None,None] for x in split_list]

The better_split_list returns:

>> [['12','4'],['54','19'], [None,None], ['89','33']]

This puts us in a good place to then place the lists of lists into a its own pandas DataFrame with the columns being separated in a more robust way:

pd.DataFrame(better_split_list, columns = ['VALUE_1','VALUE_2'])

Upvotes: 1

jezrael
jezrael

Reputation: 862481

Problem is some None or NaN values obviously, but if use str.split with parameter expand=True for new DataFrame it handling correctly.

Also instead replace is possible use str.strip:

df = pd.DataFrame({"ID": [2,3,4,5,6,7,8,9,10],
      "type" :["A", "B", "B", "A", "A", "B", "A", "A", "A"],
      "F_ID" :[None, "[7 8 9]", "[10]", np.nan, "[2]", "0", "0", "0", "0"]})

print (df)
   ID type     F_ID
0   2    A     None
1   3    B  [7 8 9]
2   4    B     [10]
3   5    A      NaN
4   6    A      [2]
5   7    B        0
6   8    A        0
7   9    A        0
8  10    A        0

f_id_df = df["F_ID"].str.strip("[]").str.split(expand=True)
print (f_id_df)
      0     1     2
0  None  None  None
1     7     8     9
2    10  None  None
3   NaN   NaN   NaN
4     2  None  None
5     0  None  None
6     0  None  None
7     0  None  None
8     0  None  None

Last if need convert values to numeric:

f_id_df = df["F_ID"].str.strip("[]").str.split(expand=True).astype(float)
print (f_id_df)
      0    1    2
0   NaN  NaN  NaN
1   7.0  8.0  9.0
2  10.0  NaN  NaN
3   NaN  NaN  NaN
4   2.0  NaN  NaN
5   0.0  NaN  NaN
6   0.0  NaN  NaN
7   0.0  NaN  NaN
8   0.0  NaN  NaN

Upvotes: 1

Related Questions