user9270170
user9270170

Reputation:

TypeError: object of type 'float' has no len() when get dataframe length

The test.csv likes this:

device_id,upload_time
12345678901,2020-06-01 07:40:20+00:00
123456,2020-06-01 07:40:40+00:00
123456,2020-06-01 07:41:00+00:00
123456,2020-06-01 07:41:02+00:00
123456,2020-06-01 07:41:04+00:00
123456,2020-06-01 07:41:08+00:00
12345678901,2020-06-01 07:41:10+00:00
12345678901,2020-06-01 07:41:18+00:00
12345678901,2020-06-01 07:41:20+00:00
,2020-06-01 07:41:24+00:00
,2020-06-01 07:41:40+00:00
12345678901,2020-06-01 07:42:00+00:00
12345678901,2020-06-01 07:42:20+00:00
12345678901,2020-06-01 07:42:22+00:00
12345678901,2020-06-01 07:42:24+00:00
12345678901,2020-06-01 07:42:26+00:00
12345678901,2020-06-01 07:42:28+00:00
12345678901,2020-06-01 07:42:40+00:00
1234,2020-06-01 07:43:00+00:00
1234,2020-06-01 07:43:12+00:00

dataframe: enter image description here

You can convert deviceid to int or str, no problem. I use this code to get new dataframe.

import pandas as pd

df = pd.read_csv(r'test.csv', encoding='utf-8', parse_dates=[1])
df = df[pd.notnull(df['device_id'])] #Delete rows where device_id is null.
a = df[df['device_id'].map(len)!=11] #Get data whose device_id length is not 11.
b = df[df['device_id'].map(len)==11] #Get data whose device_id length is 11.

But the error message is:

TypeError: object of type 'float' has no len()

Where is wrong?

Upvotes: 1

Views: 1125

Answers (2)

Karthick Mohanraj
Karthick Mohanraj

Reputation: 1658

For the input file that you have specified, it looks like the device_id column is considered as a float datatype for some reason, although all values are int type. You will face an issue while trying to calculate the length due to this:

Example:

len('12345') 
#will give you len = 5, which is the correct length

whereas,

len('12345.0') 
#will give you len = 7, which is wrong since it considers the decimal point too

So it is better to convert your datatype to int and then perform the length check on the str version of the int column as below:

Reference:

  1. The len argument may be a sequence (string, tuple or list) or a mapping (dictionary). https://docs.python.org/2/library/functions.html#len

  2. Before calling the len function, you should verify if the argument is one of this type. You can call the method isinstance() to verify it. Take a look on how to use it. https://docs.python.org/2/library/functions.html#isinstance

So try this,

import pandas as pd

df = pd.read_csv(r'sample.csv', parse_dates=[1])
df = df[pd.notnull(df['device_id'])] #Delete rows where device_id is null.

#Convert to int
df['device_id'] = df['device_id'].astype(float).astype(int)

#len function cannot be computed on an int column directly. You should convert to str and then compute len
a = df[df['device_id'].astype(str).map(len)!=11]
b = df[df['device_id'].astype(str).map(len)==11]

Upvotes: 1

Subbu VidyaSekar
Subbu VidyaSekar

Reputation: 2615

Below code would help you

Converting the float value into string will help to know the number of digits.

import pandas as pd
df = pd.read_csv(r'test.csv', encoding='utf-8', parse_dates=[1])

# to remove the null(nan)
df = df.dropna()
or 
df = df[df['device_id'].isnull()==False]
or
df = df[df['device_id'].isna()==False]

a = df[df['device_id'].astype(str).map(len)!=11]
b = df[df['device_id'].astype(str).map(len)==11]

another approach

a = df[df['device_id'].astype(str).str.len()!=11]
b = df[df['device_id'].astype(str).str.len()==11]

another approach

a = df[df['device_id'].astype(str).apply(len)!=11]
b = df[df['device_id'].astype(str).apply(len)==11]

Upvotes: 0

Related Questions