Gufran Pathan
Gufran Pathan

Reputation: 331

How to identify a pandas column is a list

I want to identify if a column in pandas is a list (in each row).

df=pd.DataFrame({'X': [1, 2, 3], 'Y': [[34],[37,45],[48,50,57]],'Z':['A','B','C']})

df
Out[160]: 
   X             Y  Z
0  1          [34]  A
1  2      [37, 45]  B
2  3  [48, 50, 57]  C

df.dtypes
Out[161]: 
X     int64
Y    object
Z    object
dtype: object

Since the dtype of strings is "object", I'm unable to distinguish between columns that are strings and lists (of integer or strings).

How do I identify that column "Y" is a list of int?

Upvotes: 15

Views: 10460

Answers (2)

jezrael
jezrael

Reputation: 862741

You can use map (or applymap for pandas versions prior to v2.1.0) to generate the type and then compare to the desired type and then use all to check if all values are True:

print (df.map(type))
               X               Y              Z
0  <class 'int'>  <class 'list'>  <class 'str'>
1  <class 'int'>  <class 'list'>  <class 'str'>
2  <class 'int'>  <class 'list'>  <class 'str'>

a = (df.map(type) == list).all()
print (a)
X    False
Y     True
Z    False
dtype: bool

Or:

a = df.map(lambda x: isinstance(x, list)).all()
print (a)
X    False
Y     True
Z    False
dtype: bool

And if need list of columns:

L = a.index[a].tolist()
print (L)
['Y']

If want check dtypes (but strings, list, dict are objects):

print (df.dtypes)
X     int64
Y    object
Z    object
dtype: object

a = df.dtypes == 'int64'
print (a)
X     True
Y    False
Z    False
dtype: bool

Upvotes: 17

amorim-ds
amorim-ds

Reputation: 106

If your dataset is big, you should take a sample before apply the type function, then you can check:

If the the most common type is list:

df\
.sample(100)\
.map(type)\  # use .applymap(type) prior to v2.1.0
.mode(0)\
.astype(str) == "<class 'list'>"

If all values are list:

(df\
.sample(100)\
.map(type)\  # use .applymap(type) prior to v2.1.0
.astype(str) == "<class 'list'>")\
.all(0)

If any values are list:

(df\
.sample(100)\
.map(type)\  # use .applymap(type) prior to v2.1.0
.astype(str) == "<class 'list'>")\
.any(0)

Upvotes: 4

Related Questions