How to identify a pandas column is a list

Question

I want to identify if a column in pandas is a list (in each row).

df=pd.DataFrame({'X': [1, 2, 3], 'Y': [[34],[37,45],[48,50,57]],'Z':['A','B','C']})

df
Out[160]: 
   X             Y  Z
0  1          [34]  A
1  2      [37, 45]  B
2  3  [48, 50, 57]  C

df.dtypes
Out[161]: 
X     int64
Y    object
Z    object
dtype: object

Since the dtype of strings is "object", I'm unable to distinguish between columns that are strings and lists (of integer or strings).

How do I identify that column "Y" is a list of int?

amorim-ds · Accepted Answer

If your dataset is big, you should take a sample before apply the type function, then you can check:

If the the most common type is list:

df\
.sample(100)\
.map(type)\  # use .applymap(type) prior to v2.1.0
.mode(0)\
.astype(str) == ""

If all values are list:

(df\
.sample(100)\
.map(type)\  # use .applymap(type) prior to v2.1.0
.astype(str) == "")\
.all(0)

If any values are list:

(df\
.sample(100)\
.map(type)\  # use .applymap(type) prior to v2.1.0
.astype(str) == "")\
.any(0)

How to identify a pandas column is a list

Answers (2)

Related Questions