Kewl
Kewl

Reputation: 3417

Checking if a data series is strings

I want to check if a column in a dataframe contains strings. I would have thought this could be done just by checking dtype, but that isn't the case. A pandas series that contains strings just has dtype 'object', which is also used for other data structures (like lists):

df = pd.DataFrame({'a': [1,2,3], 'b': ['Hello', '1', '2'], 'c': [[1],[2],[3]]})

df = pd.DataFrame({'a': [1,2,3], 'b': ['Hello', '1', '2'], 'c': [[1],[2],[3]]})
print(df['a'].dtype)
print(df['b'].dtype)
print(df['c'].dtype)

Produces:

int64
object
object

Is there some way of checking if a column contains only strings?

Upvotes: 9

Views: 14188

Answers (2)

David Bern
David Bern

Reputation: 776

You could map the data with a function that converts all the elements to True or False if they are equal to str-type or not, then just check if the list contains any False elements

The example below tests a list containing element other then str. It will tell you True if data of other type is present

test = [1, 2, '3']
False in map((lambda x: type(x) == str), test)

Output: True

Upvotes: 1

piRSquared
piRSquared

Reputation: 294218

You can use this to see if all elements in a column are strings

df.applymap(type).eq(str).all()

a    False
b     True
c    False
dtype: bool

To just check if any are strings

df.applymap(type).eq(str).any()

Upvotes: 20

Related Questions