Reputation: 3417
I want to check if a column in a dataframe contains strings. I would have thought this could be done just by checking dtype, but that isn't the case. A pandas series that contains strings just has dtype 'object', which is also used for other data structures (like lists):
df = pd.DataFrame({'a': [1,2,3], 'b': ['Hello', '1', '2'], 'c': [[1],[2],[3]]})
df = pd.DataFrame({'a': [1,2,3], 'b': ['Hello', '1', '2'], 'c': [[1],[2],[3]]})
print(df['a'].dtype)
print(df['b'].dtype)
print(df['c'].dtype)
Produces:
int64
object
object
Is there some way of checking if a column contains only strings?
Upvotes: 9
Views: 14188
Reputation: 776
You could map the data with a function that converts all the elements to True or False if they are equal to str-type or not, then just check if the list contains any False
elements
The example below tests a list containing element other then str. It will tell you True
if data of other type is present
test = [1, 2, '3']
False in map((lambda x: type(x) == str), test)
Output: True
Upvotes: 1
Reputation: 294218
You can use this to see if all elements in a column are strings
df.applymap(type).eq(str).all()
a False
b True
c False
dtype: bool
To just check if any are strings
df.applymap(type).eq(str).any()
Upvotes: 20