Reputation: 5498
How can I search for the string 'data1'
in the following Pandas dataframe?
This is where the string can be found:
df.test[0][0]['term']
'data1'
Further information regarding dataframe structure:
df.test[0]
[{'term': 'data1', 'a': "foo", 'b': "bar"},
{'term': 'data2' ,'a': "foo", 'b': "bar"}]
type(df.test)
pandas.core.series.Series
type(df.test[0])
list
type(df.test[0][0])
dict
What have I tried?
I appreciate that something like df.test.str.contains('Data1')
is required but I'm not sure how to do this with the nested list/dict data structure
Upvotes: 1
Views: 66
Reputation: 863226
The easiest is convert to string, so test by string represenatation of list of dicts:
df.test.astype(str).str.contains('data1')
If need test by term
key:
df['test'].apply(lambda x: any(y.get('term') == 'data1' for y in x))
Or by all values of dicts:
df['test'].apply(lambda x: any('data1' in y.values() for y in x))
Sample:
a = [{'term': 'data1', 'a': "foo", 'b': "bar"},
{'term': 'data2' ,'a': "foo", 'b': "bar"}]
b = [{'term': 'data4', 'a': "foo", 'b': "bar"},
{'term': 'data2' ,'a': "foo", 'b': "bar"}]
df = pd.DataFrame({"test": [a, b]})
print (df)
test
0 [{'term': 'data1', 'a': 'foo', 'b': 'bar'}, {'...
1 [{'term': 'data4', 'a': 'foo', 'b': 'bar'}, {'...
print (df.test.astype(str).str.contains('data1'))
0 True
1 False
Name: test, dtype: bool
print (df['test'].apply(lambda x: any(y.get('term') == 'data1' for y in x)))
0 True
1 False
Name: test, dtype: bool
print (df['test'].apply(lambda x: any('data1' in y.values() for y in x)))
0 True
1 False
Name: test, dtype: bool
Upvotes: 2