nipy
nipy

Reputation: 5498

Pandas string search in list of dicts

How can I search for the string 'data1' in the following Pandas dataframe?

This is where the string can be found:

df.test[0][0]['term']
'data1'

Further information regarding dataframe structure:

df.test[0]

[{'term': 'data1', 'a': "foo", 'b': "bar"},
 {'term': 'data2' ,'a': "foo", 'b': "bar"}]

type(df.test)
pandas.core.series.Series

type(df.test[0])
list

type(df.test[0][0])
dict

What have I tried?

I appreciate that something like df.test.str.contains('Data1') is required but I'm not sure how to do this with the nested list/dict data structure

Upvotes: 1

Views: 66

Answers (1)

jezrael
jezrael

Reputation: 863226

The easiest is convert to string, so test by string represenatation of list of dicts:

df.test.astype(str).str.contains('data1')

If need test by term key:

df['test'].apply(lambda x: any(y.get('term') == 'data1' for y in x))

Or by all values of dicts:

df['test'].apply(lambda x: any('data1' in y.values() for y in x))

Sample:

a = [{'term': 'data1', 'a': "foo", 'b': "bar"},
 {'term': 'data2' ,'a': "foo", 'b': "bar"}]
b = [{'term': 'data4', 'a': "foo", 'b': "bar"},
 {'term': 'data2' ,'a': "foo", 'b': "bar"}]
df = pd.DataFrame({"test": [a, b]})
print (df)
                                                test
0  [{'term': 'data1', 'a': 'foo', 'b': 'bar'}, {'...
1  [{'term': 'data4', 'a': 'foo', 'b': 'bar'}, {'...

print (df.test.astype(str).str.contains('data1'))
0     True
1    False
Name: test, dtype: bool

print (df['test'].apply(lambda x: any(y.get('term') == 'data1' for y in x)))
0     True
1    False
Name: test, dtype: bool

print (df['test'].apply(lambda x: any('data1' in y.values() for y in x)))
0     True
1    False
Name: test, dtype: bool

Upvotes: 2

Related Questions