Reputation: 2414
I have a pandas.DataFrame
:
index question_id tag
0 1858 [pset3, game-of-fifteen]
1 2409 [pset4]
2 4346 [pset6, cs50submit]
3 9139 [pset8, pset5, gradebook]
4 9631 [pset4, recover]
I need to remove every string from list of strings in tag
column except pset*
strings.
So I need to end with something like this:
index question_id tag
0 1858 [pset3]
1 2409 [pset4]
2 4346 [pset6]
3 9139 [pset8, pset5]
4 9631 [pset4]
How can I do that please?
Upvotes: 1
Views: 1590
Reputation: 38415
You can even use python in operator
df.tag = df.tag.apply(lambda x: [elem for elem in x if 'pset' in elem])
0 [pset3]
1 [pset4]
2 [pset6]
3 [pset8, pset5]
4 [pset4]
Upvotes: 2
Reputation: 36608
You can apply a function to the tag
series that constructs a list using only the elements that start with 'pset'
df.tag.apply(lambda x: [xx for xx in x if xx.startswith('pset')])
# returns:
0 [pset3]
1 [pset4]
2 [pset6]
3 [pset8, pset5]
4 [pset4]
Upvotes: 2
Reputation: 214957
One option: Use apply
method to loop through the items in the tag
column; for each item, use a list comprehension to filter strings based on the prefix using startswith
method:
df['tag'] = df.tag.apply(lambda lst: [x for x in lst if x.startswith("pset")])
df
Upvotes: 2