Reputation: 581
I have the following DF:
print(test_df)
Out:
Action_flat Resource
0 autoscaling:* [*]
1 cloudwatch:* [*]
2 cloudformation:CreateStack [*]
3 cloudformation:DescribeStackEvents [*]
4 datapipeline:Describe* [*]
5 datapipeline:Describe*
6 datapipeline:Describe* ['---', '---']
.. ... ...
I want to filter this DF to only have rows where the Resource is [*]
This code however returns an empty DF:
test_df = test_df[test_df['Resource'] == '[*]']
print(test_df)
Out:
Empty DataFrame
Columns: [Action_flat, Resource]
Index: []
What is the proper way to filter by an array value?
Upvotes: 1
Views: 58
Reputation: 34086
Use df.apply
:
In [661]: df = df[df['Resource'].apply(lambda x: '*' in str(x))]
In [662]: df
Out[662]:
Action_flat Resource
0 autoscaling:* [*]
1 cloudwatch:* [*]
2 cloudformation:CreateStack [*]
3 cloudformation:DescribeStackEvents [*]
4 datapipeline:Describe* [*]
Upvotes: 2
Reputation: 75100
You can also use the str
accessor when you have a list as a value:
output_df = df[df['Resource'].str[0].eq("*")]
If a string with multiple '*'
can occur and you just want to check where one '*'
is present , use another condition:
output_df = df[df['Resource'].str[0].eq("*") & df['Resource'].str.len().eq(1)]
Upvotes: 2