ire
ire

Reputation: 581

Filtering by a value that's in an array

I have the following DF:

print(test_df)

Out:
                           Action_flat Resource
0                        autoscaling:*      [*]
1                         cloudwatch:*      [*]
2           cloudformation:CreateStack      [*]
3   cloudformation:DescribeStackEvents      [*]
4               datapipeline:Describe*      [*]
5               datapipeline:Describe*      
6               datapipeline:Describe*      ['---', '---']
..                                 ...      ...

I want to filter this DF to only have rows where the Resource is [*]

This code however returns an empty DF:

test_df = test_df[test_df['Resource'] == '[*]']
print(test_df)

Out:
Empty DataFrame
Columns: [Action_flat, Resource]
Index: []

What is the proper way to filter by an array value?

Upvotes: 1

Views: 58

Answers (2)

Mayank Porwal
Mayank Porwal

Reputation: 34086

Use df.apply:

In [661]: df = df[df['Resource'].apply(lambda x: '*' in str(x))]

In [662]: df
Out[662]: 
                          Action_flat Resource
0                       autoscaling:*      [*]
1                        cloudwatch:*      [*]
2          cloudformation:CreateStack      [*]
3  cloudformation:DescribeStackEvents      [*]
4              datapipeline:Describe*      [*]

Upvotes: 2

anky
anky

Reputation: 75100

You can also use the str accessor when you have a list as a value:

output_df = df[df['Resource'].str[0].eq("*")]

If a string with multiple '*' can occur and you just want to check where one '*' is present , use another condition:

output_df = df[df['Resource'].str[0].eq("*") & df['Resource'].str.len().eq(1)]

Upvotes: 2

Related Questions