H K
H K

Reputation: 83

How to take a list of items and create a condition using them all

So basically I want to create a function that takes in a bunch of strings, checks if a particular column has that string then returns a boolean expression. I can easily do this with a single string. But I'm stumped on how to do it as a list of strings.

# Single String Example
def mask(x, df):
    return df.description.str.contains(x)
df[mask('sql')]

# Some kind of example of what I want
def mask(x, df):
    return df.description.str.contains(x[0]) & df.description.str.contains(x[1]) & df.description.str.contains(x[2]) & ...
df[mask(['sql'])]

Any help would be appreciated :)

So it looks like I figured out a way to do it, little unorthodox but seems to be working anyway. Solution below

def mask(x):
    X = np.prod([df.description.str.contains(i) for i in x], axis = 0)
    return [True if i == 1 else False for i in X]
my_selection = df[mask(['sql', 'python'], df)]

Upvotes: 2

Views: 84

Answers (2)

H K
H K

Reputation: 83

Managed to work out a solution here:

def mask(x):
    X = np.prod([df.description.str.contains(i) for i in x], axis = 0)
    return [True if i == 1 else False for i in X]
mine = df[mask(['sql', 'python'], df)]

A little unorthodox so if anyone has anything better will be appreciated

Upvotes: 0

U13-Forward
U13-Forward

Reputation: 71610

Try using:

def mask(x, df):
    return df.description.str.contains(''.join(map('(?=.*%s)'.__mod__, x)))
df[mask(['a', 'b'], df)]

The (?=.*<word>) one after another is really an and operator.

Upvotes: 1

Related Questions