aartist
aartist

Reputation: 3236

Select values from available list in pandas

Following code works and display the desired results. I like to select values for the SOURCE column only from available list in the reverse order, if there are multiple values.

import pandas as pd
available = ['a','b']
df = pd.DataFrame.from_dict({'SOURCE': ['x-a', 'b-y-z', 'c'] })
for entry in df['SOURCE']:
    if not  '-' in entry: continue
    for col in entry.split("-")[::-1]:
        if col in available:
            df.loc[ df['SOURCE'] == entry,'SOURCE'] = col
            break
print(df)

Output:
  SOURCE
0      a
1      b
2      c

Is there a more Pythonic way to do it?

Update: Characters are just place Holder for strings in actual problem. If I don't find match in available list, it should return the original value.

Upvotes: 1

Views: 71

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150735

You can use str.extract:

pat = '|'.join(available[::-1])
df['SOURCE'] = df.SOURCE.str.extract(f'({pat})').fillna(df['SOURCE'])

Output:

  SOURCE
0      a
1      b
2      c

Upvotes: 1

Related Questions