P A N
P A N

Reputation: 5922

Python str.contains from two or more dictionaries

I want to check if a string contains one or more values from two dictionaries.

company = {"AXP": "American Express", "BIDU": "Baidu"}
stock_index = {"GOOG": "Google"}

for c, i in zip(company, stock_index):
    df.loc[df.name.str.contains(c, i), "instrumentclass"] = "Equity"

For some reason, it only writes "Equity" for the first match in the dictionaries, i.e. "AXP":"American Express". For "Baidu"and "Google", nothing happens.

I know that I can combine the dictionaries to one as seen below, but I would prefer not to.

benchmarks = company.copy()
benchmarks.update(stock_index)

The data is written and retrieved with help of a pandas DataFrame.

import pandas as pd
df = pd.DataFrame(["LONG AXP", "SHORT AXP", "LONG BIDU", "LONG GOOG"], columns=["name"])

The code copies the column name to column instrumentclass and by doing this is supposed to substitute each cell to "Equity" if it contains "AXP", "BIDU" or "GOOG".

Upvotes: 1

Views: 463

Answers (1)

pawroman
pawroman

Reputation: 1300

Why don't you start by breaking down this data, like this:

df = pd.DataFrame(["LONG AXP", "SHORT AXP", "LONG BIDU", "LONG GOOG"],
                  columns=["name"])

# split on spaces and get the last part
df["company_name"] = df.name.str.split().str.get(-1)

>>> print df
        name company_name
0   LONG AXP          AXP
1  SHORT AXP          AXP
2  LONG BIDU         BIDU
3  LONG GOOG         GOOG

Now, it's much easier to work with these strings. Given this is a sample of your dictionaries:

company = {"AXP": "American Express", "BIDU": "Baidu"}
stock_index = {"GOOG": "Google"}

You can exploit "dictonary views" which behave like sets in Python:

# this is Python 2, if you use Python 3, .keys() method returns a view
all_companies = company.viewkeys() | stock_index.viewkeys()

>>> print all_companies
{'AXP', 'BIDU', 'GOOG'}

So now, we have a set-like object we can use to filter on the data and set "Equity":

df.loc[df.company_name.isin(all_companies), "instrumentclass"] = "Equity"

If you are concerned about not joining these dictionaries like that, you might want to consider using something like a ChainMap: https://docs.python.org/3/library/collections.html#collections.ChainMap That's Python 3 standard library, but backports to Python 2 should exist.

Upvotes: 2

Related Questions