Reputation: 5922
I am currently mapping and renaming various string values to columns in pandas
via this function:
df["fundbenchmark"] = df["name"].map(lambda x: "American Express" if "AXP" in x else "Apple" if "AAPL" in x else "Google" if "GOOG" in x else "")
I will however do this for several columns and for many different renamings. Here is another column with identical content, so it would be repetitive to add all companies there also.
df["subclass"] = df["name"].map(lambda x: "American Express" if "AXP" in x else "Apple" if "AAPL" in x else "Google" if "GOOG" in x else "")
As such I would like to maintain a dictionary like:
companies = {"AXP": "American Express", "AAPL": "Apple", "GOOG": "Google"}
and call that for all instances when I need to map company names. How can I make it map
for matches in companies
instead of lambda x
?
Current DataFrame:
Name
"BULL AXP UN X3 VON"
"BEAR AXP UN X3 VON"
"BULL GOOG UN X5 VON"
"BEAR GOOG UN X5 VON"
"BEAR ABC123 X2 CBZ"
Desired output:
Name Fundbenchmark Subclass
"BULL AXP UN X3 VON" "American Express" "American Express"
"BEAR AXP UN X3 VON" "American Express" "American Express"
"BULL GOOG UN X5 VON" "Google" "Google"
"BEAR GOOG UN X5 VON" "Google" "Google"
"BEAR ABC123 X2 CBZ" "BEAR ABC123 X2 CBZ" "BEAR ABC123 X2 CBZ" #Not in Dictionary
Dictionary:
companies = {"AXP": "American Express", "GOOG": "Google"}
So if abbreviation exists in dictionary, then write that name to other columns.
Or, if the abbreviation is not in the dictionary, duplicate the whole cell.
Upvotes: 1
Views: 7558
Reputation: 4511
You can use map
Current DataFrame:
Name
"BULL AXP UN X3 VON"
"BEAR AXP UN X3 VON"
"BULL GOOG UN X5 VON"
"BEAR GOOG UN X5 VON"
"BEAR ABC123 X2 CBZ"
companies = {"AXP": "American Express", "GOOG": "Google"}
we create a new column that extract the tickers from your column.
df['Tickers'] = df.Name.str.split(' ').apply(lambda x: x[1])
We then use map your dictionary companies
to the tickers to create a column with the names of the ticker symbols:
df['Ticker_Name'] = df['Tickers'].map(companies)
Upvotes: 2
Reputation: 30404
I doubt this is the most elegant way, but it should do the trick:
df['fbm'] = df['name']
for i in companies:
df.loc[ df.name.str.contains(i), 'fbm' ] = companies[i]
name fbm
0 BULL AXP UN X3 VON American Express
1 BEAR AXP UN X3 VON American Express
2 BULL GOOG UN X5 VON Google
3 BEAR GOOG UN X5 VON Google
4 BEAR ABC123 X2 CBZ BEAR ABC123 X2 CBZ
One thing to keep in mind here is that because this is not a dictionary lookup, you could have more than one match. For example, 'ABC' and 'UN' are both valid tickers and 'BEAR' is or was. With this method, the last match will be kept and any prior matches discarded.
Upvotes: 1
Reputation: 7822
Use Replace:
http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.replace.html
df = pd.DataFrame({'A':['string1','string2','string3'],
'B':['StringA','StringB','StringC']})
Creates:
A B
0 string1 StringA
1 string2 StringB
2 string3 StringC
Then map your replacements in a dictionary:
to_replace = {'string1':'replace1','StringC':'replaceC'}
Then replace:
df.replace(to_replace)
A B
0 replace1 StringA
1 string2 StringB
2 string3 replaceC
Upvotes: 1