Reputation: 25
I need to replace the values from a column. The values from the column need not to be exact match, so I use str.find(). Once it scanned the string, it should replace the values from the dictionary.
I achieved the desired result for one instance, but I need to do it multiple times.
I tried creating a function but it didn't work. It only worked for the last dictionary value.
dictionary = {"AA" : "111", "BB" : "222", "CC": "333,444"}
#result = []
for k, v in dictionary.items():
df["renamed"] = np.nan
df.loc[(df["combined_topic"].str.find(k) != -1), "renamed"] = v
#result.extend(df["renamed"].to_dict(orient="records"))
How should I fix my code? or can you suggest more efficient way to replace multiple values.
Expected output:
combined_topic renamed
AA, harvard 111
Diliman, Technology, BB 222
Cat, Dog, CC, Bull 333, 444
``
Upvotes: 1
Views: 215
Reputation: 863301
Use Series.str.extract
for get first matched value of dictionary and then Series.map
by dict:
pat = '|'.join(dictionary)
df['renamed'] = df['combined_topic'].str.extract('('+ pat + ')', expand=False).map(dictionary)
print (df)
combined_topic renamed
0 AA, harvard 111
1 Diliman, Technology, BB 222
2 Cat, Dog, CC, Bull 333,444
Your solution houl be used with Series.str.contains
, but mainly remove df["renamed"] = np.nan
, because data are always overwritten in each loop:
for k, v in dictionary.items():
df.loc[df["combined_topic"].str.contains(k), "renamed"] = v
Or:
for k, v in dictionary.items():
df.loc[(df["combined_topic"].str.find(k) != -1), "renamed"] = v
Upvotes: 2