Reputation: 4928
MRE:
dictionary = {'2018-10': 50, '2018-11': 76}
df = pd.DataFrame({
"date":["2018-10", "2018-10", "2018-10", "2018-11","2018-11"]
})
that looks like (I have milions of rows and multiple rows):
date
0 2018-10
1 2018-10
2 2018-10
3 2018-11
4 2018-11
depending on date, in the dictionary there is number associated to it. I want to concatenate that associated number into date column (using vectorization).
so my desired dataframe would look like:
date
0 2018-10 (50)
1 2018-10 (50)
2 2018-10 (50)
3 2018-11 (76)
4 2018-11 (76)
my date column has datatype string.
Current solution: I could use apply lambda:
df["date"] = df["date"].apply(lambda row:row + f" ({dictionary[row]})")
however I am wondering if there is any way to do it vectorized way since I have millions of rows and do not want to go row by row.
EDIT: Now I think of it I don't think there can be a vectorized way since depending on date I need to concat different numbers.
Upvotes: 0
Views: 119
Reputation: 13750
pd.Series.map
can take a dict
as the mapping, and strings and string columns can be added, so it's actually as easy as
df['date'] = df['date'] + ' (' + df['date'].map(dictionary).astype(str) + ')'
Upvotes: 1
Reputation: 11100
So I'm not 100% that this is the fastest way to do things but it is fairly simple.
data = {'2018-10': 50, '2018-11': 76}
df = pd.DataFrame({
"date":["2018-10", "2018-10", "2018-10", "2018-11","2018-11"]
})
df["data"] = df.date.apply(lambda x: data[x])
Which yields:
date data
0 2018-10 50
1 2018-10 50
2 2018-10 50
3 2018-11 76
4 2018-11 76
Alternatively to df.date.apply(lambda x: data[x])
you could use
df.apply(lambda x: data[x['date']],axis=1)
Which I believe would perform similarly but it's less readable imo.
Upvotes: 0