user9092346
user9092346

Reputation:

Text from Pandas Dataframe

I have a Pandas Dataframe that contains single incidents of sold amounts of gummibears, chocolate and mints. They are summed up and sorted by weeknumber. I now translate that into text, which is then send out by email, unsing the following method:

pd['text'] = 'In calendar week (' + pd['weeknumber'].map(str) + '), customers have bought ' + pd['gummibears'].map(str) + 'kg of gummibears, ' + pd['chocolate'].map(str) + 'kg of chocolate, as well as ' + pd['mint'].map(str) + 'kg of mints.'

Ideally, the result will give a nice text outlining the sales. However, it is possible that 0kg have been sold which then of course appears as well, looking something like this:

>>> "In calendar week 25, customers have bought 0kg of gummibears, 25kg of chocolate, as well as 0kg of mints."
>>> "In calendar week 26, customers have bought 6kg of gummibears, 0kg of chocolate, as well as 2kg of mints."

This works but is confusing for the reader. Is there an elegant way to filter out all instances with 0kg without nesting several loops? Preferably, the result from above would then look like this:

>>> "In calendar week 25, customers have bought 25kg of chocolate."
>>> "In calendar week 26, customers have bought 6kg of gummibears, as well as 2kg of mints."

Upvotes: 1

Views: 88

Answers (1)

jezrael
jezrael

Reputation: 862611

You can use custom function with numpy.where and boolean mask created by eq (==), but text has to be normalize for general solution:

df = pd.DataFrame({
         'weeknumber':[1,2,3,4,5,6],
         'gummibears':[7,8,9,4,0,0],
         'chocolate': [0,3,5,0,1,0],
         'mint':      [5,3,0,9,2,0]
})


def kg_to_string(col):
    return np.where(df[col].eq(0), '', ' ' + df[col].astype(str) + 'kg of '+ col +',')

start = 'In calendar week (' + df['weeknumber'].astype(str) + '), customers have bought'

#boolean mask if all columns are 0
mask = df[['gummibears','gummibears','mint']].eq(0).all(axis=1)
df['text'] =  start +  np.where(mask, ' nothing', kg_to_string('gummibears') + 
                                                  kg_to_string('chocolate') + 
                                                  kg_to_string('mint'))
#remove last ,
df['text'] = df['text'].str.rstrip(',')
print (df['text'].tolist())
['In calendar week (1), customers have bought 7kg of gummibears, 5kg of mint', 
 'In calendar week (2), customers have bought 8kg of gummibears, 3kg of chocolate,
                                              3kg of mint', 
 'In calendar week (3), customers have bought 9kg of gummibears, 5kg of chocolate',
 'In calendar week (4), customers have bought 4kg of gummibears, 9kg of mint', 
 'In calendar week (5), customers have bought 1kg of chocolate, 2kg of mint', 
 'In calendar week (6), customers have bought nothing']

Upvotes: 0

Related Questions