Reputation: 445
I have a MySQL table as shown below:
ID | article | price | promo_price | delivery_days | stock | received_on
17591 03D/6H 3082.00 1716.21 30 0 2019-03-20
29315 03D/6H 3082.00 1716.21 26 0 2019-03-24
47796 03D/6H 3082.00 1716.21 24 0 2019-03-25
22016 L1620S 685.00 384.81 0 3 2019-03-20
35043 L1620S 685.00 384.81 0 2 2019-03-24
53731 L1620S 685.00 384.81 0 2 2019-03-25
I created a pivot table to monitor the stock data.
md = df.pivot_table(
values='stock',
index=['article','price', 'promo_price','delivery_days'],
columns='received_on',
aggfunc=np.sum)
dates = md.columns.tolist()
dates.sort(reverse=True)
md = md[dates]
This is the resuslt
+---------------------------------+--------------+--------------+--------------+
| | 2019-03-25 | 2019-03-24 | 2019-03-20 |
|---------------------------------+--------------+--------------+--------------|
| ('03D/6H', 3082.0, 1716.21, 24) | 0 | nan | nan |
| ('03D/6H', 3082.0, 1716.21, 26) | nan | 0 | nan |
| ('03D/6H', 3082.0, 1716.21, 30) | nan | nan | 0 |
| ('L1620S-KD', 685.0, 384.81, 0) | 2 | 2 | 3 |
+---------------------------------+--------------+--------------+--------------+
How do I filter the rows and get the price, promo_price and delivery days of an article based on the recent stock received date?
For ex: I want the stock info for all the days but price, promo_price and delivery days of only 2019-03-25 as shown below
+---------------------------------+--------------+--------------+--------------+
| | 2019-03-25 | 2019-03-24 | 2019-03-20 |
|---------------------------------+--------------+--------------+--------------|
| ('03D/6H', 3082.0, 1716.21, 24) | 0 | nan | nan |
| ('L1620S', 685.0, 384.81, 0) | 2 | 2 | 3 |
+---------------------------------+--------------+--------------+--------------+
EDIT:
If there is no change in price, promo_price and delivery days, I am getting the result as expected. But if there is any change in the values, then I am getting multiple rows for the same article.
Article L1620S data is as expected. But article 03D/6H resulted in three rows.
Upvotes: 1
Views: 1034
Reputation: 862481
You can use:
df['received_on'] = pd.to_datetime(df['received_on'])
md = df.pivot_table(
values='stock',
index=['article','price', 'promo_price','delivery_days'],
columns='received_on',
aggfunc=np.sum)
#sorting columns in descending order
md = md.sort_index(axis=1, ascending=False)
#remove missing rows in first column
md = md.dropna(subset=[md.columns[0]])
#another solution
#md = md[md.iloc[:, 0].notna()]
print (md)
received_on 2019-03-25 2019-03-24 2019-03-20
article price promo_price delivery_days
03D/6H 3082.0 1716.21 24 0.0 NaN NaN
L1620S 685.0 384.81 0 2.0 2.0 3.0
EDIT: First filter by first level and then by position - first row:
md = md.sort_index(axis=1, ascending=False)
idx = pd.IndexSlice
md1 = md.loc[idx['03D/6H',:,:],:].iloc[[0]]
print (md1)
received_on 2019-03-25 2019-03-24 2019-03-20
article price promo_price delivery_days
03D/6H 3082.0 1716.21 24 0.0 NaN NaN
Upvotes: 1