Reputation: 1945
Good Morning, I have the following df:
print(df)
Date cod_id Sales Initial_stock
01/01/2017 1 5 5
01/01/2017 2 4 8
02/01/2017 1 1 5
...
Since there are a few mistakes in the real dataset, regarding "Initial_stock", I would like to create a new column, for the different cod_ids(= products), as:
Initial stock in the previous row of that cod_id + current value of initial stock - Sales; so:
print(df_final)
Date cod_id Sales Initial_stock new
01/01/2017 1 5 5 0
01/01/2017 2 4 8 4
02/01/2017 1 1 5 4
...
In which the last value equals to 4 of "cod_id 1" is computed as: 0 + 5 - 1 = 4
Upvotes: 0
Views: 64
Reputation: 1477
import pandas as pd
from pandas import DataFrame
d = {'cod_id': [1, 2, 1], 'Sales': [5,4,1], 'Initial_stock': [5,8,5]}#my initil data
#######show purpose#######
df = pd.DataFrame(data=d)#I print the dataframe of my initial data
print (df)
##########################
new=[]#declare a new list where I'll introduce all the new values
i=0
#I create a loop for element present in my initial list and for each subelement present I calculate the new one
while i <len(d['cod_id']):
new_value=(d['Initial_stock'][i])-(d['Sales'][i])#clculation new=initial_stock-sales
new.append(new_value)#append my new value in the new list
i+=1
#######show purpose#######
print (new)#print my new list to show that the calculation is correct
##########################
d['new']=new#add my new data to the original list
#######show purpose#######
df = pd.DataFrame(data=d)#create the data frame with my new values and print it again
print (df)
##########################
Upvotes: 1