Reputation: 3502
I have a dataframe with 52 columns with sales data (weekly) and 52 columns with the sold quantity. I want to create another 52 colums with the price per soldunit.
The division does not work for some reason. I get
ValueError: Wrong number of items passed 2, placement implies 1
Also, can this be achieved in a more elegant (pandas way) by removing the for loop?
This is what I have...
#calculate cost per unit
for i in range(0,52):
cw = i + 1
i = i + 5
print cw
data['Cost' + str(cw)] = data.iloc[:,i:i+1] / sap_data.iloc[:,i+52:i+1+52]
Upvotes: 2
Views: 3705
Reputation: 6585
You can use Pandas pandas.DataFrame.divide function. Is this what you're looking for?
df = pd.DataFrame({'sales 1': [100,200,300], 'sales 2': [400,500,600], 'quantity 1': [10,20,30], 'quantity 2': [40,50,60]})
print(df)
quantity 1 quantity 2 sales 1 sales 2
0 10 40 100 400
1 20 50 200 500
2 30 60 300 600
Creating 2 new data frames, one for the sales
columns df1
and the other one of the quantity
columns df2
. In your case, you would have to set it to 52 instead of 2.
df1 = df.iloc[:, :2]
df2 = df.iloc[:,2:]
print(df1)
quantity 1 quantity 2
0 10 40
1 20 50
2 30 60
print(df2)
sales 1 sales 2
0 100 400
1 200 500
2 300 600
Renaming the columns in df2
with the names from df1
.
df2.columns = df1.columns
Then, creating df3
, a new data frame.
df3 = df2.divide(df1, axis= 1)
df3
is the result of the division you need. If you want to have all of this data in one data frame, all you have to do is to rename the columns and concatenate the 3 data frames.
print(df3)
quantity 1 quantity 2
0 10.0 10.0
1 10.0 10.0
2 10.0 10.0
Upvotes: 1
Reputation: 210832
Demo of vectorized Pandas solution:
In [23]: df = pd.DataFrame(np.random.randint(1, 8, (5, 6)),
columns=['amt01','amt02','amt03','qty01','qty02','qty03'])
In [24]: df
Out[24]:
amt01 amt02 amt03 qty01 qty02 qty03
0 2 2 6 2 3 6
1 4 3 7 3 7 5
2 3 2 2 7 7 5
3 3 7 4 4 3 1
4 4 6 2 1 7 3
In [25]: cols = ['new{:02d}'.format(i) for i in np.arange(1, 3+1)]
In [26]: cols
Out[26]: ['new01', 'new02', 'new03']
In [27]: df[cols] = df.filter(regex='^amt').div(df.filter(regex='^qty').values)
In [28]: df
Out[28]:
amt01 amt02 amt03 qty01 qty02 qty03 new01 new02 new03
0 2 2 6 2 3 6 1.000000 0.666667 1.000000
1 4 3 7 3 7 5 1.333333 0.428571 1.400000
2 3 2 2 7 7 5 0.428571 0.285714 0.400000
3 3 7 4 4 3 1 0.750000 2.333333 4.000000
4 4 6 2 1 7 3 4.000000 0.857143 0.666667
Upvotes: 4
Reputation: 3897
Do you mean just
data['sales']/data['quantity']
Edit: It might be a notation matter. Can you try:
data.iloc[:,i] / sap_data.iloc[:,i+52]
which I believe is what you have in mind
Upvotes: 2