Neil
Neil

Reputation: 8247

How to return a pandas series from a for loop in python

I have a pandas series which looks like this:

dish_name
Chiken Biryani    3
Mutton Biryani    1
Paneer Biryani    4
Paneer Pulav      2
sandwitch         2

I am calculating (3/(3+1+4+2+2) then second element (1/(3+1+4+2+2) and so on till the end of series.. Which I am doing it with following code in python:

def dish_push(dish_data):
    dish_number = len(dish_data)
    for i in range(dish_number):
        dish = ((dish_data[i])/(dish_data[0:dish_number].sum()))*100
    return dish

But when I pass a series to this function it outputs only the last value.

dish_push(dish_quantity_sold)
Out[291]: 16.666666666666664

Where as I am expecting like this..

25.0
8.33333333333
33.3333333333
16.6666666667
16.6666666667

Am I doing some mistake in return statement? Why is it printing the last value? please help.

Upvotes: 1

Views: 4713

Answers (3)

TheBlackCat
TheBlackCat

Reputation: 10308

jonchar already showed the best way to do your specific task, but regarding your question, the problem is that each time through the loop, you are overwriting the dish variable with the series from that iteration. At the end, you return the last dish value from the loop.

What you would need to do is something like this:

def dish_push(dish_data):
    dish_number = len(dish_data)
    new_data = np.zeros_like(dish_data)
    for i in range(dish_number):
        new_data[i] = ((dish_data[i])/(dish_data.sum()))*100
    return new_data

This would create an array of zeros, put each value in that array, and return the new array after the values have been added.

However, it can be simplified further by using enumerate and looping over the data directly. In each cycle of the loop, this will give you each data point and the index of that data point. Also, you can compute the sum once rather than every time. This also allows you to change the original data in-place, since the sum is already calculated and thus won't change when you change a value. And since the values are changed in-place, you don't have to return anything, since you can just use the array you passed to dish_push (although I will leave the return in just in case):

def dish_push(dish_data):
    dish_sum = dish_data.sum()/100
    for i, idata in enumerate(dish_data):
        dish_data[i] = idata /dish_sum
    return dish_data

Upvotes: 3

sobek
sobek

Reputation: 1426

I realize that this is a rather ugly solution but what you expect it to do is the following:

def dish_push(dish_data):
    dish_number = len(dish_data)
    dish = []
    for i in range(dish_number):
        dish.append(((dish_data[i])/(dish_data[0:dish_number].sum()))*100)
    return dish

That is, you don't overwrite the result in every iteration but append it to the list.

Upvotes: 2

jonchar
jonchar

Reputation: 6472

If dish is your series with the values [3, 1, 4, 2, 2], you can get the result you're looking for without iteration by doing the following:

result = dish / dish.sum() * 100

Upvotes: 3

Related Questions