Pandas: how to incrementally add one to column while sum is less than corresponding column?

Question

I am trying to increment a column by 1 while the sum of that column is less than or equal to a total supply figure. I also need that column to be less than the corresponding value in the 'allocation' column. The supply variable will be dynamic from 1-400 based on user input. Below is the desired output (Allocation Final column).

supply = 14

| rank | allocation | Allocation Final |
| ---- | ---------- | ---------------- |
| 1    | 12         | 9                |
| 2    | 3          | 3                |
| 3    | 1          | 1                |
| 4    | 1          | 1                |

Below is the code I have so far:

data = [[1.05493,12],[.94248,3],[.82317,1],[.75317,1]]

df = pd.DataFrame(data,columns=['score','allocation'])

df['rank'] = df['score'].rank()

df['allocation_new'] = 0

#static for testing
supply = 14

for index in df.index:

    while df.loc[index, 'allocation_new'] < df.loc[index, 'allocation'] and df.loc[index, 'allocation_new'].sum() < supply:
        df.loc[index, 'allocation_new'] += 1

print(df)

Cimbali · Accepted Answer

This should do:

def allocate(df, supply):
    if supply > df['allocation'].sum():
        raise ValueError(f'Unacheivable supply {supply}, maximal {df["allocation"].sum()}')

    under_alloc = pd.Series(True, index=df.index)
    df['allocation final'] = 0

    while (missing := supply - df['allocation final'].sum()) >= 0:
        assert under_alloc.any()

        if missing <= under_alloc.sum():
            df.loc[df.index[under_alloc][:missing], 'allocation final'] += 1
            return df

        df.loc[under_alloc, 'allocation final'] = (
            df.loc[under_alloc, 'allocation final'] + missing // under_alloc.sum()
        ).clip(upper=df.loc[under_alloc, 'allocation'])

        under_alloc = df['allocation final'] < df['allocation']

    return df

At every iteration, we add the missing quotas to any rows that did not reach the allocation yet (rounded down, that’s missing // under_alloc.sum()), then using pd.Series.clip() to ensure we stay below the allocation.

If there’s less missing quotas than available ranks to which to allocate (e.g. run the same dataframe with supply=5 or 6), we allocate to the first missing ranks.

>>> df = pd.DataFrame( {'allocation': {0: 12, 1: 3, 2: 1, 3: 1}, 'rank': {0: 1, 1: 2, 2: 3, 3: 4}})
>>> print(allocate(df, 14))
   allocation  rank  allocation final
0          12     1                 9
1           3     2                 3
2           1     3                 1
3           1     4                 1
>>> print(allocate(df, 5))
   allocation  rank  allocation final
0          12     1                 2
1           3     2                 1
2           1     3                 1
3           1     4                 1

Pandas: how to incrementally add one to column while sum is less than corresponding column?

Answers (2)

Related Questions