mpat
mpat

Reputation: 13

Assign a value between index values / ranges in Pandas

If I have a dataframe such as:

data = [[10,"Apple"],[4,"Banana"],[3,"Strawberry"],[15,"Chocolate"],[5,"Kiwi"],[75,"Apple"],[4,"Potato"],[6,"Apple"],[45,"Banana"],[10,"Strawberry"],[10,"Apple"],]

df = pd.DataFrame(data, columns=('Cost', 'Fruit'))

Produces:

   Cost Fruit
0   10  Apple
1   4   Banana
2   3   Strawberry
3   15  Chocolate
4   5   Kiwi
5   75  Apple
6   4   Potato
7   6   Apple
8   45  Banana
9   10  Strawberry
10  10  Apple

I also have a dictionary with shop names assigned certain values:

 shop_names = {
    0 : "Shop 1",
    4 : "Shop 2",
    7 : "Shop 3",
}

Produces:

{0: 'Shop 1', 4: 'Shop 2', 7: 'Shop 3'}

I want to create a "Shop" column and assign values for each shop in the dictionary, based on the keys. So Rows 0-3 = Shop 1, Rows 4-6 = Shop 2, Rows 7-10 = Shop 3. The dictionary only has the first index as the key for each shop.

Desired output:

    Cost    Fruit   Shop
0   10  Apple       Shop 1
1   4   Banana      Shop 1
2   3   Strawberry  Shop 1
3   15  Chocolate   Shop 1
4   5   Kiwi        Shop 2
5   75  Apple       Shop 2
6   4   Potato      Shop 2
7   6   Apple       Shop 3
8   45  Banana      Shop 3
9   10  Strawberry  Shop 3
10  10  Apple       Shop 3 

This is an example dataset - the real dataset has 10,000 rows - but I have a dictionary with the shop intervals based on their equivalent index positions the original dataframe.

I am struggling to assign the values based on the index ranges from my dictionary. Any help would be appreciated.

Many thanks

Upvotes: 1

Views: 1425

Answers (2)

Bill Huang
Bill Huang

Reputation: 4648

First map the shop names onto locations using Index.map(). And then fill the column with the last non-nan value (inspired by this answer)

df["Shop"] = df.index.map(shop_names).values
df["Shop"] = df["Shop"].fillna(method="ffill")

df
Out[36]: 
    Cost       Fruit    Shop
0     10       Apple  Shop 1
1      4      Banana  Shop 1
2      3  Strawberry  Shop 1
3     15   Chocolate  Shop 1
4      5        Kiwi  Shop 2
5     75       Apple  Shop 2
6      4      Potato  Shop 2
7      6       Apple  Shop 3
8     45      Banana  Shop 3
9     10  Strawberry  Shop 3
10    10       Apple  Shop 3

Upvotes: 2

jezrael
jezrael

Reputation: 862406

Create Series by Index.map with ffill for forward filling missing values:

df['Shop'] = pd.Series(df.index.map(shop_names), index=df.index).ffill()
print (df)
    Cost       Fruit    Shop
0     10       Apple  Shop 1
1      4      Banana  Shop 1
2      3  Strawberry  Shop 1
3     15   Chocolate  Shop 1
4      5        Kiwi  Shop 2
5     75       Apple  Shop 2
6      4      Potato  Shop 2
7      6       Apple  Shop 3
8     45      Banana  Shop 3
9     10  Strawberry  Shop 3
10    10       Apple  Shop 3

Upvotes: 5

Related Questions