Reputation: 13
If I have a dataframe such as:
data = [[10,"Apple"],[4,"Banana"],[3,"Strawberry"],[15,"Chocolate"],[5,"Kiwi"],[75,"Apple"],[4,"Potato"],[6,"Apple"],[45,"Banana"],[10,"Strawberry"],[10,"Apple"],]
df = pd.DataFrame(data, columns=('Cost', 'Fruit'))
Produces:
Cost Fruit
0 10 Apple
1 4 Banana
2 3 Strawberry
3 15 Chocolate
4 5 Kiwi
5 75 Apple
6 4 Potato
7 6 Apple
8 45 Banana
9 10 Strawberry
10 10 Apple
I also have a dictionary with shop names assigned certain values:
shop_names = {
0 : "Shop 1",
4 : "Shop 2",
7 : "Shop 3",
}
Produces:
{0: 'Shop 1', 4: 'Shop 2', 7: 'Shop 3'}
I want to create a "Shop" column and assign values for each shop in the dictionary, based on the keys. So Rows 0-3 = Shop 1, Rows 4-6 = Shop 2, Rows 7-10 = Shop 3. The dictionary only has the first index as the key for each shop.
Desired output:
Cost Fruit Shop
0 10 Apple Shop 1
1 4 Banana Shop 1
2 3 Strawberry Shop 1
3 15 Chocolate Shop 1
4 5 Kiwi Shop 2
5 75 Apple Shop 2
6 4 Potato Shop 2
7 6 Apple Shop 3
8 45 Banana Shop 3
9 10 Strawberry Shop 3
10 10 Apple Shop 3
This is an example dataset - the real dataset has 10,000 rows - but I have a dictionary with the shop intervals based on their equivalent index positions the original dataframe.
I am struggling to assign the values based on the index ranges from my dictionary. Any help would be appreciated.
Many thanks
Upvotes: 1
Views: 1425
Reputation: 4648
First map the shop names onto locations using Index.map()
. And then fill the column with the last non-nan value (inspired by this answer)
df["Shop"] = df.index.map(shop_names).values
df["Shop"] = df["Shop"].fillna(method="ffill")
df
Out[36]:
Cost Fruit Shop
0 10 Apple Shop 1
1 4 Banana Shop 1
2 3 Strawberry Shop 1
3 15 Chocolate Shop 1
4 5 Kiwi Shop 2
5 75 Apple Shop 2
6 4 Potato Shop 2
7 6 Apple Shop 3
8 45 Banana Shop 3
9 10 Strawberry Shop 3
10 10 Apple Shop 3
Upvotes: 2
Reputation: 862406
Create Series
by Index.map
with ffill
for forward filling missing values:
df['Shop'] = pd.Series(df.index.map(shop_names), index=df.index).ffill()
print (df)
Cost Fruit Shop
0 10 Apple Shop 1
1 4 Banana Shop 1
2 3 Strawberry Shop 1
3 15 Chocolate Shop 1
4 5 Kiwi Shop 2
5 75 Apple Shop 2
6 4 Potato Shop 2
7 6 Apple Shop 3
8 45 Banana Shop 3
9 10 Strawberry Shop 3
10 10 Apple Shop 3
Upvotes: 5