Reputation: 121
Let's say I own grocery store, and like a good store owner I want to see how much influence my market tactics have in what people buy. To do this, when they enter my store I always gift them a random fruit of this list, of which I happen to have an overstock:
Fruit
1. Apple
2. Orange
3. Avocado
4. Lemon
5. Banana
Now I wonder how my gift influences their purchases. To each client I give a Client Number, so I can track every purchase each of my client makes in my store during the whole year. My Client Number list looks something like this
Client
1. 12345
2. 23456
3. 66666
4. 55555
5. 91919
Being a very observant store owner, I take notes of everything each of my clients buy, and in the exact order. So I construct a data frame where I number each time they came to my store to buy goods (Visit #), what they bought (Item), in what order (Purchase #) and how much they spent. I put my gift as Purchase #0. My data frame then looks something like this
Client Visit Purchase Item Cost
1. 12345 1 0 Lemon 0
2. 12345 1 1 Salami 1
3. 12345 1 2 Cow 100
4. 12345 1 3 Chickens 20
(...)
51. 12345 4 0 Avocado 0
52. 12345 4 1 Onions 2
53. 12345 4 2 Bananas 5
54. 12345 4 3 Bread 4
(...)
94. 66666 1 0 Apple 0
95. 66666 1 1 Burger 2
96. 66666 1 2 Ketchup 1
97. 66666 1 3 Alpaca 50
(...)
241. 66666 4 0 Banana 0
242. 66666 4 1 Lemon 2
243. 66666 4 2 Olive Oil 3
244. 66666 4 3 Noodles 5
I have some rules to make it easier to gather data. My clients must make exactly 4 shopping visits a month, otherwise I don't allow them to buy my products and they get a permanent restraining order to not come near my store ever again. Once they get in, they must buy exactly 10 of my products, if they get less I demand they buy something else and only allow them to leave when they have exactly 10, if they have more I swing the less expensive items in the shopping cart out the window until 10 products remain.
To maximize my profits I want to know how my gift influences their purchases. Say, I want the average expenses each of my client makes every time I gift them one of the five fruits of my first list. This means I must get the average $ of all the visits in which the Purchase #0 is an Apple, the average $ of all the visits in which the Purchase #0 is an Orange, and so on.
Then I want to track what's the average they spend in each on the 10 items the buy, based on each gift I give them. For example, in a visit in which the first item is Apple, I want to know what's the average $ my clients spend on the first item, what's the average they spend on the second, and so on. Then do this for each of the five fruits. This will give me the information needed to know where I need to locate my products within the store so they don't spend much time idling around here.
Now this is a mess so bear with me, I'm a store owner and don't know how to code. If the first item is Apple, I can try to do
giftOne <- toString(gifts[1,])
giftOneRows <- data %>%
select("Client", "Visit", "Purchase", "Item", "Cost") %>%
filter(`Item` == giftOne, `Purchase` == 0)
which returns all the rows in which the item zero given to them is Apple. And I tried to get all the Cost values for all the items in which item 0 is Apple.
for (i in 1:nrow(giftOneRows)){
costGiftOne <- data %>%
select("Client","Visit", "Cost") %>%
filter(`Client` == toString(giftOneRows[i,1]),`Visit` == toString(giftOneRows[i,2]))
print(costGiftOne)
}
The loop return the data I need, but in different data frames (a different costGiftOne for each visit in which the gift is Apple), and I want to unify it so I can get the average cost. And the above I manually did it for Apple, but I want to know how can I do it for all five fruits without doing it manually having to write five for loops. I imagine it can be done with a loop within a loop, but I don't know how to implement a working solution.
Upvotes: 0
Views: 39
Reputation: 125697
You can simplify your analysis by moving the gift row to a new column:
library(dplyr, warn = FALSE)
data |>
# Move the Gift in a new column
mutate(
Gift = Item[Purchase == 0], .by = c(Client, Visit)
) |>
# Get rid of the gift row
filter(Purchase != 0) |>
# Compute total expenditures per client and visit
summarise(
sum_exp = sum(Cost), .by = c(Client, Visit, Gift)
) |>
# Compute mean expenditures by gift
summarise(
mean_exp = mean(sum_exp), .by = c(Gift)
)
#> Gift mean_exp
#> 1 Lemon 121
#> 2 Avocado 11
#> 3 Apple 53
#> 4 Banana 10
DATA
data <- data.frame(
Client = c(12345, 12345, 12345, 12345, 12345, 12345, 12345, 12345, 66666, 66666, 66666, 66666, 66666, 66666, 66666, 66666),
Visit = c(1, 1, 1, 1, 4, 4, 4, 4, 1, 1, 1, 1, 4, 4, 4, 4),
Purchase = c(0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3),
Item = c(
"Lemon", "Salami", "Cow", "Chickens", "Avocado", "Onions", "Bananas", "Bread",
"Apple", "Burger", "Ketchup", "Alpaca", "Banana", "Lemon", "Olive Oil", "Noodles"
),
Cost = c(0, 1, 100, 20, 0, 2, 5, 4, 0, 2, 1, 50, 0, 2, 3, 5)
)
Upvotes: 1