Reputation: 41
I have a dataset that contains the information on the bought and sold prices of different products. However, rather than storing the bought and sold for price in the same row, it is stored in two separate rows, which are identified by a bought and sold variable, as shown below.
Product|Product Type|Price|Bought|Sold
---------------------------------------
Apples | Green | 1 | 0 | 1
---------------------------------------
Apples | Green | 2 | 1 | 0
---------------------------------------
Apples | Red | 3 | 0 | 1
---------------------------------------
Apples | Red | 4 | 1 | 0
---------------------------------------
I want to join the bought and sold price into one row, so it looks a little something like this:
Product|Product Type|Bought Price|Sold Price
---------------------------------------------
Apples | Green | 1 | 2
---------------------------------------------
Apples | Red | 4 | 3
Here is the code to create my example dataset. Thanks in advance for any help.
Product <- c("Apples", "Apples", "Apples", "Apples", "Apples", "Apples",
"Oranges", "Oranges", "Oranges", "Oranges", "Oranges", "Oranges",
"Buscuits", "Buscuits", "Buscuits", "Buscuits", "Buscuits", "Buscuits")
ProductType <- c("Green", "Green", "Red", "Red", "Pink", "Pink",
"Big", "Big", "Medium", "Medium", "Small", "Small",
"Chocolate", "Chocolate", "Oat", "Oat", "Digestive", "Digestive")
Price <- c(2, 1, 3, 4, 1, 2,
5, 3, 2, 1, 2, 3,
6, 4, 1, 8, 6, 2)
Bought <- c(0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1)
Sold <- c(1, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0,
1, 0, 1, 0, 1, 0)
sales <- data.frame(Product, ProductType, Price, Bought, Sold)
Upvotes: 2
Views: 57
Reputation: 887118
With dplyr
, we group by 'Product', 'ProductType', and summarise
to create the 'BoughtPrice' and 'SoldPrice' by subsetting 'Price' where 'Bought' or 'Sold' is 1
library(dplyr)
sales %>%
group_by(Product, ProductType) %>%
summarise(BoughtPrice = Price[Bought==1], SoldPrice = Price[Sold ==1])
A similar approach with data.table
would be
library(data.table)
setDT(sales)[, lapply(.SD, function(x) Price[x==1]),
.(Product, ProductType), .SDcols = Bought:Sold]
Upvotes: 2
Reputation: 6222
library(dplyr)
df <- data.frame(Product, ProductType, Price, Bought, Sold)
df %>% group_by(Product, ProductType) %>%
summarise(Bought_Price = sum(Price * Bought),
Sold_Price = sum(Sold * Price))
# A tibble: 9 x 4
# Groups: Product [?]
# Product ProductType Bought_Price Sold_Price
# <fctr> <fctr> <dbl> <dbl>
# 1 Apples Green 1 2
# 2 Apples Pink 2 1
# 3 Apples Red 4 3
# 4 Buscuits Chocolate 4 6
# 5 Buscuits Digestive 2 6
# 6 Buscuits Oat 8 1
# 7 Oranges Big 3 5
# 8 Oranges Medium 1 2
# 9 Oranges Small 3 2
Upvotes: 3
Reputation: 56159
Using dplyr:
library(dplyr)
sales %>%
group_by(Product, ProductType) %>%
summarise(BoughtPrice = Price[ Bought == 1 ],
SoldPrice = Price[ Sold == 1 ]) %>%
ungroup()
Upvotes: 4