Reputation: 564
I have a Data Table with three columns: seller, product and price.
Example data:
seller product price
1: A banana 56
2: A lemon 94
3: A orange 84
4: A banana 11
5: A lemon 86
---
166: C orange 162
167: C banana 109
168: C orange 61
169: C banana 141
170: C orange 22
Code for the data
require (data.table)
DT <- data.table(seller = c(rep(c("A"),60),rep(c("B"),62),rep(c("C"),48)), product = c(rep(c("banana", "lemon", "orange"), 20), rep(c("banana", "lemon"), 31), rep(c("banana", "orange"), 24)),
price = c(56, 94, 84, 11, 86, 103, 151, 51, 117, 71, 63, 101, 45, 147, 135, 93, 26, 164, 90, 67, 12, 34, 14, 131, 92, 145, 48, 74, 62, 57, 20, 80, 113, 46, 88, 102, 134, 98, 137, 123, 169, 133, 146,
160, 58, 42, 52, 158, 170, 2, 152, 10, 130, 30, 33, 144, 73, 41, 139, 107, 163, 9, 66, 81, 79, 127, 40, 165, 106, 161, 16, 1, 112, 70, 115, 138, 76, 105, 17, 118, 114, 121, 25, 39, 15, 155, 50, 166,
100, 159, 5, 19, 29, 24, 64, 149, 120, 35, 119, 53, 21, 7, 72, 132, 154, 168, 156, 38, 3, 148, 69, 44, 6, 28, 140, 77, 104, 153, 59, 142, 116, 150, 97, 31, 91, 43, 47, 27, 143, 99, 37, 54, 49, 4, 111,
32, 23, 85, 167, 136, 78, 129, 83, 124, 36, 96, 110, 13, 65, 108, 8, 18, 157, 87, 82, 60, 122, 89, 125, 68, 75, 126, 128, 55, 95, 162, 109, 61, 141, 22))
I would like to perform a pairwise T.test combination between all sellers that sell the same products.
I would like to have an output as it is shown (hypotetical p.values for the example).
Desire output:
seller.x seller.y product p.value
A B banana 0.45
A B lemon 0.87
B C banana 0.03
A C banana 0.23
A C orange 0.01
Upvotes: 1
Views: 925
Reputation: 662
You first need to group by product
. Then, in your j
parameter, you need to compute the combinations of seller
for this product
and get the p.value
for the t.test
of price
between seller.x
and seller.y
:
DT[
, {
sellercomb <- data.table(t(combn(unique(seller), 2)))
names(sellercomb) <- c("seller.x", "seller.y")
sellercomb[
, {
data.table(p.value = t.test(price[seller == seller.x], price[seller == seller.y])$p.value)
}
, by = .(seller.x, seller.y)
]
}
, by = .(product)
]
The result for your data above looks like this:
product seller.x seller.y p.value
1: banana A B 0.9384329
2: banana A C 0.2413946
3: banana B C 0.2154216
4: lemon A B 0.7282811
5: orange A C 0.0354320
Upvotes: 2