Reputation: 897
I have a dataframe in a long format with companies and their estimates by regions. I want to build a wide table where I could see how many positive and negative estimates were given to a company by region. When I try pivot_wider
I recieve a dataframe with vectors in its cells. That is okay, however I am unable to count the number of positive and negative feedbacks by regions. Also tried using unnest
and unnest_longer
functions. The latter though seems to resolve my problem, but it takes only one argument for column to unnest.
How can I, may be, modify my pivot_wider
to get the desirable result?
My dataframe:
set.seed(1407)
test_df <- data.frame(code = rep(c("positive", "negative"), 9),
company = c("Google", "Amazon", "SpaceX", "BlueOrigin",
"Google", "Western Digital", "Aliexpress",
"Tencent", "Aliexpress"),
n = rbinom(18, size = 9, prob = 0.5),
region = c("Asia", "Europe", "Middle East"))
What I get using a function to wide the table:
test_df %>%
pivot_wider(id_cols = region,
names_from = code,
values_from = n)
# A tibble: 3 x 3
region positive negative
<chr> <list> <list>
1 Asia <int [3]> <int [3]>
2 Europe <int [3]> <int [3]>
3 Middle East <int [3]> <int [3]>
My desirable output:
region positive negative
Asia 4 2
Asia 3 5
Asia 5 2
Europe 3 5
Europe 6 4
Europe 5 1
Middle East 8 5
Middle East 6 5
Middle East 6 2
Upvotes: 3
Views: 182
Reputation: 21938
You can modify your solution in the followin way. When the combination of id_cols
and value
does not uniquely identify an observation the result will be a named list. You may notice that you have more than one combination of Asia
and n == 5
for example so I decided to use all remaining columns instead of those specified in names_from
and values_from
in id_cols
as is the default choice than just region
.
library(tidyr)
test_df %>%
pivot_wider(names_from = code,
values_from = n) %>%
arrange(region)
# A tibble: 9 x 4
company region positive negative
<chr> <chr> <int> <int>
1 Google Asia 4 5
2 BlueOrigin Asia 5 2
3 Aliexpress Asia 3 2
4 Amazon Europe 6 5
5 Google Europe 3 1
6 Tencent Europe 5 4
7 SpaceX Middle East 8 5
8 Western Digital Middle East 6 5
9 Aliexpress Middle East 6 2
Upvotes: 3
Reputation: 102241
A base R option using reshape
reshape(
test_df,
direction = "wide",
idvar = c("company", "region"),
timevar = "code"
)
gives
company region n.positive n.negative
1 Google Asia 5 6
2 Amazon Europe 4 6
3 SpaceX Middle East 2 2
4 BlueOrigin Asia 5 6
5 Google Europe 5 3
6 Western Digital Middle East 2 4
7 Aliexpress Asia 6 3
8 Tencent Europe 5 3
9 Aliexpress Middle East 4 4
Upvotes: 2