Reputation: 391
I have this data table :
library(data.table)
dt = data.table(customer = '123',
year = 2008:2012,
sales_revenues_by_year = c(1966.4,180.2,903,1138.34,1593.66),
height_sales_by_year = c(260,10,100,120,146),
nb_orders_by_year = c(2,1,2,1,4),
nb_item_ordered_by_year = c(4,4,2,2,26),
customer_age = c(rep(12.85,5)),
growth_by_year = c(NA, -0.91, 4.01, 0.26, 0.4))
customer year sales_revenues_by_year height_sales_by_year nb_orders_by_year nb_item_ordered_by_year customer_age growth_by_year
1: 123 2008 1966.40 260 2 4 12.85 NA
2: 123 2009 180.20 10 1 4 12.85 -0.91
3: 123 2010 903.00 100 2 2 12.85 4.01
4: 123 2011 1138.34 120 1 2 12.85 0.26
5: 123 2012 1593.66 146 4 26 12.85 0.40
I want to create an output data where each columns from sales_revenues_by_year to growth_by_year will become sales_revenues_by_year_2008, sales_revenues_by_year_2009, ..., ..., growth_by_year_2008, ..., growth_by_year_2012.
The year column is my pivot table. At the end, for this customer (123) I only have 1 row with a lot of columns (5 columns(2008,2009,2010,2011,2012) by new variables)
Thanks a lot for any answer !!
Upvotes: 0
Views: 25
Reputation: 39595
One option with dplyr
and tidyr
:
library(dplyr)
library(tidyr)
#Code
newdt <- dt %>% pivot_longer(-c(customer,year)) %>%
mutate(var=paste0(name,'_',year)) %>%
select(-c(year,name)) %>%
pivot_wider(names_from = var,values_from=value,values_fill = 0)
Output:
# A tibble: 1 x 31
customer sales_revenues_~ height_sales_by~ nb_orders_by_ye~ nb_item_ordered~ customer_age_20~
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 123 1966. 260 2 4 12.8
# ... with 25 more variables: growth_by_year_2008 <dbl>, sales_revenues_by_year_2009 <dbl>,
# height_sales_by_year_2009 <dbl>, nb_orders_by_year_2009 <dbl>,
# nb_item_ordered_by_year_2009 <dbl>, customer_age_2009 <dbl>, growth_by_year_2009 <dbl>,
# sales_revenues_by_year_2010 <dbl>, height_sales_by_year_2010 <dbl>,
# nb_orders_by_year_2010 <dbl>, nb_item_ordered_by_year_2010 <dbl>, customer_age_2010 <dbl>,
# growth_by_year_2010 <dbl>, sales_revenues_by_year_2011 <dbl>, height_sales_by_year_2011 <dbl>,
# nb_orders_by_year_2011 <dbl>, nb_item_ordered_by_year_2011 <dbl>, customer_age_2011 <dbl>,
# growth_by_year_2011 <dbl>, sales_revenues_by_year_2012 <dbl>, height_sales_by_year_2012 <dbl>,
# nb_orders_by_year_2012 <dbl>, nb_item_ordered_by_year_2012 <dbl>, customer_age_2012 <dbl>,
# growth_by_year_2012 <dbl>
Or more directly:
#Code 2
newdt <- dt %>% pivot_wider(names_from=year,
values_from=c(sales_revenues_by_year:growth_by_year),
values_fill = 0)
Output:
# A tibble: 1 x 31
customer sales_revenues_~ sales_revenues_~ sales_revenues_~ sales_revenues_~ sales_revenues_~
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 123 1966. 180. 903 1138. 1594.
# ... with 25 more variables: height_sales_by_year_2008 <dbl>, height_sales_by_year_2009 <dbl>,
# height_sales_by_year_2010 <dbl>, height_sales_by_year_2011 <dbl>,
# height_sales_by_year_2012 <dbl>, nb_orders_by_year_2008 <dbl>, nb_orders_by_year_2009 <dbl>,
# nb_orders_by_year_2010 <dbl>, nb_orders_by_year_2011 <dbl>, nb_orders_by_year_2012 <dbl>,
# nb_item_ordered_by_year_2008 <dbl>, nb_item_ordered_by_year_2009 <dbl>,
# nb_item_ordered_by_year_2010 <dbl>, nb_item_ordered_by_year_2011 <dbl>,
# nb_item_ordered_by_year_2012 <dbl>, customer_age_2008 <dbl>, customer_age_2009 <dbl>,
# customer_age_2010 <dbl>, customer_age_2011 <dbl>, customer_age_2012 <dbl>,
# growth_by_year_2008 <dbl>, growth_by_year_2009 <dbl>, growth_by_year_2010 <dbl>,
# growth_by_year_2011 <dbl>, growth_by_year_2012 <dbl>
Upvotes: 1