Reputation: 204
I am working on a new shiny project and trying to re-use some of my collegues work that he has done in SQL to speed up the time it takes to build the data for this app.
I don't exactly know how to describe this problem so I will do so by showing what I have and explaining what I want to get.
Essentially we have an SQL script which spits out a bunch of data into two columns.
It would look like the following.
lookup output
1: dataAU20161 142
2: dataAU20171 246
3: dataAU20181 17
4: dataAU20191 3
5: dataAU20162 193
6: dataAU20172 203
7: dataAU20182 11
8: dataAU20192 9
So ideally I would like to transform this data into the following format where the 'data'
string identifies that they will go into the same data frame. The years in the string will be implemented into columns and the number following the years (1 or 2) will be implemented as a column as a factor variable.
x 2016 2017 2018 2019
--------------------------------
1 142 246 17 3
2 193 203 11 9
Any help with this would be greatly appreciated!
Upvotes: 1
Views: 131
Reputation: 887193
An option would to be separate thecolumn 'lookup'into two
library(dplyr)
library(tidyr)
df1 %>%
extract(lookup, into = c('lookup', 'rn'), 'dataAU(\\d{4})(\\d{1})') %>%
pivot_wider(names_from= lookup, values_from =output) %>%
dplyr::select(-rn)
# A tibble: 2 x 4
# `2016` `2017` `2018` `2019`
# <int> <int> <int> <int>
#1 142 246 17 3
#2 193 203 11 9
df1 <- structure(list(lookup = c("dataAU20161", "dataAU20171", "dataAU20181",
"dataAU20191", "dataAU20162", "dataAU20172", "dataAU20182", "dataAU20192"
), output = c(142L, 246L, 17L, 3L, 193L, 203L, 11L, 9L)), class = "data.frame",
row.names = c("1:",
"2:", "3:", "4:", "5:", "6:", "7:", "8:"))
Upvotes: 2