Kilian Murphy
Kilian Murphy

Reputation: 335

Data frame restructure in R from wide format to two columns

I am working on a data set that has 4 columns that represent when a certain forest patch was cut.

Three of these columns represent a year with each row representing a month. e.g. Column: 2015, Row: 1 = January, 2015

The last column is a unique ID for each site.

For my analysis, I need to restructure this data so that it is 2 rows, the site, and a date format for the cut. As we do not know the exact day we have chosen the 15th of each month as a proxy.

I have attached photos of the data and a summary.

enter image description here

enter image description here

The final outcome I would like is like so

uniqueid <- c(21000, 23400, 26800)
cut <- as.Date(c('2015-1-15','2016-3-15','2017-3-15'))
stack_example <- data.frame(uniqueid, cut)

Could somebody please help me to restructure the data so it is like the example output?

enter image description here

Thank you very much

<<<< EDIT FOR AKRUN >>>>>

 dput(head(clf))

structure(list(X2017 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), X2016 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_ ), X2015 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), idunique = 1:6), row.names = c(NA, 6L), class = "data.frame")

Upvotes: 1

Views: 51

Answers (1)

akrun
akrun

Reputation: 887691

We can pivot to long format with pivot_longer and then use paste or str_c or sprintf

library(dplyr)
library(tidyr)
library(stringr)
clf %>%
      pivot_longer(cols = -idunique, values_drop_na = TRUE,
           names_to = 'date', values_to = 'value') %>%
      transmute(idunique, cut = sprintf('%s-%02d-15', 
               str_remove(date, '^X'), value))
# A tibble: 1 x 2
#  idunique cut       
#     <int> <chr>     
#1        3 2017-03-15

data

# OP's dput showed all rows NA.  So, added a value for test
clf$X2017[3] <- 3

Upvotes: 2

Related Questions