Reputation:
My dataframe df
looks like this:
Value
X.Y.Z 10
X.Y.K 20
X.Y.W 30
X.Y.Z.1 20
X.Y.K.1 5
X.Y.W.1 30
X.Y.Z.2 3
X.Y.K.2 23
X.Y.W.2 44
I'm trying to unpivot using the 3rd character of the row names to name the columns, like:
So, the row names are now the last character of the rows (after the dot).
I know this is possible to do with dplyr
, I've tried gather
, and spread
, but no luck, can anyone help?
EDIT: Here's the data above in text, I:
structure(list(..1 = c("X.Y.Z", "X.Y.K", "X.Y.W", "X.Y.Z.1",
"X.Y.K.1", "X.Y.W.1", "X.Y.Z.2", "X.Y.K.2", "X.Y.W.2"), Value = c(10,
20, 30, 20, 5, 30, 3, 23, 44)), class = "data.frame", row.names = c(NA,
-9L))
Upvotes: 1
Views: 287
Reputation: 29203
library(dplyr)
library(tibble)
library(tidyr)
df1 %>%
rownames_to_column %>%
transmute(mycols = gsub('^.*\\.', '', gsub('.[[:digit:]]+', '', rowname)),
myrows = regmatches(rowname, gregexpr('[0-9]+',rowname)),
value = Value) %>%
spread(key=mycols, value=value)
# myrows K W Z
# 1 20 30 10
# 2 1 5 30 20
# 3 2 23 44 3
library(dplyr)
library(tidyr)
df1 %>%
mutate(mycols = substr(gsub('.[[:digit:]]+', '', rownames(.)), 5, 5),
myrows = as.integer(as.factor(substr(rownames(.),7,7)))-1) %>%
spread(key=mycols, value=Value)
#> myrows K W Z
#> 1 0 20 30 10
#> 2 1 5 30 20
#> 3 2 23 44 3
Data:
df1 <- structure(list(Value = c(10, 20, 30, 20, 5, 30, 3, 23, 44)),
row.names = c("X.Y.Z", "X.Y.K", "X.Y.W", "X.Y.Z.1",
"X.Y.K.1", "X.Y.W.1", "X.Y.Z.2", "X.Y.K.2", "X.Y.W.2"),
class = "data.frame")
As I said in my comment, we need to clean the data since $..1
column causes problems for dplyr
. Here's a solution using the exact data provided in the question:
df1 <- structure(list(..1 = c("X.Y.Z", "X.Y.K", "X.Y.W", "X.Y.Z.1",
"X.Y.K.1", "X.Y.W.1", "X.Y.Z.2", "X.Y.K.2", "X.Y.W.2"),
Value = c(10, 20, 30, 20, 5, 30, 3, 23, 44)),
class = "data.frame", row.names = c(NA, -9L))
library(dplyr)
library(janitor)
library(tidyr)
clean_names(df1) %>%
mutate(mycols = substr(gsub('.[[:digit:]]+', '', x1), 5, 5),
myrows = as.integer(as.factor(substr(x1,7,7)))-1) %>%
select(-x1) %>%
spread(key=mycols, value=value)
#> myrows K W Z
#> 1 0 20 30 10
#> 2 1 5 30 20
#> 3 2 23 44 3
Created on 2019-07-29 by the reprex package (v0.3.0)
Couple more approaches to see if they work for OP's dataset. (Without a reproducible example, it's hard, if not impossible, to resolve this; so, these are my final efforts).
library(dplyr)
library(tibble)
library(tidyr)
df1 %>%
rownames_to_column %>%
mutate(mycols = gsub('.[[:digit:]]+', '', rowname),
myrows = regmatches(rowname, gregexpr('[0-9]+',rowname))) %>%
select(-rowname) %>%
spread(key=mycols, value=Value)
or
df1 %>%
rownames_to_column %>%
separate(rowname,sep = "\\.", into = c("A1","B2","C3", "D4")) %>%
select(-A1,-B2) %>%
spread(key=C3, value=Value)
Upvotes: 4
Reputation: 160792
Try this:
library(dplyr) # and tibble is needed, too
library(tidyr)
df %>%
tibble::rownames_to_column() %>%
mutate(
type = gsub("(^X\\.Y\\.|\\.[0-9]$)", "", rowname),
num = gsub("\\D", "", rowname)
) %>%
select(-rowname) %>%
spread(type, Value)
# num K W Z
# 1 20 30 10
# 2 1 5 30 20
# 3 2 23 44 3
BTW, from your question it looks like what you have are proper rownames
, but your structure
includes them as a column named ..1
. I thought this might have been an artifact of trying to get the data into your question, so I removed it.
df <- structure(list(..1 = c("X.Y.Z", "X.Y.K", "X.Y.W", "X.Y.Z.1",
"X.Y.K.1", "X.Y.W.1", "X.Y.Z.2", "X.Y.K.2", "X.Y.W.2"), Value = c(10,
20, 30, 20, 5, 30, 3, 23, 44)), class = "data.frame", row.names = c(NA, -9L))
rownames(df) <- x$..1
df$..1 <- NULL
If this is not quite the case, then you can remove the call to tibble::rownames_to_column()
. However, having ..1
in a dplyr chain isn't good (Error: Column 1 must not have names of the form ... or ..j.
), so you'll need to rename it anyway.
Upvotes: 1