Reputation: 1

Transforming a data frame with one column to many columns in R

Currently my data frame consists of 1 column with 2326 rows. I want to transform my data frame to consist of 11 rows. More specifically, my current data frame look like

John Doe
7
45
42
978
3
6
8
9
0
11
Sally Jenkins
2

etc.

I want my data frame to look like:

John Doe        7  45  42  978  3  6  8  9  0  11 
Sally Jenkins   2

Each person is a row and then their statistics are each a separate column in the row. Some people are missing a statistic so I can not separate based on the number of rows between each person's name.

I have tried using t() as well as reshape(transform()). Do you have any suggestions on where to go next?

Upvotes: 0

Answers (2)

akrun

Reputation: 887851

1) One option is to create a grouping variable based on the presence of alphabets, summarise by creating a list output and use unnest_wider to change that list to a set of new columns

library(dplyr)
library(tidyr)
library(stringr)
df1  %>%
      group_by(grp = cumsum(str_detect(col1, "[A-Za-z]"))) %>%
      group_by(coln = first(col1), .add = TRUE) %>%
      slice(-1) %>%
      summarise(out = list(as.list(as.numeric(col1))))  %>%
      unnest_wider(c(out)) %>%
      ungroup %>%
      select(-grp) %>%
      rename_at(-1, ~ str_c('new_col', seq_along(.)))
# A tibble: 2 x 11
#  coln          new_col1 new_col2 new_col3 new_col4 new_col5 new_col6 new_col7 new_col8 new_col9 new_col10
#  <chr>            <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>     <dbl>
#1 John Doe             7       45       42      978        3        6        8        9        0        11
#2 Sally Jenkins        2       NA       NA       NA       NA       NA       NA       NA       NA        NA

2) Or using base R (No packages used)

grp <- with(df1, ave(col1, cumsum(grepl('[A-Za-z]', col1)), 
     FUN = function(x) x[1]))
aggregate(values ~ ind, stack(split(as.numeric(df1$col1[duplicated(grp)]), 
           grp[duplicated(grp)])),  FUN = I)
#            ind                            values
#1      John Doe 7, 45, 42, 978, 3, 6, 8, 9, 0, 11
#2 Sally Jenkins                                 2

data

df1 <- structure(list(col1 = c("John Doe", "7", "45", "42", "978", "3", 
"6", "8", "9", "0", "11", "Sally Jenkins", "2")), class = "data.frame", 
row.names = c(NA, 
-13L))

Upvotes: 1

G. Grothendieck

Reputation: 270195

Using DF defined reproducibly in the Note at the end, ensure that DF[[1]] is character. Then use grep to create a logical variable g with one element per input row which is TRUE for each heading and FALSE otherwise. Then remove the headings and and split it up using the setNames to set the names of the resulting list. Finally convert each list component to a numeric ts series as such series can be cbind'ed even if they have different lengths. Now cbind those together and transpose. No packages are used.

DF[[1]] <- as.character(DF[[1]]) 
g <- grepl("\\D", DF[[1]])
s <- setNames(split(DF[[1]][!g], cumsum(g)[!g]), DF[[1]][g])
t(do.call("cbind", lapply(s, function(x) ts(as.numeric(x)))))

giving:

              [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
John Doe         7   45   42  978    3    6    8    9    0    11
Sally Jenkins    2   NA   NA   NA   NA   NA   NA   NA   NA    NA

Note

The input in reproducible form is assumed to be:

Lines <- "John Doe
7
45
42
978
3
6
8
9
0
11
Sally Jenkins
2"
DF <- read.table(text = Lines, as.is = TRUE, sep = ";")

Upvotes: 0

Transforming a data frame with one column to many columns in R

Answers (2)

data

Note

Related Questions