user10072460
user10072460

Reputation:

How to code multiple columns in a sequence approach

I have multiple columns and I want to code them sequentially. Here is a sample of the columns:

df<-read.table(text=" A M Z X

124321  33333   123 1309
234543  12121   33  1308
130991  200EE   123 1308
130911  200EE   123 1309
124321  12121   33  1309
234543  33333   232 1309", h=T)

I want to get this table:

df1<-read.table(text=" Group1   Group2  Group3  Group4

1   6   9   12
4   5   8   11
3   7   9   11
2   7   9   12
1   5   8   12
4   6   10  12
", h=T)

I have used the following basic codes, but they are not reliable, especially when the columns are increased based on my experiences.

  df$Group1 <- as.integer(as.factor(df$A))
  df$Group2 <- as.integer(as.factor(df$M)) + max(df$Group1)
  df$Group3 <- as.integer(as.factor(df$Z)) + max(df$Group2)
  df$Group4 <- as.integer(as.factor(df$X)) + max(df$Group3)

Is there a better and more reliable solution to get my table?

Upvotes: 1

Views: 42

Answers (2)

IceCreamToucan
IceCreamToucan

Reputation: 28705

You can use accumulate

library(tidyverse)

df %>% 
  mutate_all(~ as.integer(as.factor(.))) %>% 
  accumulate(~ .y + max(.x)) %>% 
  bind_cols %>% 
  rename_all(~ paste0('Group', seq_along(.)))

# # A tibble: 6 x 4
#   Group1 Group2 Group3 Group4
#    <int>  <int>  <int>  <int>
# 1      1      7      9     12
# 2      4      5      8     11
# 3      3      6      9     11
# 4      2      6      9     12
# 5      1      5      8     12
# 6      4      7     10     12

The second column is different from the one you show, but based on the output below it looks like it's working as expected

df %>% 
  mutate_all(~ as.integer(as.factor(.)))
#   A M Z X
# 1 1 3 2 2
# 2 4 1 1 1
# 3 3 2 2 1
# 4 2 2 2 2
# 5 1 1 1 2
# 6 4 3 3 2

Or, borrowing d.b's cumsum/sapply idea (should accept d.b's answer if you think this method is better)

df %>% 
  mutate_all(~ as.integer(as.factor(.))) %>% 
  map2_dfc(c(0, cumsum(sapply(., max))[-ncol(.)]), `+`)
# # A tibble: 6 x 4
#       A     M     Z     X
#   <dbl> <dbl> <dbl> <dbl>
# 1     1     7     9    12
# 2     4     5     8    11
# 3     3     6     9    11
# 4     2     6     9    12
# 5     1     5     8    12
# 6     4     7    10    12

Upvotes: 1

d.b
d.b

Reputation: 32558

df2 = lapply(df, function(x) as.integer(as.factor(x)))
data.frame(Map("+", df2, cumsum(c(0, head(sapply(df2, max), -1)))))
#  A M  Z  X
#1 1 7  9 12
#2 4 5  8 11
#3 3 6  9 11
#4 2 6  9 12
#5 1 5  8 12
#6 4 7 10 12

Upvotes: 1

Related Questions