Joshua Rosenberg
Joshua Rosenberg

Reputation: 4236

Mutate multiple / consecutive columns (with dplyr or base R)

I'm trying to create "waves" of variables that represent repeated measures. Specifically, I'm trying to create consecutive variables that represent the mean values for variables 1 - 10, 11 - 20 ... 91-100. Note that the "..." symbolizes the variables for waves 3 through 9, as avoiding typing these is my goal!

Here is an example data frame, df, with 10 rows and 100 columns:

mat <- matrix(runif(1000, 1, 10), ncol = 100)
df <- data.frame(mat)
dim(df)
> 10 100

I've used the dplyr::mutate which works once all the variables are typed, but is time-intensive and prone to mistakes. I have not been able to find a way to do so without resorting to manually typing the names of the columns, as I started doing below (note that "..." symbolizes waves 3 through 9):

df <- df %>% 
      mutate(wave_1 = (X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10) / 10,
             wave_2 = (X11 + X12 + X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20) / 10,
             ...
             wave_10 = (X91 + X92 + X93 + X94 + X95 + X96 + X97 + X98 + X99 + X100) / 10)

Can you mutate multiple / consecutive columns with 'dplyr'? Other approaches are also welcome.

Upvotes: 10

Views: 6731

Answers (5)

Anoushiravan R
Anoushiravan R

Reputation: 21938

We can also do it with purrr::map2:

library(purrr)

set.seed(123)
mat <- matrix(runif(1000, 1, 10), ncol = 100)
df <- data.frame(mat)

data.frame(from = seq(10, 100, 10) - 9, 
           to = seq(10, 100, 10)) %>%
  {map2_dfc(.$from, .$to, ~ rowMeans(df[, .x:.y]))} %>%
  set_names(paste0("Wave_", seq_along(.)))

# A tibble: 10 x 10
   Wave_1 Wave_2 Wave_3 Wave_4 Wave_5 Wave_6 Wave_7 Wave_8 Wave_9 Wave_10
    <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>
 1   5.57   6.59   5.75   5.79   6.28   5.67   5.96   6.58   4.37    6.39
 2   6.16   4.53   5.15   4.36   5.73   5.06   7.20   5.66   5.95    3.63
 3   5.94   5.62   4.66   6.65   5.21   6.84   4.98   4.39   6.32    7.42
 4   5.91   5.77   5.55   5.53   5.43   4.25   5.85   6.84   5.65    5.06
 5   4.74   6.04   4.34   4.65   5.65   5.95   4.26   5.31   6.88    6.19
 6   4.39   5.86   6.77   6.46   5.77   6.14   5.90   4.93   5.59    6.61
 7   5.85   5.53   4.88   6.19   5.39   5.13   6.72   4.85   4.96    6.00
 8   5.84   5.24   5.15   3.11   4.23   5.81   4.55   4.65   3.64    4.54
 9   5.63   5.65   4.18   3.94   3.89   7.15   5.78   5.49   4.59    5.55
10   4.83   5.46   7.32   7.62   6.10   4.71   4.00   4.91   5.75    4.03

And also in base R we could do this:

tmp <- split.default(df, ceiling(seq_along(df) / 10))  

as.data.frame(Reduce(function(x, y) {
  cbind(x, rowMeans(tmp[[y]]))
}, 2:length(tmp), init = rowMeans(tmp[[1]]))) |>
  setNames(paste0("wave_", 1:length(tmp)))

     wave_1   wave_2   wave_3   wave_4   wave_5   wave_6   wave_7   wave_8   wave_9  wave_10
1  5.571560 6.587021 5.750900 5.791688 6.279064 5.674091 5.963896 6.583146 4.365665 6.394814
2  6.164837 4.531540 5.153556 4.362311 5.727500 5.056182 7.197980 5.664657 5.947038 3.626209
3  5.937152 5.622390 4.660815 6.652186 5.209772 6.838960 4.977723 4.390910 6.320762 7.420729
4  5.910486 5.766074 5.549177 5.526093 5.434328 4.246362 5.853391 6.841727 5.652275 5.059860
5  4.735269 6.043003 4.335277 4.646320 5.650717 5.954178 4.260801 5.311500 6.884402 6.185179
6  4.391438 5.857648 6.766768 6.460810 5.773092 6.142438 5.902048 4.932962 5.590644 6.614198
7  5.854975 5.531683 4.882462 6.188551 5.390633 5.129840 6.715329 4.850971 4.957175 5.999634
8  5.838495 5.237646 5.145763 3.105511 4.234151 5.813252 4.553210 4.652256 3.637094 4.540835
9  5.632393 5.645221 4.176820 3.942658 3.885425 7.146875 5.778416 5.492009 4.589817 5.545992
10 4.833706 5.458013 7.323117 7.621194 6.100454 4.712570 4.003229 4.914826 5.752216 4.032089

Upvotes: 3

AnilGoyal
AnilGoyal

Reputation: 26238

One more way to do it.

set.seed(123)
mat <- matrix(runif(1000, 1, 10), ncol = 100)
df <- data.frame(mat)

library(tidyverse)

imap_dfc(split.default(df, (((as.numeric(gsub('X', '', names(df)))) -1) %/% 10) +1), 
         ~ .x %>% mutate(!!paste0('wave_', .y) := rowMeans(.)))

#>          X1       X2       X3       X4       X5       X6       X7       X8
#> 1  3.588198 9.611500 9.005854 9.667218 2.285200 1.412481 6.986037 7.790276
#> 2  8.094746 5.080007 7.235231 9.120691 4.730917 4.979801 1.853566 6.662990
#> 3  4.680792 7.098136 6.764561 7.216348 4.723519 8.190324 4.455727 7.391642
#> 4  8.947157 6.153701 9.948428 8.159207 4.319609 2.097093 3.469453 1.005623
#> 5  9.464206 1.926322 6.901352 1.221523 2.372003 6.048532 8.331760 5.277849
#> 6  1.410008 9.098425 7.376774 5.300164 2.249255 2.858783 5.036647 2.981070
#> 7  5.752949 3.214790 5.896594 7.826136 3.097307 2.147785 8.290579 4.418349
#> 8  9.031771 1.378536 6.347278 2.947671 5.193662 7.779771 8.311506 6.514939
#> 9  5.962915 3.951286 3.602438 3.863629 3.393754 9.055408 8.149081 4.166181
#> 10 5.109533 9.590533 2.324023 3.084632 8.720449 4.370165 4.958485 2.000219
#>          X9      X10   wave_1      X11      X12      X13      X14      X15
#> 1  3.192575 2.176261 5.571560 6.399901 9.417698 6.831041 6.573308 3.805320
#> 2  7.012500 6.877917 6.164837 3.995412 3.711060 3.878386 9.022547 4.685275
#> 3  4.758821 4.091648 5.937152 5.397517 1.546485 3.769480 7.056992 1.094204
#> 4  8.093763 6.910823 5.910486 9.590264 9.529542 2.977909 7.633700 2.654646
#> 5  1.925782 3.883359 4.735269 5.346122 7.485366 4.325400 5.690222 8.584564
#> 6  4.914035 2.689220 4.391438 9.013152 2.280649 9.857973 6.938546 3.080456
#> 7  9.864613 8.040649 5.854975 9.229944 5.943562 2.387821 8.396249 3.151900
#> 8  9.037460 1.842355 5.838495 6.478615 9.586821 1.819396 8.076534 1.690220
#> 9  8.978222 5.201011 5.632393 4.696208 6.269350 2.277162 9.818397 3.211513
#> 10 2.575474 5.603549 4.833706 2.323852 4.640593 7.210064 4.954884 7.589217
#>         X16      X17      X18      X19      X20   wave_2      X21      X22
#> 1  8.627078 5.520696 6.669757 8.557910 3.467500 6.587021 3.148534 5.338383
#> 2  5.477745 4.185141 2.654456 3.812033 3.893345 4.531540 9.661230 3.276684
#> 3  4.491181 6.849866 8.772797 7.374613 9.870768 5.622390 6.412292 2.946293
#> 4  3.218041 4.372426 7.719112 3.385160 6.579940 5.766074 5.635268 7.069387
#> 5  1.999868 4.199008 7.014562 6.349089 9.435827 6.043003 4.623160 1.428973
#> 6  4.509950 5.803192 6.562161 5.331608 5.198794 5.857648 8.922219 7.307678
#> 7  6.147418 7.663009 4.350143 3.385295 4.661493 5.531683 4.276827 4.166998
#> 8  2.952035 2.989926 5.768521 6.081314 6.933073 5.237646 3.594154 4.680496
#> 9  5.002912 4.714715 8.872141 9.218694 2.371120 5.645221 2.535807 8.388562
#> 10 2.961916 3.391180 6.235751 9.116870 6.155804 5.458013 2.549546 9.269716
#>         X23      X24      X25      X26      X27      X28      X29      X30
#> 1  3.542755 6.565161 5.921435 3.314951 9.331372 8.839391 7.827339 3.679676
#> 2  9.649943 4.855794 6.960859 2.996091 8.937798 1.056707 2.233955 1.906494
#> 3  7.555550 5.878723 2.545286 6.337411 7.067682 1.648514 4.569261 1.647137
#> 4  7.177376 1.526306 6.697498 3.407693 9.551503 2.477901 3.024868 8.923965
#> 5  1.475595 3.347712 3.806828 5.779634 5.648004 7.933007 1.521627 7.788227
#> 6  4.556981 4.574368 7.520989 8.067625 6.188671 7.616659 4.563034 8.349453
#> 7  5.300608 2.779703 4.590458 2.512547 4.026981 9.746881 1.584355 9.839263
#> 8  6.042279 8.487348 9.724208 4.639593 4.125922 5.198251 3.032978 1.932397
#> 9  7.284354 2.375985 9.706585 5.244186 1.180219 1.669461 1.491662 1.891376
#> 10 9.241152 8.230767 7.540323 8.812961 5.525317 6.839363 7.032538 8.189485
#>      wave_3      X31      X32      X33      X34      X35      X36      X37
#> 1  5.750900 8.061177 4.506284 8.683870 6.555117 4.732618 2.228862 2.294353
#> 2  5.153556 1.084869 5.181993 3.663059 3.576568 2.074643 9.707225 2.735344
#> 3  4.660815 8.011593 7.419511 2.323435 7.640177 5.734267 5.635646 9.070648
#> 4  5.549177 7.564516 1.497717 7.335929 8.506489 3.025660 2.467633 3.773076
#> 5  4.335277 6.671187 4.193048 1.934260 3.828437 5.377706 6.597121 4.269705
#> 6  6.766768 5.328197 8.225310 1.303550 5.433099 4.331933 9.873587 8.055518
#> 7  4.882462 2.409732 8.521380 9.994641 7.276364 9.850152 7.018944 2.740408
#> 8  5.145763 1.073940 3.139745 1.313873 6.773161 4.494872 4.770243 1.159892
#> 9  4.176820 5.072126 4.185875 4.045522 6.795306 3.063204 3.910105 4.659471
#> 10 7.323117 5.430640 8.711969 9.235574 9.800681 6.609678 8.517298 5.348509
#>         X38      X39      X40   wave_4      X41      X42      X43      X44
#> 1  4.796605 6.264101 9.793891 5.791688 9.874489 3.852543 5.927135 9.577911
#> 2  4.085279 7.847413 3.666720 4.362311 2.233607 8.484117 6.798162 6.072802
#> 3  8.798350 4.354385 7.533847 6.652186 9.147786 2.936549 6.366372 3.965927
#> 4  5.095972 7.922745 8.071191 5.526093 6.186717 5.481540 3.897436 9.969555
#> 5  5.803884 5.839095 1.948760 4.646320 4.559040 3.484447 9.020029 3.113377
#> 6  9.674590 9.225959 3.156352 6.460810 5.048222 2.728210 6.636313 6.514048
#> 7  7.971324 2.667668 3.434904 6.188551 7.358517 9.555591 3.726144 1.973607
#> 8  2.879887 3.539966 1.909526 3.105511 1.742525 3.895530 4.493842 5.383293
#> 9  3.779081 1.854662 2.061225 3.942658 4.053813 5.306107 2.444276 1.895012
#> 10 9.742082 2.894384 9.921129 7.621194 7.127088 1.251933 8.762967 2.450492
#>         X45      X46      X47      X48      X49      X50   wave_5      X51
#> 1  3.546936 5.714403 9.772532 1.845249 2.936498 9.742943 6.279064 4.182455
#> 2  6.254851 4.148216 4.925170 8.398271 8.325403 1.634401 5.727500 4.297973
#> 3  7.585369 3.164776 5.176150 4.846855 3.769875 5.138067 5.209772 3.583901
#> 4  2.489688 1.523726 2.487683 7.802985 7.189684 7.314266 5.434328 1.719756
#> 5  8.798210 3.129578 6.264429 6.961470 9.394130 1.782462 5.650717 4.289088
#> 6  7.377167 9.010701 3.437002 5.000747 2.042017 9.936496 5.773092 2.602124
#> 7  7.843596 8.306447 3.070872 6.644316 2.149351 3.277891 5.390633 5.824483
#> 8  2.323757 7.727647 7.220870 1.004188 7.104015 1.445846 4.234151 5.535538
#> 9  4.222513 2.394206 3.545672 2.955192 4.860537 7.176925 3.885425 9.505316
#> 10 7.059992 2.122679 8.293583 7.343850 8.509609 8.082346 6.100454 4.071892
#>         X52      X53      X54      X55      X56      X57      X58      X59
#> 1  5.182424 3.710283 9.452246 8.208534 3.948377 1.429390 5.979826 8.908066
#> 2  1.742781 4.282039 7.188468 1.824802 2.844448 9.364156 9.154433 8.315674
#> 3  8.740962 3.809015 5.023333 8.488970 6.124444 7.923570 6.287152 8.686900
#> 4  4.560946 1.336330 8.348303 3.491696 8.992497 2.809726 4.811173 4.311064
#> 5  7.623094 5.669244 1.355493 7.777991 5.767427 6.852354 9.546268 8.865549
#> 6  2.545691 7.111121 7.651285 9.677373 6.282628 6.883900 7.381341 2.362044
#> 7  5.092855 9.129102 4.138475 1.733199 6.991616 4.557284 4.719749 3.536305
#> 8  7.931843 1.229740 8.463257 8.689283 5.769032 8.310743 1.165277 7.000346
#> 9  1.563850 9.901704 5.819873 8.220144 5.588586 5.923211 6.100607 9.796452
#> 10 8.335733 3.725989 3.470908 4.466562 1.145444 8.966194 5.410571 6.244658
#>         X60   wave_6      X61      X62      X63      X64      X65      X66
#> 1  5.739311 5.674091 3.135067 6.667709 9.130608 7.832218 6.344546 7.776389
#> 2  1.547040 5.056182 7.178413 1.869704 8.789507 8.603512 8.189412 9.253514
#> 3  9.721350 6.838960 3.032366 1.197943 7.978668 5.121294 8.995678 5.281755
#> 4  2.082134 4.246362 3.866451 9.937403 4.391348 7.566685 4.466724 6.104075
#> 5  1.795270 5.954178 2.565854 6.255450 1.378972 1.936708 1.812899 7.629583
#> 6  8.926877 6.142438 8.212866 8.036408 4.279700 2.979849 6.631693 8.717027
#> 7  5.575336 5.129840 2.316539 9.027065 3.463761 9.585561 7.710726 9.182319
#> 8  4.037455 5.813252 8.404457 7.793789 8.654207 7.768338 1.768406 1.507440
#> 9  9.049012 7.146875 3.978980 9.812834 4.261615 8.370598 3.704659 5.526175
#> 10 1.287745 4.712570 4.367524 1.397324 3.740300 4.759994 6.531059 4.154903
#>         X67      X68      X69      X70   wave_7      X71      X72      X73
#> 1  8.610005 1.180666 7.603730 1.358026 5.963896 8.604402 6.676962 4.768022
#> 2  8.257916 5.691825 7.432793 6.713205 7.197980 3.341192 5.687581 7.460961
#> 3  2.055981 1.776501 8.480998 5.856048 4.977723 1.208300 6.936592 7.682852
#> 4  7.414179 3.547021 8.979095 2.260932 5.853391 8.761596 7.564248 8.847989
#> 5  3.117420 4.783914 9.573801 3.553406 4.260801 4.011292 5.381406 6.470812
#> 6  1.674611 6.285503 5.955550 6.247275 5.902048 6.686100 4.460110 7.805830
#> 7  9.420815 8.260250 5.701022 2.485233 6.715329 5.917836 1.061502 8.625172
#> 8  2.414510 2.817957 2.536284 1.866710 4.553210 4.388000 1.033158 6.515017
#> 9  6.823516 5.134822 5.313444 4.857518 5.778416 2.672857 9.954428 8.139018
#> 10 2.561660 5.033288 3.284387 4.201854 4.003229 4.860465 1.970982 1.205692
#>         X74      X75      X76      X77      X78      X79      X80   wave_8
#> 1  4.752231 9.929536 3.499924 5.624030 9.243877 3.329211 9.403269 6.583146
#> 2  8.873941 2.335172 3.244659 3.127059 8.318018 8.375204 5.882785 5.664657
#> 3  6.826123 1.444279 2.434919 6.178921 7.083010 1.139299 2.974803 4.390910
#> 4  9.320311 6.376181 1.644671 5.340321 8.267492 6.889607 5.404850 6.841727
#> 5  2.550026 3.891956 6.016938 6.124174 2.152580 8.305476 8.210339 5.311500
#> 6  3.842315 5.751361 5.408533 2.297371 3.257007 5.152320 4.668676 4.932962
#> 7  8.238307 8.158798 5.448631 2.311379 3.983923 2.830754 1.933404 4.850971
#> 8  9.900671 1.620393 8.995344 4.710548 4.665424 1.168914 3.525087 4.652256
#> 9  3.901470 7.255186 1.330339 7.147093 6.718308 3.548194 4.253196 5.492009
#> 10 1.010725 9.537684 2.833694 6.880850 8.277508 9.237454 3.333204 4.914826
#>         X81      X82      X83      X84      X85      X86      X87      X88
#> 1  5.236137 1.444740 7.198490 1.036962 9.661792 3.936098 4.820069 2.265816
#> 2  4.292609 7.255947 3.801437 7.947185 7.627006 7.537590 1.381246 2.527921
#> 3  2.091448 4.269354 6.450282 5.166777 6.514511 9.925707 6.826979 7.857868
#> 4  1.422943 8.957202 9.919309 7.487563 2.079360 7.436202 5.217572 5.746554
#> 5  3.365167 7.977675 7.688844 6.998551 5.952331 5.539958 6.561335 8.748904
#> 6  9.717771 2.252833 1.682714 6.148663 3.364807 4.924429 3.437339 7.061995
#> 7  5.396459 3.655083 5.060520 7.334317 9.085247 9.539427 2.415658 1.117369
#> 8  5.300398 2.134745 1.481832 6.914990 1.082620 2.081633 2.028288 7.238790
#> 9  7.739136 6.309115 4.056000 3.604169 3.126114 1.676307 5.569146 9.025424
#> 10 7.008762 6.055080 7.605569 1.875155 2.170401 9.001193 5.932291 6.686652
#>         X89      X90   wave_9      X91      X92      X93      X94      X95
#> 1  1.965652 6.090889 4.365665 9.313293 1.035067 8.275476 8.981763 9.783007
#> 2  9.289587 7.809851 5.947038 5.883385 1.469788 1.377275 2.836863 5.412018
#> 3  7.078261 7.026432 6.320762 8.671281 8.799042 4.273684 7.935607 4.502533
#> 4  2.337343 5.918705 5.652275 6.252066 6.186207 8.709373 6.367267 4.757995
#> 5  7.708077 8.303173 6.884402 7.014913 3.824583 7.281519 9.619028 1.836332
#> 6  9.483390 7.832501 5.590644 5.601831 9.635192 7.160378 2.428196 2.456283
#> 7  4.787050 1.180616 4.957175 7.864755 6.320744 4.132135 5.733768 4.648748
#> 8  3.679508 4.428133 3.637094 9.130261 5.782684 5.992136 8.858362 4.076330
#> 9  3.334840 1.457921 4.589817 8.384271 4.455430 2.235193 8.827355 4.737317
#> 10 3.005932 8.181126 5.752216 1.642887 3.875979 8.064384 1.213198 3.736472
#>         X96      X97      X98      X99     X100  wave_10
#> 1  6.042525 5.573423 8.037720 5.120891 1.784974 6.394814
#> 2  2.402867 5.046399 1.414239 7.405233 3.014020 3.626209
#> 3  9.609218 6.609352 8.378602 9.278633 6.149338 7.420729
#> 4  1.395700 2.259801 3.424676 6.643996 4.601523 5.059860
#> 5  4.349420 9.171518 3.545654 9.119636 6.089188 6.185179
#> 6  9.663538 6.124989 6.788994 7.815962 8.466615 6.614198
#> 7  6.808848 5.934525 9.533070 2.240725 6.779024 5.999634
#> 8  1.551364 2.051449 1.062986 2.379294 4.523489 4.540835
#> 9  4.689513 7.858255 4.164559 2.721810 7.386219 5.545992
#> 10 4.833146 5.305325 4.771412 4.898666 1.979417 4.032089

Created on 2021-07-24 by the reprex package (v2.0.0)

Upvotes: 1

jackinovik
jackinovik

Reputation: 869

Another approach (and IMO the recommended approach) using dplyr would be to first reshape or melt your data into a tidy data format before summarizing the values from each wave.

In detail, this process would involve:

  1. Reshape your data to long format (tidyr::gather)
  2. Identify which variables belong to each "wave"
  3. Summarize values for each wave
  4. Reshape your data back to wide format (tidyr::spread)

In your example, this would look like the following:

library(tidyverse)

mat <- matrix(runif(1000, 1, 10), ncol = 100)
df <- data.frame(mat)
dim(df)

df %>%
  dplyr::mutate(id = dplyr::row_number()) %>%
  # reshape to "tidy data" or long format
  tidyr::gather(varname, value, -id) %>%
  # identify which variables belong to which "wave"
  dplyr::mutate(varnum = as.integer(stringr::str_extract(varname, pattern = '\\d+')),
                wave = floor((varnum-1)/10)+1) %>%
  # summarize your value for each wave
  dplyr::group_by(id, wave) %>%
  dplyr::summarise(avg = sum(value)/n()) %>%
  # reshape back to "wide" format
  tidyr::spread(wave, avg, sep='_') %>%
  dplyr::ungroup()

With the following output:

# A tibble: 10 x 11
      id wave_1 wave_2 wave_3 wave_4 wave_5 wave_6 wave_7 wave_8 wave_9 wave_10
   <int>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>
 1     1   6.24   4.49   5.85   5.43   5.98   6.04   4.83   6.92   5.43    5.52
 2     2   5.16   6.82   5.76   6.66   6.21   5.41   4.58   5.06   5.81    6.93
 3     3   7.23   6.28   5.40   5.70   5.13   6.27   5.55   5.84   6.74    5.94
 4     4   5.27   4.79   4.39   6.85   5.31   6.01   6.15   3.31   5.73    5.63
 5     5   6.48   5.16   5.20   4.71   5.87   4.44   6.40   5.00   5.90    3.78
 6     6   4.18   4.64   5.49   5.47   5.75   6.35   4.34   5.66   5.34    6.57
 7     7   4.97   4.09   6.17   5.78   5.87   6.47   4.96   4.39   5.99    5.35
 8     8   5.50   7.21   5.43   5.15   4.56   5.00   4.86   5.72   6.41    5.65
 9     9   5.27   5.71   5.23   5.44   5.12   5.40   5.38   6.05   5.41    5.30
10    10   5.95   4.58   6.52   5.46   7.63   5.56   5.82   7.03   5.68    5.38

This could be joined back to your original data to match the example you gave (which used mutate) as follows:

df %>%
  dplyr::mutate(id = dplyr::row_number()) %>%
  tidyr::gather(varname, value, -id) %>%
  dplyr::mutate(varnum = as.integer(stringr::str_extract(varname, pattern = '\\d+')),
                wave = floor((varnum-1)/10)+1) %>%
  dplyr::group_by(id, wave) %>%
  dplyr::summarise(avg = sum(value)/n()) %>%
  tidyr::spread(wave, avg, sep='_') %>%
  dplyr::ungroup() %>%
  dplyr::right_join(df %>%    # <-- join back to original data
                     dplyr::mutate(id = dplyr::row_number()),
                   by = 'id')

One nice aspect to this approach is that you can inspect your data to confirm that you are correctly assigning variables to "wave"s.

df %>%
  dplyr::mutate(id = dplyr::row_number()) %>%
  tidyr::gather(varname, value, -id) %>%
  dplyr::mutate(varnum = as.integer(stringr::str_extract(varname, pattern = '\\d+')),
                wave = floor((varnum-1)/10)+1) %>%
  dplyr::distinct(varname, varnum, wave) %>%
  head()

which produces:

  varname varnum wave
1      X1      1    1
2      X2      2    1
3      X3      3    1
4      X4      4    1
5      X5      5    1
6      X6      6    1

Upvotes: 1

Sebastian M&#252;ller
Sebastian M&#252;ller

Reputation: 581

Another dplyr solution which is a bit closer to syntax indicated by the OP and doesn't require recasting the data-frame.

The 4 wave calculations do basically the same thing in slightly different but vectorized (i.e. rowSums and rowMeans) ways:

df <- df %>% 
      mutate(wave_1 = rowSums(select(., num_range("X", 1:10)))/10,
             wave_2 = rowSums(select(., c(11:20)))/10,
             wave_3 = rowMeans(select(., X21:X30)),
             wave_4 = rowMeans(.[, 31:40]))

Edit: . can be used as placeholder for the current dataframe df (code was changed accordingly). Also wave_4 added to demonstrate it can be used like a dataframe.

In case to operating function is not vectorized (that is, it can't be used on the whole dataframe such as rowSums), it is also possible to make use of the rowwise and do function using a non-vectorized functions (e.g. myfun)

myfun <- function (x) {
  sum(x)/10
}

tmp=df %>%
  rowwise() %>%
  do(data.frame(., wave_1 = myfun(unlist(.)[1:10]))) %>%
  do(data.frame(., wave_2 = myfun(unlist(.)[11:20])))

Note: . changes seems to change it's meaning, referring to the whole dataframe for mutate but only the current row for do.

Upvotes: 2

Carlos Cinelli
Carlos Cinelli

Reputation: 11617

Here is one way with the package zoo:

library(zoo)
t(rollapply(t(df), width = 10, by = 10, function(x) sum(x)/10))

Here is one way to do it with base R:

splits <- 1:100
dim(splits) <- c(10, 10)
splits <- split(splits, col(splits))
results <- do.call("cbind", lapply(splits, function(x) data.frame(rowSums(df[,x] / 10))))
names(results) <- paste0("wave_", 1:10)
results

Another very succinct way with base R (courtesy of G.Grothendieck):

t(apply(df, 1, tapply, gl(10, 10), mean))

And here is a solution with dplyr and tidyr:

library(dplyr)
library(tidyr)
df$row <- 1:nrow(df)
df2 <- df %>% gather(column, value, -row)
df2$column <- cut(as.numeric(gsub("X", "", df2$column)),breaks = c(0:10*10))
df2 <- df2 %>% group_by(row, column) %>% summarise(value = sum(value)/10)
df2 %>% spread(column, value) %>% select(-row)

Upvotes: 7

Related Questions