Reputation: 328
I have a dataset comparing 15 hybrids, each with 5 separate measurements. I am trying to spread the data into a wider dataset using pivot_wider
for a regression analysis, since spread()
would not work (probably because of the repeated observations).
The dataset I am working with is below:
data <- structure(list(hybrid = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11,
11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12,
12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
15, 15, 15), measurement = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4,
4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1,
1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3,
3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5,
5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2,
2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4,
4, 5, 5, 5, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 1, 1, 1, 2, 2, 2, 3,
3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5,
5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2,
2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4,
4, 4, 5, 5, 5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5), value = c(245,
889, 450, 45, 515, 318, 956, 434, 29, 740, 156, 516, 767, 292,
753, 573, 636, 611, 777, 557, 408, 95, 482, 227, 495, 360, 55,
76, 393, 37, 667, 802, 724, 900, 885, 191, 79, 143, 531, 398,
324, 129, 172, 467, 25, 101, 476, 629, 915, 122, 498, 649, 354,
527, 920, 788, 565, 552, 586, 127, 461, 307, 77, 552, 198, 240,
816, 144, 136, 781, 593, 421, 233, 264, 812, 407, 492, 932, 940,
139, 764, 200, 352, 754, 271, 506, 381, 973, 678, 848, 432, 358,
218, 736, 287, 411, 220, 264, 531, 669, 666, 727, 841, 792, 79,
460, 159, 426, 90, 395, 793, 507, 262, 814, 157, 641, 230, 870,
304, 591, 636, 277, 534, 783, 562, 938, 889, 68, 557, 892, 809,
157, 71, 54, 256, 246, 301, 823, 622, 953, 6, 66, 556, 902, 207,
832, 248, 540, 192, 65, 381, 712, 15, 323, 1, 193, 146, 637,
488, 158, 289, 839, 229, 237, 273, 978, 560, 969, 898, 204, 335,
930, 444, 968, 920, 398, 303, 318, 975, 182, 630, 4, 624, 271,
272, 438, 661, 728, 32, 106, 473, 465, 498, 33, 189, 918, 704,
605, 867, 240, 833, 497, 514, 241, 860, 228, 643, 791, 4, 898,
574, 225, 339, 365, 387, 548, 88, 604, 283)), class = "data.frame", row.names = c(NA,
-219L))
I'm new to the pivot_wider function, so when I run my code, I get an error:
data%>%
pivot_wider(cols = -hybrid, names_to = c("1","2","3","4","5"))
Error in pivot_wider(., cols = -hybrid, names_to = c("1", "2", "3", "4", :
unused arguments (cols = -hybrid, names_to = c("1", "2", "3", "4", "5"))
How can I spread this data so that I have 5 columns? Hybrid, 1, 2, 3, 4, 5 (with the values under the columns entitled 1:5).
Upvotes: 0
Views: 91
Reputation: 886938
Using dcast
from data.table
library(data.table)
dcast(setDT(data), hybrid ~ measurement, sum)
# hybrid 1 2 3 4 5
# 1: 1 1584 878 1419 1412 1812
# 2: 2 1820 1742 804 910 506
# 3: 3 2193 1976 753 851 664
# 4: 4 1206 1535 1530 2273 1265
# 5: 5 845 990 1096 1795 1309
# 6: 6 1831 1843 1306 1158 2499
# 7: 7 1008 1434 1015 2062 1712
# 8: 8 1045 1278 1583 1028 1765
# 9: 9 913 1317 1500 957 1449
#10: 10 1037 556 1746 1025 1665
#11: 11 1620 638 1050 340 1283
#12: 12 1357 1488 2427 1469 2332
#13: 13 1019 1787 899 1371 866
#14: 14 1436 1140 2176 1570 1615
#15: 15 1662 1476 929 1023 887
Upvotes: 2
Reputation: 33498
My guess is that you are you looking for this:
library(tidyr)
pivot_wider(data, id_cols = hybrid, names_from = measurement, values_from = "value", values_fn = sum)
# # A tibble: 15 x 6
# hybrid `1` `2` `3` `4` `5`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 1584 878 1419 1412 1812
# 2 2 1820 1742 804 910 506
# 3 3 2193 1976 753 851 664
# 4 4 1206 1535 1530 2273 1265
# 5 5 845 990 1096 1795 1309
# 6 6 1831 1843 1306 1158 2499
# 7 7 1008 1434 1015 2062 1712
# 8 8 1045 1278 1583 1028 1765
# 9 9 913 1317 1500 957 1449
# 10 10 1037 556 1746 1025 1665
# 11 11 1620 638 1050 340 1283
# 12 12 1357 1488 2427 1469 2332
# 13 13 1019 1787 899 1371 866
# 14 14 1436 1140 2176 1570 1615
# 15 15 1662 1476 929 1023 887
Upvotes: 2