Reputation: 495
I have a dataframe test and want to normalize value
based on several groups, however, my code normalizes values based on the whole sample. It returns only ones number 1 in value
column, however number 1 should be more than 1 time as I have different groups.
test<-structure(list(group = c("Total", "Total", "Total", "Total",
"BMW", "BMW", "Audi", "Audi", "Skoda", "Skoda", "Skoda", "Skoda",
"Skoda", "Skoda", "Total", "Total", "Total", "Total", "BMW",
"BMW", "Audi", "Audi", "Skoda", "Skoda", "Skoda", "Skoda", "Skoda",
"Skoda", "Total", "Total", "Total", "Total", "BMW", "BMW", "Audi",
"Audi", "Skoda", "Skoda", "Skoda", "Skoda", "Skoda", "Skoda",
"Total", "Total", "Total", "Total", "BMW", "BMW", "Audi", "Audi"
), day = c("MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY",
"MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY",
"MONDAY", "MONDAY", "TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY",
"TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY",
"TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", "WEDNESDAY", "WEDNESDAY",
"WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY",
"WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY",
"WEDNESDAY", "WEDNESDAY", "THURSDAY", "THURSDAY", "THURSDAY",
"THURSDAY", "THURSDAY", "THURSDAY", "THURSDAY", "THURSDAY"),
variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("ALL",
"DE", "BE", "NL", "AUS", "ES", "IT", "FR", "PO"), class = "factor"),
value = c(2400000, 384000, 43826, 340174, 960000, 230400,
1440000, 153600, 1000, 700, 23500, 11000, 12500, 10000, 3750000,
185625, 85450, 100175, 2437500, 146250, 1312500, 39375, 1650,
1240, 35700, 29000, 6700, 6250, 3400000, 183600, 23868, 159732,
2210000, 88400, 1190000, 95200, 1400, 1100, 33000, 27500,
5500, 4500, 3100000, 186000, 18600, 167400, 1860000, 148800,
1240000, 37200)), row.names = c(NA, 50L), class = "data.frame")
normalit<-function(m){
(m - min(m))/(max(m)-min(m))
}
df_scaled <- test %>% group_by(group,variable) %>% mutate(value = normalit(value))
Upvotes: 0
Views: 114
Reputation: 5429
Your code works fine from what you say. I have multiple 1's in there. Perhaps you are not seeing all the rows?
print.data.frame( df_scaled )
group day variable value
1 Total MONDAY ALL 0.638205499
2 Total MONDAY ALL 0.097925712
3 Total MONDAY ALL 0.006760465
4 Total MONDAY ALL 0.086180522
5 BMW MONDAY ALL 0.371035716
6 BMW MONDAY ALL 0.060448682
7 Audi MONDAY ALL 1.000000000 <!-- here
8 Audi MONDAY ALL 0.082976903
9 Skoda MONDAY ALL 0.008571429
10 Skoda MONDAY ALL 0.000000000
11 Skoda MONDAY ALL 0.651428571
12 Skoda MONDAY ALL 0.294285714
13 Skoda MONDAY ALL 0.337142857
14 Skoda MONDAY ALL 0.265714286
15 Total TUESDAY ALL 1.000000000 <!-- here
16 Total TUESDAY ALL 0.044762020
17 Total TUESDAY ALL 0.017915528
18 Total TUESDAY ALL 0.021861768
19 BMW TUESDAY ALL 1.000000000 <!-- here
20 BMW TUESDAY ALL 0.024626453
21 Audi TUESDAY ALL 0.909110351
22 Audi TUESDAY ALL 0.001550470
23 Skoda TUESDAY ALL 0.027142857
24 Skoda TUESDAY ALL 0.015428571
25 Skoda TUESDAY ALL 1.000000000 <!-- here
26 Skoda TUESDAY ALL 0.808571429
27 Skoda TUESDAY ALL 0.171428571
28 Skoda TUESDAY ALL 0.158571429
29 Total WEDNESDAY ALL 0.906201426
30 Total WEDNESDAY ALL 0.044219328
31 Total WEDNESDAY ALL 0.001411803
32 Total WEDNESDAY ALL 0.037822801
33 BMW WEDNESDAY ALL 0.903154400
34 BMW WEDNESDAY ALL 0.000000000
35 Audi WEDNESDAY ALL 0.821785001
36 Audi WEDNESDAY ALL 0.041345880
37 Skoda WEDNESDAY ALL 0.020000000
38 Skoda WEDNESDAY ALL 0.011428571
39 Skoda WEDNESDAY ALL 0.922857143
40 Skoda WEDNESDAY ALL 0.765714286
41 Skoda WEDNESDAY ALL 0.137142857
42 Skoda WEDNESDAY ALL 0.108571429
43 Total THURSDAY ALL 0.825802648
44 Total THURSDAY ALL 0.044862518
45 Total THURSDAY ALL 0.000000000
46 Total THURSDAY ALL 0.039877794
47 BMW THURSDAY ALL 0.754161168
48 BMW THURSDAY ALL 0.025711975
49 Audi THURSDAY ALL 0.857428001
50 Audi THURSDAY ALL 0.000000000
Upvotes: 1