Reputation: 9437
I have an int array that represents a bell shaped distribution so that if I were to plot the data using the indexes as the x axis and the values as the y values I would obtain a graph like this:
index value
0 46
1 659
2 541
3 519
4 431
5 480
6 441
7 448
8 530
9 557
10 625
11 670
12 818
13 994
14 953
15 1139
16 1221
17 1226
18 1394
19 1772
20 2006
21 2351
22 2590
23 2785
24 3164
25 3639
26 4304
27 4860
28 5539
29 6340
30 7799
31 9364
32 10912
33 13017
34 15571
35 18633
36 22181
37 26567
38 31027
39 36643
40 42486
41 49997
42 57444
43 65501
44 74261
45 83820
46 93841
47 104361
48 114911
49 125867
50 136503
51 148606
52 158489
53 168585
54 177544
55 185554
56 192791
57 200219
58 203626
59 206432
60 208801
61 207941
62 207363
63 205734
64 201727
65 197152
66 190431
67 182139
68 174938
69 165990
70 155895
71 146229
72 136247
73 126603
74 116665
75 106734
76 97147
77 87350
78 78454
79 70097
80 62644
81 55134
82 48509
83 42327
84 36758
85 32089
86 27850
87 23787
88 20226
89 17071
90 14624
91 12542
92 10511
93 8669
94 7150
95 6054
96 5069
97 4178
98 3390
99 2894
100 2291
101 1963
102 1711
103 1394
104 1191
105 969
106 924
107 802
108 711
109 604
110 562
111 608
112 613
113 633
114 639
115 591
116 662
117 594
118 580
119 626
120 610
121 633
122 605
123 617
124 608
125 558
126 564
127 573
128 521
129 474
130 487
131 475
132 477
133 459
134 439
135 428
136 391
137 355
138 345
139 342
140 353
141 347
142 304
143 302
144 291
145 247
146 234
147 217
148 219
149 187
150 178
151 166
152 147
153 115
154 139
155 118
156 125
157 131
158 108
159 103
160 86
161 99
162 85
163 77
164 68
165 66
166 70
167 57
168 35
169 42
170 45
171 41
172 37
173 37
174 32
175 46
176 37
177 34
178 23
179 40
180 27
181 30
182 33
183 39
184 41
185 51
186 50
187 36
188 31
189 32
190 31
191 24
192 33
193 24
194 30
195 34
196 35
197 32
198 39
199 46
200 6821
In the picture above the green line shows the index with the max value and the green blocks represent standard deviation (SD) from the max (I'm not sure if SD is the proper name, I've heard some people calling it sigma value). I want to write a java function that takes this int array and outputs the lowest and highest boundaries given a desired SD and the max value. What I have so far is not much:
public static void getIntervalMinMax(int [] input){
int max = 0;
for(int i=0; i<input.length; i++){
if(input[i]>max){
max = input[i];
}
}
int deviation = ??;
System.out.println("MIN: "+(max-deviation));
System.out.println("MAX: "+(max+deviation));
}
I have checked this post but I have not being able to find a function in this library for a distribution SD. Can you please help me figure out how to calculate deviation? Thanks
Upvotes: 0
Views: 2244
Reputation: 38751
Standard Deviation is represented by the greek letter sigma so sigma value is just another same for the same thing. Standard Deviation is the square root of the Variance (also called sigma squared). The variance is defined by the average of the distribution:
variance = sum( data | x ^ 2 ) / n - average( data ) ^ 2
I'd create a class called Distribution like so:
public class Distribution {
private Double[] data;
private double max = Double.NaN;
private double min = Double.NaN;
private double variance = Double.NaN;
private double average = Double.NaN;
public getMax() {
if( max == Double.NaN ) {
calculateStats();
}
return max;
}
// each method getMin, getAverage, getVariance, etc would be written the same way as getMax().
private void calculateStats() {
min = Double.MAX_VALUE;
max = Double.MIN_VALUE;
average = 0;
variance = 0;
for( int i = 0; i < data.length; i++ ) {
double sample = data[i];
if( sample > max ) max = sample;
if( sample < min ) min = sample;
average += sample;
variance += sample * sample;
}
average = average / data.length;
variance = variance / data.length - average * average;
}
public double getStandardDeviation() {
if( variance == Double.NaN ) {
calculateStats();
}
return Math.sqrt( variance );
}
}
Upvotes: 3
Reputation:
You can calculate the SD with the square root from the empirical sample variance which is:
s² = 1/(n-1) * sum_{i=1}^n (y_i - mean(y))^2
EDIT: This density is actually NOT bell shaped if you refer to a gaussian distribution!
Upvotes: 1