R Programming issue intervals

Question

I'm trying to figure out a formula to be able to divide the max and min number inside the intervals.

x <- sample(10:40,100,rep=TRUE)
factorx<- factor(cut(x, breaks=nclass.Sturges(x)))
xout<-as.data.frame(table(factorx))
xout<- transform(xout, cumFreq = cumsum(Freq), relative = prop.table(Freq))

Using the above code in the R editor program, I get the following:

xout
      factorx Freq cumFreq relative
1 (9.97,13.8]   14      14     0.14
2 (13.8,17.5]   13      27     0.13
3 (17.5,21.2]   16      43     0.16
4   (21.2,25]    5      48     0.05
5   (25,28.8]   11      59     0.11
6 (28.8,32.5]    8      67     0.08
7 (32.5,36.2]   16      83     0.16
8   (36.2,40]   17     100     0.17

What I want to know is if there is a way to calculate the interval. For example it would be:

(13.8 + 9.97)/2

It's called the class midpoint in statistics I believe.

Metrics · Accepted Answer

#One possible solution is to split by (,] (xout is your dataframe)

x1<-strsplit(as.character(xout$factorx),",|\(|]")
x2<-do.call(rbind,x1)
xout$lower=as.numeric(x2[,2])
xout$higher=as.numeric(x2[,3])
xout$ave<-rowMeans(xout[,c("lower","higher")])

> head(xout,3)
      factorx Freq cumFreq relative higher lower   aver
1 (9.97,13.7]   15      15     0.15   13.7  9.97 11.835
2 (13.7,17.5]   14      29     0.14   17.5 13.70 15.600
3 (17.5,21.2]   12      41     0.12   21.2 17.50 19.350

R Programming issue intervals

Answers (2)

Related Questions