Akshay Jangra
Akshay Jangra

Reputation: 31

most occurring value in a vector

I have a vector file with 1000 values. All the values were generated using Random function between 0-1.

x <- runif(100,min=0,max=1)
x
  [1] 0.84620011 0.82525410 0.31622827 0.08040362 0.12894525 0.23997187 0.57177296 0.91691368 0.65751720
 [10] 0.39810175 0.60632205 0.26339035 0.93543618 0.09662383 0.35147739 0.51731042 0.29151612 0.54411769
 [19] 0.73688309 0.26086586 0.37808273 0.19163366 0.62776847 0.70973345 0.31802726 0.69101574 0.50042561
 [28] 0.20768256 0.23555818 0.21015820 0.18221151 0.85593725 0.12916935 0.52222127 0.62269135 0.51267707
 [37] 0.60164023 0.30723904 0.81990231 0.61771762 0.02502631 0.47427724 0.21250040 0.88611710 0.88648546
 [46] 0.92586513 0.57015942 0.33454379 0.03572245 0.68120369 0.48692522 0.76587764 0.55214917 0.31137200
 [55] 0.47170307 0.48639510 0.68922858 0.73506033 0.23541740 0.81793240 0.17184666 0.06670039 0.55664270
 [64] 0.10030533 0.94620061 0.58572228 0.53333567 0.80887841 0.55015406 0.82491114 0.81251132 0.06038019
 [73] 0.10918904 0.84011824 0.33169617 0.03568364 0.07703029 0.15601158 0.31623253 0.25021777 0.77024833
 [82] 0.88588620 0.49044305 0.10165930 0.55494697 0.17455070 0.94458467 0.43135868 0.99313733 0.04482747
 [91] 0.53453604 0.52500493 0.35496966 0.06994880 0.11377845 0.71307042 0.35086237 0.04032254 0.23744845
[100] 0.81131033

Out of all these values in the vector, I need to find the most occurring value(Or close to that). I'm new to R and have no idea what this. Please help?

One approach I have - Divide all the values in a certain ranges and find the frequency distribution. But will it be helpful?

Upvotes: 2

Views: 105

Answers (3)

MS Berends
MS Berends

Reputation: 5209

To really get just the most occurrent value, or when using discrete data as input, you could simply create a table, sort the results and return the highest value:

values <- c("a", "a", "c", "c", "c")

names(sort(table(values), decreasing = TRUE)[1])
#> [1] "c"

Breaking it down:

# create a table of the values
table(values)
#> a c 
#> 2 3

# sort the table descending on number of occurrences
sort(table(values), decreasing = TRUE)
#> c a 
#> 3 2

# now only keep the first value
sort(table(values), decreasing = TRUE)[1]
#> c 
#> 3

# so the final line:
names(sort(table(values), decreasing = TRUE)[1])
#> [1] "c"

If you're feeling like wanting to do fancy stuff, create a function that does this for you:

get_mode <- function(x) {
  names(sort(table(values), decreasing = TRUE)[1])
}

get_mode(values)
#> [1] "c"

Upvotes: 0

RHertel
RHertel

Reputation: 23788

One possibility to analyze the distribution of the numbers could consist in plotting a histogram and adding an approximate probability density distribution. This can be done with the ggplot2 library:

set.seed(123) # used here for reproducibility
x <- runif(100) # pseudo-random numbers between 0 and 1
library(ggplot2)
p <- ggplot(as.data.frame(x),aes(x=x, y=..density..)) + 
  geom_histogram(fill="lightblue",colour="grey60",bins=50) + 
  geom_density()

enter image description here

The value of bins specified in geom_histigram() is the number of bars in the histogram. You may want to try to change this value to obtain a different representation of the distribution.

OR

You could use base Rand plot a simple histogram:

hist(x)

enter image description here

There you can also change the bin width (see breaks), but the default might be sufficient to show the concept.

You can identify which bin in this histogram has the most entries with

> hist(x)$mids[which.max(hist(x)$counts)]
#[1] 0.45

Which in this case means that most values occur near a value of 0.45 (the middle of the bin describing the range between 0.4 and 0.5).

Hope this helps.

Upvotes: 1

Verena Praher
Verena Praher

Reputation: 1272

You can do this:

set.seed(12)
x <- runif(100,min=0,max=1)
n <- length(x)
x_cut<-cut(x, breaks = n/4)
which(table(x_cut)==max(table(x_cut)))

The result depends on the breaks value you set. This is an alternative to using a histogram if you don't need one.

Upvotes: 0

Related Questions