Daniela Rodriguez
Daniela Rodriguez

Reputation: 21

Ruby array arranged in ascending order

Hi I'm making a simple function to generate a float array in ascending order but i want to know if there is a simpler way to do this. I'll be using this to create a table so i can generate random math problems to solve them by the Lagrange numeric method.

The lagrange method uses X, F(X) and X0 where X0 has to be in range of X, if not you can't solve the problem

So far this is my code

xn = []
#i is a random number selected from a list
i.times do
    value=rand(0.1..15.1)
    xn<<value 
end
#ordena el arreglo de manera ascendente
xn.sort!

Edit: updated the code

Upvotes: 0

Views: 974

Answers (2)

pjs
pjs

Reputation: 19855

As Sid pointed out, when you ask "if there is a simpler way to do this" a great deal depends on context. If by "simpler" you're looking for efficiency, the answer isn't just how many lines of code there are.

If you really want to see all of the order statistics as a set, converting your approach to a one-liner using Ruby's block constructor capability is arguably more readable, and timings from the implementation provided below show it's slightly faster, but only by a few percent.

However, if you have a very large N and are processing the values sequentially, you might prefer a generator approach. The one implemented below as ordered_random_generator is O(1) computation and storage per element, making it O(N) to generate the entire set but O(1) storage if you discard each element after using it. If you store the elements, in practice this is slower than the sort-based approach due to the high cost of evaluating kth roots in the calculation.

Another possibility is that you aren't actually interested in the whole set of values, but are using this to get particular quantiles or order statistics, e.g., the 10th out of 100 ordered elements. In that case you can directly generate a random value with the correct distribution of the kth order statistic out of N in O(1) time with O(1) storage.

Here's an implementation of the various options, showing timings for the three approaches which generate an entire array and showing the distributional validity of the fourth approach.

# Your original.
# O(N log N) time due to sort, O(N) storage.
def ordered_random1(n, range_spec = 1.0..n)
  ary = []
  n.times do
    value = rand(range_spec)
    ary << value
  end
  ary.sort!
end

# Same sort-based algorithm using Ruby's Array constructor with a block.
# O(N log N) time due to sort, O(N) storage.
def ordered_random2(n, range_spec = 1.0..n)
  Array.new(n) { rand(range_spec) }.sort!
end

# Generator which uses distributional properties to generate the minimum of
# k uniforms from k = N down to 1, scaling the result down to the remaining
# sub-range of the original range.
# O(1) time per element, O(1) storage.  However, computation time has a very
# large constant due to the transcendental function evaluation for kth root.
def ordered_random_generator(n, range_spec = 1.0..n)
  x = range_spec.first
  upper = range_spec.last
  Enumerator.new do |yielder|
    n.times do |i|
      range = upper - x
      u_min = 1.0 - rand ** (1.0 / (n - i))
      x += range * u_min
      yielder.yield x
    end
  end
end

# Use generator to fill array of size N.
# O(N) time and storage.
def ordered_random3(n, range_spec = 1.0..n)
  gen = ordered_random_generator(n, range_spec)
  Array.new(n) { gen.next }
end

require 'random_variates'   # 'gem install random_variates' to get from rubygems

# Use distributional properties of uniform order statistics to directly
# generate instances of the kth of N values.
# O(1) time, O(1) storage.
def kth_of_n_generator(k:, n:, range_spec: 0.0..1.0)
  # Uniform order stats have a beta distribution. Beta is a ratio of Gammas.
  x = Gamma.new(alpha: k).next
  y = Gamma.new(alpha: n - k + 1).next
  beta = x / (x + y)
  (range_spec.last - range_spec.first) * beta + range_spec.first
end

# Time for Demos!
my_range = 0.1..15.1
puts "SAMPLE OUTPUT FOR RANGE = #{my_range}:"
puts " original: #{ordered_random1(5, my_range)}"
puts "one-liner: #{ordered_random2(5, my_range)}"
puts "generator: #{ordered_random3(5, my_range)}"
puts "direct generation of min & max using kth_of_n_generator: #{
  kth_of_n_generator(k: 1, n: 5, range_spec: my_range)
}, #{
  kth_of_n_generator(k: 5, n: 5, range_spec: my_range)
}"

REPS = 10_000
n = 9
puts "\nDEMO DISTRIBUTIONAL CORRECTNESS OF SINGLETON GENERATOR (range = 0.0..1.0)"
(1..n).each do |k|
  total = Array.new(REPS) { kth_of_n_generator(k: k, n: n) }.inject(:+)
  quantile = k.to_f / (n + 1)
  suffix = case k
    when 1
      "st"
    when 2
      "nd"
    when 3
      "rd"
    else
      "th"
  end
  print "Average of #{REPS} values of #{k}#{suffix} of #{n}: #{total / REPS} "
  puts "[Expected value is #{quantile}]"
end

require 'benchmark/ips'
[100, 10_000].each do |n|
  puts "\nBENCHMARKING ARRAYS OF SIZE #{n}"
  Benchmark.ips do |b|
    b.report(' original:') { ordered_random1(n, my_range) }
    b.report('one-liner:') { ordered_random2(n, my_range) }
    b.report('generator:') { ordered_random3(n, my_range) }
    b.compare!
  end
end

Here's an example of the output on my machine. Your timings will vary based on your hardware, OS, and the version of Ruby you're running. The specific values will vary from run to run due to randomness, but are consistent.

SAMPLE OUTPUT FOR RANGE = 0.1..15.1:
 original: [3.2143763318277223, 3.424117583339602, 4.98763316107166, 7.67915049946293, 13.002051529711663]
one-liner: [3.698584735327408, 3.7940473868424713, 8.133265097991108, 10.797493427133121, 13.519291528088747]
generator: [1.379949057529254, 3.330310564043854, 14.175279996588, 14.187770450655005, 14.747374304212487]
direct generation of min & max using kth_of_n_generator: 2.3844682728553956, 14.093371351681753

DEMO DISTRIBUTIONAL CORRECTNESS OF SINGLETON GENERATOR (range = 0.0..1.0)
Average of 10000 values of 1st of 9: 0.10061353514079374 [Expected value is 0.1]
Average of 10000 values of 2nd of 9: 0.19841217568287062 [Expected value is 0.2]
Average of 10000 values of 3rd of 9: 0.3018753486695847 [Expected value is 0.3]
Average of 10000 values of 4th of 9: 0.40002514960574265 [Expected value is 0.4]
Average of 10000 values of 5th of 9: 0.5003591617651723 [Expected value is 0.5]
Average of 10000 values of 6th of 9: 0.5974291957317844 [Expected value is 0.6]
Average of 10000 values of 7th of 9: 0.6980418879340753 [Expected value is 0.7]
Average of 10000 values of 8th of 9: 0.8012294219961899 [Expected value is 0.8]
Average of 10000 values of 9th of 9: 0.9002379495094114 [Expected value is 0.9]

BENCHMARKING ARRAYS OF SIZE 100
Warming up --------------------------------------
           original:     4.037k i/100ms
          one-liner:     4.242k i/100ms
          generator:   773.000  i/100ms
Calculating -------------------------------------
           original:     40.412k (± 2.0%) i/s -    205.887k in   5.096825s
          one-liner:     41.852k (± 2.3%) i/s -    212.100k in   5.070662s
          generator:      7.676k (± 4.2%) i/s -     38.650k in   5.045488s

Comparison:
          one-liner::    41852.1 i/s
           original::    40412.3 i/s - same-ish: difference falls within error
          generator::     7675.6 i/s - 5.45x  slower


BENCHMARKING ARRAYS OF SIZE 10000
Warming up --------------------------------------
           original:    29.000  i/100ms
          one-liner:    30.000  i/100ms
          generator:     7.000  i/100ms
Calculating -------------------------------------
           original:    295.387  (± 2.0%) i/s -      1.479k in   5.009243s
          one-liner:    304.406  (± 2.0%) i/s -      1.530k in   5.028485s
          generator:     78.104  (± 2.6%) i/s -    392.000  in   5.020934s

Comparison:
          one-liner::      304.4 i/s
           original::      295.4 i/s - same-ish: difference falls within error
          generator::       78.1 i/s - 3.90x  slower

Note that the generator approach is slower than the two sort-based approaches for both array sizes tested here. The gap is closing for larger array sizes due to the asymptotic behavior of O(N) vs O(N log N), but probably not enough to be of interest if your primary focus is speed.

Upvotes: 1

Sid
Sid

Reputation: 4997

The code itself looks fine and serves your purpose. Is there a context around where will this be used? Because 'simplicity' may be driven by the context of the code.

For example,

  1. You can make the quantity of random numbers needed and the range of random numbers configurable. You can configure externally the order of sorting.

  2. You can encapsulate this in a utility class and expose it like an API to other classes.

  3. If there are a million random sorted numbers required, do you want the numbers streaming? If so there are Ruby libraries.

... and many more. The context will be useful. Hope this helps.

Upvotes: 0

Related Questions