Anton Degterev
Anton Degterev

Reputation: 611

Most elegant way to get a array with all combinations of given elements

What is currently the most elegant way to get a array with column names containing all combinations of given elements? Some closest analogue of the function expand.grid() from R. All the discussions I've met on this issue have been about either a slightly different formulation of the problem, or were conducted in the days of very old versions of Julia. The most rational solution that I found at the moment looks like this:

using DataFrames
xy = [[x,y] for x in 0:0.25:1 for y in 0:0.25:1]
xy_array = permutedims(reshape(hcat(xy...), (length(xy[1]), length(xy))))
df = DataFrame(x = xy_array[:,1], y = xy_array[:,2])

A similar expression in R can be written much more compactly:

xy_comb <- expand.grid(x=seq(0, 1, 0.25), y=seq(0, 1, 0.25))

Is there any more concise form to writing a same expression in Julia?

Upvotes: 4

Views: 817

Answers (1)

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69939

Let me show several options that will also show the integration of DataFrames.jl with Tables.jl and how it can be used in this case.

You can do e.g. this:

julia> df = DataFrame(x=Float64[], y=Float64[])
0×2 DataFrame

julia> foreach(x -> push!(df, x), Iterators.product(0:0.25:1, 0:0.25:1))

julia> df
25×2 DataFrame
 Row │ x        y
     │ Float64  Float64
─────┼──────────────────
   1 │    0.0      0.0
   2 │    0.25     0.0
   3 │    0.5      0.0
  ⋮  │    ⋮        ⋮
  23 │    0.5      1.0
  24 │    0.75     1.0
  25 │    1.0      1.0
         19 rows omitted

or this (this will be faster, but the former pattern is useful in general situations when you generate rows of a data frame dynamically):

julia> rename!(DataFrame(Iterators.product(0:0.25:1, 0:0.25:1)), [:x, :y])
25×2 DataFrame
 Row │ x        y
     │ Float64  Float64
─────┼──────────────────
   1 │    0.0      0.0
   2 │    0.25     0.0
   3 │    0.5      0.0
  ⋮  │    ⋮        ⋮
  23 │    0.5      1.0
  24 │    0.75     1.0
  25 │    1.0      1.0
         19 rows omitted

The problem with DataFrame(Iterators.product(0:0.25:1, 0:0.25:1)) is that the default column names are "1" and "2" which you probably want to change.

Therefore you could also generate NamedTuples instead of Tuples like this:

julia> DataFrame((x=x,y=y) for x in 0:0.25:1 for y in 0:0.25:1)
25×2 DataFrame
 Row │ x        y
     │ Float64  Float64
─────┼──────────────────
   1 │    0.0      0.0
   2 │    0.0      0.25
   3 │    0.0      0.5
  ⋮  │    ⋮        ⋮
  23 │    1.0      0.5
  24 │    1.0      0.75
  25 │    1.0      1.0
         19 rows omitted

with a different order of rows.

Upvotes: 5

Related Questions