Qwerty
Qwerty

Reputation: 909

How to extract particular rows from a data frame in Julia?

I want to extract the 3rd and 7th row of a data frame in Julia. The MWE is:

using DataFrames
my_data = DataFrame(A = 1:10, B = 16:25);
my_data

10×2 DataFrame
│ Row │ A     │ B     │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1   │ 1     │ 16    │
│ 2   │ 2     │ 17    │
│ 3   │ 3     │ 18    │
│ 4   │ 4     │ 19    │
│ 5   │ 5     │ 20    │
│ 6   │ 6     │ 21    │
│ 7   │ 7     │ 22    │
│ 8   │ 8     │ 23    │
│ 9   │ 9     │ 24    │
│ 10  │ 10    │ 25    │

Upvotes: 4

Views: 1160

Answers (2)

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42204

The great thing about Julia is that you do not need to materialize the result (and hence save memory and time on copying the data). Hence, if you need a subrange of any array-like structure it is better to use @view rather than materialize directly

julia> @view my_data[[3, 7], :]
2×2 SubDataFrame
│ Row │ A     │ B     │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1   │ 3     │ 18    │
│ 2   │ 7     │ 22    │

Now the performance testing.

function submean1(df)
    d = df[[3, 7], :]
    mean(d.A)
end

function submean2(df)
    d = @view df[[3, 7], :]
    mean(d.A)
end

And tests:

julia> using BenchmarkTools

julia> @btime submean1($my_data)
  689.262 ns (19 allocations: 1.38 KiB)
5.0

julia> @btime submean2($my_data)
  582.315 ns (9 allocations: 288 bytes)
5.0

Even in this simplistic example @view is 15% faster and uses four times less memory. Of course sometimes you want to copy the data but the rule of thumb is not to materialize.

Upvotes: 2

Qwerty
Qwerty

Reputation: 909

This should give you the expected output:

using DataFrames
my_data = DataFrame(A = 1:10, B = 16:25);
my_data;
my_data[[3, 7], :]

2×2 DataFrame
│ Row │ A     │ B     │
│     │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1   │ 3     │ 18    │
│ 2   │ 7     │ 22    │

Upvotes: 4

Related Questions