BuddhiLW
BuddhiLW

Reputation: 627

How to rewrite this deprecated expression using do and "by", with "groupby" (Julia)

The goal is to generate fake data.

We generate a set of parameters,

## Simulated data
df_3 = DataFrame(y = [0,1], size = [250,250], x1 =[2.,0.], x2 =[-1.,-2.])

Now, I want to generate the fake data per se,

df_knn =by(df_3, :y) do df
  DataFrame(x_1 = rand(Normal(df[1,:x1],1), df[1,:size]),
  x_2 = rand(Normal(df[1,:x2],1), df[1,:size]))
end

How I can replace by with groupby, here?

SOURCE: This excerpt is from the book, Data Science with Julia (2019).

Upvotes: 2

Views: 57

Answers (1)

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42194

I think this is what you mean here:

julia> combine(groupby(df_3, :y)) do df
         DataFrame(x_1 = rand(Normal(df[1,:x1],1), df[1,:size]), 
                   x_2 = rand(Normal(df[1,:x2],1), df[1,:size]))
       end
500×3 DataFrame
 Row │ y      x_1        x_2
     │ Int64  Float64    Float64
─────┼─────────────────────────────
   1 │     0   1.88483    0.890807
   2 │     0   2.50124   -0.280708
   3 │     0   1.1857     0.823002
  ⋮  │   ⋮        ⋮          ⋮
 498 │     1  -0.611168  -0.856527
 499 │     1   0.491412  -3.09562
 500 │     1   0.242016  -1.42652
                   494 rows omitted

Upvotes: 2

Related Questions