Wang
Wang

Reputation: 1460

What is the algorithm that R uses in data frame filtering?

For example, I have a data frame, and I want to subset it according to specific conditions:

df[df$gender == "woman" & df$age > 40, ]

What is the algorithm behind this filtering in R?

Upvotes: 0

Views: 61

Answers (1)

Artem
Artem

Reputation: 3414

During the execution of the code df[df$gender == "woman" & df$age > 40, ] the following happens:

  1. df$gender extracted.
  2. df$gender == "woman" evaluated, boolean vector returned.
  3. df$age extracted.
  4. df$age > 40 evaluated, boolean vector returned.
  5. Logical by-element AND applied to each element of vectors in Step 2 and Step 4.
  6. Rows from df which have TRUE flag in Step 5 are extracted.

In all above cases backend C\C++ functions are called. E.g. [] subset function calls for do_subset in subset.c.

You can study mapping between R functions and it's C back-end in names.c

For further details you can consult Accessing R Source

Upvotes: 3

Related Questions