loard
loard

Reputation: 113

Mutate/replace in one go

MAJOR EDIT

Consider a simple data frame:

    df = data.frame(obs.no = 1:10, conc = rnorm(10))
    discard.obs.no = 1:5

I want this:

    df[df$obs.no %in% discard.obs.no,"conc"] = df[df$obs.no %in% discard.obs.no,"conc"]

To be done using a helper function like that:

    change(df[df$obs.no %in% discard.obs.no,"conc"], function(x) 2^x)

Essentially I want to avoid retyping the LHS on RHS of the assignment operator. Why? Because the whole thing becomes unwieldy with complicated filtering.

As the example suggests, the function should change only the filtered data, not return the subset. It should also happen in the background i.e. without reassignment to the original data.frame.

Mutate/transform/within etc. do not do the job, since they print out to the console, necessitating reassignment. Assign does not take parts of data.frames as an argument. Whole thing is a bit of vanity project, but I'm sure there's a viz out there who can do it (:

BONUS: try writing a parser that would shorten it even further to:

    change(2^df[df$obs.no %in% 1:5,"conc"])

I.e. figure out which part is the object to be reassigned - left/right of $ or left of [ and between [].

Upvotes: 1

Views: 655

Answers (2)

Thomas
Thomas

Reputation: 44525

What you're asking for is not supported in base R. Or, rather, it could be but you're asking for pass-by-reference semantics, which violate R's sort of core "functional" programming style. Achieving it will require some hackery.

So, you can achieve this by using data.table:

set.seed(1)
library("data.table")
dt <- data.table(obs.no = 1:10, conc = rnorm(10))
dt[obs.no %in% discard.obs.no, conc2 := 2^conc]
dt
    obs.no       conc     conc2
 1:      1 -0.6264538 0.6477667
 2:      2  0.1836433 1.1357484
 3:      3 -0.8356286 0.5603388
 4:      4  1.5952808 3.0215332
 5:      5  0.3295078 1.2565846
 6:      6 -0.8204684        NA
 7:      7  0.4874291        NA
 8:      8  0.7383247        NA
 9:      9  0.5757814        NA
10:     10 -0.3053884        NA

I show conc2 := 2^conc here, as an example, you could also store back into the conc variable itself using analogous notation.

Upvotes: 2

Jase_
Jase_

Reputation: 1196

Not entirely sure what you are after but the dplyr package will do what you want to do (I think). In the example below the select command is not needed but you mention the column corr in your question, so I thought it might help give you an idea of what you could do.

# Load the dplyr package
library(dplyr)
# create an index of values to discard
discard.obs.no <- 1:5
df <- data.frame(conc = rnorm(10), obs.no = 1:10)
modified <- df %>%
    # Select the columns you want to use by names
    select(obs.no, conc) %>%
    # use a logical statement to subset the rows you want to use
    filter(!(obs.no %in% discard.obs.no)) %>%
    # Provide a function to manipulate the data
    mutate(changed_conc = 2^conc)

Upvotes: 0

Related Questions