R: Using function arguments to update elements in a data frame

Question

I want the elements referenced in my data frame to be replaced with the argument I put into the function, however at the moment it is just replacing the elements with the argument I used to initially define the function (I'm finding it hard to explain - hopefully my code and pictures will clarify this a bit!)

Project_assign <- function(prjct) {
  Truth_vector <- is.element((giraffe[,1]),(prjct[,1]))
  giraffe[which(Truth_vector),5] <- 'prjct'
  assign('giraffe' , giraffe , envir= .GlobalEnv)
}
Project_assign(spine_hlfs)

This mostly works however the elements get replaced with prjct instead of spine_hlfs https://i.sstatic.net/uuPnv.png

If I can get this to work as intended, then I will next create a vector with all the project names and use lapply with this function saving me a lot of manual work every few months. I am relatively new to R so any explanations are well appreciated.

Maurits Evers · Accepted Answer

Sounds like a simple replace based on matching entries between a (list of) query dataframes and a subject dataframe.

Here is an example based on some simulated data.

I first simulate data for the subject dataframe:

# Sample data
giraffe <- data.frame(
    runkeys = seq(1:500),
    col1 = runif(500),
    col2 = runif(500),
    col3 = runif(500),
    col4 = runif(500));

I then simulate runkeys data for 2 query dataframes:

spine_hlfs <- data.frame(
    runkeys = c(44, 260, 478));
ir_dia <- data.frame(
    runkeys = c(10, 20, 30))

The query dataframes are stored in a list:

lst.runkeys <- list(
    spine_hlfs = spine_hlfs,
    ir_dia = ir_dia);

To flag runkeys entries present in any of the query dataframes, we can use a for loop to match runkeys entries from every query dataframe:

# This is the critical line that loops through the dataframe
# and flags runkeys in giraffe with the name of the query dataframe
for (i in 1:length(lst.runkeys)) {
    giraffe[match(lst.runkeys[[i]]$runkeys, giraffe$runkeys), 5] <- names(lst.runkeys)[i];
}

This is the output of the subject dataframe after matching runkeys entries. I'm only showing rows where entries in column 5 where replaced.

giraffe[grep("(spine_hlfs|ir_dia)", giraffe[, 5]), ];
10       10 0.7401977 0.005703928 0.6778921     ir_dia
20       20 0.7954076 0.331462567 0.7637870     ir_dia
30       30 0.5772808 0.183716142 0.6984193     ir_dia
44       44 0.9701355 0.655736489 0.4917452 spine_hlfs
260     260 0.1893012 0.600140166 0.0390346 spine_hlfs
478     478 0.7655976 0.910946623 0.9779205 spine_hlfs

R: Using function arguments to update elements in a data frame

Answers (2)

Sample Data

Proposed solution

Explanation

Alternative solution

Related Questions