Jessica Hall
Jessica Hall

Reputation: 1

creating functions with arguments as column names

I'm trying to create a function the will do a pairwise comparison between the values of one column to another and create a new vector depending on those values. I cannot work out how to allow two of the arguments to be column names that can then be changed and the function can be used on another set of columns. The specific situation is there are four columns of coloured band labels for a parent bird (pbc1...pbc4) and another four for its chick(obc1...obc4). The band columns are columns of characters such as 'G' 'PG' 'B' etc. this is the code of the first part of my function which I will extend to include all pairwise comparisons after I get this running:

colourdistance1 <- function(df, refcoldistdf, pbc, obc){
    n <- length(pbc)
    coldist1 <- rep(NA,n)
    for(i in 1:n){
        if(pbc[i]==obc[i]){
            coldist1[i] <- 0
         } else if(pbc[i]=='M'|obc[i]=='M'){
             coldist1[i] <- NA
         } else if(pbc[i]=='G'& obc[i]=='PG'| obc[i]=='G'& pbc[i]=='PG'){
             coldist1[i] <- refcoldistdf[2,2]
         } else {
             coldist1[i] <- NA
         }
    }
}

p1o1 <- colourdistance1(bd_df, refcoldistdf,pbc = pbc1, obc = obc1)

This call just returns the object p1o1 as being NULL I have also tried:

colourdistance1 <- function(df, refcoldistdf, pbc, obc){
    n <- length(pbc)
    coldist1 <- rep(NA,n)
    for(i in 1:n){
        if(df$pbc[i]==df$obc[i]){
            coldist1[i] <- 0
        } else if(df$pbc[i]=='M'|df$obc[i]=='M'){
            coldist1[i] <- NA
        } else if(df$pbc[i]=='G'& df$obc[i]=='PG'| df$obc[i]=='G'& df$pbc[i]=='PG') { 
            coldist1[i] <- refcoldistdf[2,2]
        } else {
            coldist1[i] <- NA
        }
    }
}

But that just gives this error:

Error in if (df$pbc[i] == df$obc[i]) { : argument is of length zero

I have tried all the code outside the function, inserting the column names and index number and df name and it all works. This makes me think I have an issue with the function arguments not connecting to the function code as I intended. Any help will be appreciated!!

Reproducible test data:

pbc1 <- c('B','W','G','R')
obc1 <- c('Y','W','PG','FP')
pbc2 <- c('W','W','W','M')
obc2 <- c('M','W','R','R')
pbc3 <- c('W','K','FP','K')
obc3 <- c('G','PG','B','PB')
pbc4 <- c('K','K','B','M')
obc4 <- c('K','PG','W','M')
testbanddf <- cbind(pbc1,obc1,pbc2,obc2,pbc3,obc3,pbc4,obc4)
testrefcoldist <- diag(11)

Upvotes: 0

Views: 50

Answers (1)

joran
joran

Reputation: 173517

So there are quite a few comments to make, but first, you might try this:

pbc1 <- c('B','W','G','R')
obc1 <- c('Y','W','PG','FP')
pbc2 <- c('W','W','W','M')
obc2 <- c('M','W','R','R')
pbc3 <- c('W','K','FP','K')
obc3 <- c('G','PG','B','PB')
pbc4 <- c('K','K','B','M')
obc4 <- c('K','PG','W','M')
testbanddf <- data.frame(pbc1,obc1,pbc2,obc2,pbc3,obc3,pbc4,obc4)
testrefcoldist <- diag(11)

colourdistance1 <- function(df, refcoldistdf, pbc, obc){
    n <- nrow(df)
    coldist1 <- rep(NA,n)

    pbc <- df[[pbc]]
    obc <- df[[obc]]

    for(i in 1:n){
        if(pbc[i]==obc[i]){
            coldist1[i] <- 0
        } else if(pbc[i]=='M'|obc[i]=='M'){
            coldist1[i] <- NA
        } else if(pbc[i]=='G'& obc[i]=='PG'| obc[i]=='G'& pbc[i]=='PG'){
            coldist1[i] <- refcoldistdf[2,2]
        } else {
            coldist1[i] <- NA
        }
    }
    coldist1
}

colourdistance1(testbanddf, testrefcoldist,pbc = "pbc1", obc = "obc1")
  1. cbind() creates a matrix, not a data frame. You create data frames with the function data.frame().
  2. The simplest way forward is to make the arguments pbc and obc be characters representing the column names.
  3. Referring to data frame columns using $ is useful when working interactively, but isn't so useful (as you discovered) when writing functions and don't know the names of columns in advance. In that case, you use [[, and can select them by name or position.
  4. Your function as written didn't explicitly return coldist1.

Upvotes: 1

Related Questions