statistics_learning
statistics_learning

Reputation: 437

How to take the name of the column when use "apply" function to loop over the column

I have a dataframe, which is

  SNP1 <- c("AA","GG","AG")
  SNP2 <- c("AA","CC","AC")
  SNP3 <- c("GG","AA","AG")
  df<- data.frame(SNP1, SNP2, SNP3)
  colnames(df)<- c('rs10000438', 'rs10000500','rs1000055')

With this dataframe df, I want to apply the function dominant_dummy to each column. I use the apply function, but I found that for the apply function, when it loops over the column of a dataframe, it only extract a vector of this column values, not including the name of that column. But in the function dominant_dummy, it requires to have the name of the column for this syntax NCBI_snp_query(names(x)) . How can I use the apply function at the same time I can extract the name of the column the function just loop over?

  library(rsnps)
  dominant_dummy<- function(x){

    SNP_lib<- NCBI_snp_query(names(x))
    NCBI_snp_query(names(x))

    SNP_min<- SNP_lib$Minor
    SNP_name<- SNP_lib$Query

    SNPs=as.character(x)
    SNPs=as.factor(SNPs)



    check<-substr(levels(SNPs)[2],1,1)==SNP_min
    if(!check){
      levels(SNPs)<-c(0,1,1)
      SNPs<-as.numeric(as.character(SNP))
    }else {levels(SNPs)<-c(1,1,0)
    SNPs<-as.numeric(as.character(SNP))}
  }
  df_3levels<-apply(df,2, dominant_dummy)

Upvotes: 1

Views: 37

Answers (1)

MrFlick
MrFlick

Reputation: 206253

This function just isn't going to work with apply if you are going to require names. Since you basically require a data.frame to be passing in, you are going to have to do the slicing a bit more manually (assuming you don't want to change dominant_dummy)

df_3levels<-sapply(1:ncol(df), function(i) dominant_dummy(df[,i, drop=FALSE]))

Upvotes: 4

Related Questions