user249018
user249018

Reputation: 555

How to embedd one function into another one in R

I would like to build a function that works similarly to the built-in pairs() function. Given the data set df, I would like to create a matrix of scatter plots between the numeric class columns of the data set, such that on the diagonal I can get the histogram of those columns including a density function. I have precisely an issue in integrating the diagonal part of the plots. Without the diagonal part of the code, everything works fine (one can test this on the iris data set).

new_pairs<-function(df, x){
par(mar=c(1,1,1,1))
n_col<-sum(sapply(df, is.numeric))
par(mfrow=c(n_col,n_col))
n<-ncol(df)
for (i in 1:n){

for (j in 1:n){
  
  if ((class(df[,i])!="factor" ) & (class(df[,j])!="factor") & i!=j)
  {plot(df[,i], df[,j], col = df[,x])}
  
  else if ((class(df[,i])!="factor") & (class(df[,j])!="factor") & i==j)
  {hist(df[,i], breaks=10, probability=T, main=NULL)} {lines(density(df[,i]))}
      }
    }
  }
 new_pairs(df,2)

It seems that including the line {lines(density(df[,i]))} is not permissible. I get an error message. I therefore tried to build a density function, which will account for missing values. It works fine on a given (numeric class) column but I do not know how to integrate it inside the new_pairs() function. Here is the density function:

hist_density = function (df[,3]) {
N = length(df[,3])
df[,3] <- na.omit(df[,3])
hist( df[,3], col = "light blue",
    probability = TRUE, main=NULL)
lines(density(df[,3]), col = "red", lwd = 3)
}

Upvotes: 1

Views: 56

Answers (2)

akrun
akrun

Reputation: 886988

May be we can use the lines in each of the options

new_pairs<-function(df, x){
  par(mar=c(1,1,1,1))
   n_col<-sum(sapply(df, is.numeric))
   par(mfrow=c(n_col,n_col))
   n<-ncol(df)
   for (i in 1:n){

     for (j in 1:n){
  
    if ((class(df[,i])!="factor" ) & (class(df[,j])!="factor") & i!=j) {
       plot(df[,i], df[,j], col = df[,x])
         lines(density(na.omit(df[, i])))
  
    }else if ((class(df[,i])!="factor") & (class(df[,j])!="factor") & i==j) {
          hist(df[,i], breaks=10, probability=T, main=NULL)
              lines(density(na.omit(df[,i])))
     } else{ 
          NA
        
        }
      }
    }
  }
  
  
new_pairs(iris, 3)

The hist part can be called from a different function created

hist_density <- function (df, i) {
   N <- length(df[,i])
  tmp <- na.omit(df[,i])
  hist(tmp, col = "light blue",
    probability = TRUE, main=NULL)
    lines(density(tmp), col = "red", lwd = 3)
}



new_pairs<-function(df, x){
  par(mar=c(1,1,1,1))
   n_col<-sum(sapply(df, is.numeric))
   par(mfrow=c(n_col,n_col))
   n<-ncol(df)
   for (i in 1:n){

     for (j in 1:n){
  
       if ((class(df[,i])!="factor" ) & (class(df[,j])!="factor") & i!=j) {
       plot(df[,i], df[,j], col = df[,x])
         lines(density(na.omit(df[, i])))
  
      }else if ((class(df[,i])!="factor") & (class(df[,j])!="factor") & i==j) {
            hist_density(df, i)
       } else {
          NA}
       
         }
        }
       }
     
 new_pairs(iris, 3)

Upvotes: 2

Oliver
Oliver

Reputation: 8572

With the danger of being superflous, but without inventing a new method you could use pairs itself and simply use the diag.panel argument to add the density and histogram yourself. The code below is taken from an example in help(pairs) (where I've added the density)

This might be a cleaner solution

# Add histograms + density (taken from help("pairs"))
panel.hist <- function(x, ...)
{
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(usr[1:2], 0, 1.5) )
  h <- hist(x, plot = FALSE)
  breaks <- h$breaks; nB <- length(breaks)
  y <- h$counts; y <- y/max(y)
  rect(breaks[-nB], 0, breaks[-1], y, col = "cyan", ...)
  # density
  dens <- density(x); dens$y <- dens$y / max(dens$y)
  lines(dens)
}
pairs(iris[, 1:4], diag.panel = panel.hist)

enter image description here

Upvotes: 2

Related Questions