Christine Blume
Christine Blume

Reputation: 527

Plot from package "lomb" in ggplot2

I am using the package "lomb" to calculate Lomb-Scargle Periodograms, a method for analysing biological time series data. The package does create a plot if you tell it to do so. However, the plots are not too nice (compared to ggplot2 plots). Therefore, I would like to plot the results with ggplot. However, I do not know how to access the function for the curve plotted...

This is a sample code for a plot:

TempDiff <- runif(4033, 3.0, 18) % just generate random numbers
Time2 <- seq(1,4033) % Time vector
Rand.LombScargle <- randlsp(repeats=10, TempDiff, times = Time2, from = 12, to = 36, 
                        type = c("period"), ofac = 10, alpha = 0.01, plot = T, 
                        trace = T, xlab="period", main = "Lomb-Scargle Periodogram")

I have also tried to find out something about the function looking into the function randlsp itself, but could not really find anything that seemed useful to me there...

 getAnywhere(randlsp)
 A single object matching ‘randlsp’ was found
 It was found in the following places
   package:lomb
   namespace:lomb
 with value

 function (repeats = 1000, x, times = NULL, from = NULL, to = NULL, 
 type = c("frequency", "period"), ofac = 1, alpha = 0.01, 
 plot = TRUE, trace = TRUE, ...) 
  {
    if (is.ts(x)) {
    x = as.vector(x)
  }
 if (!is.vector(x)) {
    times <- x[, 1]
    x <- x[, 2]
}
 if (plot == TRUE) {
    op <- par(mfrow = c(2, 1))
}
 realres <- lsp(x, times, from, to, type, ofac, alpha, plot = plot, 
    ...)
 realpeak <- realres$peak
 pks <- NULL
 if (trace == TRUE) 
    cat("Repeats: ")
 for (i in 1:repeats) {
    randx <- sample(x, length(x))
    randres <- lsp(randx, times, from, to, type, ofac, alpha, 
        plot = F)
    pks <- c(pks, randres$peak)
    if (trace == TRUE) {
        if (i/10 == floor(i/10)) 
            cat(i, " ")
    }
}
if (trace == TRUE) 
    cat("\n")
prop <- length(which(pks >= realpeak))
p.value <- prop/repeats
if (plot == TRUE) {
    mx = max(c(pks, realpeak)) * 1.25
    hist(pks, xlab = "Peak Amplitude", xlim = c(0, mx), main = paste("P-value: ", 
        p.value))
    abline(v = realpeak)
    par(op)
}
res = realres[-(8:9)]
res = res[-length(res)]
res$random.peaks = pks
res$repeats = repeats
res$p.value = p.value
class(res) = "randlsp"
return(invisible(res))

Any idea will be appreciated!

Best, Christine

PS: Here an example of the plot with real data. Plot with real data

Upvotes: 3

Views: 1122

Answers (1)

MattLBeck
MattLBeck

Reputation: 5831

The key to getting ggplot graphs out of any returned object is to convert the data that you need in to some sort of data.frame. To do this, you can look at what kind of object your returned value is and see what sort of data you can immediately extract into a data.frame

str(Rand.LombScargle) # get the data type and structure of the returned value

    List of 12
 $ scanned     : num [1:2241] 12 12 12 12 12 ...
 $ power       : num [1:2241] 0.759 0.645 0.498 0.341 0.198 ...
 $ data        : chr [1:2] "times" "x"
 $ n           : int 4033
 $ type        : chr "period"
 $ ofac        : num 10
 $ n.out       : int 2241
 $ peak        : num 7.25
 $ peak.at     : num [1:2] 24.6908 0.0405
 $ random.peaks: num [1:10] 4.99 9.82 7.03 7.41 5.91 ...
 $ repeats     : num 10
 $ p.value     : num 0.3
 - attr(*, "class")= chr "randlsp"

in the case of randlsp, its a list, which is usually what is returned from statistical functions. Most of this information can also be obtained from ?randlsp too.

It looks as if Rand.LombScargle$scanned and Rand.LombScargle$power contains most of what is needed for the first graph:

There is also a horizontal line on the Periodogram, but it doesn't correspond to anything that was returned by randlsp. Looking at the source code that you provided, it looks as if the Periodogram is actually generated by lsp().

LombScargle <- lsp( TempDiff, times = Time2, from = 12, to = 36, 
    type = c("period"), ofac = 10, alpha = 0.01, plot = F)

str(LombScargle)

List of 12
 $ scanned  : num [1:2241] 12 12 12 12 12 ...
 $ power    : num [1:2241] 0.759 0.645 0.498 0.341 0.198 ...
 $ data     : chr [1:2] "Time2" "TempDiff"
 $ n        : int 4033
 $ type     : chr "period"
 $ ofac     : num 10
 $ n.out    : int 2241
 $ alpha    : num 0.01
 $ sig.level: num 10.7
 $ peak     : num 7.25
 $ peak.at  : num [1:2] 24.6908 0.0405
 $ p.value  : num 0.274
 - attr(*, "class")= chr "lsp"

I am guessing that, based on this data, the line is indicating the significance level LombScargle$sig.level

Putting this together, we can create our data to pass to ggplot from lsp:

lomb.df <- data.frame(period=LombScargle$scanned, power=LombScargle$power)

# use the data frame to set up the line plot
g <- ggplot(lomb.df, aes(period, power)) + geom_line() + 
       labs(y="normalised power", title="Lomb-Scargle Periodogram")

# add the sig.level horizontal line
g + geom_hline(yintercept=LombScargle$sig.level, linetype="dashed")

Periodogram

For the histogram, it looks like this is based on the vector Rand.LombScargle$random.peaks from randlsp:

rpeaks.df <- data.frame(peaks=Rand.LombScargle$random.peaks)

ggplot(rpeaks.df, aes(peaks)) + 
       geom_histogram(binwidth=1, fill="white", colour="black") + 
       geom_vline(xintercept=Rand.LombScargle$peak, linetype="dashed") +
       xlim(c(0,12)) +  
       labs(title=paste0("P-value: ", Rand.LombScargle$p.value), 
           x="Peak Amplitude", 
           y="Frequency")

Histogram

Play around with these graphs to get them looking to your taste.

Upvotes: 4

Related Questions