jack kelly
jack kelly

Reputation: 330

An error when using log10_trans() in ggplot

I have a dataframe testlist that has three columns: "Time Point" "G1" and "G2"

I run this ggplot line on it, having "Time point" as y-axis.

testplot <- ggplot(testlist, aes_string(x = "G1", y = "`Time Point`")) +
  geom_density_ridges_gradient()+
  scale_x_continuous(trans = log10_trans(),#,
                     limits = c(10^1,10^4),
                     breaks = trans_breaks("log10", function(x) 10^x,n = 2)) 

With G1, it runs and plots my data. With G2 however, I get this error:

Error in if (!(lo <- min(hi, IQR(x)/1.34))) (lo <- hi) || (lo <- abs(x[1L])) ||  : 
  missing value where TRUE/FALSE needed

Both data sets G1 and G2 are continuous points of about 70500. The only thing that differs is the ranges:

G1: -16.34 to 54454
G2: -131.25 to 4675

I think this has to do with applying the transformation on negative numbers, but both sets have negative numbers. I also get this warning messages regardless if it plots or not:

In addition: Warning messages:
1: In self$trans$transform(x) : NaNs produced
2: Transformation introduced infinite values in continuous x-axis 

Any assistance is appreciated.

EDIT1: I am having trouble making a minimal example. I will attempt another edit with an example.

EDIT2: Here is a link to the CSV file: CSV file

EDIT3: The log function produces -inf values in the dataframe, ggplot will not use those values. I will update the post what end up doing about it.

Upvotes: 0

Views: 974

Answers (1)

kwes
kwes

Reputation: 433

The difference in failure is because one contains zeros and the other doesn't. NaNs from taking the log of a negative number are removed in the density plot function but -inf from log of zero isn't. The ggridges layer fails because it uses the standard deviation to set the bandwidth, and the standard deviation when the vector contains -inf is NaN. There's no reason to not remove negatives/zeros if you want to use a log transformation, but you could keep it as is if you input the arguments from and bandwidth into the ggridges geom.

Upvotes: 3

Related Questions