Reputation: 330
I have a dataframe testlist
that has three columns: "Time Point
" "G1
" and "G2
"
I run this ggplot line on it, having "Time point
" as y-axis.
testplot <- ggplot(testlist, aes_string(x = "G1", y = "`Time Point`")) +
geom_density_ridges_gradient()+
scale_x_continuous(trans = log10_trans(),#,
limits = c(10^1,10^4),
breaks = trans_breaks("log10", function(x) 10^x,n = 2))
With G1, it runs and plots my data. With G2 however, I get this error:
Error in if (!(lo <- min(hi, IQR(x)/1.34))) (lo <- hi) || (lo <- abs(x[1L])) || :
missing value where TRUE/FALSE needed
Both data sets G1 and G2 are continuous points of about 70500. The only thing that differs is the ranges:
G1: -16.34 to 54454
G2: -131.25 to 4675
I think this has to do with applying the transformation on negative numbers, but both sets have negative numbers. I also get this warning messages regardless if it plots or not:
In addition: Warning messages:
1: In self$trans$transform(x) : NaNs produced
2: Transformation introduced infinite values in continuous x-axis
Any assistance is appreciated.
EDIT1: I am having trouble making a minimal example. I will attempt another edit with an example.
EDIT2: Here is a link to the CSV file: CSV file
EDIT3: The log function produces -inf values in the dataframe, ggplot will not use those values. I will update the post what end up doing about it.
Upvotes: 0
Views: 974
Reputation: 433
The difference in failure is because one contains zeros and the other doesn't. NaNs from taking the log of a negative number are removed in the density plot function but -inf from log of zero isn't. The ggridges
layer fails because it uses the standard deviation to set the bandwidth, and the standard deviation when the vector contains -inf is NaN. There's no reason to not remove negatives/zeros if you want to use a log transformation, but you could keep it as is if you input the arguments from
and bandwidth
into the ggridges
geom.
Upvotes: 3