Dolan
Dolan

Reputation: 41

Gnuplot smoothing data in loglog plot

I would like to plot a smoothed curve based on a dataset which spans over 13 orders of magnitude [1E-9:1E4] in x and 4 orders of magnitude [1E-6:1e-2] in y.

MWE:

set log x
set log y
set xrange [1E-9:1E4]
set yrange [1E-6:1e-2]
set samples 1000

plot 'data.txt'   u 1:3:(1) smooth csplines not

The smooth curve looks nice above x=10. Below, it is just a straight line down to the point at x=1e-9.

When increasing samples to 1e4, smoothing works well above x=1. For samples 1e5, smoothing works well above x=0.1 and so on.

Any idea on how to apply smoothing to lower data points without setting samples to 1e10 (which does not work anyway...)?

Thanks and best regards! JP

Upvotes: 2

Views: 646

Answers (2)

theozh
theozh

Reputation: 25734

To my understanding sampling in gnuplot is linear. I am not aware, but maybe there is a logarithmic sampling in gnuplot which I haven't found yet.

Here is a suggestion for a workaround which is not yet perfect but may act as a starting point. The idea is to split your data for example into decades and to smooth them separately. The drawback is that there might be some overlaps between the ranges. These you can minimize or hide somehow when you play with set samples and every ::n or maybe there is another way to eliminate the overlaps.

Code:

### smoothing over several orders of magnitude
reset session

# create some random test data
set print $Data
    do for [p=-9:3] {
        do for [m=1:9:3] {
            print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
        }
    }
set print

set logscale x
set logscale y
set format x "%g"
set format y "%g"

set samples 100
pMin = -9
pMax =  3
set table $Smoothed
    myFilter(col,p) = (column(col)/10**p-1) < 10 ? column(col) : NaN
    plot for [i=pMin:pMax] $Data u (myFilter(1,i)):2 smooth cspline 
unset table

plot $Data u 1:2 w p pt 7 ti "Data", \
     $Smoothed u 1:2 every ::3 w l ti "cspline"
### end of code

Result:

enter image description here

Addition:

Thanks to @maij who pointed out that it can be simplified by simply mapping the whole range into linear space. In contrast to @maij's solution I would let gnuplot handle the logarithmic axes and keep the actual plot command as simple as possible with the extra effort of some table plots.

Code:

### smoothing in loglog plot
reset session

# create some random test data
set print $Data
    do for [p=-9:3] {
        do for [m=1:9:3] {
            print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
        }
    }
set print

set samples 500
set table $SmoothedLog
    plot $Data u (log10($1)):(log10($2)) smooth csplines
set table $Smoothed
    plot $SmoothedLog u (10**$1):(10**$2) w table
unset table

set logscale x
set logscale y
set format x "%g"
set format y "%g"
set key top left

plot $Data     u 1:2 w p pt 7 ti "Data", \
     $Smoothed u 1:2 w l lc "red" ti "csplines"
### end of code

Result:

enter image description here

Upvotes: 1

maij
maij

Reputation: 4218

Using a logarithmic scale basically means to plot the logarithm of a value instead of the value itself. The set logscale command tells gnuplot to do this automatically:

  1. read the data, still linear world, no logarithm yet
  2. calculate the splines on an equidistant grid (smooth csplines), still linear world
  3. calculate and plot the logarithms (set logscale)

The key point is the equidistant grid. Let's say one chooses set xrange [1E-9:10000] and set samples 101. In the linear world 1e-9 compared to 10000 is approximately 0, and the resulting grid will be 1E-9 ~ 0, 100, 200, 300, ..., 9800, 9900, 10000. The first grid point is at 0, the second one at 100, and gnuplot is going to draw a straight line between them. This does not change when afterwards logarithms of the numbers are plotted.

This is what you already have noted in your question: you need 10 times more points to get a smooth curve for smaller exponents.

As a solution, I would suggest to switch the calculation of the logarithms and the calculation of the splines.

# create some random test data, code "stolen" from @theozh (https://stackoverflow.com/a/66690491)
set print $Data
    do for [p=-9:3] { 
        do for [m=1:9:3] { 
            print sprintf("%g %g", m*10**p, (1+rand(0))*10**(p/12.*3.-2))
        } 
    } 
set print


# this makes the splines smoother
set samples 1000

# manually account for the logarithms in the tic labels
set format x "10^{%.0f}"     # for example this format
set format y "1e{%+03.0f}"   # or this one
set xtics 2   # logarithmic world, tic distance in orders of magnitude
set ytics 1 

# just "read logarithm of values" from file, before calculating splines
plot $Data u (log10($1)):(log10($2)) w p pt 7 ti "Data" ,\
     $Data u (log10($1)):(log10($2)) ti "cspline" smooth cspline

This is the result:

result

Upvotes: 1

Related Questions