Jakub Peschel
Jakub Peschel

Reputation: 135

Gnuplot candlesticks next to each other in line graph

I am making a comparison of different algorithms with dependence on the properties of the datasets, and I am watching the execution time. Because there might exist multiple observations for one value of the property, I created a line graph, where lines would correspond to the average values of execution times. However, I also wanted to see extremes and quartiles, so my first idea was to add to the relevant places some candlesticks showing relevant values.

I expected that it should look something like this:

example of result

My data are in form of csv with relevant values in it:

size, GSP_min, GSP_firstQuartile, GSP_median, GSP_avg, GSP_thirdQuartile, GSP_max, SPAM_min, SPAM_firstQuartile, SPAM_median, SPAM_avg, SPAM_thirdQuartile, SPAM_max, PREFIX_SPAN_min, PREFIX_SPAN_firstQuartile, PREFIX_SPAN_median, PREFIX_SPAN_avg, PREFIX_SPAN_thirdQuartile, PREFIX_SPAN_max
498101.0, 101.0, 101.0, 385.6666666666667, 340.0, 716.0, 11.0, 11.0, 11.0, 33.666666666666664, 29.0, 61.0, 49.0, 49.0, 49.0, 60.333333333333336, 56.0, 76.0, 
730189.0, 189.0, 189.0, 3489.0, 3740.0, 6538.0, 19.0, 19.0, 19.0, 106.66666666666667, 114.0, 187.0, 32.0, 32.0, 32.0, 69.66666666666667, 81.0, 96.0, 

Here is my code and how I planned to achieve it:

set terminal png size 1024,1024
set bmargin 5
set key autotitle columnhead

set datafile separator ","

set style line 1 \
    linecolor rgb '#00ff00' \
    linetype 1 linewidth 2 \
    pointtype 7 pointsize 1.5

set style line 2 \
    linecolor rgb '#0000ff' \
    linetype 1 linewidth 2 \
    pointtype 7 pointsize 1.5

set style line 3 \
    linecolor rgb '#ff0000' \
    linetype 1 linewidth 2 \
    pointtype 7 pointsize 1.5

set boxwidth 0.1 relative
set style fill empty

set output 'sizeExp.png'
plot 'size.csv' using 1:4 with lp ls 1, \
         '' using 1:9 with lp ls 2, \
         '' using 1:14 with lp ls 3, \
         '' using ($1-1):3:2:6:5 with candlesticks whiskerbars, \
         '' using ($1):8:7:11:10 with candlesticks whiskerbars, \
         '' using ($1+1):13:12:16:15 with candlesticks whiskerbars

This is the generated result: enter image description here The problem here is twofold:

  1. Because the values differ a lot, I am not able to set a width. I thought I would manage to do it somehow logically with the "relative" keyword, but instead, I got really weird widths of the boxes.
  2. Secondly, I am not managing to put these bars next to each other and instead, I am getting them overlapped. I tried different values in the x = "($1+1)" position, but nothing gave me a good result.

Is there a way how to modify values relatively to image size?

And the third problem, if someone could give me some advice, I expected that line would be named "GSP_avg", "SPAM_avg", and "Prefix_span_avg", but instead, I got that mess.

Upvotes: 0

Views: 151

Answers (2)

Ethan
Ethan

Reputation: 15093

I suggest that you look into the with boxplot style, which would calculate quartiles and construct appropriate candlestick-like plots directly from the data.

Here is an online demo for gnuplot boxplots.

See also the answer provided for this earlier question: How to plot grouped boxplot by gnuplot

Unlike the with candlesticks plot style, you can provide individual widths for the boxplots. There is also control over clustering and spacing between members of the cluster. From the documentation:

 By default only one boxplot is produced that represents all y values from the
 second column of the using specification. However, an additional (fourth)
 column can be added to the specification. If present, the values of that
 column will be interpreted as the discrete levels of a factor variable.
 As many boxplots will be drawn as there are levels in the factor variable.
 The separation between these boxplots is 1.0 by default, but it can be changed
 by `set style boxplot separation`. By default, the value of the factor variable
 is shown as a tic label below (or above) each boxplot.

Example

 # Suppose that column 2 of 'data' contains either "control" or "treatment"
 # The following example produces two boxplots, one for each level of the
 # factor
 plot 'data' using (1.0):5:(0):2

 The default width of the box can be set via `set boxwidth <width>` or may be
 specified as an optional 3rd column in the `using` clause of the plot command.
 The first and third columns (x coordinate and width) are normally provided as
 constants rather than as data columns.

Upvotes: 1

theozh
theozh

Reputation: 25724

  1. Your boxwidth: relative to what? Your x-coordinates (column 1) are in the order of 1e5 to 1e6. Hence you should set the boxwidth in the order of 50000 to 100000 absolute. Check help boxwidth.

  2. Same for the offsets. An offset of ($1+50000) seems to be reasonable.

  3. Switch the key to noenhanced mode. Check help key.

I see another challenge: Your y-values span more than 3 orders of magnitude. It will be difficult to see them all at once. In the example below, I tried to set logscale y, but candlesticks in logscale look strange/unusual/confusing to me. Maybe there is another way to display or group your data.

Script:

### candlesticks grouped/with offset
reset session

$Data <<EOD
size, GSP_min, GSP_firstQuartile, GSP_median, GSP_avg, GSP_thirdQuartile, GSP_max, SPAM_min, SPAM_firstQuartile, SPAM_median, SPAM_avg, SPAM_thirdQuartile, SPAM_max, PREFIX_SPAN_min, PREFIX_SPAN_firstQuartile, PREFIX_SPAN_median, PREFIX_SPAN_avg, PREFIX_SPAN_thirdQuartile, PREFIX_SPAN_max
498101.0, 101.0, 101.0, 385.6666666666667, 340.0, 716.0, 11.0, 11.0, 11.0, 33.666666666666664, 29.0, 61.0, 49.0, 49.0, 49.0, 60.333333333333336, 56.0, 76.0, 
730189.0, 189.0, 189.0, 3489.0, 3740.0, 6538.0, 19.0, 19.0, 19.0, 106.66666666666667, 114.0, 187.0, 32.0, 32.0, 32.0, 69.66666666666667, 81.0, 96.0, 
EOD

set datafile separator ","
set style line 1 lc rgb '#00ff00' lw 2 pt 7 ps 1.5
set style line 2 lc rgb '#0000ff' lw 2 pt 7 ps 1.5
set style line 3 lc rgb '#ff0000' lw 2 pt 7 ps 1.5

set key autotitle columnhead noenhanced top left

set style fill empty
set boxwidth 1e4
set offsets graph 0.15, graph 0.15, graph 0.1, graph 0.1
set xtics 1e5
set logscale y

plot $Data u 1:4 w lp ls 1, \
         '' u 1:9 w lp ls 2, \
         '' u 1:14 w lp ls 3, \
         '' u ($1-5e4):3:2:6:5     w candlesticks whiskerbars, \
         '' u 1:8:7:11:10          w candlesticks whiskerbars, \
         '' u ($1+5e4):13:12:16:15 w candlesticks whiskerbars
### end of script

Result:

enter image description here

Upvotes: 1

Related Questions