Reputation: 7563
I do not want to change my data files that come with the first column containing the time values. Then I formatted it on gnuplot to show only the hour and minute. But it is a bit ugly to start the time from 8:00. I would like to start it from 0 and keep the values at the same pace of the data file. I was trying to use a constant like this example shows How do I make a plot in gnuplot with the lowest value automatically subtracted from the y data? but it is not working.
Here are my source and the plot.
#!/usr/bin/gnuplot
# set grid
set key outside bottom center horizontal
set key font ",19"
set style line 1 lc rgb '#E02F44' lt 1 lw 1 ps 0.5 pt 7 # input throughput
set style line 2 lc rgb '#FF780A' lt 1 lw 1 ps 0.5 pt 1 # output throughput
set style line 3 lc rgb '#56A64B' lt 1 lw 1 ps 0.5 pt 2 # average processing latency
set style line 4 lc rgb '#000000' lt 1 lw 1 ps 0.5 pt 3 # 99th percentile processing latency
set style arrow 1 heads ls 4
set style arrow 2 head ls 4
set terminal pdf
set pointintervalbox 0
set datafile separator ','
set output "Cost-20K-ThroughputVsLatency.pdf"
#set title ""
set xlabel "time (minutes)" font ",17" offset 0,1,0
set xtics font ",8" offset 0,0.5,0
set xdata time # tells gnuplot the x axis is time data
set timefmt "%Y-%m-%d %H:%M:%S" # specify our time string format
set format x "%H:%M" # otherwise it will show only MM:SS
set xrange ["2020-05-07 08:05:00":"2020-05-07 09:50:00"]
set ylabel "Throughput (K rec/sec)" font ",18" offset 0,0,0
set yrange [0:7]
set ytics font ",20"
#set y2label "processing latency (seconds)" font ",18" offset -1.5,0,0
set y2range [0:25]
set ytics nomirror
set y2tics 0, 5 font ",17"
plot "throughput-vs-latency-20K.csv" using 1:(column(2)/1000) title "IN throughput" with linespoints ls 1 axis x1y1 \
, "throughput-vs-latency-20K.csv" using 1:(column(10)/1000) title "OUT throughput" with linespoints ls 2 axis x1y1 \
, "throughput-vs-latency-20K.csv" using 1:(column(18)/1000) title "avg. latency" with linespoints ls 3 axis x1y2 \
, "throughput-vs-latency-20K.csv" using 1:(column(26)/1000) title "99th latency" with linespoints ls 4 axis x1y2
UPDATE I changed my script like you said @theozh but I am still not getting the x axis starting from 0.
set key bottom right
set key font ",11"
set style line 1 lc rgb '#E02F44' lt 1 lw 1 ps 0.5 pt 7 # input throughput
set style line 2 lc rgb '#FF780A' lt 1 lw 1 ps 0.5 pt 1 # output throughput
set style line 3 lc rgb '#56A64B' lt 1 lw 1 ps 0.5 pt 2 # average processing latency
set style line 4 lc rgb '#000000' lt 1 lw 1 ps 0.5 pt 3 # 99th percentile processing latency
set style arrow 1 heads ls 4
set term pdfcairo size 5.0in,2.5in
set pointintervalbox 0
set datafile separator ','
set tmargin 1.5
set border 1+2+8
set xtics nomirror
set output "throughput-latency-increasingK-TaxiRideNYC-50Kpersec.pdf"
myTimeFmt = "%Y-%m-%d %H:%M:%S"
set xlabel "time (minutes)" font ",9" offset 0,1.5,0
set xtics font ",8" #rotate by 45 right
set ylabel "Throughput (K rec/sec)" font ",10" offset 2,0,0
set yrange [0:3.5]
set y2label "processing latency (seconds)" font ",10" offset -2,0,0
set y2range [0:14]
set ytics nomirror
set y2tics 0, 2
set xdata time # tells gnuplot the x axis is time data
set format x "%M" time
plot t=0 "throughput-latency-increasing.csv" u (t==0?(t0=timecolumn(1,myTimeFmt),t=1):NaN, timecolumn(1,myTimeFmt)-t0):(column(2)/1000) title "IN throughput" with linespoints ls 1 axis x1y1 \
, t=0 "throughput-latency-increasing.csv" u (t==0?(t0=timecolumn(1,myTimeFmt),t=1):NaN, timecolumn(1,myTimeFmt)-t0):(column(18)/1000) title "avg. latency" with linespoints ls 3 axis x1y2 \
, 4/0 t "# of tuples pre-aggregating" with vectors arrowstyle 1
values are here:
"Time","pre_aggregate-outPool[0]-avg","pre_aggregate-outPool[1]-avg","pre_aggregate-outPool[2]-avg","pre_aggregate-outPool[3]-avg","pre_aggregate-outPool[4]-avg","pre_aggregate-outPool[5]-avg","pre_aggregate-outPool[6]-avg","pre_aggregate-outPool[7]-avg","pre_aggregate-outPool[0]-99","pre_aggregate-outPool[1]-99","pre_aggregate-outPool[2]-99","pre_aggregate-outPool[3]-99","pre_aggregate-outPool[4]-99","pre_aggregate-outPool[5]-99","pre_aggregate-outPool[6]-99","pre_aggregate-outPool[7]-99","pre_aggregate[0]-param","pre_aggregate[1]-param","pre_aggregate[2]-param","pre_aggregate[3]-param","pre_aggregate[4]-param","pre_aggregate[5]-param","pre_aggregate[6]-param","pre_aggregate[7]-param"
"2020-04-27 10:22:45",33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33,70,75,79,33,41,62,75,50000,50000,50000,50000,50000,50000,50000,50000
"2020-04-27 10:23:00",33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33,33,75,79,33,33,33,37,50000,50000,50000,50000,50000,50000,50000,50000
"2020-04-27 10:23:15",33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33,33,33,33,33,33,33,33,50000,50000,50000,50000,50000,50000,50000,50000
"2020-04-27 10:23:30",33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,62,66,50,62,66,45,50,66,50000,50000,50000,50000,50000,50000,50000,50000
"2020-04-27 10:23:45",33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,62,66,50,62,66,45,50,66,50000,50000,50000,50000,50000,50000,50000,50000
"2020-04-27 10:24:00",33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33.33333432674408,33,33,33,33,33,33,33,33,50000,50000,50000,50000,50000,50000,50000,50000
Upvotes: 1
Views: 423
Reputation: 25714
The following example uses the newer gnuplot date time syntax (see help timecolumn
), e.g. timecolumn(1,myTimeFmt)
and set format x "%H:%M" time
.
In order to normalize your time series to the first data point you have to store this time into a variable, e.g. t0
which you can "re-use" in successive plot commands from the same datafile.
Note the different time format for the x axis: "%H:%M"
for day time and "%tH:%tM"
for hours exceeding 24 hours or minutes exceeding 60 minutes, see help time_specifiers
.
Edit:
Normalize()
. But note that t=0
is still required at the beginning of the plot command.skip <number of header lines>
.Code:
### normalize time data relative to start time
reset session
myTimeFmt = "%Y-%m-%d %H:%M:%S"
# create some test data
set table $Data
plot '+' u (strftime(myTimeFmt,time(0) + $1*3600*2)):(cos($1)) w table
unset table
# function to normalize time column to first value
Normalize(c) = (t==0?(t0=timecolumn(c,myTimeFmt),t=1):NaN, timecolumn(c,myTimeFmt)-t0)
# in case there are uncommented header lines skip them
SkipHeaderLines = 0
set multiplot layout 2,1
set format x "%Y\n%m-%d\n%H:%M" time
plot $Data u (timecolumn(1,myTimeFmt)):3 skip SkipHeaderLines w l ti "absolute time"
set format x "%tH:%tM" time
plot t=0 $Data u (Normalize(1)):3 skip SkipHeaderLines w l ti "relative time"
unset multiplot
### end of code
Result:
Upvotes: 3