Reputation: 233
My data file has this content
# data file for use with gnuplot
# Report 001
# Data as of Tuesday 03-Sep-2013
total 1976
case1 522 278 146 65 26 7
case2 120 105 15 0 0 0
case3 660 288 202 106 63 1
I am making a histogram from the case... lines using the script below - and that works. My question is: how can I load the grand total value 1976 (next to the word 'total') from the data file and either (a) store it into a variable or (b) use it directly in the title of the plot?
This is my gnuplot script:
reset
set term png truecolor
set terminal pngcairo size 1024,768 enhanced font 'Segoe UI,10'
set output "output.png"
set style fill solid 1.00
set style histogram rowstacked
set style data histograms
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
plot for [i=3:7] 'mydata.dat' every ::1 using i:xticlabels(1) with histogram \
notitle, '' every ::1 using 0:2:2 \
with labels \
title "My Title"
For the benefit of others trying to label histograms, in my data file, the column after the case label represents the total of the rest of the values on that row. Those total numbers are displayed at the top of each histogram bar. For example for case1, 522 is the total of (278 + 146 + 65 + 26 + 7).
I want to display the grand total somewhere on my chart, say as the second line of the title or in a label. I can get a variable into sprintf into the title, but I have not figured out syntax to load a "cell" value ("cell" meaning row column intersection) into a variable.
Alternatively, if someone can tell me how to use the sum function to total up 522+120+660 (read from the data file, not as constants!) and store that total in a variable, that would obviate the need to have the grand total in the data file, and that would also make me very happy.
Many thanks.
Upvotes: 7
Views: 15133
Reputation: 25749
It's not clear to me where your "grand total" of 1976
comes from. If I calculate 522+120+660
I get 1302
not 1976
.
Anyway, here is a solution which works even without stats
and sum
which were not available in gnuplot 4.4.0.
In the data you don't necessarily need the "grand total" or the sum of each row, because gnuplot can calculate this for you. This is done by (not) plotting the file as a matrix, and at the same time summing up the rows in the string variable S0
and the total sum in variable Total
. There will be a warning warning: matrix contains missing or undefined values
which you can ignore. The labels are added by plotting '+' ... with labels
extracting the desired values from the S0
string.
Data: SO18583180.dat
So, the reduced input data looks like this:
# data file for use with gnuplot
# Report 001
# Data as of Tuesday 03-Sep-2013
case1 278 146 65 26 7
case2 105 15 0 0 0
case3 288 202 106 63 1
Script: (works for gnuplot>=4.4.0, March 2010 and gnuplot 5.x)
### histogram with sums and total sum
reset
FILE = "SO18583180.dat"
set style histogram rowstacked
set style data histograms
set style fill solid 0.8
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
set key top left noautotitle
set grid y
set xrange [0:2]
set offsets 0.5,0.5,0,0
Total = 0
S0 = ''
addSums(v) = S0.sprintf(" %g",(M=$2,(N=$1+1)==1?S1=0:0,S1=S1+v))
plot for [i=2:6] FILE u i:xtic(1) notitle, \
'' matrix u (S0=addSums($3),Total=Total+$3,NaN) w p, \
'+' u 0:(real(S2=word(S0,int($0*N+N)))):(S2) every ::::M w labels offset 0,0.7 title sprintf("Total: %g",Total)
### end of script
Result: (created with gnuplot 4.4.0, Windows terminal)
Upvotes: 0
Reputation: 698
For linux and similar.
If you don't know the row number where your data is located, but you know it is in the n-th column of a row where the value of the m-th column is x, you can define a function
get_data(m,x,n,filename)=system('awk "\$'.m.'==\"'.x.'\"{print \$'.n.'}" '.filename)
and then use it, for example, as
y = get_data(1,"case2",4,"datafile.txt")
using data provided by user424855
print y
should return 15
Upvotes: 2
Reputation: 48390
Lets start with extracting a single cell at (row,col). If it is a single values, you can use the stats
command to extract the values. The row
and col
are specified with every
and using
, like in a plot command. In your case, to extract the total value, use:
# extract the 'total' cell
stats 'mydata.dat' every ::::0 using 2 nooutput
total = int(STATS_min)
To sum up all values in the second column, use:
stats 'mydata.dat' every ::1 using 2 nooutput
total2 = int(STATS_sum)
And finally, to sum up all values in columns 3:7
in all rows (i.e. the same like the previous command, but without using the saved totals) use:
# sum all values from columns 3:7 from all rows
stats 'mydata.dat' every ::1 using (sum[i=3:7] column(i)) nooutput
total3 = int(STATS_sum)
These commands require gnuplot 4.6 to work.
So, your plotting script could look like the following:
reset
set terminal pngcairo size 1024,768 enhanced
set output "output.png"
set style fill solid 1.00
set style histogram rowstacked
set style data histograms
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
# extract the 'total' cell
stats 'mydata.dat' every ::::0 using 2 nooutput
total = int(STATS_min)
plot for [i=3:7] 'mydata.dat' every ::1 using i:xtic(1) notitle, \
'' every ::1 using 0:(s = sum [i=3:7] column(i), s):(sprintf('%d', s)) \
with labels offset 0,1 title sprintf('total %d', total)
which gives the following output:
Upvotes: 14