Txiki PerPar
Txiki PerPar

Reputation: 35

gnuplot double chart from csv

I can't make a double bar chart by age and genre. This is my data:

"20 to 30","Man",7
"20 to 30","Woman",5
"30 to 40","Man",3
"30 to 40","Woman",6
"40 to 50","Man",9
"40 to 50","Woman",2

I'm trying something like:

enter image description here

I've tried several options, like:

plot 'data.csv' using 3:xtic(2) with boxes ls 1,\
   'data.csv' using 3:xtic(2) with boxes ls 2

But it show like this:

nup

Upvotes: 0

Views: 329

Answers (2)

theozh
theozh

Reputation: 26200

Of course, you can always preprocess your data with external tools such that it can be easily plotted with gnuplot's plotting styles. But in your case (with several string columns) I'm not sure whether gnuplot offers a suitable plotting style. At least I couldn't find (yet) a comparable example on www.gnuplot.info or anywhere else.

Ethan's script is shorter, but only works nicely because the categories "male" and "female" are strictly alternating in the data.

In my more general approach they could be random, but then you need to create lists of unique parameters, (in your case age and gender). For example, in Python this simply would be the command set(list). In gnuplot you have to implement this yourself.

Then plot the data in loops using the ternary operator (check "help ternary") to "filter" the data. Note that I'm not using any histogram style but simply with boxes. With a few more parameters gap and boxWidth and a list of colors you can easily fine tune your graph. I hope you can adapt the script below to your needs.

Script: (works for gnuplot>=5.0.0, except for 5.0.4 for some unknown reason)

### grouped bar chart with several string columns
reset session

# data can be random, xtics of bars will be in order of first occurrence
$Data <<EOD
"20 to 30", "male",   7
"30 to 40", "male",   3
"40 to 50", "female", 2
"40 to 50", "male",   9
"30 to 40", "female", 6
"20 to 30", "female", 5
EOD

set datafile separator comma
colX    = 1     # here: age
colSubX = 2     # here: gender
colY    = 3

# create unique lists of entries
addUniq(list,col) = list.(_s=' "'.strcol(col).'"', strstrt(list,_s)>0 ? '' : _s)
Xs = SubXs = ''
stats $Data u (   Xs=addUniq(Xs,   colX))    nooutput
stats $Data u (SubXs=addUniq(SubXs,colSubX)) nooutput
N        = words(Xs)
M        = words(SubXs)
X(i)     = word(Xs,i)
SubX(i)  = word(SubXs,i)

# bar chart settings
gap         = 0.3    # relative gap between bar groups
boxWidth    = 0.8    # relative boxwidth
boxGrid     = (1.-gap)/M
xPos(n,m)   = n - 0.5 + gap/2. + boxGrid*(m-0.5)
yPos(n,m,c) = strcol(colX) eq X(n) && strcol(colSubX) eq SubX(m) ? column(c) : NaN
colors      = "#0000ff #ff0000 #00ff00"
color(i)    = word(colors,i)

set style fill solid 1.0
set key top left noautotitle
set xtics out nomirror
set xrange[0.5:N+0.5]
set yrange[0:]
set offset 0,0,graph 0.1, 0

plot for [n=1:N] for [m=1:M] \
     $Data u (xPos(n,m)):(yPos(n,m,colY)):(boxWidth*boxGrid) w boxes lc rgb color(m), \
     for [m=1:M] NaN ti SubX(m) w boxes lc rgb color(m), \
     for [n=1:N] $Data u (n):(NaN):xtic(X(n))
### end of script

Result:

enter image description here

By using the same code but simply swapping the column numbers, i.e. colX = 2 and colSubX = 1 you will get the following:

enter image description here

Upvotes: 1

Ethan
Ethan

Reputation: 15143

SUDHANSHU SHEKHAR CHAURASIA was correct to point to an earlier answer. Your case is very similar and so is the solution.

$Data <<EOD
"20 to 30", "male",   7
"20 to 30", "female", 5
"30 to 40", "male",   3
"30 to 40", "female", 6
"40 to 50", "male",   9
"40 to 50", "female", 2
EOD
set datafile sep comma

set style data histogram
set style histogram cluster
set style fill solid

plot $Data every 2::0 using 3:xtic(1) title "Man", \
     $Data every 2::1 using 3 title "Woman"

enter image description here

Pulling the title from column 2 might be possible but I think it would depend on the exact version of gnuplot you are using.

Upvotes: 1

Related Questions