Hani Goc
Hani Goc

Reputation: 2441

How to plot a large dataset with more than 800,000 samples using gnuplot

I have a dataset of 818,741 samples. values range between 0 and 7276. I am using the following gnuplot script to plot the data.

#+begin_src gnuplot :var data=xtics :exports code :file file.png
  reset
  set term png 
  set output "data.png" 
  set title "Variations/entity"
  
  set xlabel "entity"
  set xtics rotate by -45

  set yrange [0:7276]
  set ylabel "# fo variations"

  plot 'sort_1.txt' u 2:xticlabels(1) w lp lw 2 notitle 
#+end_src

Problem

The problem is that the cruve becomes a straight line when I use the dataset with the 818,741 samples.I cannot see the distribution of the data anymore. What plot do you suggest.


Sample data

entity   # of variations
E0669803 7276
E0726485 496
E0679687 459
E0159288 395
E0018102 337
E0498282 333
E0349508 322
E0566375 315
E0096588 314
E0182788 313
E0595006 312
E0550909 291
E0338738 290
E0031352 290
E0409686 284
E0576457 279
E0277375 275
E0277379 0

enter image description here


update

The following script is for the whole dataset. Well I don't think I can do any better.

#+begin_src gnuplot :var data=xtics :exports code :file file.png
  reset
  set term png 
  set output "data.png" 
  set title "Variations/entity"
  
  set xlabel "entity"
  set xtics rotate by -90

  set yrange [0:7276]
  set ylabel "# fo variations"

  plot 'data.txt'  u 2:xticlabels(1) every 100000 w lp lw 2 notitle
 
#+end_src

enter image description here

Upvotes: 0

Views: 2061

Answers (1)

Christoph
Christoph

Reputation: 48430

If you want to extract statistical data from your data sample, try boxplots, one for each entity:

set yrange [0:7276]
set style fill solid 0.25 border -1
set style boxplot nooutliers pointtype 7 separation 2
set boxwidth 1

plot "data.txt" using (1.0):2:(0):1 with boxplot notitle

This creates one boxplot for all data samples with the same string value in the first column, your "entity". And one boxplot for each unique entity is generated.

Upvotes: 2

Related Questions