Reputation: 2441
I have a dataset of 818,741 samples. values range between 0 and 7276. I am using the following gnuplot script to plot the data.
#+begin_src gnuplot :var data=xtics :exports code :file file.png
reset
set term png
set output "data.png"
set title "Variations/entity"
set xlabel "entity"
set xtics rotate by -45
set yrange [0:7276]
set ylabel "# fo variations"
plot 'sort_1.txt' u 2:xticlabels(1) w lp lw 2 notitle
#+end_src
The problem is that the cruve becomes a straight line when I use the dataset with the 818,741 samples.I cannot see the distribution of the data anymore. What plot do you suggest.
entity # of variations
E0669803 7276
E0726485 496
E0679687 459
E0159288 395
E0018102 337
E0498282 333
E0349508 322
E0566375 315
E0096588 314
E0182788 313
E0595006 312
E0550909 291
E0338738 290
E0031352 290
E0409686 284
E0576457 279
E0277375 275
E0277379 0
The following script is for the whole dataset. Well I don't think I can do any better.
#+begin_src gnuplot :var data=xtics :exports code :file file.png
reset
set term png
set output "data.png"
set title "Variations/entity"
set xlabel "entity"
set xtics rotate by -90
set yrange [0:7276]
set ylabel "# fo variations"
plot 'data.txt' u 2:xticlabels(1) every 100000 w lp lw 2 notitle
#+end_src
Upvotes: 0
Views: 2061
Reputation: 48430
If you want to extract statistical data from your data sample, try boxplots, one for each entity
:
set yrange [0:7276]
set style fill solid 0.25 border -1
set style boxplot nooutliers pointtype 7 separation 2
set boxwidth 1
plot "data.txt" using (1.0):2:(0):1 with boxplot notitle
This creates one boxplot for all data samples with the same string value in the first column, your "entity". And one boxplot for each unique entity is generated.
Upvotes: 2