How to group boxplot outliers in gnuplot -


i have large set of data points. try plot them boxplot, of outliers exact same value , represented on line beside each other. found how set horizontal distance between outliers in gnuplot boxplot, doesn't much, apparently not possible.

is possible group outliers together, print 1 point , print number in brackets beside indicate how many points there are? think make more readable in graph.

for information, have 3 boxplots 1 x value , times 6 in 1 graph. using gnuplot 5 , played around pointsize, doesn't reduce distance anymore. hope can help!

edit:

set terminal pdf set output 'dat.pdf' file0 = 'dat1.dat' file1 = 'dat2.dat' file2 = 'dat3.dat' set pointsize 0.2 set notitle set xlabel 'x' set ylabel 'y' header = system('head -1 '.file0); n = words(header)  set xtics ('' 1) set [i=1:n] xtics add (word(header, i) i)  set style data boxplot plot file0 using (1-0.25):1:(0.2) boxplot lw 2 lc rgb '#8b0000' fs pattern 16 title 'a' plot file1 using (1):1:(0.2) boxplot lw 2 lc rgb '#00008b' fs pattern 4 title 'b' plot file2 using (1+0.25):1:(0.2) boxplot lw 2 lc rgb '#006400' fs pattern 5 title 'c' [i=2:n] plot file0 using (i-0.25):i:(0.2) boxplot lw 2 lc rgb '#8b0000' fs pattern 16 notitle [i=2:n] plot file1 using (i):i:(0.2) boxplot lw 2 lc rgb '#00008b' fs pattern 4 notitle [i=2:n] plot file2 using (i+0.25):i:(0.2) boxplot lw 2 lc rgb '#006400' fs pattern 5 notitle 

what best way implement code in place?

there not option have done automatically. required steps manually in gnuplot are:

(in following assume, data file data.dat has single column.)

  1. analyze data stats determine boundaries outliers:

    stats 'data.dat' using 1 range = 1.5 # (this default value of `set style boxplot range` value) lower_limit = stats_lo_quartile - range*(stats_up_quartile - stats_lo_quartile) upper_limit = stats_up_quartile + range*(stats_up_quartile - stats_lo_quartile) 
  2. count outliers , write them temporary file

    set table 'tmp.dat' plot 'data.dat' using 1:($1 > upper_limit || $1 < lower_limit ? 1 : 0) smooth frequency unset table 
  3. plot boxplot without outliers, , outliers labels plotting style:

    set style boxplot nooutliers plot 'data.dat' using (1):1 boxplot,\      'tmp.dat' using (1):($2 > 0 ? $1 : 1/0):(sprintf('(%d)', int($2))) labels offset 1,0 left point pt 7 

and needs done every single boxplot.

disclaimer: procedure should work basically, having no example data couldn't test it.


Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -