How to group boxplot outliers in gnuplot -
i have large set of data points. try plot them boxplot, of outliers exact same value , represented on line beside each other. found how set horizontal distance between outliers in gnuplot boxplot, doesn't much, apparently not possible.
is possible group outliers together, print 1 point , print number in brackets beside indicate how many points there are? think make more readable in graph.
for information, have 3 boxplots 1 x value , times 6 in 1 graph. using gnuplot 5 , played around pointsize, doesn't reduce distance anymore. hope can help!
edit:
set terminal pdf set output 'dat.pdf' file0 = 'dat1.dat' file1 = 'dat2.dat' file2 = 'dat3.dat' set pointsize 0.2 set notitle set xlabel 'x' set ylabel 'y' header = system('head -1 '.file0); n = words(header) set xtics ('' 1) set [i=1:n] xtics add (word(header, i) i) set style data boxplot plot file0 using (1-0.25):1:(0.2) boxplot lw 2 lc rgb '#8b0000' fs pattern 16 title 'a' plot file1 using (1):1:(0.2) boxplot lw 2 lc rgb '#00008b' fs pattern 4 title 'b' plot file2 using (1+0.25):1:(0.2) boxplot lw 2 lc rgb '#006400' fs pattern 5 title 'c' [i=2:n] plot file0 using (i-0.25):i:(0.2) boxplot lw 2 lc rgb '#8b0000' fs pattern 16 notitle [i=2:n] plot file1 using (i):i:(0.2) boxplot lw 2 lc rgb '#00008b' fs pattern 4 notitle [i=2:n] plot file2 using (i+0.25):i:(0.2) boxplot lw 2 lc rgb '#006400' fs pattern 5 notitle
what best way implement code in place?
there not option have done automatically. required steps manually in gnuplot are:
(in following assume, data file data.dat
has single column.)
analyze data
stats
determine boundaries outliers:stats 'data.dat' using 1 range = 1.5 # (this default value of `set style boxplot range` value) lower_limit = stats_lo_quartile - range*(stats_up_quartile - stats_lo_quartile) upper_limit = stats_up_quartile + range*(stats_up_quartile - stats_lo_quartile)
count outliers , write them temporary file
set table 'tmp.dat' plot 'data.dat' using 1:($1 > upper_limit || $1 < lower_limit ? 1 : 0) smooth frequency unset table
plot boxplot without outliers, , outliers
labels
plotting style:set style boxplot nooutliers plot 'data.dat' using (1):1 boxplot,\ 'tmp.dat' using (1):($2 > 0 ? $1 : 1/0):(sprintf('(%d)', int($2))) labels offset 1,0 left point pt 7
and needs done every single boxplot.
disclaimer: procedure should work basically, having no example data couldn't test it.
Comments
Post a Comment