Box Plot
What is a box and whisker plot?
A box plot is a graphical representation of a dataset's distribution based on five sample statistics. The "box" includes the median center line (Q2), the lower 1st quartile (Q1) and the upper 3rd quartile (Q3). Whereas the "whiskers" extend out from the box and typically represent either the maximum and minimum value (i.e. range), Tukey's fences calculated as 1.5 times the interquartile range (IQR), or set percentiles.
Any dots that extend beyond the whiskers of a box plot may indicate potential outliers in the dataset. This outlier test is most commonly used in reference to Tukey's fences (ie. 1.5 * IQR).
To dot plot or NOT to dot plot?
The box plot provides quite a bit of information about the underlying distribution of a dataset including if the data is symmetrical, tightly coupled, skewed, or has outliers. That said, the shape of the distribution is generally not as simple to interpret as with a histogram nor does it provide any information about the number of observations.
This is why Graphmatik overlays a distribution dot plot on top of the generated box plot by default. You can of course toggle off these individual data points if you'd like.
toggle off
points, outliers will still be shown, as is standard with box-and-whisker plotsChart properties
Prop | Default | Description |
---|---|---|
central tendency | median | median The middle most value of a sorted set of numbers. |
whiskers | range | range The difference between the highest and lowest values within a set. 1.5 * Interquartile range (1.5*IQR) A range representing Q1 - 1.5 * IQR and Q3 + 1.5 * IQR. 2.5 percentile - 97.5 percentile (2.5-97.5 %tile) The difference between the 2.5 percentile and the 97.5 percentile, representing the middle 95% of a set. |
sort | none | none The dataset is arranged in insertion order. ascending The dataset is arranged from smallest to largest value. descending The dataset is arranged from largest to smallest value. |