Charts

Box Plot

A classic plot for displaying a data distribution with quartiles.

What is a box and whisker plot?

A box plot is a graphical representation of a dataset's distribution based on five sample statistics. The "box" includes the median center line (Q2), the lower 1st quartile (Q1) and the upper 3rd quartile (Q3). Whereas the "whiskers" extend out from the box and typically represent either the maximum and minimum value (i.e. range), Tukey's fences calculated as 1.5 times the interquartile range (IQR), or set percentiles.

Any dots that extend beyond the whiskers of a box plot may indicate potential outliers in the dataset. This outlier test is most commonly used in reference to Tukey's fences (ie. 1.5 * IQR).

Statistician John Tukey is widely credited with creating the box plot for exploratory data analysis.
Box plot summaries are a conventient way to quickly compare differences between groups.
Default vertical box plot
Includes an overlay of the individual points representing the distribution of the data
A vertical box plot with Graphmatik's default styling applied.

To dot plot or NOT to dot plot?

The box plot provides quite a bit of information about the underlying distribution of a dataset including if the data is symmetrical, tightly coupled, skewed, or has outliers. That said, the shape of the distribution is generally not as simple to interpret as with a histogram nor does it provide any information about the number of observations.

This is why Graphmatik overlays a distribution dot plot on top of the generated box plot by default. You can of course toggle off these individual data points if you'd like.

Even if you toggle off points, outliers will still be shown, as is standard with box-and-whisker plots
Make sure to check out these amazing tips on how to create beautiful column charts
Easily swap directions
Alternate between vertical or horizontal box plots by switching types within the data workspace.
A horizontal box plot showing four groups in descending order. The largest group (on top) is highlighted in purple.

Chart properties

PropDefaultDescription
central tendencymedian
median
The middle most value of a sorted set of numbers.
whiskersrange
range
The difference between the highest and lowest values within a set.
1.5 * Interquartile range (1.5*IQR)
A range representing Q1 - 1.5 * IQR and Q3 + 1.5 * IQR.
2.5 percentile - 97.5 percentile (2.5-97.5 %tile)
The difference between the 2.5 percentile and the 97.5 percentile, representing the middle 95% of a set.
sortnone
none
The dataset is arranged in insertion order.
ascending
The dataset is arranged from smallest to largest value.
descending
The dataset is arranged from largest to smallest value.