Charts

Grouped Box Plot

A grouped visualization that uses quartiles to highlight the distribution.

What is a grouped box plot?

Grouped box plots are used to display clustered data using five key sample statistics. The "box" itself represents the interquartile range, with a central line indicating the median (Q2), and its boundaries defining the lower first quartile (Q1) and upper third quartile (Q3). The "whiskers" which extend outwards, commonly represent either the full data range, Tukey's fences (calculated as 1.5 * IQR), or specific percentiles. By clustering the data in this way, grouped box plots enable clear side-by-side comparisons both between different groups and across their respective subcategories.

Dots extending beyond the whiskers of a box plot could be potential outliers. This outlier test is most commonly used in reference to Tukey's fences (ie. 1.5 * IQR).
Default grouped box plot
Includes an overlay of individual points showing the spread of the data by default.
Grouped vertical box plot showing two clusters and three subgroups with Graphmatik's default styling applied.

To dot plot or NOT to dot plot?

Box plots offer valuable insights into a dataset's underlying distribution, revealing symmetry, data concentration, skewness, and potential outliers. However, discerning the precise shape of the distribution from a box plot is often less straightforward than with a histogram, nor does it provide any information about the total number of observations.

This is why Graphmatik overlays a distribution dot plot on top of the generated box plot by default. You can of course toggle off these individual data points if you'd like.

Even if you toggle off points, outliers will still be shown, as is standard with box-and-whisker plots
Make sure to check out these tips on how to create beautiful grouped plots
Easily swap directions
Alternate between vertical and horizontal box plots by toggling between columns or rows.
A horizontal grouped box plot showing two clusters, each containing three subgroups arranged from smallest to largest. The overall largest subgroup across clusters is highlight in purple.

Chart properties

PropDefaultDescription
central tendencymedian
median
The middle most value of a sorted set of numbers.
whiskersrange
range
The difference between the highest and lowest values within a set.
1.5 * Interquartile range (1.5*IQR)
A range representing Q1 - 1.5 * IQR and Q3 + 1.5 * IQR.
2.5 percentile - 97.5 percentile (2.5-97.5 %tile)
The difference between the 2.5 percentile and the 97.5 percentile, representing the middle 95% of a set.
sortnone
none
The clusters are arranged in insertion order.
ascending
Clusters are arranged from smallest to largest.
descending
Clusters are arranged from largest to smallest.
group byfactor
factor
Bars will be grouped by the selected factor, with the other factor defining the subgroups.