Charts

Scatter Plot

An essential tool for revealing patterns, correlations, outliers, and clusters within datasets.

What is a scatter plot?

A scatter plot uses individual data points to display the relationship between two separate numerical variables. Each point on the plot represents a single observation, with its position determined by its values on the horizontal and vertical axes. Scatter plots are primarily used to identify patterns, data clusters, or relationships between variables.

Scatter plots are great for data exploration. By observing patterns in the plot, you can discern the type of correlation present: positive (points trend upwards), negative (points trend downwards), or null (points show no clear trend). They are also excellent at visualizing the dispersion and clustering of the data, and can even help spot outliers that deviate from general trends.

Due to their versatility, scatter plots are an excellent starting point for data exploration, helping to quickly identify trends and clusters within a dataset.
Default scatter plot
A chart with an exceptional data-to-ink ratio
Scatter plot showing two data clusters highlighted in gray and red.
Data-ink ratio – A principle for improving charts by maximizing the data and minimizing clutter.

First-class support for visualizing error

Data transparency is important and that's why Graphmatik scatter plots help you display uncertainty in your data by default. Graphmatik will automatically calculate the neccessary summary statistics from replicate values and display error bars by default.

Error bars can easily be toggled on/off to suit your preferences.
Scatter plot showing individual data points with error bars.Scatter plot displaying data points and their variability via error bars. A linear trendline suggests a positive correlation.

Simple regression analysis

Easily add a line of best fit to scatterplots. Graphmatik can fit linear, polynomial, exponential, logarithmic, and power trendlines via "Ordinary Least Squares" (OLS) regression. Estimated equation parameters and R-squared values (i.e. coefficient of determination) can be found in the stats workspace.

R-squared measures the proportion of variance in the dependent variable that can be explained by the independent variable in a regression model.
Linear
Scatter plot fit with a linear trendline.
Exponential
Scatter plot fit with an exponential trendline.
Logarithmic
Scatter plot fit with a logarithmic trendline.
Power
Scatter plot fit with a power trendline.
Polynomial
Scatter plot fit with a third order polynomial.

Interpolate from a curve

It is easy to interpolate from a regression curve in Graphmatik. After running an analysis, switch to the stats workspace and select the interpolate tab. A table will appear where you can enter values into X or Y columns.

Provide a value for X, and Graphmatik will interpolate a value for Y. Enter a value for Y, and Graphmatik will estimate the corresponding value of X.

Standard curve plotting individual data points with error bars, demonstrating a linear relationship for predicting unknown values.
XY
13.5265
24.6003
57.8215
11.68515
20.99825

Tips for creating beautiful scatter plots

Avoid overplotting
To fix overplotting – unreadable scatterplots caused by overlapping data points – use strokes, opacity, and/or smaller dots.
High-density scatter plot with large, opaque data points that obscure the underlying data due to heavy overlap.
High-density scatter plot where overlapping data points are shown with increased opacity, revealing areas of higher data concentration.
Filters & Thresholds
Employ filters and thresholds to craft stunning, highly customized plots that truly distinguish your data.
Volcano plot illustrating changes in expression. Data points showing a significant increase are purple, while those with a significant decrease are sky blue.
Highlight groups
Grouping categories by color is an effective way to draw insights from clusters.
A single large cluster of data points is visible.
Three visually distinct clusters of data points are present, colored sky blue, magenta, and purple.

Chart properties

PropDefaultDescription
central tendencymean
mean
The sum of a set of values divided by the number of values in the set.
median
The middle most value of a sorted set of numbers.
errorSEM
standard error of the mean (SEM) mean
How much the sample means vary from the population mean.
standard deviation (SD) mean
A measure of the variation of a set of values around their mean.
95% confidence interval (95% CI) mean or median
95% probability that the population parameter lies within this range.
range mean or median
The difference between the highest and lowest values within a set.
Interquartile range (IQR) median
The middle 50% of a set of values (i.e. 3rd quartile - 1st quartile).
radiusmed
slider
Change the size of the points between small, medium, and large.
opacity100%
slider
Change the opacity of the points.
regressionlinear
linear
Fit a straight line to the data.
polynomial
Fit a curved polynomial trendline to the data. Graphmatik supports up to 10th order polynomials.
exponential
Fit a curved exponential line to the data. Best for when the data increases or decreases at increasingly higher rates
logarithmic
Fit a curved logarithmic trendline to the data. Best for when data increases or decreases quickly then levels out.
power
Fit a curved power trendline to the data. Best for when the data increases at a specific rate.