Charts

Scatter Plot

An essential tool for revealing patterns, correlations, outliers, and clusters within datasets.

What is a scatter plot?

A scatter plot uses individual data points to display the relationship between two separate numerical variables. Each point on the plot represents a single observation, with its position determined by its values on the horizontal and vertical axes. Scatter plots are primarily used to identify patterns, data clusters, or relationships between variables.

Scatter plots are great for data exploration. By observing patterns in the plot, you can discern the type of correlation present: positive (points trend upwards), negative (points trend downwards), or null (points show no clear trend). They are also excellent at visualizing the dispersion and clustering of the data, and can even help spot outliers that deviate from general trends.

Due to their versatility, scatter plots are an excellent starting point for data exploration, helping to quickly identify trends and clusters within a dataset.

Default scatter plot

A chart with an exceptional data-to-ink ratio

Scatter plot showing two data clusters highlighted in gray and red.

Data-ink ratio – A principle for improving charts by maximizing the data and minimizing clutter.

First-class support for visualizing error

Data transparency is important and that's why Graphmatik scatter plots help you display uncertainty in your data by default. Graphmatik will automatically calculate the neccessary summary statistics from replicate values and display error bars by default.

Error bars can easily be toggled on/off to suit your preferences.

Scatter plot showing individual data points with error bars.

Scatter plot displaying data points and their variability via error bars. A linear trendline suggests a positive correlation.

Simple regression analysis

Easily add a line of best fit to scatterplots. Graphmatik can fit linear, polynomial, exponential, logarithmic, and power trendlines via "Ordinary Least Squares" (OLS) regression. Estimated equation parameters and R-squared values (i.e. coefficient of determination) can be found in the stats workspace.

R-squared measures the proportion of variance in the dependent variable that can be explained by the independent variable in a regression model.

Linear

Exponential

Logarithmic

Power

Polynomial

Interpolate from a curve

It is easy to interpolate from a regression curve in Graphmatik. After running an analysis, switch to the stats workspace and select the interpolate tab. A table will appear where you can enter values into X or Y columns.

Provide a value for X, and Graphmatik will interpolate a value for Y. Enter a value for Y, and Graphmatik will estimate the corresponding value of X.

Standard curve plotting individual data points with error bars, demonstrating a linear relationship for predicting unknown values.

X	Y
1	3.5265
2	4.6003
5	7.8215
11.685	15
20.998	25

Tips for creating beautiful scatter plots

Avoid overplotting

To fix overplotting – unreadable scatterplots caused by overlapping data points – use strokes, opacity, and/or smaller dots.

High-density scatter plot with large, opaque data points that obscure the underlying data due to heavy overlap.

High-density scatter plot where overlapping data points are shown with increased opacity, revealing areas of higher data concentration.

Filters & Thresholds

Employ filters and thresholds to craft stunning, highly customized plots that truly distinguish your data.

Volcano plot illustrating changes in expression. Data points showing a significant increase are purple, while those with a significant decrease are sky blue.

Highlight groups

Grouping categories by color is an effective way to draw insights from clusters.

A single large cluster of data points is visible.

Three visually distinct clusters of data points are present, colored sky blue, magenta, and purple.

Chart properties

Prop	Default	Description
`central tendency`	mean	mean The sum of a set of values divided by the number of values in the set. median The middle most value of a sorted set of numbers.
`error`	SEM	standard error of the mean (SEM) `mean` How much the sample means vary from the population mean. standard deviation (SD) `mean` A measure of the variation of a set of values around their mean. 95% confidence interval (95% CI) `mean or median` 95% probability that the population parameter lies within this range. range `mean or median` The difference between the highest and lowest values within a set. Interquartile range (IQR) `median` The middle 50% of a set of values (i.e. 3rd quartile - 1st quartile).
`radius`	med	slider Change the size of the points between small, medium, and large.
`opacity`	100%	slider Change the opacity of the points.
`regression`	linear	linear Fit a straight line to the data. polynomial Fit a curved polynomial trendline to the data. Graphmatik supports up to 10th order polynomials. exponential Fit a curved exponential line to the data. Best for when the data increases or decreases at increasingly higher rates logarithmic Fit a curved logarithmic trendline to the data. Best for when data increases or decreases quickly then levels out. power Fit a curved power trendline to the data. Best for when the data increases at a specific rate.

Stacked Area Chart

Perfect for showing the evolution of a data series over time.

Dose Response

The standard way to characterize the potency or efficacy of a compound.

Docs

Tutorials