Andrew Reid - Ph.D.

Blog posts

Intro

Learning about correlation and partial correlation
Published on 2021-02-04 by Andrew Reid	#14

Note: Matlab code for the stuff below can be downloaded here.

Correlation: an overview

Correlation is, simply put, the degree to which you can predict one variable by knowing another variable. This relationship is often interchangeably referred to as an association or a dependency. Two variables that are completely uncorrelated are said to be independent of one another.

A Pearson correlation coefficient is the most common way to quantify a correlation between two continuous variables (i.e., variables expressed on an interval scale). It is usually denoted \(r\), or \(r_{xy}\) for variables \(X\) and \(Y\), and takes on values from -1 to 1, where 0 indicates complete independence, 1 indicates that the two variables are identical, and -1 indicates that they are exactly inverse.

Correlations can best be visualised using a scatterplot, which plots the individual data points of the two variables on opposing axes. A scatterplot shows both the univariate distributions (also shown as histograms below) and the bivariate distribution (how the two variables are related to one another).

As you can probably see from the above figures, a perfect positive correlation (\(r=1.0\)) or negative correlation (\(r=-1.0\)) has all the data points lying on the diagonal line, slanting upwards or downwards, respectively. In these two extreme situations (which you will never see in naturally observed phenomena), knowledge of \(X\) allows us to predict \(Y\) with complete certainly. Put another way, \(X\) shares 100% of its variance with \(Y\). Indeed, if you square \(r_{xy}\), this will give you the percentage of shared variance between the two variables.

We can get a more intuitive understanding of Pearson coefficients if we convert our variables to Z-scores (or, equivalently, standardize them). A Z-score is obtained when we subtract the mean from all the values, and divide by the standard deviation:

\[ X_Z = \frac{X - \bar{X}}{s_x} \]

where \(\bar{X}\) is the mean of \(X\), and \(s_x\) is its standard deviation. When we transform our variables in this way, we now have a variable for which 0 is always the mean, and ±1 is always one standard deviation above or below the mean.

Pearson coefficients make use of an interesting property of Z-scores, which is that if you square them, sum the squares, and divide by the number of data points (\(N\)) less one, you get exactly 1:

\[ \frac{\sum{x_Z^2}}{N-1}=1.0 \]

(\(x_Z\) here is a single data point in \(X_Z\)). This also is what you get when you multiply two identical variables together (\(Y_Z=X_Z\)):

\[ \frac{\sum{x_Z y_Z}}{N-1}=\frac{\sum{x_Z x_Z}}{N-1}=\frac{\sum{x_Z^2}}{N-1}=1.0 \]

When your variables are inverse (\(Y_Z=-X_Z\)), on the other hand, this sum becomes:

\[ \frac{\sum{x_Z y_Z}}{N-1}=\frac{\sum{-x_Z x_Z}}{N-1}=-\frac{\sum{x_Z^2}}{N-1}=-1.0 \]

You hopefully recognize these as the extreme cases in the scatterplots above. Indeed, for any two variables that are not identical or inverse, the sum of their products will lie somewhere in between -1 and 1, and the closer they are to 0, the less similar they are to one another. This sum of products is actually the equation for a Pearson coefficient:

\[ r_{xy} = \frac{1}{N-1} \sum{x_Z y_Z} \]

For those of you who are less than fond of equations, we can instead look at the same plots as above, but this time after converting our variables to Z-scores:

Here, I've also drawn in some squares and rectangles, corresponding to three data points. The (signed) areas of these shapes are the products of the corresponding \(x_Z\) and \(y_Z\) values; green areas are positive and yellow areas are negative. Our Pearson coefficient is proportional to the sum of these areas. Notably, for \(r_{xy}=1.0\), these areas are always positive, because the two variables always have the same sign; conversely, for \(r_{xy}=-1.0\) they are always negative. For \(r_{xy}=0.5\), some of these areas will be negative and some positive.

Intuitively, this shows us that the more we spread our points into all four quadrants, the lower our absolute Pearson coefficient will be. As the positive and negative areas balance out, our correlation will approach zero. This coefficient thus does a good job of capturing the degree to which our two variables are associated with one another.

The spectre of confounding

While a correlation coefficient can inform us how strongly two variables are associated, the interpretation of this association is often tricky, due to the problem of confounding. A confounder is a third variable that is correlated with both your variables of interest. The existence of such a variable (let's call it \(C\)) makes it very difficult to say anything conclusive about the relationship between \(X\) and \(Y\).

Let's consider a less abstract example. Say we were interested in understanding whether performance in school leads to better incomes later in life. We collect data via a questionnaire that records people's household incomes (\(Y\)) along with their average grades (\(X\)) from secondary school. We then compute a Pearson correlation from these values: \(r_{xy} = 0.43\).

If we square this value, we get \(r_{xy}^2 = 0.185\), indicating that school performance explains 18.5% of later household income. Interesting, right? We might be tempted to conclude that performance in school leads (on average) to better incomes.

The problem with this is that there are likely many factors we haven't considered. Socioeconomic status (SES), for instance, can have a strong detrimental impact on the ability of students to develop their academic skills. This means that, for someone from a low-SES community, it is simply much more difficult to perform better in school, and subsequently to attain higher incomes. Compounding this problem, a low-SES background can itself impede one's ability to attain a higher income, for numerous reasons.

Failing to consider SES means that it is difficult for us to determine how much academic performance contributes to income, since much or even all of this association might be attributable to differing SES backgrounds. In other words, SES is a confounder.

So how do I deal with confounding?

To deal with this sort of confounding, we need a way to assess the association between education and income after first accounting for the relationships between SES and education, and SES and income. We can use so-called partial correlations to do this.

Partial correlations can be summed up in the following Venn diagram:

In this diagram, circles represent a variable's variance, and overlapping circles represent variance that is shared between variables (also called covariance). Our objective with partial correlations is to determine how much shared variance (if any) is left between education and income (cyan), after we remove the shared variance with SES (white).

There are several ways to compute partial correlations, which are detailed in the Wikipedia entry linked above. One straightforward way is to use a so-called correlation matrix. This is simply a square matrix (let's call it \(\textbf{K}\)) with one row/column for each variable we are interested in, where each element \(k_{ij}\) is the Pearson coefficient for variables \(i\) and \(j\).

To get our partial correlations, we perform what's called matrix inversion on \(\textbf{K}\), to get a "precision matrix" \(\textbf{P}\):

\[ \textbf{P} = \textbf{K}^{-1} \]

Each element \(p_{ij}\) of \(\textbf{P}\) can be used to compute our partial correlation \(r_{ij}^\prime\) between variables \(i\) and \(j\), which accounts (or "adjusts") for the shared variance of the other variables:

\[ r_{ij}^\prime = -p_{ij} / \sqrt{p_{ii} p_{jj}} \]

An example

So let's try this out with our example. Say that we also recorded SES (which we'll call \(C\)) in our questionnaire, and we compute the correlations \(r_{xc} = 0.5\) and \(r_{yc} = 0.39\). (Disclaimer: these are completely fabricated values!).

We can plug these values into a correlation matrix (note that \(k_{ij}\) is the same as \(k_{ji}\)), where the rows/columns are ordered \(X\), \(Y\), \(C\):

\[ \begin{bmatrix} 1.00 & 0.43 & 0.50\\ 0.43 & 1.00 & 0.39\\ 0.50 & 0.39 & 1.00 \end{bmatrix} \]

If we invert this matrix, we get the precision matrix \(\textbf{P}\):

\[ \begin{bmatrix} 1.46 & -0.41 & -0.57\\ -0.41 & 1.29 & -0.30\\ -0.57 & -0.30 & 1.40\\ \end{bmatrix} \]

And using our equation above, we get partial correlations \(r_{xy}^\prime=0.29\), \(r_{xc}^\prime=0.40\), and \(r_{yc}^\prime=0.22\). This tells us that, after accounting for SES, education now explains 8% of later income — \((r_{xy}^\prime)^2\) — compared to the 18.5% we estimated earlier.

Summing up

Correlations are a way for us to quantify how strongly two variables are associated with one another, but our ability to infer from this relationship is hampered by the existence of numerous confounders. If we can determine what these confounders are, we can measure them and use partial correlations to get a more accurate interpretation of the relationship between our two variables of interest.

Another way to deal with this problem is through multiple linear regression. In this framework, a single variable of interest (called a criterion or outcome variable) is predicted via a linear model by one or more other variables (called predictor variables). In our example above, the linear regression model would look like:

\[ Y = \beta_0 + \beta_1 X + \beta_2 C + \epsilon \]

Because multiple linear regression fits all \(\beta\) coefficients simultaneously, it has partial correlations built in. But that is a subject for a future blog post :-)

Comments here

This is the first of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology. In this post, I will try to provide an intuitive explanation of (1) the Pearson correlation coefficient, (2) confounding, and (3) how partial correlations can be used to address confounding.

Tags:Stats · Linear regression · Correlation · Partial correlation · Teaching

Causal discovery: An introduction
Published on 2024-09-23 by Andrew Reid	#21

This post continues my exploration of causal inference, focusing on the type of problem an empirical researcher is most familiar with: where the underlying causal model is not known. In this case, the model must be discovered. I use some Python code to introduce the PC algorithm, one of the original and most popular approaches to causal discovery. I also discuss its assumptions and limitations, and briefly outline some more recent approaches. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Causality · Causal inference · Causal discovery · Graph theory · Teaching

Causal inference: An introduction
Published on 2023-07-17 by Andrew Reid	#20

Hammer about to hit a nail, representing a causal event.

In this post, I attempt (as a non-expert enthusiast) to provide a gentle introduction to the central concepts underlying causal inference. What is causal inference and why do we need it? How can we represent our causal reasoning in graphical form, and how does this enable us to apply graph theory to simplify our calculations? How do we deal with unobserved confounders? This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Causality · Causal inference · Graph theory · Teaching

Multiple linear regression: short videos
Published on 2022-08-10 by Andrew Reid	#19

In a previous series of posts, I discussed simple and multiple linear regression (MLR) approaches, with the aid of interactive 2D and 3D plots and a bit of math. In this post, I am sharing a series of short videos aimed at psychology undergraduates, each explaining different aspects of MLR in more detail. The goal of these videos (which formed part of my second-year undergraduate module) is to give a little more depth to fundamental concepts that many students struggle with. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Linear regression · Teaching

Learning about multiple linear regression
Published on 2021-12-30 by Andrew Reid	#18

In this post, I explore multiple linear regression, generalizing from the simple two-variable case to three- and many-variable cases. This includes an interactive 3D plot of a regression plane and a discussion of statistical inference and overfitting. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Linear regression · Teaching

Learning about fMRI analysis
Published on 2021-06-24 by Andrew Reid	#17

In this post, I focus on the logic underlying statistical inference based on fMRI research designs. This consists of (1) modelling the hemodynamic response; (2) "first-level" within-subject analysis of time series; (3) "second-level" population inferences drawn from a random sample of participants; and (4) dealing with familywise error. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · FMRI · Hemodynamic response · Mixed-effects model · Random field theory · False discovery rate · Teaching

Learning about simple linear regression
Published on 2021-03-25 by Andrew Reid	#16

In this post, I introduce the concept of simple linear regression, where we are evaluating the how well a linear model approximates a relationship between two variables of interest, and how to perform statistical inference on this model. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Linear regression · F distribution · Teaching

New preprint: Tract-specific statistics from diffusion MRI
Published on 2021-03-05 by Andrew Reid	#15

In our new preprint, we describe a novel methodology for (1) identifying the most probable "core" tract trajectory for two arbitrary brain regions, and (2) estimating tract-specific anisotropy (TSA) at all points along this trajectory. We describe the outcomes of regressing this TSA metric against participants' age and sex. Our hope is that this new method can serve as a complement to the popular TBSS approach, where researchers desire to investigate effects specific to a pre-established set of ROIs.

Tags:Diffusion-weighted imaging · Tractography · Connectivity · MRI · News

Linear regression: dealing with skewed data
Published on 2020-11-17 by Andrew Reid	#13

One important caveat when working with large datasets is that you can almost always produce a statistically significant result when performing a null hypothesis test. This is why it is even more critical to evaluate the effect size than the p value in such an analysis. It is equally important to consider the distribution of your data, and its implications for statistical inference. In this blog post, I use simulated data in order to explore this caveat more intuitively, focusing on a pre-print article that was recently featured on BBC.

Tags:Linear regression · Correlation · Skewness · Stats

Functional connectivity as a causal concept
Published on 2019-10-14 by Andrew Reid	#12

In neuroscience, the conversation around the term "functional connectivity" can be confusing, largely due to the implicit notion that associations can map directly onto physical connections. In our recent Nature Neuroscience perspective piece, we propose the redefinition of this term as a causal inference, in order to refocus the conversation around how we investigate brain connectivity, and interpret the results of such investigations.

Tags:Connectivity · FMRI · Causality · Neuroscience · Musings

Functional connectivity? But...
Published on 2017-07-26 by Andrew Reid	#11

Functional connectivity is a term originally coined to describe statistical dependence relationships between time series. But should such a relationship really be called connectivity? Functional correlations can easily arise from networks in the complete absence of physical connectivity (i.e., the classical axon/synapse projection we know from neurobiology). In this post I elaborate on recent conversations I've had regarding the use of correlations or partial correlations to infer the presence of connections, and their use in constructing graphs for topological analyses.

Tags:Connectivity · FMRI · Graph theory · Partial correlation · Stats

Driving the Locus Coeruleus: A Presentation to Mobify
Published on 2017-07-17 by Andrew Reid	#10

How do we know when to learn, and when not to? Recently I presented my work to Vancouver-based Mobify, including the use of a driving simulation task to answer this question. They put it up on YouTube, so I thought I'd share.

Tags:Norepinephrine · Pupillometry · Mobify · Learning · Driving simulation · News

Limitless: A neuroscientist's film review
Published on 2017-03-29 by Andrew Reid	#9

In the movie Limitless, Bradley Cooper stars as a down-and-out writer who happens across a superdrug that miraculously heightens his cognitive abilities, including memory recall, creativity, language acquisition, and action planning. It apparently also makes his eyes glow with an unnerving and implausible intensity. In this blog entry, I explore this intriguing possibility from a neuroscientific perspective.

Tags:Cognition · Pharmaceuticals · Limitless · Memory · Hippocampus · Musings

The quest for the human connectome: a progress report
Published on 2016-10-29 by Andrew Reid	#8

The term "connectome" was introduced in a seminal 2005 PNAS article, as a sort of analogy to the genome. However, unlike genomics, the methods available to study human connectomics remain poorly defined and difficult to interpret. In particular, the use of diffusion-weighted imaging approaches to estimate physical connectivity is fraught with inherent limitations, which are often overlooked in the quest to publish "connectivity" findings. Here, I provide a brief commentary on these issues, and highlight a number of ways neuroscience can proceed in light of them.

Tags:Connectivity · Diffusion-weighted imaging · Probabilistic tractography · Tract tracing · Musings

New Article: Seed-based multimodal comparison of connectivity estimates
Published on 2016-06-24 by Andrew Reid	#7

Our article proposing a threshold-free method for comparing seed-based connectivity estimates was recently accepted to Brain Structure & Function. We compared two structural covariance approaches (cortical thickness and voxel-based morphometry), and two functional ones (resting-state functional MRI and meta-analytic connectivity mapping, or MACM).

Tags:Multimodal · Connectivity · Structural covariance · Resting state · MACM · News

Four New ANIMA Studies
Published on 2016-03-18 by Andrew Reid	#6

Announcing four new submissions to the ANIMA database, which brings us to 30 studies and counting. Check them out if you get the time!

Tags:ANIMA · Neuroscience · Meta-analysis · ALE · News

Exaptation: how evolution recycles neural mechanisms
Published on 2016-02-27 by Andrew Reid	#5

Exaptation refers to the tendency across evolution to recycle existing mechanisms for new and more complex functions. By analogy, this is likely how episodic memory — and indeed many of our higher level neural processes — evolved from more basic functions such as spatial navigation. Here I explore these ideas in light of the current evidence.

Tags:Hippocampus · Memory · Navigation · Exaptation · Musings

The business of academic writing
Published on 2016-02-04 by Andrew Reid	#4

Publishers of scientific articles have been slow to adapt their business models to the rapid evolution of scientific communication — mostly because there is profit in dragging their feet. I explore the past, present, and future of this important issue.

Tags:Journals · Articles · Impact factor · Citations · Business · Musings

Reflections on multivariate analyses
Published on 2016-01-15 by Andrew Reid	#3

Machine learning approaches to neuroimaging analysis offer promising solutions to research questions in cognitive neuroscience. Here I reflect on recent interactions with the developers of the Nilearn project.

Tags:MVPA · Machine learning · Nilearn · Elastic net · Statistics · Stats

New ANIMA study: Hu et al. 2015
Published on 2016-01-11 by Andrew Reid	#2

Announcing a new submission to the ANIMA database: Hu et al., Neuroscience & Biobehavioral Reviews, 2015.

Tags:ANIMA · Neuroscience · Meta-analysis · ALE · Self · News

Who Am I?
Published on 2016-01-10 by Andrew Reid	#1

Musings on who I am, where I came from, and where I'm going as a Neuroscientist.

Tags:Labels · Neuroscience · Cognition · Musings

Andrew Reid PhD

Correlation: an overview

The spectre of confounding

So how do I deal with confounding?

An example

Summing up