Table of Contents
What is r-squared?
How is r-squared used in investing?
How to use r-squared
R-squared and investing style
R-squared and other statistics
Assessing goodness of fit in a regression model
The bottom line
R-Squared for Investing: What It Is & How to Calculate It
R-Squared for Investing: What It Is & How to Calculate It
Jun 10, 2022
9 min read
Learn all about how R-squared can be a good yardstick for investors to decide if they want investments that closely track an index, such as index funds.
Many investors, especially professional money managers and analysts, want to make comparisons. Are the stocks and funds in their portfolios keeping up with or lagging behind the stock market or a market index? And is there any connection between their investments and some benchmark like the Standard & Poor’s 500 Index? Do they move in tandem or in different directions? Is one driven by the other?
Statistics are used to help investors make these comparisons and see the tie between their investments and benchmarks. One of the most-cited statistics is known as r-squared, or the coefficient of determination.
In statistics, the term r-squared is a measure of the relationship between two things, called variables. R-squared is used to assess how much a change in one variable (call it Y, the investment) is determined by the change in the other variable (call it X, the benchmark or index). A statistical model is created to test the degree of relationship between the variables by comparing the actual values of Y (the investment’s returns) on a chart against the predicted returns represented by a line on the chart.
R-squared is derived from r, the symbol used to denote correlation, so it’s simply correlation squared. In finance and investing, correlation measures how often returns for two investments are moving in the same direction, or the opposite direction, with statistical values known as coefficients of between -1 and 1. Two assets in perfect positive correlation (both always rising) have a correlation value or coefficient of 1; two in perfect negative or inverse correlation (one always falling when the other is rising or vice versa) have a coefficient of -1. A coefficient of zero means the two investments have no correlation. Almost all correlations fall somewhere between these perfect extremes.
R-squared is often used to assess the degree to which an investment, typically a fund or portfolio, generates returns in line with the benchmark. Said another way, the r-squared statistic sizes up how much the investment’s returns are determined by the benchmark’s returns.
So an investor with a portfolio of stocks or stock funds might ask, “How much do my returns depend on the broad market’s returns?” A common example of this r-squared evaluation is a fund or stock portfolio in relation to the S&P 500, the most widely used proxy for the U.S. stock market.
For example, if the XYZ Large Cap Fund has a correlation coefficient of 0.70 with the S&P 500, that means the fund returns and index returns are rising together 70% of the time.
R-squared, or correlation squared, for the XYZ Large Cap Fund then is:
0.7 X 0.7 = 0.49
R-squared is always smaller than r because it’s the product of two decimals. For investors it’s expressed more intuitively as a percentage, so 0.49 means 49% of XYZ’s returns are determined by the returns of its benchmark, the S&P 500.
The main value of r-squared in statistics is in quickly assessing whether the statistical model is a good fit for the data set—does the data support the hypothesized relationship between X and Y? In other words, how well did the model predict the investment’s results?
For investors, r-squared explains how much the performance of an investment is explained by the performance of a benchmark such as an index. A higher value of r-squared, closer to 1.0 or 100%, suggests it has greater power as a forecasting tool for the performance of a fund or portfolio. A low r-squared, all other things equal, usually indicates the model is not good for forecasting.
Beyond this simple explanation of the relationship between correlation and determination, the value of r-squared can also be found through statistical analysis of variables on a chart, called regression analysis. A regression model is meant to help forecast returns on an investment by using a data sample, such as the daily price changes for the investment and the benchmark for a certain period (three months, six months, one year, etc.). Each of the daily changes would be a data point on the regression chart.
Regression analysis involves creating a model hypothesis, or equation, of the relationship between the variables:
The regression is depicted on the chart with a straight line and a number of dots on or around the line. Here is an example:
The line, typically upward sloping, represents the equation meant to quantify the relationship between the variables. A basic model equation might look like this:
Y = bX + a
In the equation, a and b are constants—their value doesn’t change. For equations plotted on a chart, the constant a represents the intercept—the value where the sloping regression line intersects the Y axis. And b represents the slope, or beta, of the line, whether it’s steep or flat.
Let’s assume the constant a has a value of 1, and the constant b has a value of 2. The equation then is:
Y = 2X + 1
In plain English, the model’s equation is hypothesizing that the rate of return on Y (the investment) will be two times the rate of X (the benchmark/index), with a minimum rate of 1%.
Here’s another example of a regression chart, with the straight line of the model’s equation and the individual observations of the fund/portfolio returns as dots scattered around the line. Because the model’s simple equation produces a straight line, this is called linear regression.
One purpose of regression analysis is to place the line through the scattered dots in a way that minimizes the average distance of the dots away from the line—that is, to discern a linear pattern through the scatter. This is called finding the best fit for the model to the data. It’s achieved through a series of calculations of the spread between the dots and the line, called least-squares regression.
The graph below shows an r-squared of 15% for a mutual fund, which means that only 15% of its returns are attributable to the returns of the index. The graph shows how widely the data points—the returns of the fund—are scattered away from the regression line. So an investor can intuitively see the weak relationship between the fund’s returns and the benchmark’s.
By contrast, the next graph below shows a much stronger relationship between the two variables—the plotted observations of the fund returns are clustered close to the regression line. The r-squared is 85%, meaning 85% of the fund’s returns are attributable to the index’s performance, and they show a better fit for the model’s proposed relationship between the two variables.
Investors can look at r-squared values in relation to their investing style:
This usually involvesindex funds or exchange-traded funds (ETFs) that seek to match a broad market benchmark. Investors want high r-squared. For example, the Vanguard 500 Index Admiral Fund and the Fidelity 500 Index Fund have r-squared values at or close to 100%, or 1. Passive investments tend to cost less for investors because they only need to mimic the benchmark, and less effort is needed to construct and maintain the portfolio.
: The goal is to find investments that will beat the market. Investors expect lower r-squared because active portfolio managers seek stocks that don’t just match the index. A hedge fund would presumably have a lower r-square. Otherwise, an actively managed fund with higher costs but an r-squared of 97%, for example, might make investors question why they’re paying higher fees for a fund whose returns are mostly the result of changes in an index, when a lower-cost index fund produces about the same returns.
Some other variations of r-squared include:
This is used for linear regressions with more than one independent variable—for example, the benchmark return and the price of gold—that try to explain the dependent variable’s return. In statistics, adding another independent variable to a regression model will increase the r-squared reading.
R-squared only works as intended in a simple linear regression model with one explanatory variable. With a multiple regression made up of several independent variables, the r-squared must be adjusted lower to compensate for the possibility that the extra variables add no explanatory power to the model.
The slope of the regression line measures the magnitude of volatility in a portfolio’s returns relative to the benchmark. A beta of 1, for example, means that if the benchmark rises or falls 1% the portfolio rises or falls 1%. A beta of less than 1, for example 0.8, means that the portfolio is less volatile than the index— it rises or falls 0.8% when the benchmark rises or falls 1%. A beta of 1.2 means the portfolio is more volatile—it changes 1.2% when the benchmark changes 1%.
Investors can look at r-squared together with beta for a fuller understanding of the performance of their funds or portfolios. A fund with a high r-squared closely tracks the benchmark’s return. If it also has a high beta, above 1, that could mean outperforming the benchmark in a rising stock market—or doing worse than the benchmark when markets are declining.
Goodness of fit refers to how closely the scattered dots on the regression graph crowd around the regression line.
These differences should be free of what is called systematic bias—the data points should be randomly scattered around the regression line. Bias is indicated by another pattern in the scatterplot of data points, meaning another independent variable besides the benchmark –possibly a different benchmark—may explain the investment’s returns. Detecting bias requires a visual inspection of the scatterplot in the regression chart.
A low r-squared value can still provide some information about the general direction of investment returns, even with the wider dispersion of returns from the benchmark. But it can be a problem if the investor wants a forecast to be more precise, with a smaller range around the forecast. Higher r-squared values generally provide more precise forecasts.
A value of 0.9 would mean that 90% of the return for a fund or portfolio is attributable to the return on a benchmark. In the stock market, that would mean 90% of an equity fund’s performance stems from the performance of an index such as the S&P 500.
R-squared can be a good yardstick for investors to decide if they want investments that closely track an index, such as index funds, or investments that correlate less with an index, such as actively managed funds and hedge funds. Although some familiarity with the concept of r-squared can be useful for the average investor, it primarily is a tool used by professionals in managing and constructing investment portfolios.
At Titan, we are value investors: we aim to manage our portfolios with a steady focus on fundamentals and an eye on massive long-term growth potential. Investing with Titan is easy, transparent, and effective.
Get started today.
Certain information contained in here has been obtained from third-party sources. While taken from sources believed to be reliable, Titan has not independently verified such information and makes no representations about the accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; Titan has not reviewed such advertisements and does not endorse any advertising content contained therein.
This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any strategy managed by Titan. Any investments referred to, or described are not representative of all investments in strategies managed by Titan, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results.
Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see Titan’s Legal Page for additional important information.
You might also like
What Is an Index Fund & How Does It Work?
An index fund is a way to invest in every stock within a particular index or grouping, and their goal is usually to try to match the performance of a benchmark market index.
Index Fund vs. ETF: What's the Difference?
Index funds and ETFs while traded differently, can both offer a low-cost way to invest in a diversified group of assets.
Index Fund vs. Mutual Fund: What’s the Difference?
Some mutual funds are also index funds, but more often, mutual fund managers actively manage the fund to try to outperform an index.
How to Invest in Index Funds: A Beginner’s Guide
An index fund is a type of fund that tries to mirror the performance of a benchmark market index, such as the S&P 500 stock market index. Investors can use a brokerage or retirement account to purchase exchange-traded funds (ETFs) or mutual funds that track indexes.