The hidden risk in historical data

In this section we will take a closer look at one of the most popular long-term investment portfolios: the 60/40 portfolio. This name derives from the two allocations in the portfolio, namely 60% in stocks (via a broad stock index ETF like SPY) and 40% in fixed income instruments (via a long-term government bond ETF like TLT). We will learn how to compute the combined returns of a portfolio containing more than one asset, and we will learn that for a long time, fixed income returns provided a way of risk mitigation against stock market crashes—but only until 2022, when this risk mitigation mechanism broke down. This example will teach us that statistical properties of historical data should not be taken for granted, and that regime-changes in market dynamics force investors to contiuously update their view on markets.

Historical data

To illustrate long-term relations between different asset classes, we first download suitable data. For this first example, we will again use the ETFs SPY (as a tradable asset that replicates the S&P500 index) and TLT (as a tradable asset that invests in long-term US government bonds), even if we only get data for both starting in mid-2002, so we only get a lookback period of about 20 years. In Chapter 3, we will learn to use proxy data from different, but very related assets to extend the lookback, but still mimic the same fees and returns of the asset that we actually trade today. For example, for stock index data, we could instead use the ticker symbol ^GSPC (which denotes the S&P500 index itself, not the tradable ETF). This means that we will ignore tracking errors of the ETF and that we will have to correct the data for any fees that the ETF would charge. For long-term bond price data, we could use the ticker VUSTX, a tradable fund by Vanguard that invests in long-term US government bonds. The advantage over TLT is that VUSTX has been trading since 1986, providing us with a much larger lookback period, but not with the exact same returns. But for now, we are happy with downloading the SPY and TLT data, and we only keep time stamps for which we have data on both assets:

import yfinance as yf
data = yf.download(["SPY", "TLT"], period="max")

prices = data['Adj Close']
prices = prices.dropna()
prices

                   SPY        TLT
Date                             
2002-07-30   60.080936  39.072857
2002-07-31   60.226238  39.556957
2002-08-01   58.653877  39.782238
2002-08-02   57.339134  40.189655
2002-08-05   55.343956  40.366993
...                ...        ...
2024-07-25  538.409973  92.269997
2024-07-26  544.440002  92.989998
2024-07-29  544.760010  93.489998
2024-07-30  542.000000  93.849998
2024-07-31  550.799988  94.690002

[5539 rows x 2 columns]

Let's plot the cumulative return of both assets on a log-scale to get a better overview of how the value of these two assets has fluctuated over the years. To do this, we divide our price dataframe by the first row, to let both price series start at a value of one:

# divide price by first price to normalize
normed_prices = prices/prices.iloc[0]

# use pandas built-in plot method for dataframes
normed_prices.plot()

# scale the y-axis of the plot logarithmically
plt.yscale('log')

10⁰

10¹

2005

2010

2015

2020

2025

Date

SPY

TLT

We notice multiple things here:

First, investing in long-term US government bonds (TLT) back in 2002 and holding for a long time was a surprisingly good idea, as the performance roughly matched the performance of stocks in the long-term up to the Corona crash. But we also notice that investing in fund of bonds is not riskless, as the value of a bond may decrease when interest rates increase, such as after the Corona crash (it's not nice to sit on 20-year bond with a fixed interest rate of 2% when the current rate climbs to 5%).
The risk profile of the two assets is quite different: whereas the S&P500 stock index—despite its 500 individual constituents—exhibits steep drops, the bonds fund grows more steadily over time, its volatility is lower—at least until shortly after the Corona crash of 2020.
If you look more closely (drag a rectangle using the mouse to zoom in on the plot, and right-click to zoom back again), you will see that up to (and including) the Corona crash of 2020, large price movements in both assets are often opposite, i.e. if stocks crash, bonds gain, and vice versa. This negative correlation is the reason why so many people invested in a mix of the two, because they "hedge" each other's risk and (supposedly) provide independent sources of returns.

Now that we have an overview on the individual assets, let us simulate our first combined portfolio of multiple assets in the next section!

The 60/40 portfolio

The 60/40 portfolio is a popular long-term investment strategy that is simple to describe: you invest 60% of your money in a broad stock index fund, and the remaining 40% in a long-term bonds fund. At first sight, this portfolio seems to be a fire-and-forget investment that you don't have to touch after setting it up initially. But wait, the prices of the two assets will fluctuate differently over time, so the 60/40 ratio will also change over time! For example, if stocks appreciate in value by 10% in one year, but bonds do not gain or lose anything, your portfolio will contain 62.3% (=(60*1.1)/(60*1.1 + 40)) stocks and only 37.7% bonds at the end of the year. If the stock market crashes next year, you will be more exposed to this risk than the average 60/40 investor. This example shows that we need to rebalance our portfolio regularly to not let one asset accumulate and leave us over-exposed to certain kinds of risks. This process of rebalancing includes selling shares of assets that have overperformed others in the portfolio (i.e. taking profit) and buying additional shares of under-performing assets (i.e. hoping for a reversion to the long-term performance).

If rebalancing is so important, then we should do it often, right? Maybe every day? Well, choosing a suitable rebalancing interval also depends on the fees that your broker charges you for trading, and on the bid/ask spread (the difference in price depending on whether you buy or sell an asset) of the asset. Too much buying/selling to adjust small allocation deviations may be more expensive than the additional return you get by doing it. In addition, daily rebalancing requires you to be at your laptop sending out trades every day, and your time also has value (which may very well exceed the additional return you will get from your portfolio). Long-term investors often only adjust monthly, quarterly, once a year, or only if the actual weights of the assets in the portfolio deviate enough from their target values. In Chapter 3, we will devise simulation methods that will allow you to test different rebalancing methods to find the right one for your investment needs, but for now, we simply go with monthly rebalancing.

The easiest way to simulate a portfolio with fixed allocation weights (60% stocks and 40% bonds) and monthly rebalancing is to first compute monthly returns, i.e. by how much the price of the assets changes relative to last month's prices:

monthly_returns = prices.resample('BM').last().pct_change()
monthly_returns

                 SPY       TLT
Date                          
2002-07-31       NaN       NaN
2002-08-30  0.006802  0.055132
2002-09-30 -0.104853  0.042592
2002-10-31  0.082284 -0.036943
2002-11-29  0.061681 -0.009162
...              ...       ...
2024-03-29  0.032702  0.007829
2024-04-30 -0.040320 -0.064555
2024-05-31  0.050580  0.028870
2024-06-28  0.035280  0.018171
2024-07-31  0.012091  0.034988

[265 rows x 2 columns]

The .resample('BM').last() method select the last price of each business month (in case the last day of a month falls on a weekend), and the .pct_change() method computes relative price changes. Note that the first row of the resulting dataframe contains NaN values, as there is no row prior to the first one to compute relative change from. We can get rid of that empty first row by dropping NaN values:

monthly_returns = monthly_returns.dropna()
monthly_returns

                 SPY       TLT
Date                          
2002-08-30  0.006802  0.055132
2002-09-30 -0.104853  0.042592
2002-10-31  0.082284 -0.036943
2002-11-29  0.061681 -0.009162
2002-12-31 -0.056570  0.045255
...              ...       ...
2024-03-29  0.032702  0.007829
2024-04-30 -0.040320 -0.064555
2024-05-31  0.050580  0.028870
2024-06-28  0.035280  0.018171
2024-07-31  0.012091  0.034988

[264 rows x 2 columns]

Looking at these values, we know for example that stocks lost over 10% in September 2002, whereas bonds gained over 4% in value in the same period (seconf row from the top). But how do we combine these returns to obtain the returns of our combined portfolio? It's fairly simple at this point: for each row of our dataframe, we compute the weighted average return, and the weights are our allocation weights. With our static weights of 60% for stocks and 40% for bonds, this calculation is carried out as follows:

portfolio_returns = 0.6*monthly_returns['SPY'] + 0.4*monthly_returns['TLT']
portfolio_returns

Date
2002-08-30    0.026134
2002-09-30   -0.045875
2002-10-31    0.034593
2002-11-29    0.033344
2002-12-31   -0.015840
              ...   
2024-03-29    0.022753
2024-04-30   -0.050014
2024-05-31    0.041896
2024-06-28    0.028437
2024-07-31    0.021250
Freq: BM, Length: 264, dtype: float64

Still, we're not quite happy with these portfolio returns, as we are interested in the long-term return of our investments, not really in the monthly fluctuations. To obtain the equity curve, i.e. the portfolio value relative to its starting value over time, we need to accumulate returns and account for compounding of returns. After all, we aim to simulate the case that we do not withdraw profits, but let them run over the whole investment period. To accomplish that, we do the following calculation:

import numpy as np

portfolio_value = np.cumprod(1+portfolio_returns)
portfolio_value

Date
2002-08-30    1.026134
2002-09-30    0.979060
2002-10-31    1.012929
2002-11-29    1.046704
2002-12-31    1.030124
              ...   
2024-03-29    5.807165
2024-04-30    5.516728
2024-05-31    5.747855
2024-06-28    5.911304
2024-07-31    6.036917
Freq: BM, Length: 264, dtype: float64

Let's look at the individual steps of this calculation in a bit more detail:

First we compute 1+portfolio_returns to obtain a factor that tells us how much money we will have at the end of a month, assuming that we have $1 at the beginning of the month. For example, if the portfolio gains 5% in a month, we will have (1+0.05)=1.05 dollars at the end of the month.
Since we do not withdraw any profits, we will start the next month not with $1, but with $1.05. If the portfolio gains 10% in the next month, we will have 1.05*(1+0.10)=1.155 dollars, not just $1.10. This is power of compounding, and it makes us understand why we need to multiply subsequent returns, not just add them up.
The function np.cumprod does all the multiplication steps for all the individual monthly returns for us, and it keeps all the intermediate results for all timestamps (that is why it's called cumprod, short for cumulative product). This way, we get a series of values that tell us how much our portfolio is worth in each month, assuming that we started with $1 at the very beginning.

Let's plot the resulting portfolio equity curve:

plt.figure()

# fetch the axis object to plot multiple Pandas
# dataframes using the built-in plot method
ax = plt.gca() 

# plot our 60/40 portfolio
portfolio_value.plot(label='60/40', ax=ax)

# plot the equity curves of the individual investments
np.cumprod(1+monthly_returns).plot(ax=ax)

# logarithmic scaling of the y-axis for better overview
plt.yscale('log')

10⁰

10¹

2005

2010

2015

2020

Date

60/40

SPY

TLT

As we can see, the 60/40 portfolio combines benefits of both investments: you get the stability of the bonds, but also some outperformance of the stocks. Plus, large stock crashes like during the US financial crisis of 2008 are dampened by the counter-movement of bonds prices. Still, since the Corona crash, the 60/40 portfolio has lost some popularity, mostly because many investors did not anticipate the risk of holding long-term bonds when interest rates are rising quickly, resulting in unusually large value drawdowns bond funds.

Correlation matters

We have seen that the 60/40 portfolio really generates a benefit for investors during times when stocks and bonds move contrary to each other (effectively hedging eachother), but can be a problem when this nice relationship between assets breaks (like after the Corona crash, when governments worldwide increased interest rates to fight inflation). So how can we quantify the relationship between assets, whether they move together or contrary to each other?

We can do this by computing the correlation coefficient between them. A correlation coefficient takes the value of 1 if the returns of the assets are perfectly positively correlated, meaning that every time that the SPY exhibits a return that is larger than its average return, TLT will also exhibit a return that is larger than its average return, and vice versa. If the correlation coefficient is positive but smaller than one, there is still a tendency that above-average returns and below-average returns coincide for both assets, but not every time. If the correlation coefficient is zero, we detect no relation between the returns of the two assets (which does not necessarily mean that there is no relation, just that our correlation coefficient cannot detect the relation). If the correlation coefficient is negative, we see "anti-correlation", i.e. there is a tendency that above-average returns of SPY are accompanied by below-average returns of TLT, and vice versa. This negative, or anti-correlation between assets is what investors are seeking, as they can then combine the long-term growth of multiple assets, but reduce the overall risk of large price fluctuations of their portfolio.

The most commonly used correlation coefficient is the Pearson correlation coefficient, which assumes a linear relation between two variables, meaning that it implies that the returns of SPY, $r_\mathrm{SPY}$ , and the returns of TLT, $r_\mathrm{TLT}$ , should fluctuate around a straight line when plotted against each other. Let's have a look at this:

plt.scatter(100*monthly_returns['SPY'], 
          100*monthly_returns['TLT'])
plt.xlabel('monthly returns SPY (%)')
plt.ylabel('monthly returns TLT (%)')
# pragma: end show

from scipy.stats import pearsonr

p = pearsonr(monthly_returns['SPY'], 
           monthly_returns['TLT'])
xx = np.linspace(-17, 17, 2)
plt.plot(xx, xx*p.statistic, "-.k", visible=False, label="_line", lw=20)
plt.text(13, 4, "?", fontsize=50, visible=False, label="_?")
t = np.linspace(0, 2*np.pi, 64)+np.deg2rad(-50)
x = np.sin(t)*16
y = np.cos(t)*17
angle = np.deg2rad(-2)
x, y = [x*np.cos(angle)+y*np.sin(angle), x*np.sin(angle)+y*np.cos(angle)]
plt.plot(x-1, y, '-k', visible=False, label="_ellipse");

-10

monthly returns TLT (%)

-10

monthly returns SPY (%)

Well, that does quite look like the points really lie on a straight line! Of course, we would only expect to see that the points form a narrow straight line if the correlation was close to -1 or 1. So what this oblated point cloud already tells us is that the correlation between SPY and TLT is rather weak when measured over the whole historical period. Still we may see that the point cloud is tilted downwards towards the right, indicating negative correlation if any. To quantify this further, we use the scipy.stats Python package to calculate the Pearson correlation coefficient:

from scipy.stats import pearsonr

pearsonr(monthly_returns['SPY'], 
       monthly_returns['TLT'])

PearsonRResult(statistic=-0.11968226633040632, pvalue=0.05209278993279599)

As we can see, we do not only get back one value (the value of the estimated correlation coefficient), but a second value, the p-value. The p-value tells us how trustworthy the estimated value of the correlation coefficient really is. Importantly, the smaller the p-value, the more we can trust the result! It takes values between 0 (absolutely trustworthy) and 1 (do not trust at all). Think about it this way: If we only had 1 year of historical data on SPY and TLT, so only 12 monthly returns, and they aligned perfectly, i.e. the greater the returns of SPY, the smaller the returns of TLT, would you believe the resulting correlation coefficient of -1 and bet money on that perfect correlation? Probably not, it feels too uncertain as the perfect negative correlation could be a product of pure chance, and the next few months could easily negate our result. Let's try this example and compute the correlation coefficient from just the first 12 monthly returns in our data set:

pearsonr(monthly_returns['SPY'].iloc[:12], 
       monthly_returns['TLT'].iloc[:12])

PearsonRResult(statistic=-0.3082014294118599, pvalue=0.32974817350594693)

As you can see, using only 12 months of data, we get a larger p-value of 0.33, compared to the value of 0.05 when we use all available data. So we can see that the p-value somehow indicates how likely it is that in reality we have zero correlation, but due to the fact that we only have a finite amount of data, we see some spurious correlation (Note: statisticians often insist on the fact that the p-value is not equal to the probability that an effect is created by random chance, and they are certainly right, but we do not want to lose ourselves in mathematical details here). But what p-value is a good p-value, at which point do we start to believe that there is in fact a non-zero correlation between SPY and TLT?

That is not an easy question to answer; in scientific projects, a common cutoff is $p<0.05$ , or $p<0.01$ . Our estimate of the Pearson correlation coefficient misses both those values (even if just by a bit). If we could add some proxy data to extend our historical data by more years, however, we could see that the negative correlation between SPY and TLT becomes statistically significant. It is worth noting, though, that in finance, you can make profit by exploiting non-significant correlations, but also the opposite is true: in Chapter 2, we will see that extreme values in financial time series can lead us in believing certain statistical properties that are in fact not true and will be invalidated when the next crash or extreme event happens!

To complicate things a bit more, there is not just the Pearson correlation coefficient, but also other coefficients that compute correlation in slightly different ways! One example is Spearman's rank correlation coefficient, implemented in scipy.statsas spearmanr. In contrast to the Pearson correlation coefficient, Spearman does not compare whether individual values in both series are above or below their respective average, but whether their ranks are above or below the average. So it's about the position of a value in the list of sorted values, rather than the value compared to the average value. This makes Spearman's rank correlation coefficient more robust against outlier values (which we definitely have in financial time series, wait for Chapter 2 to see just how extreme these values can get) and allows it to capture (at least some types of) non-linear correlations. Non-linear correlations are any kind of correlation where the points of both series form a certain pattern, but this pattern does not approximate a straight line, see these illustrative examples on wikipedia.

Let's compute Spearman's rank correlation coefficient for the monthly returns of SPY and TLT to see how the result differs from the Pearson correlation coefficient:

from scipy.stats import spearmanr

spearmanr(monthly_returns['SPY'], monthly_returns['TLT'])

SignificanceResult(statistic=-0.1294543106653014, pvalue=0.03553081400917159)

We can see that the value of the correlation coefficient is approximately the same (a tiny bit stronger compared to Pearson), and the p-value is a bit smaller, this time below the common threshold of $p < 0.05$ for statistical significance! But how can one coefficient reach significance while the other one does not? Well, Pearson tests exclusively for linear correlation, whereas Spearman may also captures non-linear correlation and is more robust to outliers. So Spearman is a more genenral test, and Pearson is a more specific test. Thus, we expect that if our data were indeed linearly correlated, then Pearson—being the more specfic test—should yield a more desicive result, whereas if the correlation is of non-linear nature or if outlier values are present—as is the case here—then Spearman may yield significance when Pearson does not. In the end, the most trustworthy results should always be confirmed using multiple different approaches. If one has to explicitly search for a single test to be positive, one is probably hunting a ghost in the machine and not a real signal!

Ok, so there are different types of correlation metrics, and correlation values based on too few data points can be spurious, but still, we have ignored another effect that can hinder us from estimating the correlation between two assets correctly: regime shifts (also called regime changes, regime switches or break points)! When macroeconomic policies changes, the relation between assets may also change, including the correlation of returns. Essentially, this means that we have to assume that correlation between assets changes over time—sometimes gradually, sometimes abruptly. In later chapters, we will learn how to properly handle time-varying parameter in our models while accounting for statistical significance and all the subtle details, but for a first demonstration, we may simply compute the Pearson correlation coefficient based on a rolling window of the trailing 12 months. We can do this using the Pandas methods rolling and corr:

corr = monthly_returns['SPY'].rolling(window=12).corr(monthly_returns['TLT'])

plt.plot(corr.index.values, corr.values, zorder=2)
plt.axhline(0., lw=1, zorder=1, c='C1', ls='--')
plt.ylabel('ρ(SPY, TLT)');

-1.0

-0.5

0.0

0.5

ρ(SPY, TLT)

2005

2010

2015

2020

2025

Of course, the rolling window estimates of the correlation coefficient fluctuates quite a bit over time, but we have expected that as we only base the correlation estimate on 12 data points at each point in time. Still, we may see a greater picture emerge from the fluctuations: Prior to 2021, the correlation estimate fluctuated mostly within the negative range, from -1 to slightly above 0. After 2021, however, correlation increases quickly and reaches high positive values. This regime change is due to drastic changes in interest rate policies across the globe, and it eradicates the nice negative correlation on which many investors of the 60/40 portfolio relied on for risk mitigation. To see the effect of this regime change on portfolio performance, we will plot the portfolio volatility, also within a rolling window of 12 months:

portfolio_volatility = portfolio_returns\
  .rolling(window=12)\
  .std()\
  .dropna()\
  *np.sqrt(12)

plt.plot(portfolio_volatility)
plt.ylabel('Annualized volatility');

0.05

0.10

0.15

0.20

Annualized volatility

2005

2010

2015

2020

2025

As we can see, the rolling portfolio volatility (and thus the magnitude of fluctuations we see every day in our brokerage account) usually hover below or at 10% annualized volatility, except for two occasions: First, during the financial crisis of 2008/09, when correlation was breifly positive and the stock market volatility skyrocketed, and second since 2021, when the correlation between SPY and TLT turned positive again.

Positive correlation magnifies portfolio fluctuations! This is why most portfolio optimization approaches not only select allocation weights based on past performance, but also based on the correlation between different assets. Ideally, one would find many different assets with zero or even negative correlation to each other, such that one can profit from uncorrelated returns, while the random fluctuations (partially) counteract each other, resulting in a smooth, upwards pointing equity curve. Finding different, uncorrelated sources of returns is the main goal of many hedge funds, which invest not only in stocks or bonds, but also other (more or less non-correlated) markets such as commodities (Gold, Aluminium, Cattle, Soybeans, ...), currencies, volatility indicies (betting on whether stock market fluctuations will become stronger or weaker), exotic option markets (with very non-linear payouts), or even sports betting.

For now, we are content with our insight that correlations between different assets matter a lot when it comes to overall portfolio performance, and that picking stocks or other assets based solely on their past performance is not a good idea, one has to have the whole portfolio in mind—a holistic picture of portfolio optimization. In Chapter 3, we will revisit this idea and study the traditional mean-variance approach based on Markowitz, but also a more modern, simulation-based approach that can account for non-linear relations between assets. In Chapter 4 we will then build dynamic allocation techniques that can react to market conditions (such as changing correlations), thus mitigating at least some of the risk that such regime shifts like the one discussed here hold to an investor.

The hidden risk in historical data

Historical data

import yfinance as yf
data = yf.download(["SPY", "TLT"], period="max")

prices = data['Adj Close']
prices = prices.dropna()
prices

                   SPY        TLT
Date                             
2002-07-30   60.080936  39.072857
2002-07-31   60.226238  39.556957
2002-08-01   58.653877  39.782238
2002-08-02   57.339134  40.189655
2002-08-05   55.343956  40.366993
...                ...        ...
2024-07-25  538.409973  92.269997
2024-07-26  544.440002  92.989998
2024-07-29  544.760010  93.489998
2024-07-30  542.000000  93.849998
2024-07-31  550.799988  94.690002

[5539 rows x 2 columns]

# divide price by first price to normalize
normed_prices = prices/prices.iloc[0]

# use pandas built-in plot method for dataframes
normed_prices.plot()

# scale the y-axis of the plot logarithmically
plt.yscale('log')

10⁰

10¹

2005

2010

2015

2020

2025

Date

SPY

TLT

We notice multiple things here:

First, investing in long-term US government bonds (TLT) back in 2002 and holding for a long time was a surprisingly good idea, as the performance roughly matched the performance of stocks in the long-term up to the Corona crash. But we also notice that investing in fund of bonds is not riskless, as the value of a bond may decrease when interest rates increase, such as after the Corona crash (it's not nice to sit on 20-year bond with a fixed interest rate of 2% when the current rate climbs to 5%).
The risk profile of the two assets is quite different: whereas the S&P500 stock index—despite its 500 individual constituents—exhibits steep drops, the bonds fund grows more steadily over time, its volatility is lower—at least until shortly after the Corona crash of 2020.
If you look more closely (drag a rectangle using the mouse to zoom in on the plot, and right-click to zoom back again), you will see that up to (and including) the Corona crash of 2020, large price movements in both assets are often opposite, i.e. if stocks crash, bonds gain, and vice versa. This negative correlation is the reason why so many people invested in a mix of the two, because they "hedge" each other's risk and (supposedly) provide independent sources of returns.

Now that we have an overview on the individual assets, let us simulate our first combined portfolio of multiple assets in the next section!

The 60/40 portfolio

monthly_returns = prices.resample('BM').last().pct_change()
monthly_returns

                 SPY       TLT
Date                          
2002-07-31       NaN       NaN
2002-08-30  0.006802  0.055132
2002-09-30 -0.104853  0.042592
2002-10-31  0.082284 -0.036943
2002-11-29  0.061681 -0.009162
...              ...       ...
2024-03-29  0.032702  0.007829
2024-04-30 -0.040320 -0.064555
2024-05-31  0.050580  0.028870
2024-06-28  0.035280  0.018171
2024-07-31  0.012091  0.034988

[265 rows x 2 columns]

monthly_returns = monthly_returns.dropna()
monthly_returns

                 SPY       TLT
Date                          
2002-08-30  0.006802  0.055132
2002-09-30 -0.104853  0.042592
2002-10-31  0.082284 -0.036943
2002-11-29  0.061681 -0.009162
2002-12-31 -0.056570  0.045255
...              ...       ...
2024-03-29  0.032702  0.007829
2024-04-30 -0.040320 -0.064555
2024-05-31  0.050580  0.028870
2024-06-28  0.035280  0.018171
2024-07-31  0.012091  0.034988

[264 rows x 2 columns]

portfolio_returns = 0.6*monthly_returns['SPY'] + 0.4*monthly_returns['TLT']
portfolio_returns

Date
2002-08-30    0.026134
2002-09-30   -0.045875
2002-10-31    0.034593
2002-11-29    0.033344
2002-12-31   -0.015840
              ...   
2024-03-29    0.022753
2024-04-30   -0.050014
2024-05-31    0.041896
2024-06-28    0.028437
2024-07-31    0.021250
Freq: BM, Length: 264, dtype: float64

import numpy as np

portfolio_value = np.cumprod(1+portfolio_returns)
portfolio_value

Date
2002-08-30    1.026134
2002-09-30    0.979060
2002-10-31    1.012929
2002-11-29    1.046704
2002-12-31    1.030124
              ...   
2024-03-29    5.807165
2024-04-30    5.516728
2024-05-31    5.747855
2024-06-28    5.911304
2024-07-31    6.036917
Freq: BM, Length: 264, dtype: float64

Let's look at the individual steps of this calculation in a bit more detail:

First we compute 1+portfolio_returns to obtain a factor that tells us how much money we will have at the end of a month, assuming that we have $1 at the beginning of the month. For example, if the portfolio gains 5% in a month, we will have (1+0.05)=1.05 dollars at the end of the month.
Since we do not withdraw any profits, we will start the next month not with $1, but with $1.05. If the portfolio gains 10% in the next month, we will have 1.05*(1+0.10)=1.155 dollars, not just $1.10. This is power of compounding, and it makes us understand why we need to multiply subsequent returns, not just add them up.
The function np.cumprod does all the multiplication steps for all the individual monthly returns for us, and it keeps all the intermediate results for all timestamps (that is why it's called cumprod, short for cumulative product). This way, we get a series of values that tell us how much our portfolio is worth in each month, assuming that we started with $1 at the very beginning.

Let's plot the resulting portfolio equity curve:

plt.figure()

# fetch the axis object to plot multiple Pandas
# dataframes using the built-in plot method
ax = plt.gca() 

# plot our 60/40 portfolio
portfolio_value.plot(label='60/40', ax=ax)

# plot the equity curves of the individual investments
np.cumprod(1+monthly_returns).plot(ax=ax)

# logarithmic scaling of the y-axis for better overview
plt.yscale('log')

10⁰

10¹

2005

2010

2015

2020

Date

60/40

SPY

TLT

Correlation matters

plt.scatter(100*monthly_returns['SPY'], 
          100*monthly_returns['TLT'])
plt.xlabel('monthly returns SPY (%)')
plt.ylabel('monthly returns TLT (%)')
# pragma: end show

from scipy.stats import pearsonr

p = pearsonr(monthly_returns['SPY'], 
           monthly_returns['TLT'])
xx = np.linspace(-17, 17, 2)
plt.plot(xx, xx*p.statistic, "-.k", visible=False, label="_line", lw=20)
plt.text(13, 4, "?", fontsize=50, visible=False, label="_?")
t = np.linspace(0, 2*np.pi, 64)+np.deg2rad(-50)
x = np.sin(t)*16
y = np.cos(t)*17
angle = np.deg2rad(-2)
x, y = [x*np.cos(angle)+y*np.sin(angle), x*np.sin(angle)+y*np.cos(angle)]
plt.plot(x-1, y, '-k', visible=False, label="_ellipse");

-10

monthly returns TLT (%)

-10

monthly returns SPY (%)

from scipy.stats import pearsonr

pearsonr(monthly_returns['SPY'], 
       monthly_returns['TLT'])

PearsonRResult(statistic=-0.11968226633040632, pvalue=0.05209278993279599)

pearsonr(monthly_returns['SPY'].iloc[:12], 
       monthly_returns['TLT'].iloc[:12])

PearsonRResult(statistic=-0.3082014294118599, pvalue=0.32974817350594693)

Let's compute Spearman's rank correlation coefficient for the monthly returns of SPY and TLT to see how the result differs from the Pearson correlation coefficient:

from scipy.stats import spearmanr

spearmanr(monthly_returns['SPY'], monthly_returns['TLT'])

SignificanceResult(statistic=-0.1294543106653014, pvalue=0.03553081400917159)

corr = monthly_returns['SPY'].rolling(window=12).corr(monthly_returns['TLT'])

plt.plot(corr.index.values, corr.values, zorder=2)
plt.axhline(0., lw=1, zorder=1, c='C1', ls='--')
plt.ylabel('ρ(SPY, TLT)');

-1.0

-0.5

0.0

0.5

ρ(SPY, TLT)

2005

2010

2015

2020

2025

portfolio_volatility = portfolio_returns\
  .rolling(window=12)\
  .std()\
  .dropna()\
  *np.sqrt(12)

plt.plot(portfolio_volatility)
plt.ylabel('Annualized volatility');

0.05

0.10

0.15

0.20

Annualized volatility

2005

2010

2015

2020

2025

Basics

The hidden risk in historical data

Historical data

The 60/40 portfolio

Correlation matters

Loading...

The hidden risk in historical data

Historical data

The 60/40 portfolio

Correlation matters