Categories
All Articles

Early-Retirement, Post-Pandemic Portfolio Risk Review

I start this article with no idea as to its conclusion.  I want to revisit the question: “Am I taking the appropriate amount of investing risk in early retirement?”

The chart above shows my portfolio returns since I retired in 2018.  For comparison, I also show the performance of three other portfolios:  all US stocks; all US bonds; and a 50/50 mixture.  My portfolio for the period shown is around 30% stocks. Much less than almost every financial podcast I listen to recommends.  Based on the end of 2022 results shown, perhaps I should have followed their advice?

Of course, there is more to the story. For example, annual returns miss the initial effects of COVID19 on the market.  This shows up in March 2020 when the data is plotted monthly.

Even this monthly view is incomplete.  The low-point for stocks occurred on March 15th.

On that particular day, the all-US stock portfolio was down 33% from the previous high.  My portfolio was down 7%. The market had several large down days prior to the 15th. I was busy trying to stay safe from the virus and paid little attention to stocks at the time. But many investors were paying attention and many of them were selling (trading volume was at its highest since the 2008 financial crisis).  What happened to their portfolios?

The chart above shows the 5-year period of returns we started with, but this time the 100% and 50% stock portfolios switched to all bond (or cash) portfolios near the COVID pandemic low-point.  These portfolios move back to their target stock allocations at the end of 2020.  Compared to people reacting this way, I fared quite well. My guess is there were lots of these people.

This illustrates the core of risk tolerance in investing.  If you ever sell (or feel an urge to sell) to “cut your losses” during a market downturn, you took too much risk.

At the start of my retirement (a couple of years before COVID), I decided my risk tolerance was no more than a -10% portfolio loss after 1 or more years.  There was no real science behind this figure, I just felt uncomfortable with the idea of losing more.   However, I did use some science to help me figure out how to achieve my target. Based on historical annual returns of the S&P 500 and US Bonds, I determined a 30% stocks portfolio should come close.

The last 5 years saw several large market drops including the COVID selloff.  For me, the worst period was from December, 2021 to September, 2022.  My portfolio allocation lost around -15%.  More than the -10% predicted by my spreadsheets (bonds behaved atypically), but not enough to make me feel nervous or uneasy about my retirement finances.  I was actually eager to buy stocks “on sale”, re-balancing often during that period.

So why revisit my risk tolerance?

  • I clearly would have more money now (mid-2023) if I selected (and maintained) a higher stock allocation.
  • I did not give market downturns a second thought with 30% equities (-10% worst downturn predicted, -15% experienced).  Can I stretch this and still keep my emotions in check?

There are many ways to think about retirement financial risk.  We, of course, have the luxury of turning to some previous spreadsheets!

A good resource is the Safe Withdrawal Rate (SWR) chart from the 4% Rule article/spreadsheet.

This chart helps answer perhaps the most important risk tolerance question: “How much risk do I NEED to take in retirement?”

For example, if you require 4% of your portfolio per year for essential expenses at the start of a 30-year retirement, history says you need between 50% and 75% stocks in your portfolio.  This range succeeds for any 30-year period starting in 1871, though sometimes just barely. The 4% rule article/spreadsheet also shows us how the two points just off the 50%-75% plateau differ.

Both points have a single failure year, but the lower stock allocation has many more nervous cycles (failures after 30 but before 45 years) towards the end of retirement.

The choice for a 4% rule retiree seems easy.  Pick the right side of the plateau and the higher stock percentage.

However, that choice, once made, requires a full commitment for several years into a retirement.  If the portfolio grows significantly over time, revisiting your required risk (less) and stock allocation (lower) is appropriate.  If not, the initial target stock allocation must be maintained by re-balancing no matter what. Otherwise, the historical SWR simulation no longer applies.  The retired investor falls off the plateau and into the abyss…

Here are three retirement starting years that were tough at first, but ultimately successful* using the 4% rule (*although the 2000 retirement is short of its 30-year period, it’s on track for success).

A 75% stock allocation is quite scary during the early years of these retirements.  The worst is a -65% portfolio loss before recovering.  This is a little atypical, but downturns near -40% are not.

No matter how frightening, a retired investor’s reaction to large market drops must be: BUY MORE STOCKs.  The 4% rule and other SWR strategies depend on regular re-balancing (at least annually) back to the initial target stock allocation.

Put yourself in this scenario.  You recently retired with a $1M portfolio consisting of $750,000 in stocks and $250,000 in bonds.  After 2 years of ~$40K withdrawals for living expenses and a poor market, your portfolio drops to $800K.   You re-balance each month, which gives you $600K in stocks and $200K in bonds.

A banking crisis occurs and the market suffers a -50% dip over a 3-day period with no end in sight. You are very close to a monthly re-balance date and your portfolio now sits at $300K in stocks and $200K in bonds.

  • Will you feel OK about a portfolio that suffers large losses like this in early retirement?
  • Will you re-balance as usual to $375K stocks and $125K bonds?  Into a market that is still falling?

You must know the answers these questions to use a strategy like the 4% rule.  You must also commit to the answers before retirement.  Finally, your answers must be YES.

My situation is different.

I am in the fortunate position of not needing very much from my portfolio.  I have a spreadsheet (of course!) that looks at my expenses, pension income, projected social security, inflation estimates and many other factors.  Currently, my estimate of average annual portfolio withdrawals required for essential expenses (adjusted each year for projected inflation) throughout retirement is under 2%.

SWR curves are still very useful to my situation, but provide different insights.

For a 30-year retirement, I don’t NEED ANY stocks, but I CAN HAVE 100% stocks.  This is true up to a 2.4% withdrawal rate.

Because I retired early, I like to think the 45-year SWR curve applies to me.  In this case, I need at least a 10% stock allocation, but any amount higher is OK.

As mentioned before, I actually started retirement with 30% stocks per my risk tolerance assessment at the time.

At that equity percentage, the first thing the SWR curves tell me is I am not spending enough!

For a 45-year retirement and 30% stock allocation, the SWR is 3%.  50% more than my current 2% spending rate!  For a 30-year retirement, my SWR is 3.7%.

What about nervous cycles?  For a 45-year retirement, let’s define nervous cycles as failures between 45 and 60 years. The distance between these two curves at a given stock allocation is an indicator of nervous cycles.  Wider equals more.

At 30% stocks and a 3% SWR, I’m at the worst possible spot as far as nervous cycles.  From here, I have two paths to get to the 60-year curve, where nervous cycles go away.  The first is to increase stocks from 30% to 40%, keeping SWR at 3%.  The second is reduce SWR to 2.6% leaving stocks at 30%. 

Both of these choices essentially turn my 45-year retirement into a 60-year retirement, which I think is too conservative. What I really want is a point on the 45-year line that minimizes nervous cycles without excessive downside risk.

The right-most point on the first plateau is an intriguing choice (60% stocks, 3.5% SWR). Other than the spot where the 45 and 60-year curves touch, this point is one with the shortest gap.  However, I don’t think I’m quite ready for the downside risk of a 60% stock allocation.

A much better point for me, I feel, is 45% stocks and a SWR of 3.4%.  This has the shortest gap in the range of 30% to 55% stocks.  Here is what it looks like on our simulations of early and late retirement nervous periods:

I’m still not thrilled with the possibility of a -40% portfolio downturn, so I added worst case downturn figures to some points on the SWR chart to explore that concern.

What I find really interesting is that increasing my current spending rate by 50% only increases my downturn risk by 4%.  Now I am convinced I need to move up to the 45-year curve spending line.  Once there, it becomes a trade-off between increasing downturn risk versus reducing nervous cycles (by lowering the gap size to the 60-year line).  Taking one step up on the 45-year curve to 35% stocks lowers the gap. Taking another step does not lower the gap and going farther, I feel, is not worth the increased downside risk for now.  This will likely change over time because my risk tolerance should increase.  I will explain why in a future article (relating to social security and/or annuities).

Now I have the answer to my earlier question: “Am I taking the appropriate amount of investing risk in early retirement?”

The answer is no.  I need to increase my stock allocation from 30% to around 35%. I also need to increase my current portfolio spending rate by about 50%.  This does little to investing risk but should greatly reduce “not having enough fun” risk. 

Emotionally, I need to accept a downside risk of -30%.  It turns out my original estimate of -10% (for a 30% stock allocation) is really -20% given my 2% spending rate. Since I made it through the pandemic market dip without worry, I feel fairly confident I can take on -10% more risk. 

April 2023

Jan 2024 Update: I went ahead with this risk reassessment and my current stock allocation is 37%

Categories
All Articles

Two HSA Spreadsheet Secrets

I briefly discuss the eligibility rules for Health Savings Accounts (HSAs) in another article and will not cover them here.  The bottom line is: if you are eligible for an HSA, you should have one.

HSAs are triple-tax-advantaged.  Money from wages goes in tax-free*, grows tax-free and then comes out tax-free if used for medical expenses.  This is the only investment vehicle like that.  Because of this tax treatment, I came across two subtle (dare I say secret) characteristics of an HSA which I describe via spreadsheets below.

*Note: in addition to Federal income tax avoidance, HSA contributions made through employer payroll deductions also avoid FICA taxes if you make less than the annual Social Security Tax salary cap.  However, this may also lower your social security benefit.  In some states, state income tax is also avoided.  I do not consider either of these in this article (but you can… in your own spreadsheet!).

HSA Spreadsheet Secret #1: Spending HSA Money Can Generate More Money

I consume a lot of personal finance media and often hear (and might even repeat): “Don’t spend HSA dollars, invest them.  Pay current medical bills with cash reserves and save the receipts for much later HSA reimbursement in retirement.”  This advice is not bad, but it comes with an important caveat: it depends on how tight your finances are.

Let’s look at a yearly balance sheet of a married couple with an income that exactly covers expenses (including medical) after taxes and credits.

Here is what their balance sheet looks like if they fund an HSA with $1000 and use it to pay medical expenses.

Now there is an extra $220 in after-tax dollars.  Since they don’t need this money for expenses, what should they do with it?  They could just leave it as after-tax cash and invest it that way.  But let’s look at two HSA-related options.

Option 1:  Only reimburse part of their medical expenses from the HSA.

Rather than the full $1000, they might only reimburse themselves for $780 of their medical expenses.  This leaves $220 in their HSA, but doesn’t change their net worth.

Option 2:  Reimburse the full $1000 in medical expenses, but overfund the H SA.

Another, much better, option is to reimburse the full $1000 in medical expenses while contributing more than $1000 to the HSA such that they can still balance their budget (net take-home of $0).  The initial HSA funding for this is $1282. After reimbursement, $282 is in their HSA. 28% more than $220.

So, in this tight-budget example, spending HSA dollars creates more net worth!

Where does the effect of money creation from spending HSA dollars end?  When you reach the HSA yearly funding limit.  For this couple’s finances, they run up against the HSA funding limit (for a family) of $7200 at around $6000 in reimbursed medical expenses.

This leaves them with $1200 in their HSA and $49 in cash.

What if they have additional medical expenses?  This leads us to our second secret.

HSA Spreadsheet Secret #2:  The Roth HSA

Once the HSA limit is hit, medical expense reimbursement can no longer magnify money.  However, submitted expenses can still turn pre-tax HSA dollars directly into tax-free cash.  Then you can invest those dollars, if eligible, in a Roth IRA!  So instead of the usual after-tax dollars, untaxed dollars fund the Roth IRA! I call this the Roth HSA!

Strangely, I have not seen or heard my Roth HSA concept discussed anywhere else.  I find this a little surprising. Perhaps the financial gurus think it too complicated. Or that it doesn’t provide much financial benefit.  But the only difficulty I see is meeting the eligibility rules for both HSAs and Roth IRAs.  If so, I think the actual implementation process is fairly simple and worth the effort.

In another article on this site, I prioritized some common investment choices in this order:

  • Put enough pre-tax money in a company 401k to get the full company match
  • If eligible, fund an HSA. To the limit if possible.
  • If eligible, contribute to a Roth IRA…I will now add, if you can, make this a Roth HSA!

Let’s try to quantify these choices with a modified version of our married couple’s budget example.

In this iteration, their finances are not as tight, with only $80,000 in expenses.   This initially leaves them with an annual net worth increase of $11,010.  If they choose to contribute 8% to their 401k, which gets them the full company match, they can add an additional $3201.

They still have a budget surplus.  What if they add more (unmatched dollars) to their 401K? In their case, they can contribute an extra $5000 and still balance the budget.  Unfortunately, based on a predicted tax rate in retirement that is higher than their current top bracket, they lose value.

What if they moved that $5000 to an HSA instead?  Given the expectation that HSA money will eventually come out tax free, their -$400 loss is now a +600 gain (a $1k difference).

But they are not at their HSA max yet, so they have not fully magnified their money! 

A quick side note:  The $5000 figure I use approximates what an average family spends on out-of-pocket medical expenses each year. 

So now let’s say our married couple maxes out their HSA, then reimburses themselves $5k of that.  Now they have increased their annual net worth by an additional +$264.  They also have a decent budget surplus, which they can use to fund a Roth IRA.

Since their budget surplus comes from tax-free, reimbursed HSA money, their Roth IRA is now a Roth HSA! More medical expenses submitted this year means more for the Roth HSA.

Here is another very crucial consideration regarding my Roth HSA concept.  The medical expenses don’t have to be from the current year.  They can be from any prior year back to when the HSA was first created.  This is important because sometimes the various Government annual limits on retirement accounts can get in the way of my Roth HSA technique.  That’s usually a good thing, because it often means lots of income and savings those years.  But since medical expense receipts can be saved up, there is nothing wrong with waiting for a better time to use them.

Here is one example of how that might play out.  Let’s give our married couple more income. Each year with finances like this, they can max out both their HSA and their Roth IRA without submitting medical expenses.

However, some of their income is in the 22% tax bracket, which is higher than the tax rate they anticipate in retirement.  They could decide to put another $8900 into their 401k to get to the $19.5k annual limit (if they both have 401k access, this limit is doubled). This improves their net worth by a small amount ($143).  More significantly, it gives them the opportunity to transfer around $6500 of untaxed HSA money into Roths!  They will need enough saved-up medical expenses, so perhaps they develop a plan to implement this strategy every few years.

You might have noticed that transferring money from an HSA to my Roth HSA does not increase overall net worth.  Then why do it?  The reason is because money in an HSA is subject to many more restrictions than money in a Roth.  And it’s the rules regarding earnings that are most relevant, because over time earnings can compound and become quite large compared to contributions:

  • HSA earnings are only potentially tax-free (because they must be used for medical expenses). Roth earnings are always tax-free when the age limit and 5-year Roth ownership rules are met.
  • HSA funds (including earnings) can be treated like income from a Traditional IRA and used for non-medical expenses (after paying taxes) starting at age 65.  Roth earnings are available without restriction after age 59.5 and are not treated as income.

October 2022

Categories
All Articles

Should the 4% Rule be FIRE’d?

Ever hear of the FIRE (Financial Independence, Retire Early) movement?  Then you will know about the 4% rule.  Many interpret this rule as follows:  Once you can cover yearly expenses with 4% of an investment portfolio, then you are financially independent and don’t need a job anymore.  This interpretation is wrong.

The “4% rule” is not a guarantee, it is a statistical finding based on (some) historical records for US stocks, bonds and inflation.  It is quite easy to reproduce this finding, and get an understanding of its limitations, using a spreadsheet.

To start, you need annual returns for US stocks and bonds, as well as annual inflation rates.  I was able to find data going back to 1871 fairly easily.  Two sources for this data are Yale Professor Robert Schiller’s website and the “Simba” spreadsheet from the Bogleheads forum.  There are many others.

The concept for the spreadsheet is pretty simple.  Calculate the end of year (EOY) balance of a portfolio by adding (or subtracting) the yearly investment return. Then take out planned living expenses for the following year, after adjusting for inflation.  The mechanization is a little tricky because a new long-term investing scenario begins each year in the data range. I started as shown below, with 152 rows and 152 columns for the portfolio balance calculations.

EOY balance formulas go into around 11,500 cells (152*152*1/2). Fortunately, there are lots of shortcuts and auto-population features in Excel.  I don’t know about Google Sheets, but it probably has similar tools. 

Here is what I did in Excel.

After poking around on the internet, I learned how to auto-populate cells along a diagonal using the Identity Matrix function MUNIT.  On a blank worksheet, generate both a horizontal and vertical sequence of numbers out to 152.  Select that 152 by 152 area (using the numbers as a measuring tool). With the area highlighted, type “=MUNIT(152)” into the formula bar.  Instead of hitting “enter” after inputting the formula, hit “CTRL+Shift+Enter”.

This generates an identity matrix with “1” on the diagonal and “0” everywhere else.  The contents of the cells are actually formulas instead of numbers.  To turn them into numbers, make a copy of that area and paste over it with the Paste/Values option. Then get rid of all of the “0s” using the Find/Replace feature.

Next, replace all of the “1s” on the diagonal with some arbitrary number (I used -56789).  Now cut and paste the identity matrix with -56789 on the diagonal and blanks everywhere else into the simulation spreadsheet.  Use the Find/Replace feature again to substitute the desired formula for the arbitrary numbers as shown below.

At this point, I started to think about an approach for the EOY balance formulas.  I decided it would be easier if I generated yearly expenses separately.  So, I made a copy of my worksheet, called it “expenses” and linked it to the main spreadsheet by copying the starting withdrawal amount from there. Then I modified the diagonal entries with the Find/Replace feature as before.  I also cleared out unneeded content and generated the inflation-adjusted expenses formula with appropriate relative and absolute cell references.

Back on the main worksheet, I replaced the previous column of inflation data with the return rates of the mixed stock/bonds portfolio. Then I generated the EOY balance formula.

To complete the simulation, I first filled in the formulas on the expenses worksheet. Then the EOY balance formulas on the main worksheet, since they needed those expenses.  Both of these are manual click and drag operations, but go pretty quick.

The original research (from the 1990’s) that postulated the 4% rule looked at 30-year retirement periods.  I decided to add a row along the top of the spreadsheet with a copy of EOY balances after 30 years, which are along a diagonal starting at cell F39. I thought I could put the next diagonal entry into the adjacent cell and click and drag to the right to auto-populate the rest of the data.  This did not work in Excel.

Instead I had to use the “INDIRECT(ADDRESS(row#,column#))” command. I leave it to you to deduce its workings.  Here is a hint: Excel equates the column heading letter “F” to the number “6”.  I also added a formula to count the negative 30-year EOY balances.

The spreadsheet is finished, now it is time for some research!

There are two highlighted input cells, “Withdrawal Rate” and “Percent Stocks”, initially 4% and 50%.  If you slightly increase the Withdrawal Rate to 4.1%, “Negative Balances” will change from 0 to 2.  So, given a 50% allocation to stocks, the highest safe withdrawal rate (no negative 30-year balances) is 4%.  Repeating this iterative technique at various stock allocation percentages generates the following data and chart.

Hence the 4% rule! …

Given a stock allocation between 50% and 75%, and 30 year retirement periods.

Which leads us to the main limitation of the 4% rule as it applies to the FIRE-movement community.  They have much longer timelines, which we can also simulate with our spreadsheet.

Let’s say you want your money to last until age 95.  At age 65, you need 30 years of coverage, so the chart implies a 4% maximum Safe Withdrawal Rate (SWR).  If you want to retire at age 50 with 45 years of coverage, your SWR is 3.6% (10 percent less).  At age 35 and 60 years of coverage, your SWR is 3.5%.

All these curves have plateaus which might lead to the conclusion that once you get to a certain % stock allocation, you don’t get any additional benefit with more stocks.  That depends on the interpretation of the underlying data, to which we have access and therefore can draw our own conclusions. In the previous chart, I highlighted two points on the 30-year curve.  These are the first % stock allocations on either side of the plateau where the 4% rule fails.

The first point is a 45% stock allocation.  Here is what the underlying year-by-year data looks like (this is a view of the spreadsheet at 24% magnification).

Red columns represent where the money runs out.  I superimposed the 30-year line as well as a 45-year line.

The single failure cycle is a retirement that begins in 1966, which represents a failure rate of less than 1%.  However, I decided to create an additional “nervous cycles” metric, for time-periods with failures before or close to the 45-year line.  Almost a third of the cycles are in that category. In contrast, here is what happens on the other side of the 30-year SWR plateau, at an 80% stock allocation.

Same failure rate but substantially fewer “nervous cycles”.

The most conservative SWR curve possible from our simulation is where no portfolio withdrawal failures occur (no red anywhere in the 24% magnification spreadsheet view). The yellow line below depicts this.

Managers of charitable endowments would probably want to be somewhere on or below this bottom line. More stocks mean more funded programs each year. This also means more risk of a large yearly loss, which might question your strategy among the trustees (and your position).

What about the FIRE community?  I think anyone who wants to retire (or work less) before age 55 should stick to this lower line.  As far as stock allocation, it’s the changing slope of this line (and how much $ they need) that might help them decide.  Definitely at least 40% stocks, which is where the highest slope ends, providing a 2.8% SWR.  Probably not more than 65% stocks, providing a 3.3% SWR.

Does this mean the 4% rule is incorrect? Or that it was once correct and is now outdated?  Not at all.  The charts created by our simulation are mostly based on cycles starting in the mid-1960s and earlier.  This means the 1990s analyses findings were basically the same.  It has always just been about interpretation. That said, the underlying data still has many limitations that should temper any conclusions.  For example, 123 annual cycles are not a lot.

Another concern with the 4% rule I often hear brought up is home-country bias.  The US stock market is often portrayed as an over-performing outlier compared to other countries.  It’s not easy to find data from other countries that goes back very far.  I tried but didn’t get enough to make good spreadsheet comparisons (very few 30-year cycles).

I did find some articles on the internet from authors (e.g. Wade Pfau) summarizing what they say was reasonably good data.  From these sources I approximated 30-year SWR curves for a few other countries.

Canada, the UK and the US are quite similar, Canada with slightly higher SWR curve and the UK slightly worse.

There are several countries with really low SWRs.  My 30,000 foot interpretation of this data is:

  • Countries with wars fought on their own soil (including civil wars) do poorly
  • Countries ruled by fascists or dictators, even for a short time, do poorly

Hopefully the US can avoid these things for the next few decades.

December 2022

Categories
All Articles

Bayes Theorem: A Mathematical Magic 8-Ball

Bayes Theorem Part 1 — Interpreting Unlikely Test Results

Bayes theorem calculates the probability of something that is conditional upon the probability of something else.  Maybe that doesn’t sound too interesting, but think instead about these very common questions that come up in life:

  • What is the chance an unlikely result is actually true?
  • What is the chance a future prediction based on the past will actually happen?

Now imagine you have a tool, that few other people know how to use, which can answer these questions.  That’s Bayes!

Most explanations of Bayes theorem start with its mathematical expression.  I’ll save that for later and instead begin by illustrating a classic application using a spreadsheet!

This example is not related to finances, but I’ll get there in Part 4 by showing how I successfully reduced the downside risk of my portfolio from 2020 to 2022 using Bayes.

Imagine the world is in the early stages of a pandemic (if only this idea seemed far-fetched ☹).

The current rate of asymptomatic cases in the general population is 5%.  You have no symptoms, but must take a rapid test to travel on business. The rapid test, carefully evaluated via clinical trials, has a positive result accuracy (sensitivity) of 70% and a negative result accuracy (specificity) of 90%.  You test positive. What is your chance of actually being sick?  Hint: it’s not 70%.

The correct answer from the spreadsheet above is around 27%.  The actual math behind the spreadsheet is quite simple (shaded cells are inputs, other values are calculated as shown in red), but applying this math correctly is a little tricky.

The first step is to make up a number to represent a random population. In this case, I chose a sample size of 1000 (you can make this number anything you want; the answer will be the same).  For the chosen sample size and community infection rate, 50 people (5%) are positive. The remaining 950 (95%) are negative.  Of the 50 people who are actually positive, only 35 of them (70%) will test positive.  Of the 950 people who are actually negative, 95 of them (100%-90% = 10%; 10% of 950 = 95) will falsely test positive. So, the chance of actually being positive if you test positive is (35/(35+95)), or around 27%.

Let’s say you test negative.  What is the chance of actually being negative?

The spreadsheet now includes actual negatives that test negative (90% of 950 =855).  Also shown are actual positives that test negative (30% of 50 =15).  So, the chance of actually being negative if you test negative is (855/(855+15)), or around 98%.

To get a good feel for the basic math behind Bayes, you should take the time to duplicate everything you see and make sure your results match exactly.

Now let’s tweak a few things and see what happens.

What if the current rate of asymptomatic infection in the general population is higher?

With 25% percent of the community infected, a positive test result indicates a 70% chance of infection.  Much higher!

What if the positive accuracy of the test (sensitivity) is 100%?

Because of the low general infection rate (I moved it back to 5%), a positive result is still not very conclusive. A negative result, however, is conclusive. It really and truly means negative given that no positive cases are falsely identified as negative.

What if the sensitivity and specificity are both 50%?

This result is quite important to make note of.  A 50% test is by definition random.  Our result confirms that this test does nothing to our original probability estimate.  There is still a 5% chance of being positive and a 95% chance of being negative.

But the really, really important thing to notice is that 50% test sensitivity and 50% test specificity add up to 100%.  Of course, you know that, but I want you to notice it!  The math of Bayes theorem works such that any combination of sensitivity and specificity that adds up to 100% will not change the original probability estimate.   That test is random. Try it for yourself and see.  Make sensitive/specificity = 60/40.  Or 70/30, or 30/70.  In each case in our spreadsheet, the new Positive result will still be 5% and the new Negative result will still be 95%. The math of Bayes goes on to conclude that if sensitivity plus specificity is less than 100%, the test is worse than random and therefore misleading. If the sum is 200%, the test is perfect and provides the exact answer regardless of prior probability.

Bayes Theorem Part 2 — Developing a Feel for Bayes Math (Without a Spreadsheet!)

A really excellent way to think about Bayes theorem is with a “grains of sand” approach.   I learned about this technique from a book by science author Sean Carroll.  Here is how it applies to the medical test example above. Imagine starting with 1000 grains of sand. In your mind’s eye, separate these grains into two buckets based on the original positive likelihood estimate of 5%.  That means 50 grains in one bucket and 950 grains in the other.

We obtain new information relevant to our concern and estimate its sensitivity (accuracy of a positive result, or chance of confirming the concern) at 70%.  This tells us that 35 of the 50 grains of sand in our positive bucket are True Positives.  We also estimate a specificity (accuracy of a negative result, or chance of ruling out the concern) of 90%.  This tells us that 855 grains (90% of 950) of the sand in our negative bucket are True Negatives.  That means 950 – 855, or 95, are False Positives.  Now we can calculate a New Positive Estimate from the True and False Positives.

This is the same positive result obtained from the first medical test example in this article.  Note that I did not show a new negative estimate.  The data is available (True and False Negatives), but if I wanted to use this technique for that calculation, I would do it separately with a rearranged “grains of sand” model.

You can use this technique iteratively when new information is obtained to make math-based educated estimates in many areas.  The next time you make a sports bet (or perhaps first time given what you are about to learn), consider this 100 “grains of sand” approach:

  • In the last couple of years, my favorite team has won only 33% of their games.
  • Their next game is at night.  They don’t play at night very often, but when they do, they seem to win around half of the time.  I don’t think this is a fluke. There is something about my team and night games!  In my mind this doesn’t change their chance of losing a day (non-night) game change much.  It might go up slightly to make up for fewer losses occurring at night.  In Bayes terms, that means a “night game” test sensitivity of 50% and a test specificity (losing a non-night game) of 68%.
    • Note: This is where many people get Bayes reasoning wrong…I have seen several Bayes sports-betting examples where specificity is set equal to sensitivity by default…this is often incorrect. Each needs separate reasoning. In this case, for example, we know that setting both to 50% would be a random premise providing no added insight

Our observation about past night games leads us to predict a 44% chance of winning the upcoming night game. But there is more to consider.

  • My team recently acquired a new star player.  He is battling injuries and hasn’t played much, but he is playing in the next game.  So far, they won 45% of their games with him in the lineup.  Even when he is out, the team seems to be playing better (slightly less likely to lose than before).  I attribute this to his overall positive effect on the team’s morale.  For this Bayes iteration, sensitivity is 45% and specificity is 65%.
  • This 2-iteration Bayes assessment predicts your team has a 50/50 chance of winning their next game (even-money).  If you find a sports-gambling facility offering more than even-up odds (on either a win or a loss), and you have a little fun-money, take the odds and bet!

Some things to note:

  • The order of these two iterative steps does not matter.  The result is the same either way. 
  • You need to consider the sum of sensitivity and specificity related to any premise.  If it is 100%, the premise is random and does not add any insight.  If less than 100%, it is worse than random and generates meaningless results. 

Bayes Theorem Part 3 — Crazy Claims and Bayes Thinking.

“Extraordinary claims require extraordinary evidence”

This quote is attributed to the famous scientist Carl Sagan, though other scientists before him said similar things.  This is actually a great way to summarize an important conclusion of Bayes Theorem. We’ve seen how Bayes works.  If something is really unlikely, like a rare disease, a test for that condition needs to be extraordinarily accurate to indicate it is more likely than not. Here are some examples:

The last column shows that a positive result from a test that is 99.9% accurate (both sensitivity and specificity) predicts only a 50/50 chance of something that is nominally a 1 in 1000 event.

Let’s say someone claims something extremely unlikely has happened.  For example, “The election in our State was stolen!”. You are of course skeptical, but he has evidence!?  He directs you to a copy of an affidavit signed by a guy in Italy. This document describes secret CIA software that years ago was used to flip electronic votes in Venezuela.  More recently, according to the Italian, this software was modified by a Spanish company to target American elections!!!

You do some research on election procedures in your State. These include paper ballots, multiple levels of independent audits during the counting of those ballots, and automatic hand recounts if the results are close.  In addition, vote counting software has been used in your country thousands of times with no known cases of electronically flipping cast votes. Given this knowledge, you estimate there is less than a 1 in 1000 chance of this happening.

Still, the person making the claim seems truly convinced.  He has probably not applied Bayes thinking, but you can.  Imagine a test for “electronically flipped ballots” that is 99.9% accurate.   Think of this test as some sort of evidence presented to 1000 truly unbiased (we are dealing with politics) experts on computer software used in elections. To get a positive test result, 999 of the 1000 experts need to agree the evidence indicates votes were flipped.  This same number of experts also has to agree that without the evidence, or prior to being shown the evidence, votes were not flipped. Even then, given the original low likelihood estimate, agreement from those 999 experts only gets you to a coin-flip.

Bayes Theorem Part 4 — Designing a Useful Test

In actual medical or other scientific applications, tests designed using Bayes techniques are evaluated for usefulness based on their sensitivity and specificity. This can be an extremely complex task.  Especially if one of these values is less than 50%. Nonetheless, I came up with something that approximates the material I reviewed.  So, for entertainment purposes only, here is my highly subjective, unscientific ranking scale for Bayes test usefulness:

  • Note that this scale does not apply to Bayes thought (e.g. “grains of sand”) experiments. It only applies to tests developed from collected data.  In thought experiments, anything over 100% for the sum of sensitivity and specificity is a good goal to help inform odds estimates.

Of course, to apply this scale, we need to come up with a Bayes test to evaluate.  One method is to collect some data you think is conditionally related in a spreadsheet and postulate a true/false premise (a proposed test) for that data.  Then you can calculate both the sensitivity and specificity of your proposed test from the results.  Here is an example using some historical financial data (I said I would get here).

This test looks at the relationship between the annual returns of the S&P 500 and inflation starting in 1928.   Two formulas compare inflation to a test value, separating the S&P returns into different columns based on whether 12 month inflation at the end of that year is above or below the test value. The bottom of the spreadsheet contains Bayes-related statistics derived from the data above.  Many of these calculations involve the “COUNT” and “COUNTIF” Excel formulas.  The grey box below shows an example.  The blue boxes show the equations for calculating sensitivity and specificity.  With this knowledge, you can (and should) recreate the results you see.

Changing the Test Value yields different results for these parameters.  Unfortunately, for this particular test, no reasonable Test Value provides a combination of sensitivity and specificity deemed “Useful” per my ranking scale.

However, an interesting thing happens by changing the polarity of the test.

Now we have a value for sensitivity which is just below my scale’s threshold of being useful.  If your scale is a little more lenient, you might interpret this result as follows: if inflation is likely to be higher than 5% this year, then the chance of a negative stock market return increases from 27% to 44%.

It might seem odd that the sum of sensitivity and specificity is different after changing the polarity of the test, but not the value of the test variable.  However, in order to make the sum the same, we also need to change the polarity of positive/negative results in the appropriate formulas.  In other words, “negative market returns” become “positive conditions” of the test.  It sounds more confusing than it is, and in the end any useful conclusions from the test are the same.  It’s worth the effort to generate this modified spreadsheet if you feel so inclined. I won’t do it here. The takeaway, however, is make sure to use both polarities, like I did, to evaluate Bayes test usefulness. For the comparison of annual S&P 500 returns to inflation, the table below shows the results at different inflation levels.

The red-shaded cells indicate information that is not useful per my rating scale.  So, this tells me I can’t draw any conclusions from this test about stock market performance (compared to the baseline sample results) given projected inflation levels less than 6%.  Above 6% however, I can reasonably estimate that the chance of a negative stock market return increases from 27% to 40%.  Above 7%, the increased chance of a negative returns goes up slightly less, from 27% to 36%.

Of course, there are unlimited ways to test (or analyze) data in a spreadsheet.  But with a Bayesian approach, there is also an indication of test quality.

Here is another analysis similar to the one for S&P returns, but instead comparing inflation to real (after inflation) returns from US Intermediate Term Bonds.

This test produces much more useful data per my scale.  Though if you understand how bonds work, you probably don’t need a test to tell you that their real returns are more likely to do well when inflation is low and poorly when inflation is high.  But this test helps quantify that conclusion at specific inflation levels.

In these examples, I purposely chose a range of returns that ended in 2020.  That year, in response to the COVID pandemic, the US and other governments printed a lot of money.  It did not take a genius to figure out that big inflation would result relatively soon.  So, what was a Bayesian investor (like me!) to do?

Let’s look at bonds first.  The 3% inflation level is an interesting result.  Below that level, the chance of a positive returns increases from 66% to 76% (this 76% figure implies that the chance of a negative return is 24%).  But this result also finds that above 3%, the chance of a negative return is 49%.  So, what happens if inflation is predicted to be exactly 3%?  In that case, since the sensitivity of the test is higher than the specificity, we would put more credence the 49% negative return prediction.

But the real usefulness of the bond test results is at the higher inflation levels.  Starting with inflation above 5%, the chance of a negative real return from bonds increases from 34% to 78%!  As inflation goes higher, the chance of negative bond returns increases further.  This test is pretty clear that bonds are not a great place to invest if you see inflation above 5% on the horizon!

Do you move money from bonds to stocks? 

As seen earlier, we don’t start getting useful Bayesian results about stocks until inflation is above 6%.  At that level, the chance of a negative annual return from stocks increases from 27% to 40%.   At 7% inflation the chance of a negative return from stocks is 36%, slightly less than it was at 6% inflation.  This pattern indicates that high inflation makes stocks riskier, but unlike the situation for bonds, this effect lessens as inflation goes even higher. Since you don’t know how high above 5% inflation could go, leaving the stock percentage of your portfolio alone seems prudent.

If not bonds to stocks, what then?

Given a decision against increased equity exposure, is there somewhere better than Intermediate Term Bonds to put the “safe” part of our portfolio?  I looked back at three possibilities: 1-year Treasuries, 3-month Treasuries and my 401k Stable Value fund. Interestingly, my table of Bayes data for these alternatives looked almost identical to the original bond analysis.  So, I went a step further.  I averaged the negative real returns in the negative test results column at a 6% inflation level.  In the original bond analysis, the average was -7%. For 1-year Treasuries, the average was -6.5%.  For both 3-month Treasuries and Stable Value, the average was -5%.  These last two choices are definitely better for the safe part of a portfolio when a big inflation threat is looming.

I chose Stable Value.  Over the next two years, inflation exceeded 6% per year. Bonds cumulatively lost around -13%. Stable Value gained around 4%. In real terms, Stable Value lost money, but did much better than bonds.

Bayes Theorem Part 5 — The Formula

I said I would get to a mathematical expression for Bayes Theorem and here it is:

If we define the “Probability of a Positive Test Result” as P(B), then the “Chance of Actually Being Positive Given a Positive Test Result” is P(A|B).   But learning the specific symbology of this formula is not all that important for us.  If you understand the spreadsheet and thought experiments we went through, you know Bayes Theorem!

September 2022

Categories
All Articles

Keeping Money From a Hype Cycle Asset (like Bitcoin)

This content is password protected. To view it please enter your password below:

Categories
All Articles

Investing Should be Boring (Hint: Bitcoin is Too Exciting)

This content is password protected. To view it please enter your password below:

Categories
All Articles

My Investing Experience – So Far

This content is password protected. To view it please enter your password below:

Categories
All Articles

When are Stocks Too Expensive?

This content is password protected. To view it please enter your password below:

Categories
All Articles

Bonds, Explained Bonds

This content is password protected. To view it please enter your password below:

Categories
All Articles

Mmm, Momentum

This content is password protected. To view it please enter your password below: