Modus Tollens

Mark Spitznagel recently wrote a great book called Safe Haven. It got me thinking about logic and finance in a way I hadn’t before and really helps to cut through the noise in finance. It’s a great book that combines ideas of probability and non-ergodicity with a passion for investing.

Spitz contends that if the goal of the investor is to increase wealth then they should be focused not on arithmetic returns but cost adjusted growth rate (CAGR). Doing so requires not only “setting sail” and going after returns but also mitigating risk through cost-effective insurance.

How does insurance help? If you can find a cost-effective way to protect wealth you are better off in the long run. Losses have an asymmetric effect on portfolio growth owing to the concavity of geometric averaging (think of the graph of a logarithm). As a simple example, a 50% loss requires a 100% gain to recover from.

The book has a central hypothesis:

If a strategy cost-effectively mitigates a portfolio’s risk, then adding that strategy raises the portfolio’s CAGR over time.

Spitz applies Popperian falsification to rigorously test the cost-effectiveness of risk mitigation. The essence of Popperian falsification comes down to a syllogism called modus tollens or “denying the consequent.”

The structure of modus tollens looks like “If hypothesis, then outcome. Not outcome. Therefore, not hypothesis.” Symbolically, it can be represented as the following:

The logic here can be a bit tricky, particularly when you consider the cases where the hypothesis is false (vacuous truth). Here is the truth table for the proof of modus tollens as a refresher:

In short, what modus ponens tells us is that if we have have an implication and we have a negation of the consequent then we can conclude the negation of the antecedent. So, we can conclude that our hypothesis is false.

What we cannot do, however, is prove that a hypothesis is true (though the logical argument may be valid). This is the basis of the scientific method: disproving hypotheses to get closer to the truth.

If we deny the outcome by finding that adding the risk mitigation strategy to a portfolio does not raise the CAGR of the portfolio, then we must also deny that the strategy cost-effectively mitigates the risk of the portfolio.

Spitz makes a good point about starting with hypothesis a priori, rather than building an ad hoc hypothesis that fits experimental data.

“We need logical explanations that independent and formed prior to those observations.”

This statement hits at the idea of formulating causal relationships for hypotheses — something Yudea Pearl talks about in his Book of Why. Nutrition is an example of a field with lots of data where you can apply an ad hoc hypothesis to and have no understanding of the causal relationships that may (or may not) exist underneath, and have a hard time falsifying with existing data.

Spitz also brings up logical fallacies to be avoided such as “affirming the consequent” and “denying the antecedent”. Again, I refer the reader to Sean McClure’s excellent podcast on logical fallacies for further investigation.

I’m currently doing a deep dive on cost-effective risk mitigation for portfolio management. In particular I’m looking at the feasibility of using convex products to protect either a single stock position, or a basket of stocks like the SPX.

Thinking About Probability #4


Continuing the detour here before I return to Papoulis & Pillai.

I recently came across a great explainer video on ergodicity by Alex Adamou on

Ergodicity is an incredibly important concept to understand when dealing with multiple trajectories of a stochastic process — which is simply a random process through time. Common examples of stochastic processes include games of chance (i.e. rolling a dice) and random walks (i.e. path a molecule takes as it travels in a liquid or gas).

Taking some examples from Alex, below is a graph representing a stochastic process X(t) with a set of four trajectories {xn(t)}.

To determine the finite time average we fix the trajectory and average horizontally over the time range

Finite time average

When time diverges the time average is given by

Time average

In the finite ensemble average we fix time and average over the trajectories

Finite ensemble average

When the number of systems diverge the ensemble average is given by

Ensemble average (imagine there are many more trajectories)

A stochastic process is ergodic if the time average of a single trajectory is equal to the ensemble average.

Shown graphically, here is what we are examining:

Ergodic perspective

And that’s really all there is to ergodicity! Ergodicity is a special case where the time average is equal to the time average.

Said differently, the time average and the ensemble average are only interchangeable when the process under consideration is ergodic.

This is a key point because the ensemble average is often easier to determine than the time average. Alex points to economics as an example where people often utilize expectation values to represent temporal phenomena without knowing if the system is ergodic.

In many real world scenarios, things exhibit strong path dependence meaning that they are very likely non-ergodic. The classic example is Polya’s urn where a ball is selected from an urn with balls of two different colors. The ball is then replaced and another ball of the same color that was selected is added to the urn. The first ball selected has a large impact on the subsequent makeup of the urn over multiple trials and breaks ergodicity.

I highly recommend watching Alex’ video for additional examples. I think he did a really good job of distilling the essence of ergodicity for the layman.

Now I am thinking about how you can tell if a process is stochastic or the result of deterministic chaos…

Thinking About Probability #3

At long last, I am finally returning to the blog. I want to address my earlier point about Hume and review Uwe Saint-Mont’s recent paper Induction: A Logical Analysis to provide some clarification on the application of probability and statistics.

If it seems like I am belaboring the point it’s because I am. When learning probability the inductive approach comes up early and often. Having a sense of the guide posts early on will hopefully lead to better understanding of when and where different methods are applicable.


I mentioned in a previous post that Hume attacked induction with force. He disputed “that inductive conclusions, i.e., the very method of generalizing, can be justified at all.” Since his work induction has been looked at with skepticism even to this day. Popper and Miller went so far as to provide a proof of the impossibility of inductive probability.

As Pillai would point out, we use induction in probability and statistics because in certain – but broad-ranging cases – it works. Supporting Pillai, Jaynes comments (p. 699) on Popper and Millner, their proof is “written for scientists… like trying to prove the impossibility of heavier-than-air flight to an assembly of professional airline pilots.” Understanding where it is induction is applicable is key.

I’d like to go into Uwe Saint-mont’s paper in more detail later, but here I’ll just summarize. Saint-mont seeks to deal with Hume’s objections to induction in a constructive way. The key concepts are boundedness and information.

For bounded problems, induction is a reasonable approach to logic. You can use a deck of cards or a game of dice as examples of bounded problems. The rules of the game are well understood at the outset and outcomes are bounded.

With regards to information Saint-mont considers a basic model where you have tiers going from more general to less general. This introduces the concept of layers of information and the distance between those layers. If the distance is bounded, well-structured inductive leaps between less general and more general can be made in “small” steps. If the distance is unbounded, “a leap of faith from [less general] to [more general] never seems to be well-grounded.” This implies that the less general tier is a subset of the more general tier. In such instances (the realm of “Mediocristan” as Taleb would write), the law of large numbers holds and statistics provides a valid response to Hume’s arguments.

As Saint-Mont writes “without any assumptions, boundary conditions or restrictions – beyond any bounds – there are no grounds for induction.” Thus, when using inductive logic it is important that your boundaries be well defined in order for the “logic funnel” to be applicable.

I can easily imagine constructs where you can create consistent models for induction (games of chance, certain closed financial systems, isolated computer algorithms). The question remains, however, how applicable can these constructs be to real life? Understanding that is key to useful application of probability and statistics since in unbounded systems uncertainty will dominate and induction can rapidly lead you astray.

Thinking About Probability #2

I left off my previous post with Hume’s problem of induction and a way forward coming from Uwe Saint-Mont.

For this post, I was originally planning to do a deep dive on the structure of logic and epistemic uncertainty to help frame future discussions. Fortunately Sean McLure has already done a better job than I could ever do in a recent series on his podcast, NonTrivial.

I highly recommend the listen as a primer for logical analysis to help understand where logic can be useful and where risk management becomes a better proxy, particularly in complex situations.

He spends a lot of time talking about inductive logic and Popperian falsification, which I think will become foundational as this probability series progresses.

Here is the link to the first part of the Facts and Logic Series on NonTrivial:

Here is a link to Sean’s comprehensive list of logical fallacies:

Next post I’ll get back to probability starting with the Uwe Saint-Mont paper.

Thinking About Probability #1

This is my first attempt at adding some of the Notability maths into the blog – we’ll see how it goes.

As I mentioned in my previous post I am starting to chronicle my re-learnings in probability and statistics starting with the textbook “Probability, Random Variables and Stochastic Processes” by Papoulis and Pillai. The writing is pretty engaging for a math book and I’m hope to make it at least a chapter before getting distracted. Joe Norman incorporated parts of the book in his Applied Complexity Science course I took recently and I grabbed it off Amazon where I was pleasantly surprised to find a positive review from Taleb.

On to the learning. It’s late here so I’ll start from the beginning and maybe get through some thoughts on a few ideas/equations.

We start with an intro to probability and I think one of the key takeaways is that probability is dealing specifically with mass phenomena. How many events? Well, it depends. More on this later I suppose.

A physical interpretation of probability is provided by the following equation (Notability works great btw):

Where P means probability, A stands for an event, n(A) is the number of times the event occurred and n is the number of experiments run.

In a classic example of flipping a coin, if you flip the coin 10 times and you get heads 4 times out of the 10, then your probability P(A) of getting head (funny undergrad story on this I might tell later) is 4/10 or 0.4.

Easy enough… but is it?

This equation is an approximation of the probability and has a couple key assumptions baked in (assumptions to me are like kernels or rules of an automaton, perhaps more on this later). The first assumption is that the relative frequency of the occurrence of A is close to P(A) provided that n is sufficiently large. Obvious question: how large is large enough? The second assumption is that provided you have meet the first demand then only with a “high degree of certainty” is the equation valid.

Any use of this equation for prediction in the real word takes us down the path of induction. You’ll never be able to run an experiment an infinite amount of times and so to estimate the probability of an event occurring on the n+1 experiment it is going to have to be based on a priori knowledge. Say you flip a coin 100 times and see heads half of the time, then you would induce that the next 100 flips would yield heads half of the time.

David Hume had a problem with inductive logic. Gauche had the following summary of his argument:

“(i)Any verdict on the legitimacy of induction must result from deductive or inductive arguments, because those are the only kinds of reasoning.

(ii)A verdict on induction cannot be reached deductively. No inference from the observed to the unobserved is deductive, specifically because nothing in deductive logic can ensure that the course of nature will not change.

(iii)A verdict cannot be reached inductively. Any appeal to the past successes of inductive logic, such as that bread has continued to be nutritious and that the sun has continued to rise day after day, is but worthless circular reasoning when applied to induction’s future fortunes.

Therefore, because deduction and induction are the only options, and because neither can reach a verdict on induction, the conclusion follows that there is no rational justification for induction.”

Whoops. Now what?

It’s fucking late now and I’m just one page into Papoulis. I’m having fun though so I’ll keep doing this (daily?) and hopefully have time to go back and properly reference things.

Next post, I want to pick this up with a way-out of the inductive reasoning trap which I found in a recent paper by Uwe Saint-Mont. Then maybe I’ll move on to deductive (math-fun) reasoning.

Returning to the Blog

I have had this nagging sensation that I need to write again. Had everything set up to go on my iPad for some late night writing sessions and then my keyboard died. I wasn’t sure what the issue was but finally just decided to replace it.

Lot’s of things going on since my last post. Top of mind is $GME and the recent hedge fund blow ups. I like GME a lot and think they have some long term potential to turn around into a major gaming media company. Also, if we turn the corner on coronavirus and can go back to small public gatherings I can see some huge potential for board gaming as a turnaround strategy for the retail gaming. Nowhere near justifying the $300+ price point, but a good direction for the company nonetheless, I think.

I have also been spending a lot of time circling around complex systems. I think I was originally introduced to the topic through a Coursera class from Scott Page years ago. Some of the recent work from Yaneer Bar-Yam, Nassim Taleb, Joe Norman and Ole Peters has kindled the old fire and I have been reading everything I can get my hands on. I am finding my main limitation so far has been my lack of a deep working knowledge of probability and statistics so I am starting from scratch with a goal of eventually understanding Nassim’s recent writings in his Technical Incerto.

I recently learned that Notability now has an automated handwriting to equation tool so I’ll be posting some of the learning as I go along. It’s mostly for my sake to help express my understanding but feel free to follow along.

I’ll start with Papoulis and see where it goes from there. I have a tendency to bounce around books a bit as I find textbook statistics incredibly boring. I figure by the end of it I’ll have a self-made understanding of probability and stats and hopefully some interesting insights along the way.

Lastly, just before I pop off, I have been digging into Christopher Alexander’s Nature of Order recently. I am currently on Book 1 and its pretty damn interesting already. I’m sensing a deep dive into David Bohm’s physics work on implicate order in the near future.

Lastly, lastly, I had the pleasure of running across a recent blog post from Jerry Neumann on LinkedIn. The first piece I read was called Strategy Under Uncertainty and it is a wonderful combination of Christiansen, Taleb and his personal experiences in Angel and VC startup investing. His blog is called the Reaction Wheel (after some kind of satellite part) and I slogged through all 270 posts over the Christmas holidays. It was time well spent.

Let’s Talk About Your Quantifiable Value Proposition

If you’re developing a product or service you’ve probably come across the term Quantifiable Value Proposition (QVP). What is QVP and why is it important?

Your QVP is a measure of the value that you provide to your customers and clients through your products and services. As the name implies, the key concept of QVP is that the value you provide is quantifiable (ad hoc) – and it’s best if your QVP is also measurable (post hoc).

Your QVP should be aligned with your customer’s priorities and aspirations. If your customer’s #1 priority is increasing their sales conversions and your QVP is increasing sales conversions then your offering benefits are aligned with your customer priorities. If, on the other hand, your QVP was focused on increasing the number of leads, the conversation with the customer will be an uphill struggle. In this case, there’s misalignment between the customer’s priorities (sales conversion) and the QVP that you are promoting (lead gen). 

Quantifying your value proposition helps you paint a vision of the future for your client. You are saying “through the use of our offering, you can expect XYZ benefits.” Measuring is important because it allows the customer to reflect on the purchase to determine if it met expectations.

Sometimes quantifying and measuring your value proposition is easy. As an example, let’s say your customer is a product development department in a medical device company. Your offering has the ability to help them save 3 weeks of time in the product development process through increased efficiencies. The QVP is that through the use of your product the customer will be saving 3 weeks in product development time. The QVP is also easily measurable by the client. At the end of their product development they compare to a previous estimate or similar project and see if they indeed saved time using your offering.

Sometimes your value proposition is not so easy to quantify – particularly when dealing with subjective experiences like taste, feeling, or satisfaction. In these cases, it is helpful to consider how your offering improves the customer’s condition. Are there simple ways to show they are better off with your product or service than without it? Does your product or service help your customer deal or cope with a problem and reduce its negative impacts?

Your QVP begins with understanding your clients top priorities and aspirations. Once you have a clear understanding of what matters most to your client you can then craft a QVP that helps them paint a vision of the future with your product or service. Your QVP will help you in determining your pricing and once you have settled on your business model you can Calculate your Customer Lifetime Value.

How to Determine your Customer Lifetime Value (LTV)

Customer lifetime value (LTV) is an important measure of unit economics for MedTech and HealthTech companies. LTV is a measure of your product or service offering’s profitability over a length of 3 to 5 years. Since the LTV is calculated over a time period, the profits are discounted to come to a net present value for each additional customer.  Note that LTV does not take into account the costs associated with acquiring a customer (CAC), which is an important consideration that helps determine your sales and marketing expense.

The first step to determining your LTV is to decide on your business model. Do you plan to sell your product or service offering as a one-time fee? Do you have a razor-blade model where you sell a device with ongoing consumables? Will you have a subscription and ongoing service? Or maybe you have some combination all of the above? There are many business models to choose from and your choice will depend on your product or service offering and the core strengths of your team. Your choice of business model will also have an important effect on your LTV calculations.

Once you have decided on your business model, calculating the LTV requires a few key pieces of information:

1. Your product pricing and revenue streams: What is the price of your product? How much product do you expect your average customer to buy? Is there a one-time revenue stream, a recurring revenue stream or both? Do you have any upselling opportunities once you have landed a customer that do not require much work on behalf of the sales and marketing team?

2. Product repurchase rate: For one-time revenue streams, how frequently you think your customers will be repurchasing your product over a 3 to 5 year window?

3. Your customer retention rate: For recurring revenue streams, what percentage of customers will continue paying a recurring fee for use of the product or service?

4. Your cost of goods sold (COGS): How much does it take to produce or make each of your individual products? Note that this does not include costs associated with sales & marketing, R&D or administration.

5. Weighted average cost of capital (WACC): The WACC an estimate of how much of a premium investors place on investing today in your solution. The number is highly variable, but generally somewhere between 35 and 75 percent depending on the offering, financial markets and experience on the team. A high WACC means that revenues in later years become less important and this is one of the reasons you typically don’t see LTV calculations go beyond 5 years.

A simple spreadsheet can help you easily calculate your LTV once you have all of the above information. I have included a simple template based on a widget that you can use for LTV calculations here. The widget has a one-time revenue stream upon purchase, a recurring revenue stream for consumables, and a service upsell included in the calculations.

A positive LTV means that you are contributing to your gross profit with each additional customer. A negative LTV means that your unit economics aren’t sustainable and you may need to reconsider your offering’s value proposition, pricing, or business model. It could also mean that your offering is not profitable at lower volumes, however, at higher volumes your offering becomes profitable. 

Whatever the case, LTV is an important metric to help understand your business’ unit economics and will be a key consideration for potential investors and savvy board members.

Determining your product pricing

OK, so you’ve got the idea and maybe you even have a prototype of the product or service your are planning to offer. How do you determine what the price of your offering should be?

This question is often flummoxing for early stage entrepreneurs. After all, you are running an innovation-driven enterprise — your idea is new and meant to change the way people are currently doing things! How do you price something that has no direct comparison?

The first method of pricing is cost-plus whereby you determine the cost inputs that go into the delivery of your offering and charge a flat markup, say 20%, to come up with the final price. Cost-plus pricing is commonly requested from contractors by military and government agencies. 

While cost-plus is the simplest way to determine your pricing — and often a good exercise as it forces the entrepreneur to understand the cost structure of their offering and the lowest price possible to remain profitable — it often leaves a lot of value on the table. In the early stages of your product or service lifecycle there are early adopters who are willing to pay a premium to have access to the product or service before anybody else. Also remember: your costs are irrelevant to your customer. They just care that your product is providing value at an amount they can afford!

Understanding pricing of your offering begins with understanding the Quantifiable Value Proposition (QVP) to your customer. In a B2B setting, the QVP can often be determined as the revenue gains or cost savings the business will obtain through the delivery of your offering. A good rule of thumb is to target your pricing to be about 20% of your customer’s QVP.

Let’s assume that you have done your research and are confident that through the use of your offering a company can reasonably expect to earn an additional $100K in revenues. As a first approximation, an offering price of $20K would be a good place to start — the client obtains plenty of value from having your offering and you extract a modest amount of the benefits. You have set up a win-win for both sides provided your cost structure allows for profitability at that price point.

The 20% figure is just a goal post in determining the pricing of your offering. There are other important factors to consider when you are looking to decide your pricing. 

Does your company have a monopoly? If so, you may be able to command higher pricing. 

Does the price come within your decision-maker’s budget? If not, your offering may require additional approvals and lengthen your sales cycle. 

Does the price exceed the reimbursement the customer receives? Even if your customer receives benefits from your offering outside of reimbursement, it can be difficult to craft an appeasing story to the acquisition committee who may have a few metrics they make decisions on.

There are several other methods for determining pricing that can help you triangulate on your target customer’s willingness to pay. One method is looking at comparables: what are people already paying for similar products and services? If a customer has previously paid for a similar offering then it’s probably already in their budget. The friction to replace an existing offering is often a lot lower than bringing in something completely new. Another method is substitution: if a customer isn’t already using something similar to your offering what are they using instead? Understanding how much a customer is spending on substitutes can often lead to interesting dialogues with the customer about how your offering can better serve their needs and position them for future success.

Different customers will derive different QVPs from your offering so be sure to segment appropriately! Segmenting in this way can help to determine beachhead markets and branding opportunities within your offering. As you scale and are able to lower your cost structure, opportunities may open at the lower ends of the market where you can sell to customers and maintain similar profitability.

Finally, at the early stages of your company it is often necessary to offer a discount to early customers in order to build the customer base and win referrals. While discounts are a great way to drive sales, take care not to erode the perceived value of your offering and make sure there is a timeline on them so that the customer knows they are receiving a discount because they are taking an early risk. Hardware discounts are often much easier to remove than software discounts as people more readily recognize the tangible value of hardware.

Pricing your startup’s offering depends on several levers and requires an in-depth understanding of your customers needs and their willingness to pay. Cost-plus pricing should be avoided if possible as it tends to leave a lot of value on the table and entrepreneurs should focus on delivering and extracting value through their quantitative value proposition. Once you have settled on your business model and pricing you can now look at calculating your customer lifetime value (LTV)!