Finance, Decoded: 99% Confidence, Six Different Answers: Why Risk Isn’t One Number

It’s Monday morning.

You open your portfolio, glance at the market, and ask a simple question:

“How bad could today get?”

Somewhere on Wall Street, that exact question is being answered — not once, but many times.

One model says the risk is moderate.
Another says it’s higher.
A third quietly suggests something worse might be lurking.

Same market. Same data. Different answers.

This isn’t a mistake.

It’s how risk actually works.

What Does “99% Risk” Actually Mean?

At the heart of modern risk management is something called Value‑at‑Risk (VaR).

A 99% VaR of 2.5% means:

On 99 out of 100 days, losses should be smaller than 2.5%

But on 1 day out of 100, losses could be worse

It’s not predicting the worst case — it’s a statistical threshold for a “typical bad day.”

This single number is widely used across banks, hedge funds, and regulators.

But as it turns out, how you calculate that number matters a lot.

One Market, Multiple Ways to Measure Risk

To explore this, I estimated risk for the S&P 500 using six standard models:

Historical Simulation (HS)

Moving Average (MA)

EWMA

GARCH

t‑GARCH

Extreme Value Theory (EVT)

Each model uses the same underlying data.

What differs is the lens.

Some rely purely on history.
Some assume markets behave smoothly.
Others focus explicitly on extreme events.

These differences may seem technical — but they have real consequences.

How Risk Evolves Over Time

The chart below tracks daily risk estimates over the first few months of 2026.

It reveals something important: even when models measure the same thing, they do not behave the same way.

Figure: Daily 99% Value‑at‑Risk estimates for the S&P 500 using multiple models. Differences across models highlight model risk.

Looking at the chart, a clear pattern emerges.

· Historical Simulation (HS) moves in steps, remaining unchanged until extreme observations enter or leave the dataset

· EWMA reacts immediately to new market conditions, rising and falling with recent volatility

· GARCH adjusts more gradually, capturing longer-term volatility patterns

· t‑GARCH tends to sit higher, reflecting the possibility of extreme losses

· Moving Average (MA) remains relatively smooth and stable, showing gradual changes as overall market volatility evolves.

· EVT appears more irregular, with occasional sharper movements reflecting its focus on extreme losses rather than typical market conditions.

Risk is not just a number — it’s a dynamic process.

Why the Models Don’t Agree

If risk were easy to measure, all models would converge to the same estimate.

They don’t.

Each model is built on a different view of how markets behave:

Some assume the future will resemble the past

Some assume returns follow a stable distribution

Some are designed specifically to capture rare, extreme events

Financial markets exhibit all of these behaviors — at different times.

As a result, no single model can fully capture reality.

This Difference Has a Name: Model Risk

The variation across models is not noise — it is model risk.

Model risk arises when different methods produce different answers to the same question. based on defining model risk as the ratio of the highest to the lowest risk forecast. If all models agree, the model risk is one.

At times, the gap between models is small, suggesting stability.
At other times, the gap widens — signaling uncertainty, particularly in extreme scenarios.

Rather than being ignored, this variation should be interpreted.

It contains information.

How Model Risk Evolved Over Time

Looking across the sample period, model risk itself is not constant — it changes over time.

At some points, all models produce very similar estimates. This suggests a relatively stable market environment where different assumptions lead to similar conclusions.

At other times, the gap between models widens.

Figure: Model risk over time, measured as the ratio of the highest to lowest VaR across models. Larger values indicate greater disagreement between models.

In this analysis, the ratio between the highest and lowest risk estimates (a simple measure of model risk) typically ranged between roughly 1.05 and 1.45.

This means that, depending on the model used, estimated risk could differ by as much as 40–45% on the same day.

Importantly, these differences are not random.

They tend to increase when:

market conditions are changing

volatility is shifting

or extreme events become more relevant

In other words, model disagreement is often highest when it matters most.

Understanding the Patterns Beneath the Numbers

One of the most revealing insights comes from how each model responds to change:

   HS appears stable — until it suddenly isn’t
   MA changes gradually, reflecting broader shifts in average volatility
   EWMA is highly sensitive to recent shocks
   GARCH smooths volatility into persistent trends
   t‑GARCH quietly signals elevated tail risk
   EVT reacts to extremes, making tail events more visible than ordinary fluctuations

These behaviors are not quirks.

They are direct consequences of the underlying mathematics.

Understanding these patterns allows us to move beyond simply reading numbers — and toward interpreting what they mean.

So What Is “The” Risk?

There isn’t one.

There are multiple, equally valid estimates — each shaped by its assumptions.

Reducing risk to a single number creates a false sense of precision.

A range of estimates provides a more honest picture.

Methodology

This analysis uses daily S&P 500 data and standard risk models commonly used in finance.

Risk Measure: 99% Value‑at‑Risk (VaR)

Time Horizon: 1 day

Estimation Window: 1000 trading days

Six models were applied:

- Historical Simulation (HS): estimates risk directly from past returns by taking the worst observed losses.
- Moving Average (MA): assumes returns are normally distributed and estimates risk using average historical volatility.
- EWMA: gives more weight to recent data, allowing risk estimates to respond quickly to changing market conditions.
- GARCH: models volatility as time‑varying, capturing the tendency of markets to cluster between calm and turbulent periods.
- t‑GARCH: extends GARCH by allowing for extreme events, producing higher risk estimates when tail risk is significant.
- EVT: focuses specifically on rare, extreme losses by modeling the tail of the distribution.

All models were applied consistently to the same dataset to isolate the effect of methodology.

The Real Lesson for Risk Management

The most important insight isn’t which model is “correct.”

It’s that none of them are — at least not on their own.

In practice, risk management is not about finding a single perfect estimate.
It is about understanding the range of possible outcomes and the assumptions behind them.

When models agree, confidence increases.
When they diverge, attention is required.

The danger lies not in volatility, but in overconfidence — in believing that risk has been fully captured by a single number.

The professionals who manage risk most effectively are not those who trust one model.

They are the ones who understand why the models disagree — and what that disagreement is trying to tell them.

In a follow‑up post, I’ll explore Expected Shortfall (ES) — a risk measure gaining prominence under Basel III — and how it compares to VaR in practice.

References

[1] Danielsson, J., James, K., Valenzuela, M., & Zer, I. (2016). Model risk of risk models. Journal of Financial Stability, 23, 79–91. https://doi.org/10.1016/j.jfs.2016.02.002

[2] Danielsson, J. (n.d.). Technical details. Extreme Risk. https://extremerisk.org/technical-details/

[3] Danielsson, J. (2011). Financial Risk Forecasting. John Wiley & Sons.

[4] Danielsson, J. (2022). The Illusion of Control. Yale University Press. https://yalebooks.yale.edu/book/9780300234817/the-illusion-of-control/