01Foundations: what a market regime is
A regime is a persistent state of the market in which returns share stable statistical properties — and persists long enough that conditioning your strategy on it pays for the lag in detecting it. Foundational
Four regime axes matter for a discretionary trader's rules. This handbook focuses on the first two, because they decide the trend-versus-reversion question directly; the other two are overlays you bolt on later.
| Axis | Low / "A" state | High / "B" state | What it changes |
|---|---|---|---|
| Directional structure | Mean-reverting / range | Trending | Whether to fade or follow |
| Volatility | Quiet, low realized vol | Stressed, vol-clustered | Position size, stop distance |
| Correlation / risk | Risk-on, dispersed | Risk-off, everything-correlates | Cross-asset hedges, FX carry |
| Liquidity / session | Deep (cash hours) | Thin (off-hours, holidays) | Spread, slippage, gap risk |
The autocorrelation duality
Strip the indicators away and the entire trend/reversion question is one number: the autocorrelation of returns.
ρ(k) = corr(r_t, r_t-k) # lag-k autocorrelation
ρ(1) > 0 → returns persist → TREND (follow)
ρ(1) ≈ 0 → independent → RANDOM WALK (no directional edge)
ρ(1) < 0 → returns reverse → MEAN REVERSION (fade)
The null hypothesis is the random walk: ρ(k) = 0 for all k, no exploitable structure. Trend-following bets you can reject the null upward; mean-reversion bets you can reject it downward. Almost every detector here — variance ratio, Hurst, ADX, HMM — is a different lens on this same quantity, trading off responsiveness, robustness, and lag.
The equity-vs-FX asymmetry
The single most important framing for your two instruments: a CFD equity index and an FX major are not the same kind of time series, and they do not get the same regime toolkit.
| Property | S&P 500 CFD | GBPUSD / EURUSD |
|---|---|---|
| Structural drift | Positive — equity risk premium, earnings growth, inflation, index survivorship | ~Zero — relative value of two fiat currencies |
| Long-run return | Upward with fat left tail (crashes) | Roughly driftless random walk, regime-dependent autocorrelation |
| Dominant regime question | "Bull/quiet (stay long) vs bear/stressed (flat)?" — asymmetric | "Trending (follow) vs ranging (fade)?" — symmetric |
| Where the edge lives | Drawdown avoidance, not direction prediction | Regime classification → strategy selection |
| Natural strategy | Long-biased trend-following + crash filter | Trend-following and mean-reversion, switched by regime |
| Verdict | Don't short the drift; filter for when to be out | Don't marry one style; detect the regime and switch |
This is why a long-only S&P 500 hypothesis is rational — you're harvesting a real risk premium — while a symmetric mean-reversion-short on the same index fights structural drift. On FX majors there is no drift to harvest, so the entire game is classifying trend versus range and deploying the matching engine. Internalize this and most of Part 6 writes itself.
Stylized facts that break naive models
Returns violate textbook assumptions in consistent ways. A regime model that ignores these is fitting noise.
| Fact | Meaning | Consequence for regime work |
|---|---|---|
| Fat tails | Returns are leptokurtic; 5σ moves happen | Gaussian models under-price crash regimes |
| Volatility clustering | Large moves follow large moves | Vol regimes are real and persistent — the easiest regime to detect |
| Leverage effect | Vol rises more on down moves (equities) | Asymmetric GARCH for the S&P; weaker on FX |
| Non-stationarity | Parameters drift over time | A single static fit is wrong; refit on a rolling window |
| Regime persistence | States last weeks-to-months, then switch | Justifies the approach — but switches are few, so per-regime samples are small |
02Data & preprocessing
Regime detection is exquisitely sensitive to data quality and to look-ahead leaking in through preprocessing. Get this layer wrong and every downstream model is measuring artefacts. Foundational → Intermediate
What your data actually is
| Source | Instrument fit | Notes |
|---|---|---|
| OANDA v20 API | GBPUSD, EURUSD, S&P 500 CFD (SPX500_USD) | Single production feed — historical + live + execution from one vendor avoids feed-divergence artefacts. Bid/ask/mid candles; granularities S5 to month. |
| Dukascopy | FX tick data | Free historical tick / L1 for FX; useful for non-time bars and realistic spread modelling in research. Not your live feed. |
S&P 500 CFD mechanics
A cash-index CFD is not the index. Three mechanics directly affect a long-only hold:
- Overnight financing. Holding a CFD long incurs a daily funding charge ≈ (rate + markup) × notional / 360. In a positive-rate environment the long pays to carry. A filter that flattens during bear/high-vol stretches saves financing and avoids drawdown — a hidden second benefit.
- Dividend adjustments. On index ex-dividend days, long CFD holders receive an adjustment (shorts pay). It partly offsets financing; your backtest must credit it or you'll understate long returns.
- Sessions & gaps. The CFD trades nearly 24/5 with a maintenance break and thin off-hours liquidity; spreads widen outside US cash hours. Model spread by session, not as a constant.
For GBPUSD/EURUSD: no dividends, but swap/rollover applies overnight, the spread is tight in the London/NY overlap and wide in the Asian session, and there are weekend gaps. A volatility estimator that ignores the gap will misread Monday.
Bars: time is the worst sampling clock
| Bar type | Built on | Statistical property | Use |
|---|---|---|---|
| Time bars (M5, H1, D1) | Fixed clock | Heteroskedastic, autocorrelated; oversample quiet periods | Default, convenient |
| Tick bars | N transactions | Closer to i.i.d. returns | Microstructure |
| Volume bars | N units traded | Better-behaved; sync to activity | When volume is reliable |
| Dollar bars | N notional traded | Most robust across price-level shifts | Long backtests across regimes |
Spot FX has no true volume, so dollar/volume bars are approximate (tick count proxies activity). For a first build, time bars are acceptable — just know they inflate quiet-period sample counts and bias any test that assumes i.i.d. returns.
Returns, stationarity, and the memory trade-off
- Log returns (r_t = ln(P_t/P_t-1)) over simple: additive, symmetric, better-behaved. Model on these; convert to simple for P&L.
- Stationarity. Test with ADF (null: unit root) and KPSS (null: stationary) — complementary, run both. Returns are usually stationary; prices are not.
- The differencing dilemma. Differencing prices to returns makes them stationary but destroys memory. Fractional differentiation differences by a fractional order d ∈ (0,1), the minimum needed to pass ADF while keeping maximum memory. Worth knowing when a model needs both; overkill for a first build.
Volatility estimators
Vol is the input to vol-regimes, sizing, and stop distance. Pick by what data you have and whether gaps matter.
| Estimator | Inputs | Strength | Weakness |
|---|---|---|---|
| Close-to-close | Closes | Simple, unbiased | Noisy; ignores intrabar range |
| Parkinson | High, Low | ~5× more efficient than CC | Ignores gaps & drift |
| Garman-Klass | OHLC | More efficient still | Assumes no gaps, no drift |
| Rogers-Satchell | OHLC | Drift-independent | Ignores gaps |
| Yang-Zhang | OHLC + prev close | Handles gaps and drift; most efficient | Most complex |
| ATR (Wilder) | OHLC | Robust, trader-native | Not an annualizable variance |
Features for regime detection
Keep the set small and interpretable:
- Returns (log) and a rolling mean (drift proxy).
- Realized vol (Yang-Zhang) and vol-of-vol.
- Trend strength — ADX, efficiency ratio, R² of a rolling linear regression on price.
- Autocorrelation features — rolling lag-1 ρ, variance ratio.
- Distance from a slow MA (z-scored) — stretch / reversion pressure.
03Classical regime detection: statistical tests
The "is this series trending or reverting?" test battery — model-light, interpretable, and the right first answer before reaching for HMMs. Intermediate
Variance Ratio test (Lo–MacKinlay)
The canonical random-walk test. If returns are i.i.d., the variance of a q-period return is exactly q × the variance of a 1-period return.
VR(q) = 1 → random walk
VR(q) > 1 → positive autocorrelation → TREND / momentum
VR(q) < 1 → negative autocorrelation → MEAN REVERSION
Use the heteroskedasticity-robust statistic (returns aren't homoskedastic). Sweep q (2, 4, 8, 16, 32) and plot — the shape tells you the horizon at which structure appears.
r = np.diff(logp); n = len(r); mu = r.mean()
var1 = ((r - mu)**2).sum() / (n - 1)
rq = logp[q:] - logp[:-q] # overlapping
m = q * (n - q + 1) * (1 - q / n)
return (((rq - q*mu)**2).sum() / m) / (q * var1)
Hurst exponent
Summarizes persistence in one scalar via how the range of the series scales with horizon.
H < 0.5 → anti-persistent → MEAN REVERTING
H = 0.5 → random walk
H > 0.5 → persistent → TRENDING
Two estimators: classic R/S analysis (rescaled range) and DFA (detrended fluctuation analysis), which is more robust on non-stationary series.
Stationarity / mean-reversion: ADF
The Augmented Dickey-Fuller test doubles as a mean-reversion detector: rejecting the unit-root null on a price (or a spread) says the series is stationary, i.e. mean-reverting. This is the engine behind cointegration testing below.
Ornstein–Uhlenbeck: how fast does it revert?
If a series is mean-reverting, the actionable question is the half-life — how long to revert halfway to the mean. Model it as an OU process:
ΔX_t = λ · X_t-1 + c + ε_t # OLS: regress change on lagged level
half_life = − ln(2) / λ # λ < 0 for a reverting series
A half-life of 3 bars → fast reversion, scalp it. A half-life of 200 bars → barely reverting, don't. This number sizes your mean-reversion lookbacks and tells you whether a pair is tradeably reverting at all.
lag = series[:-1]; delta = np.diff(series)
beta = OLS(delta, add_constant(lag)).fit().params[1]
return -np.log(2) / beta
Cointegration (for pairs / stat-arb regimes)
A single FX major rarely mean-reverts cleanly, but a combination of correlated instruments can. Cointegration means a linear combination of non-stationary series is stationary.
| Method | What it does | When |
|---|---|---|
| Engle–Granger | Regress y on x, ADF-test the residual | Two series, quick screen |
| Johansen | VECM; trace & max-eigenvalue tests; gives the cointegrating vector | ≥2 series, or you need the hedge ratio estimated properly |
04Indicator-based regime filters
The practitioner toolkit: fast, transparent, no model-fitting, and what a discretionary trader actually puts in a rule. These are filters — they gate a strategy on/off — more than predictors. Foundational → Intermediate
| Indicator | Measures | Regime read | Typical threshold | Watch-out |
|---|---|---|---|---|
| ADX / DMI (Wilder) | Trend strength, not direction | High = trending, low = ranging | > 25 trend, < 20 range | Lags hard; rises after the move |
| Efficiency Ratio (Kaufman) | Signal-to-noise of price travel | Near 1 = clean trend, near 0 = chop | > 0.3–0.4 trend | Lookback-sensitive |
| Choppiness Index | Range vs directional travel | High = consolidating, low = trending | > 61.8 chop, < 38.2 trend | Mean-reverting itself; use as a band |
| MA slope / ribbon | Direction + persistence | Stacked & sloping = trend | Slope sign + stack order | Whipsaws in transitions |
| Lin-regression R² | How line-like recent price is | High R² = trend | > 0.5 | Window-sensitive |
| Bollinger / Keltner width | Vol compression/expansion | Squeeze → low vol, often pre-breakout | Width percentile | Direction-blind |
The two worth implementing first
ADX is the trader-native trend-strength gate, built from directional movement (DI+, DI−): DX = 100·|DI+ − DI−|/(DI+ + DI−), then Wilder-smoothed into ADX. Above ~25 the market trends with strength (direction from DI+/DI−); below ~20 it ranges. Its weakness is lag — ADX confirms a trend already underway.
Kaufman's Efficiency Ratio is underrated for regime gating and computationally trivial:
# numerator = net directional travel over n bars
# denominator = total path length
ER → 1 : price went straight (trending)
ER → 0 : price wandered (ranging)
change = np.abs(price[n:] - price[:-n])
path = np.array([np.abs(np.diff(price[i-n:i+1])).sum()
for i in range(n, len(price))])
return change / path
Volatility-regime filters
A high/low vol classifier from realized-vol (Yang-Zhang) percentile, ATR percentile, or a Bollinger/Keltner squeeze (Bollinger bands inside Keltner channels = compression). On equities the low-vol regime overlaps the uptrend, so a vol filter often is a usable bull/bear filter. On FX the link is weaker — vol regime and trend regime are more independent, so you need both.
05Probabilistic & ML regime models
Where you move from "estimate the property" to "infer a hidden state with a probability and a transition model." More power, more ways to fool yourself. Advanced
Markov regime-switching (Hamilton)
The econometric workhorse. An autoregressive model whose parameters (mean, variance) switch according to an unobserved Markov chain, estimated by maximum likelihood (Hamilton filter / EM). Output: the filtered/smoothed probability of each regime at each time, plus a transition matrix (persistence). A 2-state switching-variance model on S&P returns separates "quiet drift" from "stressed" almost out of the box.
switching_variance=True)
res = mod.fit()
p_stress = res.smoothed_marginal_probabilities[1] # P(stressed)
Hidden Markov Models (HMM)
The same idea in HMM language, usually fit on features (returns + realized vol). Gaussian emissions per state; Baum-Welch (EM) to fit; Viterbi for the single most-likely state path; forward-backward for per-bar state probabilities.
hmm = GaussianHMM(n_components=3, covariance_type="full", n_iter=200)
hmm.fit(X)
states = hmm.predict(X) # Viterbi path
probs = hmm.predict_proba(X) # per-bar state probs
Gaussian Mixture Models
A "static" cousin of the HMM — soft-clusters return/vol observations into regimes without a transition model. Cheaper, no temporal structure; good for labelling the current observation, weak on persistence.
GARCH family — volatility regimes specifically
The right tool when the regime you care about is volatility.
For equities use an asymmetric variant (EGARCH, GJR-GARCH) to capture the leverage effect — vol rises more on down days. Markov-switching GARCH combines discrete vol regimes with GARCH dynamics. Library: arch (Kevin Sheppard) — mature and correct.
Unsupervised clustering
K-means, hierarchical, or DBSCAN on engineered features (return, vol, trend strength) to discover regimes you didn't pre-specify. Useful for exploration and hypothesis generation; weak for live switching because clusters lack a transition model and are unstable to refits. A research lens, not a production switch.
Change-point detection
Instead of classifying every bar, detect when the regime breaks.
| Method | Mode | Library |
|---|---|---|
| Bayesian online CPD (Adams–MacKay) | Online, probabilistic run-length | custom / community |
| PELT | Offline, exact, linear-time, penalized | ruptures |
| Binary segmentation | Offline, greedy, fast | ruptures |
breakpoints = algo.predict(pen=10) # indices where the regime shifts
BOCPD is the one to know for live use — it updates a posterior over "bars since last change" each new bar, fully causal.
Kalman filters & trend filtering
Extract a smooth, time-varying trend and slope from noisy price — the slope's sign and magnitude is a continuous regime read.
- Kalman / local-linear-trend DLM — estimates level + slope online; the slope is a causal trend signal. Also gives a dynamic hedge ratio for cointegrated pairs. Libraries: pykalman (works, lightly maintained), filterpy.
- L1 trend filtering (Kim–Koh–Boyd) — fits a piecewise-linear trend by penalizing the second difference; the kinks are regime changes. Offline.
- HP filter (Hodrick-Prescott) — smooth trend, but two-sided → look-ahead. Fine for describing history, unsafe as a live signal unless one-sided.
Supervised ML for regime
Turn regime into a labelled prediction problem.
- Label. The hard part. Triple-barrier (López de Prado): label each observation by which barrier — profit-take, stop, or time — price hits first. Meta-labeling: a secondary model predicts whether to act on a primary signal, sizing rather than directing.
- Model. Gradient-boosted trees (LightGBM/XGBoost) on the Part 2 feature set predict regime or next-period return sign. Inspect feature importance.
Deep learning — know it, default against it
LSTM/Temporal-CNN/Transformer classifiers and autoencoders (reconstruction error spikes flag new regimes) can model regimes. For FX/single-index regime detection they are usually the wrong call: data-hungry, opaque, prone to overfitting on the few hundred regime transitions a decade contains, and rarely beating a well-built HMM or vol-target. Reach for them only with strong out-of-sample evidence and a reason simpler models failed.
Method selection
| Method | Interpretable | Causal / online | Data need | Stability | Best for |
|---|---|---|---|---|---|
| Variance ratio / Hurst | High | Yes (windowed) | Low | Medium | First answer, screening |
| ADX / Efficiency Ratio | High | Yes | Low | High | Trader-facing gate |
| Markov-switching / HMM | Medium | Yes (if filtered) | Medium | Low–Med | Probabilistic state + transitions |
| GARCH family | Medium | Yes | Medium | High | Volatility regime |
| Change-point (BOCPD/PELT) | Medium | BOCPD yes | Medium | Medium | Detecting the break |
| Kalman / trend filter | Medium | Yes | Low | High | Causal slope, dynamic hedge ratio |
| Supervised ML | Low–Med | Yes | High | Low | Rich features + clean labels |
| Deep learning | Low | Yes | Very high | Low | Rarely justified here |
| Verdict | Start with ADX/ER + variance ratio; add a 2–3 state HMM or MS-variance model; get the vol regime from GARCH or Yang-Zhang. Treat ML/DL as last resorts. | ||||
06From detection to strategy
Detection is worthless until it changes a position. Three ways to wire a regime signal into a strategy. Intermediate → Advanced
| Wiring | What it does | Risk |
|---|---|---|
| Filter | Turns one strategy on/off (trade only in-regime) | Simplest; binary flip-flop cost |
| Switch | Selects which strategy runs (trend vs reversion) | Two engines to maintain; transition gaps |
| Sizing input | Scales exposure by regime confidence / vol | Smoothest; needs a calibrated probability |
Worked example A — S&P 500 CFD, long-only, bull-regime filter
Hypothesis: the S&P 500 carries a persistent positive drift (equity risk premium), so a long-only exposure that steps aside during bear/high-vol regimes should keep most of the upside with far less drawdown.
Step 1 — validate the drift before building anything.
- Mean log return significantly > 0 over the full sample (t-test on returns)?
- Variance ratio > 1 at multi-week horizons (momentum, not noise)?
- Regime persistence: how long do bull vs bear states last (transition matrix from a 2-state fit)? Long persistence → a filter has time to pay off.
- Conditional drift: is the mean return in the quiet regime materially higher — and drawdown lower — than in the stressed regime? If not, there is no regime edge to capture.
Step 2 — candidate regime filters (simplest first):
| Filter | Logic | Strength | Weakness |
|---|---|---|---|
| 200-day SMA | Long only when price > SMA200 | Dead simple, robust, well-documented | Late re-entries; whipsaws in choppy bear rallies |
| 2-state HMM / MS-variance | Long when P(quiet) > τ | Probabilistic, earlier vol detection | Refit/label discipline required |
| Vol-target overlay | Scale to target_vol / realized_vol | Crash-protective; vol spikes precede drawdowns | Can de-risk into a sharp V-recovery |
| Combo | price > SMA200 and vol-target sizing | Direction gate + magnitude control | More moving parts |
Step 3 — the CFD economics tilt the verdict. Because a long CFD pays overnight financing, a filter that flattens during the worst (bear/high-vol) stretches saves carry and avoids drawdown — two benefits. But every flip costs spread, and off-hours spread is wide, so don't evaluate on signals only: subtract financing, credit dividend adjustments, and charge realistic session-dependent spread.
Step 4 — what "success" looks like. On a drifting index, a good regime filter usually delivers similar CAGR with materially lower max drawdown → a higher Calmar/Sortino, not a higher raw return. If your filtered version also beats buy-and-hold on CAGR, be suspicious of look-ahead before you celebrate.
- Universe: S&P 500 CFD (SPX500_USD), daily.
- Regime gate: long-enabled when close > SMA(200) and P(quiet) > 0.6 from a 2-state MS-variance model (filtered, expanding-window refit).
- Sizing: position = clip(target_vol / YZ_vol(20), 0, max_lev).
- Exit-to-flat: gate off — below SMA200 or P(stressed) > 0.6.
- Costs: session-dependent spread + daily financing − dividend adjustments.
- Benchmark: unfiltered buy-and-hold CFD, same costs. Judge on Calmar and max-DD first.
Worked example B — GBPUSD / EURUSD, trend-vs-reversion switch
Step 1 — classify the regime with an ensemble (they fail differently):
- ADX > 25 and ER > 0.35 → trending.
- Choppiness > 61.8 / ADX < 20 / Hurst < 0.5 → ranging.
- Optionally arbitrate with a 2-state HMM on (returns, vol) and require agreement before switching.
Step 2 — deploy per regime:
| Regime | Engine (illustrative) | Entry idea | Exit anchor |
|---|---|---|---|
| Trending | Trend-following, both directions | Donchian/MA breakout in DI direction | ATR-multiple trailing stop |
| Ranging | Mean-reversion | Fade extremes: |z| > 2, gated by a short OU half-life | Reversion to mean / opposite band; ATR stop |
Step 3 — handle the switch. Don't flip instantly on a one-bar threshold cross — that's whipsaw. Require N-bar persistence or an HMM probability > τ, and scale down the outgoing engine while scaling up the incoming one rather than a hard cut. Sizing by regime confidence beats a binary switch here too.
- Regime score: trend_score = 0.5·norm(ADX) + 0.5·ER ; range_score = 1 − trend_score.
- Engine weights: w_trend = sigmoid(k·(trend_score − 0.5)) ; w_range = 1 − w_trend.
- Trend engine: Donchian(20) breakout, ATR(14)×2 trailing stop, both directions.
- Range engine: enter on |z(price, 20)| > 2 only if half_life < 30 bars; exit at mean.
- Net position: w_trend · pos_trend + w_range · pos_range, capped by a vol-target.
The lag/whipsaw problem (applies to both)
Every detector lags; switching costs spread. Mitigations: a persistence requirement before acting; continuous sizing over binary flips; ensemble agreement to cut false switches; and explicitly budgeting the switching cost in the backtest — if turnover eats the regime edge, the filter is net-negative no matter how clever the detector.
07Backtesting & validation
Regime strategies are unusually easy to overfit — the regime label is itself fitted, and the number of true regime transitions in any sample is small. This is the discipline that keeps you honest. Advanced
The cardinal sin: look-ahead in the regime label
The label must be computable using only past data at every point.
- Expanding/rolling refit, not full-sample fit. Fitting an HMM/Markov model on the entire history then "backtesting" over the same history uses future data to set the present regime. At bar t, the model may only have seen data up to t.
- Filtered, not smoothed, probabilities. Smoothed probabilities at t use observations after t. Causal use requires filtered probabilities.
- Indicator warmup & no peeking. ADX/ER/MA all need warmup; the regime at t uses values computed at t with no centred windows.
Out-of-sample protocol
| Technique | What it adds | Note |
|---|---|---|
| Train/test split | Baseline OOS check | Necessary, not sufficient |
| Walk-forward (anchored or rolling) | Repeated OOS across time; mirrors live refitting | The default for regime strategies |
| Purged k-fold CV + embargo | Removes train samples whose labels overlap the test set (purge) + a gap after (embargo) to kill serial-correlation leakage | Essential when labels span multiple bars |
| Combinatorial purged CV | Many backtest paths → a distribution of performance, not one lucky path | Best defence against path-dependence |
Did the edge survive multiple testing?
You tried many thresholds, lookbacks, and state counts. Some "worked" by chance.
- Deflated Sharpe Ratio (Bailey–López de Prado) — discounts the observed Sharpe for the number of trials, non-normal returns, and sample length. A Sharpe of 1.5 after 200 configurations is not a Sharpe of 1.5.
- Probability of Backtest Overfitting (PBO) — via combinatorially symmetric cross-validation, estimates the chance your "best" config is in-sample luck.
Metrics — measure per regime, not just overall
| Metric | Why it matters here |
|---|---|
| Sharpe / Sortino by regime | A strategy can be brilliant in-regime and disastrous out — the blend hides it |
| Calmar (CAGR / max-DD) | The right headline for the S&P long-only filter — drawdown is the product |
| Max drawdown & duration | The thing the filter exists to reduce |
| Hit rate & payoff by regime | Trend engines: low hit rate, high payoff; reversion: the reverse |
| Turnover & switching count | Directly tied to the whipsaw cost that kills regime strategies |
Costs are not optional
Charge spread (session-dependent), slippage, CFD overnight financing and dividend adjustments (S&P) or swap (FX). Then charge the switching turnover explicitly — the cost of every regime flip. A regime filter that's profitable gross and unprofitable net of switching costs is a negative-edge strategy with extra steps.
Robustness
- Parameter sensitivity — vary thresholds/lookbacks ±20%; a real edge degrades gracefully, an overfit one falls off a cliff.
- Regime stability over time — does the detector identify the same kind of regime across decades, or has its meaning drifted?
- Structural breaks — check performance across known macro breaks (2008, 2020, 2022) separately.
08Tools, libraries & stack
Python-first, matched to a typical quant backend. Maturity/maintenance is current to draft time — re-check at publish, since several of these move. Reference
| Category | Library | Use | Maturity / maintenance | License |
|---|---|---|---|---|
| Data | pandas | Frames, time series | Mature, active | BSD |
| polars | Fast columnar; large-history backtests | Mature, very active | MIT | |
| OANDA v20 | Production feed (hist + live) | Vendor SDK | Vendor | |
| dukascopy tools | Free FX tick history (research) | Community, varies | varies | |
| Indicators | TA-Lib | Indicator / oracle engine (C-fast, stable) | Mature, stable; Polars-compatible | BSD |
| pandas-ta | (removed from runtime) | Effectively unmaintained — avoid | MIT | |
| Stats / econometrics | statsmodels | Markov-switching, ADF/KPSS, cointegration | Mature, active | BSD |
| arch | GARCH/EGARCH/GJR, variance ratio, bootstraps | Mature, active (K. Sheppard) | NCSA | |
| scipy / numpy | Numerics, distributions | Mature, active | BSD | |
| Regime-specific | hmmlearn | Gaussian HMM (Viterbi, forward-backward) | Active, lean | BSD |
| ruptures | Change-point (PELT, BinSeg) | Active | BSD | |
| pykalman / filterpy | Kalman / DLM (slope, dynamic hedge ratio) | pykalman light, filterpy active | BSD/MIT | |
| hurst / nolds | Hurst / DFA estimators | Community | MIT | |
| ML | scikit-learn | Clustering, GMM, pipelines | Mature, active | BSD |
| lightgbm / xgboost | Gradient boosting on features | Mature, active | MIT/Apache | |
| river | Online/streaming ML (causal by design) | Active | BSD | |
| Backtesting | vectorbt | Vectorized backtests, fast param sweeps | Active (OSS + Pro) | Apache (OSS) |
| backtesting.py | Simple event-driven backtests | Light, stable | AGPL — check | |
| nautilus_trader | Production-grade event engine | Active, heavy | LGPL | |
| Viz | matplotlib / plotly | Figures, regime overlays | Mature, active | BSD/MIT |
09Pitfalls & anti-patterns
The recurring ways regime work goes wrong, each with its fix. All levels
- Look-ahead in the regime fit. The #1 killer. Full-sample HMM fits, smoothed probabilities, centred indicator windows. → Expanding-window refit, filtered probabilities, causal indicators.
- Overfitting regime count / thresholds. Adding states or tuning a threshold until the backtest sings. → BIC/AIC for state count; deflated Sharpe and PBO for thresholds; out-of-sample everything.
- HMM label switching. "State 0" silently changing meaning across refits. → Map states to characteristics (sort by mean/variance) every fit.
- Regimes that exist only in-sample. A beautiful structure that vanishes out-of-sample. → CPCV; check the regime means the same thing across decades.
- Ignoring switching cost. A filter that's gross-profitable and net-negative once spread/turnover is charged. → Budget switching cost explicitly; prefer continuous sizing to binary flips.
- Confusing a vol filter for a trend filter. They coincide on equities, not on FX. → Test the directional regime separately from the vol regime on FX majors.
- Trusting a single Hurst/VR number. Noisy estimators read as precise switches. → Treat them as slow, soft indicators; cross-check estimators; ensemble.
- Fancy beats simple — usually backwards. A deep net that underperforms a 200-SMA filter. → Earn complexity with out-of-sample evidence; a simple ADX/vol filter is a serious baseline, not a strawman.
- The regime-is-itself-a-prediction trap. Treating the detected regime as ground truth rather than a lagging, uncertain estimate. → Size by confidence; never bet the account on a coin-flip probability.
- Non-stationary regimes. Assuming the regimes themselves are stable. → Re-validate periodically; markets change, and so do their states.
10References & further reading
The foundational sources behind every method in this handbook. Reference
Foundational texts
- Hamilton, Time Series Analysis — regime-switching, the canonical treatment.
- Lo & MacKinlay, A Non-Random Walk Down Wall Street — variance ratio, the random-walk tests.
- López de Prado, Advances in Financial Machine Learning — fractional differentiation, triple-barrier / meta-labeling, purged CV, deflated Sharpe, PBO.
- Chan, Algorithmic Trading: Winning Strategies and Their Rationale — mean reversion, cointegration, half-life in practice.
- Kaufman, Trading Systems and Methods — efficiency ratio, adaptive indicators, the practitioner toolkit.
- Ang, Asset Management — economic interpretation of regimes.
Key methods (search terms)
Hamilton (1989) Markov regime-switching · Adams & MacKay (2007) Bayesian online change-point · Kim, Koh & Boyd (2009) L1 trend filtering · Killick et al. PELT · Bailey & López de Prado deflated Sharpe / PBO · Yang & Zhang volatility estimator.
Glossary
- Autocorrelation — correlation of a return with its own lagged value; its sign decides trend vs reversion.
- Variance ratio — variance of a q-period return ÷ q × the 1-period variance; > 1 trend, < 1 reversion.
- Hurst exponent — persistence scalar; < 0.5 reverting, > 0.5 trending.
- Half-life — bars for a mean-reverting series to close half the gap to its mean.
- Filtered vs smoothed probability — filtered uses data up to t (causal); smoothed uses the whole sample (look-ahead).
- Calmar ratio — CAGR ÷ max drawdown; the headline metric for a drawdown-focused filter.
- Deflated Sharpe — Sharpe adjusted for multiple testing, non-normality, and sample length.
- Walk-forward — repeated train-then-test marching through time, mirroring live refits.
Keep reading
The Technical Analysis Handbook covers the indicator and structure foundations this handbook leans on in Part 4 — candlesticks, market structure, trend/momentum/volatility indicators, and confluence, each with exact rules.
Read the Technical Analysis Handbook →