Skip to content

markowitz.backtest

markowitz.backtest

markowitz.backtest: walk-forward rebalancing, turnover, and performance attribution.

AlignmentError

Bases: BacktestError

Raised when the universe of assets is inconsistent across inputs.

BacktestError

Bases: Exception

Base class for all errors raised inside markowitz.backtest.

BacktestResult(returns_gross: pd.DataFrame, returns_net: pd.DataFrame, weights: dict[str, pd.DataFrame], turnover: pd.DataFrame, rebalance_dates: list[pd.Timestamp] = list()) dataclass

Aggregated output of a :class:WalkForward run.

Attributes:

Name Type Description
returns_gross DataFrame

Per-period gross returns, one column per strategy.

returns_net DataFrame

Per-period net-of-cost returns.

weights dict[str, DataFrame]

Mapping strategy_name -> DataFrame of post-rebalance weights.

turnover DataFrame

Per-period turnover, one column per strategy.

rebalance_dates list[Timestamp]

Sorted list of dates on which weights were recomputed.

summary(*, ann: int = 12, gamma: float = 5.0) -> pd.DataFrame

Standard performance grid (Sharpe, Sortino, MDD, Calmar, CEQ, TO).

Source code in src/markowitz/backtest/result.py
def summary(self, *, ann: int = 12, gamma: float = 5.0) -> pd.DataFrame:
    """Standard performance grid (Sharpe, Sortino, MDD, Calmar, CEQ, TO)."""
    rows: list[dict[str, float]] = []
    for col in self.returns_net.columns:
        r = self.returns_net[col].dropna()
        rows.append(
            {
                "Sharpe": _stats.sharpe_ratio(r, ann=ann),
                "Sortino": _stats.sortino_ratio(r, ann=ann),
                "MaxDD": _stats.max_drawdown(r),
                "Calmar": _stats.calmar(r, ann=ann),
                "CEQ": _stats.ceq(r, gamma=gamma),
                "Turnover": float(self.turnover[col].mean()) if col in self.turnover else 0.0,
            }
        )
    return pd.DataFrame(rows, index=list(self.returns_net.columns))

DegenerateWindowError

Bases: BacktestError

Raised when a rolling window is rank-deficient or non-finite.

GMVSample

Global Minimum Variance with the sample covariance, closed form.

InsufficientHistoryError

Bases: BacktestError

Raised when fewer observations are available than the rolling window.

MaxSharpeNaive

Tangency portfolio with sample mean and sample covariance.

Falls back to :func:markowitz.optimizer.MeanVariance when available, otherwise uses the closed-form Sigma^{-1} (mu - rf) solution.

MissingMarketCapsError

Bases: BacktestError

Raised when a strategy that requires market caps cannot find them.

OneOverN

Equal-weight benchmark of DeMiguel, Garlappi, Uppal (2009).

RiskParity(*, max_iter: int = 500, tol: float = 1e-08)

Equal-risk-contribution portfolio via cyclical coordinate descent.

Solves the Spinu (2013) / Griveau-Billion-Richard-Roncalli (2013) convex programme min_x 0.5 x'Sigma x - sum_i b_i log(x_i) with b_i = 1/n (equal risk targets), then renormalizes to sum to one.

Source code in src/markowitz/backtest/strategies.py
def __init__(self, *, max_iter: int = 500, tol: float = 1e-8) -> None:
    self.max_iter = int(max_iter)
    self.tol = float(tol)

Strategy

Bases: Protocol

Static portfolio-construction protocol used by :class:WalkForward.

WalkForward(returns: pd.DataFrame, strategies: Mapping[str, Strategy], *, rebalance: str = 'M', lookback: int = 120, rf: pd.Series | float | None = None, cost_bps: float = 10.0, debug_no_lookahead: bool = False)

Rolling-window walk-forward backtester.

The engine slides a fixed-length lookback window across returns, fits every supplied strategy on the strictly in-sample slice, and holds the resulting weights until the next rebalance date.

Parameters:

Name Type Description Default
returns DataFrame

Wide DataFrame of per-period simple returns indexed by date.

required
strategies Mapping[str, Strategy]

Mapping label -> Strategy (any object implementing the :class:~markowitz.backtest.strategies.Strategy protocol).

required
rebalance str

Pandas offset alias. "M" (default) rebalances every period.

'M'
lookback int

Length of the rolling estimation window, in periods.

120
rf Series | float | None

Risk-free rate; scalar or Series aligned to returns.

None
cost_bps float

Proportional transaction cost in basis points per unit turnover.

10.0
debug_no_lookahead bool

If True, raise on any rebalance whose window contains an observation dated at-or-after the rebalance date. Used by the regression test-suite.

False
Source code in src/markowitz/backtest/walk_forward.py
def __init__(
    self,
    returns: pd.DataFrame,
    strategies: Mapping[str, Strategy],
    *,
    rebalance: str = "M",
    lookback: int = 120,
    rf: pd.Series | float | None = None,
    cost_bps: float = 10.0,
    debug_no_lookahead: bool = False,
) -> None:
    if returns.empty:
        raise AlignmentError("returns DataFrame is empty")
    if returns.isna().any().any():
        raise AlignmentError("returns DataFrame contains NaNs; please pre-clean")
    if lookback < 2:
        raise InsufficientHistoryError("lookback must be >= 2")
    if lookback >= len(returns):
        raise InsufficientHistoryError(f"lookback={lookback} >= len(returns)={len(returns)}")
    if not strategies:
        raise AlignmentError("at least one strategy is required")

    self.returns = returns.sort_index()
    self.strategies = dict(strategies)
    self.rebalance = rebalance
    self.lookback = int(lookback)
    self.cost_bps = float(cost_bps)
    self.debug_no_lookahead = bool(debug_no_lookahead)

    if rf is None:
        self.rf: pd.Series = pd.Series(0.0, index=self.returns.index)
    elif isinstance(rf, pd.Series):
        self.rf = rf.reindex(self.returns.index).fillna(0.0)
    else:
        self.rf = pd.Series(float(rf), index=self.returns.index)

run() -> BacktestResult

Execute the walk-forward loop and return a :class:BacktestResult.

Source code in src/markowitz/backtest/walk_forward.py
def run(self) -> BacktestResult:
    """Execute the walk-forward loop and return a :class:`BacktestResult`."""
    idx = self.returns.index
    assets = list(self.returns.columns)
    n_assets = len(assets)

    # Trading dates start strictly after the first feasible window end.
    trading_dates = idx[self.lookback :]
    # pragma: defensive guard; ctor already enforces lookback < len(returns)
    if len(trading_dates) == 0:  # pragma: no cover
        raise InsufficientHistoryError("no trading periods after lookback")

    rebal_mask = self._build_rebalance_mask(trading_dates)
    rebalance_dates: list[pd.Timestamp] = [
        t for t, flag in zip(trading_dates, rebal_mask, strict=True) if flag
    ]

    gross = pd.DataFrame(
        0.0, index=trading_dates, columns=list(self.strategies.keys()), dtype=float
    )
    turnover_df = pd.DataFrame(
        0.0, index=trading_dates, columns=list(self.strategies.keys()), dtype=float
    )
    weights_records: dict[str, dict[pd.Timestamp, np.ndarray]] = {
        name: {} for name in self.strategies
    }
    current_w: dict[str, np.ndarray] = {
        name: np.full(n_assets, 1.0 / n_assets) for name in self.strategies
    }

    for i, t in enumerate(trading_dates):
        pos = self.lookback + i
        window = self.returns.iloc[pos - self.lookback : pos]
        if self.debug_no_lookahead and len(window) > 0 and window.index.max() >= t:
            raise AssertionError(
                f"look-ahead detected: window ends at {window.index.max()} >= {t}"
            )
        r_t = self.returns.iloc[pos].to_numpy(dtype=float)
        rf_t = float(self.rf.iloc[pos])

        for name, strat in self.strategies.items():
            w_prev = current_w[name]
            if rebal_mask[i]:
                w_new = np.asarray(strat.fit(window, rf=rf_t), dtype=float).ravel()
                if w_new.shape[0] != n_assets:
                    raise AlignmentError(
                        f"strategy {name} returned {w_new.shape[0]} weights, "
                        f"expected {n_assets}"
                    )
                to_t = compute_turnover(w_new, w_prev, r_t)
                weights_records[name][t] = w_new
            else:
                w_new = w_prev
                to_t = 0.0
            gross.at[t, name] = float(np.dot(w_prev, r_t))
            turnover_df.at[t, name] = to_t

            # Drift weights forward for the next period.
            denom = 1.0 + float(np.dot(w_prev, r_t))
            if denom != 0.0 and np.isfinite(denom):
                drifted = w_prev * (1.0 + r_t) / denom
            else:
                drifted = w_prev.copy()
            # Apply this period's rebalance *after* booking the return.
            current_w[name] = w_new if rebal_mask[i] else drifted

    net = pd.DataFrame(index=trading_dates, columns=list(self.strategies.keys()), dtype=float)
    for name in self.strategies:
        net[name] = apply_transaction_costs(gross[name], turnover_df[name], bps=self.cost_bps)

    weights_frames: dict[str, pd.DataFrame] = {}
    for name, recs in weights_records.items():
        if recs:
            weights_frames[name] = pd.DataFrame.from_dict(
                recs, orient="index", columns=assets
            ).sort_index()
        else:  # pragma: no cover  # defensive: rebal_mask always sets index 0
            weights_frames[name] = pd.DataFrame(columns=assets)

    return BacktestResult(
        returns_gross=gross,
        returns_net=net,
        weights=weights_frames,
        turnover=turnover_df,
        rebalance_dates=rebalance_dates,
    )

apply_transaction_costs(gross_returns: pd.Series, turnover: pd.Series, *, bps: float = 10.0) -> pd.Series

Subtract proportional transaction costs from gross_returns.

Parameters:

Name Type Description Default
gross_returns Series

Per-period simple returns before costs.

required
turnover Series

Per-period one-sided turnover, aligned to gross_returns.

required
bps float

Round-trip-equivalent cost in basis points. The deduction applied to period t is (bps / 10_000) * turnover_t.

10.0
Source code in src/markowitz/backtest/costs.py
def apply_transaction_costs(
    gross_returns: pd.Series,
    turnover: pd.Series,
    *,
    bps: float = 10.0,
) -> pd.Series:
    """Subtract proportional transaction costs from ``gross_returns``.

    Parameters
    ----------
    gross_returns:
        Per-period simple returns *before* costs.
    turnover:
        Per-period one-sided turnover, aligned to ``gross_returns``.
    bps:
        Round-trip-equivalent cost in basis points. The deduction applied
        to period ``t`` is ``(bps / 10_000) * turnover_t``.
    """
    rate = float(bps) / 10_000.0
    aligned_to = turnover.reindex(gross_returns.index).fillna(0.0)
    return gross_returns.sub(aligned_to.mul(rate))

calmar(r: pd.Series, ann: int = 12) -> float

Calmar ratio: annualized arithmetic mean / |max drawdown|.

Returns 0.0 when the drawdown is zero.

Source code in src/markowitz/backtest/stats.py
def calmar(r: pd.Series, ann: int = 12) -> float:
    """Calmar ratio: annualized arithmetic mean / |max drawdown|.

    Returns 0.0 when the drawdown is zero.
    """
    mdd = max_drawdown(r)
    if mdd == 0.0:
        return 0.0
    ann_ret = float(r.mean()) * ann
    return float(ann_ret / abs(mdd))

ceq(r: pd.Series, gamma: float) -> float

Per-period certainty-equivalent return under quadratic utility.

CEQ = mu - 0.5 * gamma * sigma^2.

Source code in src/markowitz/backtest/stats.py
def ceq(r: pd.Series, gamma: float) -> float:
    """Per-period certainty-equivalent return under quadratic utility.

    ``CEQ = mu - 0.5 * gamma * sigma^2``.
    """
    mu = float(r.mean())
    var = float(r.var(ddof=1))
    return float(mu - 0.5 * gamma * var)

compute_turnover(weights_new: np.ndarray, weights_prev: np.ndarray, returns_between: np.ndarray) -> float

Return one-sided drift-adjusted turnover.

Let w_prev be the weights at the start of the period, r the realized asset returns over that period and w_new the target weights set at the rebalance date. The portfolio drifts to

w_drift = w_prev * (1 + r) / (1 + w_prev . r)

and the turnover charged on the rebalance is

TO = 0.5 * sum(|w_new - w_drift|).

Source code in src/markowitz/backtest/turnover.py
def compute_turnover(
    weights_new: np.ndarray,
    weights_prev: np.ndarray,
    returns_between: np.ndarray,
) -> float:
    """Return one-sided drift-adjusted turnover.

    Let ``w_prev`` be the weights at the start of the period, ``r`` the
    realized asset returns over that period and ``w_new`` the target
    weights set at the rebalance date. The portfolio drifts to

    ``w_drift = w_prev * (1 + r) / (1 + w_prev . r)``

    and the turnover charged on the rebalance is

    ``TO = 0.5 * sum(|w_new - w_drift|)``.
    """
    w_prev = np.asarray(weights_prev, dtype=float).ravel()
    w_new = np.asarray(weights_new, dtype=float).ravel()
    r = np.asarray(returns_between, dtype=float).ravel()
    if w_prev.shape != w_new.shape or w_prev.shape != r.shape:
        raise ValueError("weights_prev, weights_new and returns_between must align")
    denom = 1.0 + float(np.dot(w_prev, r))
    if denom == 0.0 or not np.isfinite(denom):
        # Catastrophic drawdown: charge full rebalance.
        return 0.5 * float(np.sum(np.abs(w_new - w_prev)))
    w_drift = w_prev * (1.0 + r) / denom
    return 0.5 * float(np.sum(np.abs(w_new - w_drift)))

jobson_korkie_memmel(r_i: pd.Series, r_n: pd.Series) -> tuple[float, float]

Jobson-Korkie test with the Memmel (2003) correction.

Tests :math:H_0: Sharpe(:math:r_i) = Sharpe(:math:r_n) against a two-sided alternative. Returns (z, p_value).

Identical input series yield z == 0 and p_value == 1.

Source code in src/markowitz/backtest/stats.py
def jobson_korkie_memmel(r_i: pd.Series, r_n: pd.Series) -> tuple[float, float]:
    """Jobson-Korkie test with the Memmel (2003) correction.

    Tests :math:`H_0`: Sharpe(:math:`r_i`) = Sharpe(:math:`r_n`) against a
    two-sided alternative. Returns ``(z, p_value)``.

    Identical input series yield ``z == 0`` and ``p_value == 1``.
    """
    joined = pd.concat([r_i, r_n], axis=1, join="inner").dropna()
    if joined.shape[0] < 3:
        return 0.0, 1.0
    a = joined.iloc[:, 0].to_numpy(dtype=float)
    b = joined.iloc[:, 1].to_numpy(dtype=float)
    t_obs = a.shape[0]

    mu_i = float(np.mean(a))
    mu_n = float(np.mean(b))
    sig_i = float(np.std(a, ddof=1))
    sig_n = float(np.std(b, ddof=1))
    cov_in = float(np.cov(a, b, ddof=1)[0, 1])

    if sig_i == 0.0 or sig_n == 0.0:
        return 0.0, 1.0

    delta = sig_n * mu_i - sig_i * mu_n
    theta = (1.0 / t_obs) * (
        2.0 * (sig_i**2) * (sig_n**2)
        - 2.0 * sig_i * sig_n * cov_in
        + 0.5 * (mu_i**2) * (sig_n**2)
        + 0.5 * (mu_n**2) * (sig_i**2)
        - (mu_i * mu_n / (sig_i * sig_n)) * (cov_in**2)
    )
    if not np.isfinite(theta) or theta <= 0.0:
        return 0.0, 1.0
    z = delta / float(np.sqrt(theta))
    p = 2.0 * (1.0 - float(_sps.norm.cdf(abs(z))))
    p = max(0.0, min(1.0, p))
    return float(z), float(p)

ledoit_wolf_bootstrap_pvalue(r_i: pd.Series, r_n: pd.Series, *, B: int = 999, block_length: int = 5, seed: int = 0) -> float

Stationary-block-bootstrap p-value for the Sharpe-difference test.

Uses Politis-Romano fixed-block sampling under the null that the two Sharpe ratios are equal. Returns a two-sided p-value in [0, 1].

Source code in src/markowitz/backtest/stats.py
def ledoit_wolf_bootstrap_pvalue(
    r_i: pd.Series,
    r_n: pd.Series,
    *,
    B: int = 999,
    block_length: int = 5,
    seed: int = 0,
) -> float:
    """Stationary-block-bootstrap p-value for the Sharpe-difference test.

    Uses Politis-Romano fixed-block sampling under the null that the two
    Sharpe ratios are equal. Returns a two-sided p-value in ``[0, 1]``.
    """
    joined = pd.concat([r_i, r_n], axis=1, join="inner").dropna()
    if joined.shape[0] < block_length + 1:
        return 1.0
    a = joined.iloc[:, 0].to_numpy(dtype=float)
    b = joined.iloc[:, 1].to_numpy(dtype=float)
    n = a.shape[0]

    def _sharpe_diff(x: np.ndarray, y: np.ndarray) -> float:
        sx = float(np.std(x, ddof=1))
        sy = float(np.std(y, ddof=1))
        if sx == 0.0 or sy == 0.0:
            return 0.0
        return float(np.mean(x) / sx - np.mean(y) / sy)

    obs = _sharpe_diff(a, b)
    rng = np.random.default_rng(seed)
    n_blocks = int(np.ceil(n / block_length))

    count = 0
    for _ in range(B):
        starts = rng.integers(0, n, size=n_blocks)
        idx = np.concatenate([(s + np.arange(block_length)) % n for s in starts])[:n]
        a_b = a[idx]
        b_b = b[idx]
        # Recenter under H0: equal Sharpes -> compare against observed.
        diff_boot = _sharpe_diff(a_b, b_b) - obs
        if abs(diff_boot) >= abs(obs):
            count += 1
    p = (count + 1) / (B + 1)
    return float(max(0.0, min(1.0, p)))

max_drawdown(r: pd.Series) -> float

Maximum drawdown of the wealth path implied by r.

Returns a non-positive number (0.0 if the path is monotonically non-decreasing).

Source code in src/markowitz/backtest/stats.py
def max_drawdown(r: pd.Series) -> float:
    """Maximum drawdown of the wealth path implied by ``r``.

    Returns a non-positive number (``0.0`` if the path is monotonically
    non-decreasing).
    """
    if len(r) == 0:
        return 0.0
    wealth = (1.0 + r.astype(float)).cumprod()
    peak = wealth.cummax()
    dd = wealth / peak - 1.0
    return float(dd.min())

sharpe_ratio(r: pd.Series, rf: pd.Series | float = 0.0, ann: int = 12) -> float

Annualized Sharpe ratio of an excess-return series.

Returns 0.0 if the standard deviation is exactly zero (degenerate case).

Source code in src/markowitz/backtest/stats.py
def sharpe_ratio(r: pd.Series, rf: pd.Series | float = 0.0, ann: int = 12) -> float:
    """Annualized Sharpe ratio of an excess-return series.

    Returns 0.0 if the standard deviation is exactly zero (degenerate case).
    """
    excess = _to_excess(r, rf)
    sd = float(excess.std(ddof=1))
    mu = float(excess.mean())
    scale = max(abs(mu), 1.0)
    if not np.isfinite(sd) or sd <= 1e-12 * scale:
        return 0.0
    return float(mu / sd * np.sqrt(ann))

sortino_ratio(r: pd.Series, rf: pd.Series | float = 0.0, ann: int = 12) -> float

Annualized Sortino ratio using downside semi-deviation (target = 0).

Source code in src/markowitz/backtest/stats.py
def sortino_ratio(r: pd.Series, rf: pd.Series | float = 0.0, ann: int = 12) -> float:
    """Annualized Sortino ratio using downside semi-deviation (target = 0)."""
    excess = _to_excess(r, rf)
    downside = excess.clip(upper=0.0)
    dd_sd = float(np.sqrt((downside**2).mean()))
    if not np.isfinite(dd_sd) or dd_sd == 0.0:
        return 0.0
    mu = float(excess.mean())
    return float(mu / dd_sd * np.sqrt(ann))