Skip to content

Backtester

BacktestReport dataclass

Container for backtest results and performance metrics.

This class encapsulates the complete results of a backtest run, including P&L records, orders executed, and calculated performance metrics such as Sharpe ratio and maximum drawdown.

Attributes: starting_cash (np.float64): Initial cash amount at start of backtest. final_cash (np.float64): Final cash amount at end of backtest. PnlRecord (pd.Series): Time series of P&L values throughout the backtest. orders (list[Order]): List of all orders executed during the backtest. tradeRecord (list[np.float64]): List of individual trade P&L values. margin_call_events (list[dict]): Margin call events triggered during the run. data (pd.DataFrame): OHLC price data used in the backtest.

periods_per_year property

Calculate the number of trading periods per year.

This property infers the appropriate number of periods per year from the P&L record index, useful for annualized calculations.

Returns:

Name Type Description
int

Number of trading periods per year (e.g., 252 for daily data).

__str__()

Generate a formatted string summary of backtest results.

Returns a human-readable string containing key performance metrics including total return, Sharpe ratio with confidence intervals, maximum drawdown, and total number of trades.

Returns:

Name Type Description
str str

Formatted string with backtest summary statistics.

plot(figsize=(10, 5))

Plot the equity curve and drawdown charts.

Creates a two-panel plot showing: 1. The equity curve over time 2. The drawdown curve as a percentage

Parameters:

Name Type Description Default
figsize tuple

Figure size as (width, height) in inches. Defaults to (10, 5).

(10, 5)
Note

This method uses matplotlib to display the plots and requires an interactive environment to show the figures.

plot_trades(figsize=(16, 8), style='yahoo', title=None, start_date=None, end_date=None, volume=True)

Plot the price chart with trade entry and exit markers.

Creates a candlestick chart showing all trades with: - Green triangles (^) for buy entries - Red triangles (v) for sell exits - Position is closed when an order in the opposite direction is executed

Parameters:

Name Type Description Default
figsize tuple

Figure size as (width, height) in inches. Defaults to (16, 8).

(16, 8)
style str

mplfinance style name. Defaults to "yahoo".

'yahoo'
title str

Chart title. Defaults to None (uses default title).

None
start_date str

Start date filter (e.g., '2026-03-01'). Defaults to None (show all).

None
end_date str

End date filter (e.g., '2026-03-31'). Defaults to None (show all).

None
volume bool

Whether to show volume subplot. Defaults to True.

True

Raises:

Type Description
ValueError

If no OHLC data is available in the backtest report.

ValueError

If no orders are available.

CommissionType

Bases: Enum

Enumeration for commission calculation types.

This enum defines how commissions are calculated for trades in the backtesting framework.

Attributes:

Name Type Description
PERCENTAGE int

Commission calculated as percentage of trade value (0). Commission = trade_value * commission_rate

CASH int

Commission calculated as fixed cash amount per lot (1). Commission = quantity * commission_rate / lot_size

DataSplitMode

Bases: Enum

Enumeration for data split modes in optimization.

MonteCarloMode

Bases: Enum

Monte Carlo simulation modes.

Attributes:

Name Type Description
TRADE_ORDER

Randomize the sequence of trade execution

PRICE_PATH

Resample price returns to create synthetic paths

BOTH

Run both analyses and combine results

MonteCarloResult dataclass

Container for Monte Carlo simulation results.

This class holds the complete results of a Monte Carlo simulation, including equity curves from all simulations, summary statistics, and visualization methods.

Attributes:

Name Type Description
mode MonteCarloMode

The simulation mode used.

equity_curves list[Series]

List of equity curves from each simulation.

summary_stats dict

Summary statistics including mean, std, min, max of final returns.

percentile_results dict

Percentile values for final returns (5th, 25th, 50th, 75th, 95th).

original_equity Series

The original backtest equity curve for comparison.

simulations int

Number of simulations run.

__str__()

Generate formatted string summary of Monte Carlo results.

plot(figsize=(12, 8), show_original=True, show_percentiles=True, max_curves=None)

Plot all Monte Carlo simulation equity curves.

Creates a plot showing equity curves from all simulations with transparency. The average/median path appears lightest due to overlap of all curves.

Parameters:

Name Type Description Default
figsize tuple

Figure size as (width, height) in inches. Defaults to (12, 8).

(12, 8)
show_original bool

Whether to overlay the original equity curve. Defaults to True.

True
show_percentiles bool

Whether to show percentile bands. Defaults to True.

True
max_curves int | None

Maximum number of simulation curves to render. Defaults to self.plot_max_curves.

None
Note

This method uses matplotlib to display the plots and requires an interactive environment to show the figures.

probabilities(target_return, drawdown_threshold, horizon=None, as_percent=True)

Calculate the probability of reaching a target return and exceeding a drawdown threshold within a given time horizon.

Parameters:

Name Type Description Default
target_return float

Target return threshold. If as_percent is True, this is treated as a decimal return (e.g. 0.05 for 5%).

required
drawdown_threshold float

Drawdown threshold. If as_percent is True, this is treated as a decimal drawdown (e.g. 0.05 for 5%).

required
horizon int | str | Timedelta | None

Evaluation horizon. If an integer is provided, it is treated as a number of steps. If a string or Timedelta is provided, it is treated as a time span relative to the first timestamp in each equity curve.

None
as_percent bool

Whether thresholds are provided as decimal percentages. Defaults to True.

True

Returns:

Name Type Description
dict dict

Probability summary containing return and drawdown metrics.

OptimizationResult dataclass

Container for optimization results with train/validate/test splits.

This class holds the complete results of an optimization run that includes evaluation on all three data splits, enabling proper model selection and generalization assessment.

Attributes:

Name Type Description
best_params dict

Best parameter values found.

train_report Any

Backtest report for training data.

validate_report Any

Backtest report for validation data.

test_report Any

Backtest report for test data.

train_metrics dict

Computed metrics for training performance.

validate_metrics dict

Computed metrics for validation performance.

test_metrics dict

Computed metrics for test performance.

all_results DataFrame

DataFrame with all parameter combinations and their metrics for each split.

SimpleBacktester

Simple backtester for executing trading strategies on historical data.

This class provides functionality to run backtests on trading strategies, calculate performance metrics, and perform parameter optimization through grid search (both sequential and parallel).

The backtester simulates realistic trading conditions including: - Order execution with market and limit orders - Commission calculations - Position management - Margin calls - P&L tracking - Leverage for amplified position sizing

Example

from quantex import SimpleBacktester, CSVDataSource

Create strategy and data source

source = CSVDataSource("data.csv")

strategy = MyStrategy() # Your custom strategy

bt = SimpleBacktester(strategy, cash=10000)
report = bt.run()
print(report)

__init__(strategy, cash=10000, commission=0.002, commission_type=CommissionType.PERCENTAGE, lot_size=1, margin_call=0.5, leverage=1.0)

Initialize the backtester with strategy and configuration parameters.

Parameters:

Name Type Description Default
strategy Strategy

Trading strategy to backtest. Must implement the Strategy interface with init() and next() methods.

required
cash float

Initial cash amount. Defaults to 10,000.

10000
commission float

Commission rate per trade. Defaults to 0.002 (0.2%).

0.002
commission_type CommissionType

Type of commission calculation. Can be CommissionType.PERCENTAGE or CommissionType.CASH. Defaults to CommissionType.PERCENTAGE.

PERCENTAGE
lot_size int

Size of trading lots. Defaults to 1.

1
margin_call float

Margin call threshold as fraction of cash value. Defaults to 0.5 (50%).

0.5
leverage float

Leverage multiplier for position sizing. Defaults to 1.0 (no leverage). For example: - 2.0 = 2x leverage (control 2x the position with same cash) - 0.5 = half leverage (control half the position)

1.0

Raises:

Type Description
ValueError

If strategy is None or commission rate is negative.

monte_carlo(simulations=100, mode=MonteCarloMode.BOTH, seed=None, progress_bar=False)

Run Monte Carlo simulation on the strategy.

This method runs multiple simulations to test strategy robustness using either trade order randomization, price path resampling, or both.

Parameters:

Name Type Description Default
simulations int

Number of simulations to run. Defaults to 100.

100
mode MonteCarloMode | str

Simulation mode. Options: - "trade_order": Randomize trade execution order - "price_path": Resample price returns to create synthetic paths - "both": Run both analyses and combine results Defaults to "both".

BOTH
seed int | None

Random seed for reproducibility. Defaults to None.

None
progress_bar bool

Whether to show progress bar during simulation. Defaults to False.

False

Returns:

Name Type Description
MonteCarloResult MonteCarloResult

Object containing: - equity_curves: List of equity curves from each simulation - summary_stats: Mean, std, min, max of final returns - percentile_results: 5th, 25th, 50th, 75th, 95th percentiles - plot(): Visualization method for the "spaghetti plot"

Example

from quantex import SimpleBacktester, CSVDataSource source = CSVDataSource("data.csv")

Create and configure strategy

bt = SimpleBacktester(strategy, cash=10000) result = bt.monte_carlo(simulations=500, mode="both") print(result) # Print summary statistics result.plot() # Show spaghetti plot

Note
  • Trade order randomization shuffles when trades execute while keeping the same trades
  • Price path resampling creates synthetic market scenarios from historical returns
  • Both methods help identify if strategy performance is robust or dependent on specific conditions

optimize(params, constraint=None, objective='sharpe', risk_tolerance=None)

Perform a grid search over the provided parameter ranges.

This method systematically tests all combinations of parameter values to find the optimal configuration for the trading strategy. Each parameter combination is backtested individually to evaluate performance.

Parameters:

Name Type Description Default
params dict[str, range]

Dictionary mapping strategy attribute names to iterables of candidate values. For example:

{
    'fast_period': range(5, 21, 5),    # [5, 10, 15, 20]
    'slow_period': range(20, 51, 10),  # [20, 30, 40, 50]
    'threshold': np.linspace(0.01, 0.1, 10)
}
required
constraint Callable[[dict[str, Any]], bool] | None

Optional callable that takes a candidate parameter dict and returns True to evaluate the combo or False to skip it. Useful for enforcing logical constraints like ensuring fast_period < slow_period. Defaults to None (no constraints).

None
objective str

BacktestReport attribute or computed metric to optimize. Defaults to "sharpe". Supports any attribute exposed by BacktestReport and the computed metrics "final_cash", "total_return", "sharpe", "max_drawdown", and "trades".

'sharpe'
risk_tolerance dict[str, float] | None

Optional maximum allowed values for candidate metrics. Any candidate that exceeds a threshold is discarded before scoring. For example, {"max_drawdown": 0.05} rejects strategies with drawdown above 5%. Defaults to None.

None

Returns:

Name Type Description
OptimizationResult OptimizationResult

Object containing: - best_params: Best parameter values found - train_report: BacktestReport for the best parameters - validate_report: None for single-split optimization - test_report: None for single-split optimization - train_metrics: Metrics computed for best parameters - validate_metrics: Empty dict for single-split optimization - test_metrics: Empty dict for single-split optimization - all_results: DataFrame with all parameter combinations

Raises:

Type Description
ValueError

If params is empty or contains parameters with no values.

TypeError

If any parameter values are not iterable.

Note

The optimization uses the selected objective as the primary selection criterion. If the objective is invalid (NaN), the candidate is skipped.

Example

bt = SimpleBacktester(strategy)
best_params, best_report, results = bt.optimize({
... 'fast_period': [5, 10, 20],
... 'slow_period': [20, 50, 100]
... }, constraint=lambda p: p['fast_period'] < p['slow_period'])
print(f"Best parameters: {best_params}")
print(f"Best Sharpe ratio: {best_report.periods_per_year}")

optimize_gradient_descent(param_init, param_bounds, objective='sharpe', learning_rate=0.01, max_iterations=100, tolerance=1e-06, momentum=0.9, train_ratio=0.7, validate_ratio=0.15, test_ratio=0.15, selection_criterion='validate', progress_bar=True, integer_params=None)

Optimize strategy parameters using gradient descent.

This method performs gradient-based optimization on continuous strategy parameters, similar to machine learning workflows. It uses train/validate/test splits to prevent overfitting and select the best model based on validation performance.

The optimization computes numerical gradients by evaluating small perturbations around the current parameter values.

Parameters:

Name Type Description Default
param_init dict[str, float]

Initial parameter values.

required
param_bounds dict[str, tuple[float, float]]

Bounds for each parameter as (min, max) tuples.

required
objective str

Metric to optimize. Defaults to "sharpe". Supports same metrics as optimize().

'sharpe'
learning_rate float

Step size for gradient descent. Defaults to 0.01.

0.01
max_iterations int

Maximum number of iterations. Defaults to 100.

100
tolerance float

Convergence tolerance. Optimization stops when gradient magnitude falls below this threshold. Defaults to 1e-6.

1e-06
momentum float

Momentum factor for accelerated descent. Defaults to 0.9.

0.9
train_ratio float

Fraction of data for training. Defaults to 0.7 (70%).

0.7
validate_ratio float

Fraction of data for validation. Defaults to 0.15 (15%).

0.15
test_ratio float

Fraction of data for testing. Defaults to 0.15 (15%).

0.15
selection_criterion str

Which split to use for final parameter selection. Options: "train", "validate", "test". Defaults to "validate".

'validate'
progress_bar bool

Whether to show progress bar. Defaults to True.

True
integer_params set[str] | None

Set of parameter names that should be treated as integers. These parameters will be rounded to the nearest integer after each gradient update. Defaults to None (all parameters are continuous).

None

Returns:

Name Type Description
OptimizationResult OptimizationResult

Object containing: - best_params: Optimized parameter values - train_report: BacktestReport for training data - validate_report: BacktestReport for validation data - test_report: BacktestReport for test data - train_metrics: Metrics computed on training data - validate_metrics: Metrics computed on validation data - test_metrics: Metrics computed on test data - all_results: DataFrame with iteration history

Example

Optimize with integer parameters

result = bt.optimize_gradient_descent( ... param_init={'fast_period': 10.0, 'slow_period': 30.0}, ... param_bounds={ ... 'fast_period': (2.0, 50.0), ... 'slow_period': (10.0, 100.0) ... }, ... integer_params={'fast_period', 'slow_period'}, ... learning_rate=0.05, ... max_iterations=50 ... ) print(f"Optimized params: {result.best_params}") print(f"Final validation Sharpe: {result.validate_metrics['sharpe']}")

optimize_optuna(param_space, n_trials=100, objective='sharpe', risk_tolerance=None, constraint=None, timeout=None, random_seed=None, workers=None, progress_bar=True, verbose=False)

Optimize strategy parameters using Optuna (Bayesian optimization).

This method uses Optuna's optimization framework with TPE (Tree-structured Parzen Estimator) sampler for intelligent parameter search. It typically finds better solutions than grid search with fewer evaluations.

The method supports: - Continuous parameter ranges (sampled uniformly) - Discrete/categorical parameter lists - Early pruning of unpromising trials - Parallel execution for faster optimization

Parameters:

Name Type Description Default
param_space dict[str, tuple[Any, Any] | list[Any]]

Parameter search space. Can be: - Continuous range: (min, max) tuple for uniform sampling - Discrete list: [val1, val2, ...] for categorical sampling Example: {'period': (5, 50), 'threshold': [0.01, 0.02, 0.05]}

required
n_trials int

Maximum number of optimization trials. Defaults to 100.

100
objective str

Metric to optimize. Defaults to "sharpe". Supports: "final_cash", "total_return", "sharpe", "max_drawdown", "trades".

'sharpe'
risk_tolerance dict[str, float] | None

Maximum allowed values for risk metrics. Trials exceeding thresholds are pruned. Defaults to None.

None
constraint Callable[[dict[str, Any]], bool] | None

Optional callable to enforce parameter constraints. Defaults to None.

None
timeout int | None

Maximum time in seconds for optimization. Defaults to None (no limit).

None
random_seed int | None

Random seed for reproducibility. Defaults to None.

None
workers int | None

Number of parallel workers for Optuna study. Defaults to None (sequential).

None
progress_bar bool

Whether to show progress bar. Defaults to True.

True
verbose bool

Whether to show Optuna trial logs. Defaults to False (suppresses verbose output).

False

Returns:

Name Type Description
OptimizationResult OptimizationResult

Object containing: - best_params: Best parameter values found - train_report: BacktestReport for best parameters (None for Optuna) - validate_report: None - test_report: None - train_metrics: Metrics for best parameters - validate_metrics: Empty dict - test_metrics: Empty dict - all_results: DataFrame with all trial results

Performance Notes
  • Optuna typically finds good solutions in 50-200 trials
  • For 10,000+ grid combos, Optuna can be 50-100x faster
  • Use workers > 1 for parallel trial evaluation
  • Pruning callbacks significantly speed up optimization
Example

Optimize with continuous and discrete parameters

result = bt.optimize_optuna({ ... 'fast_period': (5, 50), # Continuous: 5-50 ... 'slow_period': [20, 30, 50], # Discrete: pick one ... 'threshold': (0.01, 0.1), # Continuous: 1%-10% ... }, n_trials=100) print(f"Best params: {result.best_params}") print(f"Best Sharpe: {result.train_metrics['sharpe']}")

Note

Requires optuna package: pip install optuna

optimize_parallel(params, constraint=None, objective='sharpe', risk_tolerance=None, workers=None, chunksize='auto')

Perform parallel grid search over parameter ranges for optimization.

This method is identical to optimize() but uses multiprocessing to distribute parameter combinations across multiple worker processes, significantly reducing computation time for large parameter spaces.

Parameters:

Name Type Description Default
params dict[str, range]

Dictionary mapping strategy attribute names to iterables of candidate values (same format as optimize()). constraint (Callable[[dict[str, Any]], bool] | None, optional): Optional callable for parameter constraints (same as optimize()). Defaults to None. objective (str, optional): BacktestReport attribute or computed metric to optimize. Defaults to "sharpe". risk_tolerance (dict[str, float] | None, optional): Optional maximum allowed metric values for candidate rejection before scoring. Defaults to None.

required
workers int | None

Maximum number of worker processes to use. If None, defaults to min(os.cpu_count()-1, 4) to avoid overwhelming the system. Defaults to None.

None
chunksize int | str

Chunk size for ProcessPoolExecutor.map. Can be an integer or "auto" for adaptive sizing based on total combinations and worker count. Smaller values provide better load balancing for many small tasks. Larger values reduce IPC overhead. Defaults to "auto" (previously 1). Auto-calculation: max(16, total_combos // (workers * 4))

'auto'

Returns:

Name Type Description
OptimizationResult OptimizationResult

Object containing: - best_params: Best parameter values found - train_report: BacktestReport for the best parameters - validate_report: None for single-split optimization - test_report: None for single-split optimization - train_metrics: Metrics computed for best parameters - validate_metrics: Empty dict for single-split optimization - test_metrics: Empty dict for single-split optimization - all_results: DataFrame with all parameter combinations

Raises:

Type Description
ValueError

If params is empty or contains parameters with no values.

TypeError

If any parameter values are not iterable.

Note
  • This method creates separate processes, so the strategy must be picklable for multiprocessing to work.
  • The main process re-runs the best configuration to get the full BacktestReport (parallel workers only return summary metrics).
  • Uses ProcessPoolExecutor for true parallelism across CPU cores.
  • Memory usage scales with the number of workers as each worker maintains a copy of the strategy.
Performance Tips
  • For parameter spaces with many combinations (>1000), prefer optimize_parallel over optimize for better performance.
  • For small parameter spaces, optimize() may be faster due to lower multiprocessing overhead.
  • Monitor system memory usage as each worker maintains a full copy of the strategy and data.
  • Auto chunksize provides better throughput for large parameter spaces.
Example

bt = SimpleBacktester(strategy)

Use 4 workers for parallel optimization

best_params, best_report, results = bt.optimize_parallel(
... {'period1': range(5, 50, 5), 'period2': range(20, 100, 10)},
... workers=4
... )

optimize_with_split(params, constraint=None, objective='sharpe', risk_tolerance=None, train_ratio=0.6, validate_ratio=0.2, test_ratio=0.2, selection_criterion='validate')

Optimize strategy parameters using train/validate/test splits.

This method implements ML-style optimization with three data splits: - Training set: Used to fit strategy parameters - Validation set: Used to select the best parameters - Test set: Used for final out-of-sample evaluation

This approach helps prevent overfitting by evaluating generalization performance on held-out data before final selection.

Parameters:

Name Type Description Default
params dict[str, range]

Dictionary mapping strategy attribute names to iterables of candidate values (same format as optimize()).

required
constraint Callable[[dict[str, Any]], bool] | None

Optional callable for parameter constraints. Defaults to None.

None
objective str

Metric to optimize. Defaults to "sharpe". Supports same metrics as optimize().

'sharpe'
risk_tolerance dict[str, float] | None

Optional maximum allowed metric values. Defaults to None.

None
train_ratio float

Fraction of data for training. Defaults to 0.6 (60%).

0.6
validate_ratio float

Fraction of data for validation. Defaults to 0.2 (20%).

0.2
test_ratio float

Fraction of data for testing. Defaults to 0.2 (20%).

0.2
selection_criterion str

Which split to use for final parameter selection. Options: "train", "validate", "test". Defaults to "validate".

'validate'

Returns:

Name Type Description
OptimizationResult OptimizationResult

Object containing: - best_params: Best parameters found - train_report: BacktestReport for training data - validate_report: BacktestReport for validation data - test_report: BacktestReport for test data - train_metrics: Metrics computed on training data - validate_metrics: Metrics computed on validation data - test_metrics: Metrics computed on test data - all_results: DataFrame with all results

Raises:

Type Description
ValueError

If split ratios don't sum to 1.0 or selection_criterion is invalid.

Example

bt = SimpleBacktester(strategy) result = bt.optimize_with_split( ... {'fast_period': [5, 10, 15], 'slow_period': [20, 30, 50]}, ... selection_criterion='validate' ... ) print(f"Best params: {result.best_params}") print(f"Train Sharpe: {result.train_metrics['sharpe']}") print(f"Validate Sharpe: {result.validate_metrics['sharpe']}") print(f"Test Sharpe: {result.test_metrics['sharpe']}")

run(progress_bar=False)

Execute the backtest for the configured strategy.

This method runs the complete backtest simulation, iterating through all data points in the strategy's data sources, executing strategy logic, processing orders, and tracking performance metrics.

Parameters:

Name Type Description Default
progress_bar bool

Whether to show a progress bar during backtest execution. Useful for long-running backtests. Defaults to False.

False

Returns:

Name Type Description
BacktestReport BacktestReport

Object containing complete backtest results including: - Starting and final cash amounts - P&L record over time - List of all executed orders - Calculated performance metrics

Note

This method modifies the internal state of the strategy and should not be called multiple times on the same instance without resetting.

walk_forward_analyze(params, train_periods, test_periods, step_periods=None, constraint=None, objective='sharpe', risk_tolerance=None, min_train_periods=30, min_test_periods=10, progress_bar=True, **optimizer_kwargs)

Perform walk-forward analysis on the strategy.

Walk-forward analysis is a rigorous method for evaluating trading strategies that simulates real-world deployment conditions. It uses rolling windows to: 1. Train: Optimize parameters on historical data 2. Test: Evaluate best parameters on unseen future data

This method provides a convenient way to run walk-forward analysis directly on the backtester instance without needing to create a separate WalkForwardAnalyzer.

Parameters:

Name Type Description Default
params dict[str, range]

Dictionary mapping strategy attribute names to iterables of candidate values. For example:

{
    'fast_period': [5, 10, 15, 20],
    'slow_period': [20, 30, 40, 50],
    'threshold': [0.01, 0.02, 0.05]
}
required
train_periods int

Number of periods for each training window. This is the lookback window used for parameter optimization.

required
test_periods int

Number of periods for each test window. This is the forward-looking window for out-of-sample evaluation.

required
step_periods int | None

Number of periods to step forward between windows. If None, uses test_periods (non-overlapping windows). Defaults to None.

None
constraint Callable[[dict[str, Any]], bool] | None

Optional callable that takes a parameter dict and returns True to evaluate or False to skip. Useful for enforcing logical constraints. Defaults to None.

None
objective str

Metric to optimize. Defaults to "sharpe". Supports: "final_cash", "total_return", "sharpe", "max_drawdown", "trades".

'sharpe'
risk_tolerance dict[str, float] | None

Optional maximum allowed values for risk metrics. Defaults to None.

None
min_train_periods int

Minimum required training periods. Defaults to 30.

30
min_test_periods int

Minimum required test periods. Defaults to 10.

10
progress_bar bool

Whether to show progress bar. Defaults to True.

True
**optimizer_kwargs Any

Additional keyword arguments passed to the optimizer. Can include: - workers (int): For parallel optimization - n_trials (int): For Optuna optimization - timeout (int): For Optuna timeout - random_seed (int): For reproducibility

{}

Returns:

Name Type Description
WalkForwardResult WalkForwardResult

Object containing: - n_windows: Total number of walk-forward windows - train_periods: Periods in each training window - test_periods: Periods in each test window - window_results: List of WalkForwardWindow objects - aggregated_metrics: Aggregated statistics across windows - all_windows_results_df: DataFrame with all results - plot(): Visualization method for results

Example

result = bt.walk_forward_analyze( ... params={'fast': [5, 10, 20], 'slow': [20, 50, 100]}, ... train_periods=252, # 1 year training ... test_periods=63, # 3 months testing ... objective='sharpe' ... ) print(result) print(f"Average OOS Sharpe: {result.aggregated_metrics['out_of_sample_sharpe_mean']:.2f}") result.plot()

Run with parallel optimization

result = bt.walk_forward_analyze( ... params={'period': range(5, 50, 5)}, ... train_periods=252, ... test_periods=63, ... workers=4 # Use parallel optimization ... )

Run with Optuna optimization

result = bt.walk_forward_analyze( ... params={'period': (5, 50)}, ... train_periods=252, ... test_periods=63, ... n_trials=100 # Optuna-specific ... )

Note
  • Walk-forward analysis helps detect overfitting by evaluating parameter stability and out-of-sample performance over multiple time windows
  • A stability ratio (OOS/IS Sharpe) close to 1.0 indicates robust parameters
  • Use step_periods < test_periods for overlapping windows with more samples

TrainValidateTestSplit dataclass

Container for train/validate/test data splits.

This class holds the split configuration and indices for dividing historical data into training, validation, and test sets for machine learning-style optimization workflows.

Attributes:

Name Type Description
train_start int

Starting index for training data.

train_end int

Ending index for training data.

validate_start int

Starting index for validation data.

validate_end int

Ending index for validation data.

test_start int

Starting index for test data.

test_end int

Ending index for test data.

train_ratio float

Ratio of data used for training.

validate_ratio float

Ratio of data used for validation.

test_ratio float

Ratio of data used for testing.

__post_init__()

Validate split ratios sum to 1.0.

WalkForwardAnalyzer

Analyzer for walk-forward optimization.

This class performs walk-forward analysis on a trading strategy, evaluating parameter optimization over rolling windows. It can work with any optimizer function that follows the OptimizerProtocol.

Walk-forward analysis is a rigorous method for evaluating trading strategies that simulates real-world deployment conditions. Each window represents a complete cycle of: 1. Training: Optimize parameters on historical data 2. Testing: Evaluate best parameters on unseen future data

Attributes:

Name Type Description
backtester SimpleBacktester

The backtester instance to use.

train_periods PeriodSpec

Number of periods or timedelta for each training window. Can be an int (number of periods) or a timedelta (e.g., pd.Timedelta('252 days') or timedelta(days=365)).

test_periods PeriodSpec

Number of periods or timedelta for each test window. Can be an int or timedelta.

step_periods PeriodSpec

Number of periods or timedelta to step forward between windows. If None, uses test_periods (non-overlapping windows).

min_train_periods int

Minimum required training periods.

min_test_periods int

Minimum required test periods.

selection_criterion str

Metric to use for selecting best parameters.

Example

from quantex.backtester.walk_forward import WalkForwardAnalyzer from quantex import SimpleBacktester import pandas as pd

Create analyzer with grid search optimizer (using periods)

analyzer = WalkForwardAnalyzer( ... backtester=bt, ... train_periods=252, # 1 year training (252 trading days) ... test_periods=63, # 3 months testing ... step_periods=63 # Move forward 3 months each window ... )

Or using timedelta (data frequency is inferred from the data)

analyzer = WalkForwardAnalyzer( ... backtester=bt, ... train_periods=pd.Timedelta('252 days'), # 1 year training ... test_periods=pd.Timedelta('90 days'), # 3 months testing ... step_periods=pd.Timedelta('90 days') # Move forward 3 months ... )

result = analyzer.analyze( ... optimizer=lambda bt, params: bt.optimize(params), ... params={'fast': [5, 10, 20], 'slow': [20, 50, 100]} ... )

View aggregated results

print(result) print(f"Average out-of-sample Sharpe: {result.aggregated_metrics['out_of_sample_sharpe_mean']:.2f}")

step_periods property

Resolved step_periods as number of periods (int).

step_periods_spec property

Original step_periods specification (int or timedelta).

test_periods property

Resolved test_periods as number of periods (int).

test_periods_spec property

Original test_periods specification (int or timedelta).

train_periods property

Resolved train_periods as number of periods (int).

train_periods_spec property

Original train_periods specification (int or timedelta).

__init__(backtester, train_periods, test_periods, step_periods=None, min_train_periods=30, min_test_periods=10, selection_criterion='sharpe')

Initialize the WalkForwardAnalyzer.

analyze(optimizer, params, constraint=None, objective='sharpe', risk_tolerance=None, progress_bar=True, **optimizer_kwargs)

Perform walk-forward analysis using the specified optimizer.

This method runs walk-forward optimization over multiple rolling windows. For each window: 1. Train: Optimize parameters using the specified optimizer on training data 2. Test: Evaluate the best parameters on out-of-sample test data

Parameters:

Name Type Description Default
optimizer OptimizerProtocol

Optimizer function to use. Should accept a SimpleBacktester and params dict, and return an OptimizationResult. Can be any of: bt.optimize, bt.optimize_parallel, bt.optimize_optuna, or a custom optimizer function.

required
params dict[str, Any]

Parameter space for optimization.

required
constraint Callable[[dict], bool] | None

Optional constraint function for parameter validation. Defaults to None.

None
objective str

Metric to optimize. Defaults to "sharpe".

'sharpe'
risk_tolerance dict[str, float] | None

Risk tolerance constraints. Defaults to None.

None
progress_bar bool

Whether to show progress bar. Defaults to True.

True
**optimizer_kwargs Any

Additional keyword arguments passed to optimizer.

{}

Returns:

Name Type Description
WalkForwardResult WalkForwardResult

Object containing: - window_results: List of WalkForwardWindow objects for each window - aggregated_metrics: Aggregated statistics across all windows - all_windows_results_df: DataFrame with all results

Example

Using grid search:

result = analyzer.analyze( ... optimizer=lambda bt, params: bt.optimize(params), ... params={'fast': [5, 10, 20], 'slow': [20, 50, 100]}, ... objective='sharpe' ... )

Using Optuna:

result = analyzer.analyze( ... optimizer=lambda bt, params: bt.optimize_optuna( ... {k: v if isinstance(v[0], (int, float)) else v ... for k, v in params.items()} ... ), ... params={'fast': (5, 50), 'slow': (20, 100)}, ... n_trials=50 ... )

WalkForwardResult dataclass

Container for walk-forward optimization results.

This class holds the complete results of a walk-forward optimization run, including per-window results and aggregated statistics.

Attributes:

Name Type Description
n_windows int

Total number of walk-forward windows.

train_periods int

Number of periods in each training window (computed value).

test_periods int

Number of periods in each test window (computed value).

train_periods_spec PeriodSpec

Original specification for train periods (can be int or timedelta).

test_periods_spec PeriodSpec

Original specification for test periods (can be int or timedelta).

window_results list[WalkForwardWindow]

Results for each window.

aggregated_metrics dict

Aggregated statistics across all windows.

all_windows_results_df DataFrame

DataFrame with results from all windows.

in_sample_returns property

Total returns from training (in-sample) for each window.

in_sample_sharpe property

Sharpe ratios from training (in-sample) for each window.

out_of_sample_returns property

Total returns from testing (out-of-sample) for each window.

out_of_sample_sharpe property

Sharpe ratios from testing (out-of-sample) for each window.

__str__()

Generate a formatted string summary of walk-forward results.

get_param_stability(param_name)

Analyze the stability of a parameter across windows.

Parameters:

Name Type Description Default
param_name str

Name of the parameter to analyze.

required

Returns:

Name Type Description
dict dict

Dictionary with 'mean', 'std', 'min', 'max', 'cv' (coefficient of variation) for the parameter across windows.

plot(figsize=(14, 8))

Plot walk-forward analysis results.

Creates a multi-panel figure showing: 1. In-sample vs out-of-sample Sharpe ratios 2. In-sample vs out-of-sample returns 3. Parameter stability over time

Parameters:

Name Type Description Default
figsize tuple

Figure size as (width, height) in inches. Defaults to (14, 8).

(14, 8)

WalkForwardWindow dataclass

Container for a single walk-forward window.

Attributes:

Name Type Description
window_index int

Index of this window (0-based).

train_start int

Starting index for training data.

train_end int

Ending index for training data.

test_start int

Starting index for test data.

test_end int

Ending index for test data.

train_periods int

Number of periods in training.

test_periods int

Number of periods in testing.

train_periods_spec PeriodSpec

Original specification for train periods (can be int or timedelta).

test_periods_spec PeriodSpec

Original specification for test periods (can be int or timedelta).

best_params dict

Best parameters found during training.

train_metrics dict

Metrics computed on training data.

test_metrics dict

Metrics computed on test (out-of-sample) data.

train_report Any

BacktestReport for training period.

test_report Any

BacktestReport for test period.

create_train_validate_test_split(data_length, train_ratio=0.6, validate_ratio=0.2, test_ratio=0.2)

Create indices for train/validate/test split.

This function divides the data indices into three sets for ML-style optimization: training (parameter fitting), validation (hyperparameter selection), and testing (final evaluation).

Parameters:

Name Type Description Default
data_length int

Total number of data points.

required
train_ratio float

Fraction of data for training. Defaults to 0.6 (60%).

0.6
validate_ratio float

Fraction of data for validation. Defaults to 0.2 (20%).

0.2
test_ratio float

Fraction of data for testing. Defaults to 0.2 (20%).

0.2

Returns:

Name Type Description
TrainValidateTestSplit TrainValidateTestSplit

Object containing start/end indices for each split.

Raises:

Type Description
ValueError

If ratios don't sum to 1.0 or are invalid.

Example

split = create_train_validate_test_split(1000, 0.6, 0.2, 0.2) print(f"Train: {split.train_start}-{split.train_end}") print(f"Validate: {split.validate_start}-{split.validate_end}") print(f"Test: {split.test_start}-{split.test_end}")

max_drawdown(equity)

Calculate the maximum drawdown of an equity curve.

The maximum drawdown represents the largest peak-to-trough decline in the equity curve, expressed as a positive percentage.

Parameters:

Name Type Description Default
equity Series

Time series of equity values.

required

Returns:

Name Type Description
float float

Maximum drawdown as a positive percentage (e.g., 0.15 for 15%).

Example

equity = pd.Series([100, 110, 95, 105, 90])
max_drawdown(equity)
0.18181818181818182 # ~18.18% drawdown

walk_forward_analyze(backtester, optimizer, params, train_periods, test_periods, step_periods=None, constraint=None, objective='sharpe', risk_tolerance=None, min_train_periods=30, min_test_periods=10, progress_bar=True, **optimizer_kwargs)

Convenience function for walk-forward analysis.

This is a convenience wrapper around WalkForwardAnalyzer.analyze() that creates the analyzer and runs the analysis in one call.

Parameters:

Name Type Description Default
backtester SimpleBacktester

The backtester instance to use.

required
optimizer OptimizerProtocol

Optimizer function to use.

required
params dict[str, Any]

Parameter space for optimization.

required
train_periods PeriodSpec

Number of periods or timedelta for each training window. Can be an int or timedelta.

required
test_periods PeriodSpec

Number of periods or timedelta for each test window. Can be an int or timedelta.

required
step_periods PeriodSpec | None

Periods or timedelta to step between windows. If None, uses test_periods. Defaults to None.

None
constraint Callable[[dict], bool] | None

Constraint function.

None
objective str

Metric to optimize. Defaults to "sharpe".

'sharpe'
risk_tolerance dict[str, float] | None

Risk tolerance.

None
min_train_periods int

Minimum training periods. Defaults to 30.

30
min_test_periods int

Minimum test periods. Defaults to 10.

10
progress_bar bool

Show progress bar. Defaults to True.

True
**optimizer_kwargs Any

Additional arguments passed to optimizer.

{}

Returns:

Name Type Description
WalkForwardResult WalkForwardResult

Walk-forward analysis results.

Example

Using periods:

from quantex.backtester.walk_forward import walk_forward_analyze result = walk_forward_analyze( ... backtester=bt, ... optimizer=lambda bt, params: bt.optimize(params), ... params={'fast': [5, 10, 20], 'slow': [20, 50, 100]}, ... train_periods=252, ... test_periods=63, ... objective='sharpe' ... )

Using timedelta:

import pandas as pd result = walk_forward_analyze( ... backtester=bt, ... optimizer=lambda bt, params: bt.optimize(params), ... params={'fast': [5, 10, 20], 'slow': [20, 50, 100]}, ... train_periods=pd.Timedelta('365 days'), ... test_periods=pd.Timedelta('90 days'), ... objective='sharpe' ... )

print(f"Average OOS Sharpe: {result.aggregated_metrics['out_of_sample_sharpe_mean']:.2f}")

Source code

The backtester is a package with the following structure: