Draft:Meta-Labeling

Review waiting, please be patient.

This may take 3 months or more, since drafts are reviewed in no specific order. There are 2,840 pending submissions waiting for review.

If the submission is accepted, then this page will be moved into the article space.
If the submission is declined, then the reason will be posted here.
In the meantime, you can continue to improve this submission by editing normally.

Where to get help

If you need help editing or submitting your draft, please ask us a question at the AfC Help Desk or get live help from experienced editors. These venues are only for help with editing and the submission process, not to get reviews.
If you need feedback on your draft, or if the review is taking a lot of time, you can try asking for help on the talk page of a relevant WikiProject. Some WikiProjects are more active than others so a speedy reply is not guaranteed.

How to improve a draft

Wikipedia:Contributing to Wikipedia – a basic overview on how to edit Wikipedia.
Help:Wikitext – how to use the markup
Help:Referencing for beginners – how to include references
Wikipedia:Article development – how to develop your article
Wikipedia:Writing better articles – how to improve your article
Wikipedia:Verifiability – make sure your article includes reliable third-party sources

You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article.

Improving your odds of a speedy review

To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags.

Add tags to your draft

Editor resources

Easy tools: Citation bot (help) | Advanced: Fix bare URLs

Reviewer tools

Instructions · What links here · Meta-Labeling (talk: + · bio) · (log) · Copyvios report · reFill · Citation Bot · (Search: Google, Wikipedia) · Submitted 35 hours ago by Dsr02014 (talk: D · +) · Last edited 8 hours ago by Citation bot

Meta-labeling, also known as corrective AI, is a technique in machine learning (ML) developed for use in quantitative finance. It serves as a secondary decision-making layer that evaluates the signals generated by a primary predictive model. By assessing the confidence and likely profitability of those signals, meta-labeling allows investors and algorithms to dynamically size positions and suppress false positives.^[1]

Overview

Meta-labeling decouples two core components of systematic trading strategies: directional prediction and position sizing. The process involves training a primary model to generate trade signals (e.g., buy, sell, or hold) and then training a secondary model to determine whether each signal is likely to lead to a profitable trade. The second model outputs a probability that is interpreted as the confidence in the forecast, which can be used to adjust the position size or to filter out unreliable trades.^[1]^[2]

Architecture

Meta-labeling is typically implemented as a three-stage process:^[2]^[3]

Primary model (M1): Predicts the direction or label of a financial outcome using features such as market prices, returns, or volatility indicators. A typical output is directional, e.g., $Y\in \{-1,0,1\}$ , representing short, neutral, or long positions.
Secondary model (M2): A binary classifier trained to predict whether the primary model's prediction will be profitable. The target variable is a binary meta-label $F\in \{0,1\}$ . Inputs can include features used in the primary model, performance diagnostics, or market regime data.
Position sizing algorithm (M3): Translates the output probability of the secondary model into a position size. Higher confidence scores result in larger allocations, while lower confidence leads to reduced or zero exposure.

Motivation

Meta-labeling is designed to improve precision without sacrificing recall. As noted by López de Prado, attempting to model both the direction and the magnitude of a trade using a single algorithm can result in poor generalization. By separating these tasks, meta-labeling enables greater flexibility and robustness:

Enhances control over capital allocation.
Reduces overfitting by limiting model complexity.
Allows the use of interpretability tools and tailored thresholds to manage risk.
Enables dynamic trade suppression in unfavorable regimes.^[1]^[2]

Position Sizing Methods

Various algorithms have been proposed for transforming predicted probabilities into trade sizes:^[3]

All-or-nothing: Allocate 100% of capital if the probability exceeds a predefined threshold (e.g., 0.5); otherwise, do not trade.
Model confidence: Use the probability score directly as the fraction of capital allocated.
Linear scaling: Rescale the model's probabilities using min-max normalization based on the training data.
Normal CDF (NCDF): Use a normal cumulative distribution function applied to a z-statistic derived from the predicted probability.^[1]
Empirical CDF (ECDF): Rank probabilities based on their percentile in the training data to ensure relative allocation.
Sigmoid Optimal Position Sizing (SOPS): Applies a smooth non-linear sigmoid transformation optimized for expected return.

Model Calibration

Many ML models, including support vector machines (SVMs) and naïve Bayes classifiers, do not output calibrated probabilities by default. Calibration improves the interpretability and reliability of probability scores, which is especially important for meta-labeling.

Common calibration methods include:

Platt scaling: Fits a logistic regression model to the classifier outputs.^[4]
Isotonic regression: A non-parametric calibration method that fits a piecewise-constant, non-decreasing function to predicted scores.^[5]

Applications

Meta-labeling has been applied in a variety of financial ML contexts, including:

Algorithmic trading: Filtering and sizing trades to reduce false positives.
Portfolio optimization: Scaling exposure across multiple signals with differing confidence levels.
Risk management: Dynamically disabling strategies in adverse market conditions.
Model validation: Interpreting when and why a model may be underperforming due to regime shifts.

Performance

Empirical studies using synthetic data and simulated trading environments have demonstrated that meta-labeling improves strategy performance. Specifically, it increases the Sharpe ratio, reduces maximum drawdown, and leads to more stable returns over time.^[3]^[2]

Limitations

It requires relatively larger datasets to reliably train the secondary model, compared to a simpler model.
If misused, it could potentially introduce unecessary model complexity and tuning requirements.
Performance gains may be relatively small when forecast confidence is not a stable predictor of payoff magnitude.

References

^ ^a ^b ^c ^d López de Prado, Marcos (2018). Advances in Financial Machine Learning. Wiley. ISBN 978-1-119-48208-6.
^ ^a ^b ^c ^d Joubert, Jacques Francois (Summer 2022). "Meta-Labeling: Theory and Framework". Journal of Financial Data Science. 4 (3): 31–44. doi:10.3905/jfds.2022.1.043 (inactive 14 April 2025).{{cite journal}}: CS1 maint: DOI inactive as of April 2025 (link)
^ ^a ^b ^c Meyer, Michael; Barziy, Illya; Joubert, Jacques Francois (Spring 2023). "Meta-Labeling: Calibration and Position Sizing". Journal of Financial Data Science. 5 (2): 23–40. doi:10.3905/jfds.2023.1.062 (inactive 14 April 2025).{{cite journal}}: CS1 maint: DOI inactive as of April 2025 (link)
^ Platt, John (1999). "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods". CiteSeerX 10.1.1.41.1631.
^ Zadrozny, Bianca; Elkan, Charles (2002). "Transforming classifier scores into accurate multiclass probability estimates". KDD '02: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: 694–699. doi:10.1145/775047.775151. ISBN 1-58113-567-X.