# Backtest Overfitting and Trading Strategy Replication

## Evolution of Quantitative Performance Metrics

The foundational framework for evaluating the performance of investment strategies rests on Modern Portfolio Theory, pioneered by Harry Markowitz in 1952, which posits that rational, risk-averse investors optimize their portfolios by focusing exclusively on the first two moments of the return distribution: mean and variance [cite: 1]. Building upon this mean-variance optimization framework, William F. Sharpe introduced the Sharpe Ratio in 1966 to measure the expected differential return per unit of risk associated with an investment strategy [cite: 1, 2, 3]. Calculated as the excess return of a portfolio over the risk-free rate divided by the standard deviation of those excess returns, the Sharpe Ratio quickly became the industry standard for assessing whether historical returns were the product of sound investment logic or excessive risk-taking [cite: 2, 4]. 

Under traditional interpretations, an annualized Sharpe Ratio between 1.00 and 1.99 is considered robust, while a ratio exceeding 2.00 is highly exceptional, often prompting institutional scrutiny to verify whether leverage was heavily utilized [cite: 2]. However, the standard Sharpe Ratio is constructed upon a set of rigid statistical assumptions that rarely hold true in modern financial markets. Specifically, the metric implicitly assumes that asset returns are independent and identically distributed (IID) and follow a perfect Gaussian (normal) distribution [cite: 1, 5, 6]. Furthermore, the classical Sharpe Ratio evaluates the statistical significance of a strategy strictly in isolation, assuming that only a single hypothesis test has been conducted [cite: 3, 5, 7]. 

The advent of high-frequency market data, machine learning classification algorithms, and cloud-based parallel computing has rendered the single-trial assumption obsolete. Modern quantitative analysts routinely execute millions, or even billions, of backtest simulations to identify optimal parameter configurations across vast temporal and asset-class dimensions [cite: 1, 4, 8]. Because the standard Sharpe Ratio point estimator lacks a mechanism to penalize the multiplicity of these trials, it systematically overstates the statistical significance of historical performance, creating an illusion of alpha that rapidly deteriorates in out-of-sample live trading environments [cite: 4, 5, 9].

## Mechanics of Backtest Overfitting

To comprehend why machine learning and artificial intelligence trading strategies routinely fail to replicate their historical results, a strict distinction must be drawn between traditional model overfitting and backtest overfitting. In computer science and statistical learning, model overfitting occurs when an algorithm—such as a deep neural network or a random forest—memorizes the noise within a specific training dataset rather than capturing the underlying generative signal [cite: 10, 11]. This results in a model that performs flawlessly in-sample but fails to generalize to unseen out-of-sample data.

Conversely, backtest overfitting in mathematical finance is a manifestation of selection bias and the multiple testing problem [cite: 7, 10, 12]. It occurs during the strategy formulation process when a researcher evaluates a vast array of distinct models, parameter combinations, timeframes, and stop-loss rules on the same historical dataset, but only reports the performance of the single best-performing iteration [cite: 4, 5, 7]. 

Even if the researcher employs standard machine learning safeguards such as the hold-out method or out-of-sample validation periods, these techniques fail to identify backtest overfitting [cite: 7, 9, 13]. Each time an analyst evaluates a model's out-of-sample performance, modifies a parameter, and re-tests, the out-of-sample data effectively becomes integrated into the optimization loop, contaminating its independence [cite: 9]. Because the signal-to-noise ratio in global financial markets is exceptionally low, testing thousands of non-predictive variations virtually guarantees the discovery of a strategy that exhibits a highly profitable historical equity curve purely by random chance [cite: 8, 12]. Under the influence of memory effects and non-stationary market regimes, such overfitted strategies do not merely generate random noise in live deployment; they systematically destroy capital [cite: 13, 14].

## The False Strategy Theorem

The mathematical certainty of performance inflation under multiple testing is formalized in the False Strategy Theorem, established by David H. Bailey and Marcos López de Prado [cite: 15, 16]. The theorem models the expected maximum in-sample Sharpe Ratio as a direct function of the number of independent strategy variations evaluated, demonstrating that a researcher can achieve any desired performance threshold simply by increasing computational output [cite: 13, 16, 17].

The theorem operates under the null hypothesis that a set of proposed investment strategies possesses a true out-of-sample expected Sharpe Ratio of exactly zero [cite: 8, 18]. If a researcher evaluates $N$ independent strategy configurations, the in-sample Sharpe Ratio estimators follow a normal distribution due to the Central Limit Theorem [cite: 19, 20]. The maximum of these normally distributed estimators follows a Gumbel extreme value distribution. Using this framework, the expected maximum in-sample Sharpe Ratio ($E[\max(\widehat{SR}_n)]$) can be analytically approximated. 

The approximation integrates the variance across the estimated Sharpe Ratios of the trials ($V[\{\widehat{SR}_n\}]$), the Euler-Mascheroni constant ($\gamma \approx 0.5772$), the cumulative distribution function of the standard normal distribution ($Z$), and Euler's number ($e$) [cite: 1, 8, 21]. The mathematical formulation proves that the expected maximum Sharpe Ratio is strictly increasing with respect to the number of independent trials ($N$) and the variance of those trials [cite: 8, 18]. 

The empirical consequences of this theorem reveal the severe fragility of unadjusted performance metrics. If an analyst conducts a skill-less brute-force search evaluating a rudimentary trading rule with merely seven binary parameters, they generate 128 independent trials ($2^7 = 128$). Purely through random variation, the expected maximum annualized Sharpe Ratio for the optimal iteration will exceed 2.6 [cite: 22]. Expanding this search space to 1,000 independent backtests mathematically guarantees an expected maximum Sharpe Ratio of approximately 3.26, despite the underlying strategy lacking any genuine predictive power [cite: 21].

[image delta #1, 0 bytes]


## Statistical Corrections for Selection Bias

To identify robust quantitative strategies and separate genuine empirical findings from statistical flukes, researchers require performance evaluation methodologies that actively control for both non-normal returns distributions and multiple testing inflation [cite: 1, 8, 12].

### The Probabilistic Sharpe Ratio

The standard Sharpe Ratio fails to measure the inflationary effects generated by short sample lengths and non-Gaussian returns [cite: 8, 15]. Empirical financial time series are predominantly characterized by negative skewness—where large losses occur more frequently than large gains—and positive excess kurtosis, representing "fat tails" or a heightened probability of extreme outlier events [cite: 1, 5]. Because the variance of the Sharpe Ratio estimator increases substantially in the presence of negative skewness and positive kurtosis, point estimates derived from non-normal returns carry much wider confidence intervals [cite: 1, 5].

To address this, Bailey and López de Prado developed the Probabilistic Sharpe Ratio (PSR), which establishes the probability that a strategy's true Sharpe Ratio exceeds a designated benchmark threshold [cite: 8]. The PSR framework integrates the strategy's track record length ($T$) alongside the empirical skewness ($\hat{\gamma}_3$) and kurtosis ($\hat{\gamma}_4$) of the returns distribution [cite: 8]. By adjusting the standard error of the Sharpe Ratio using these higher moments, the PSR delivers a calibrated confidence level [cite: 5, 8]. Consequently, two strategies that display an identical nominal Sharpe Ratio of 1.50 will yield vastly different PSR confidence intervals if one strategy generates normally distributed returns while the other achieves its returns through highly skewed, tail-risk exposures [cite: 1].

### The Deflated Sharpe Ratio

While the PSR successfully calibrates a strategy's statistical significance against its returns distribution, it operates under the assumption that only a single strategy was tested [cite: 7, 8]. To resolve the multiple testing problem, Bailey and López de Prado introduced the Deflated Sharpe Ratio (DSR) [cite: 7, 8, 15]. 

The DSR is mathematically structured as a Probabilistic Sharpe Ratio wherein the static rejection threshold is replaced by a dynamic, selection-bias-aware threshold [cite: 5, 8]. This dynamic benchmark is precisely the expected maximum Sharpe Ratio derived from the False Strategy Theorem [cite: 5, 8]. The framework integrates five distinct variables to deflate the nominal metric: the track record length, the skewness of returns, the kurtosis of returns, the variance across the estimated Sharpe Ratios of all trials conducted, and the number of independent trials ($N$) [cite: 8, 15].

A critical operational requirement for applying the DSR is the precise recording of all historical backtests to determine the true value of $N$ [cite: 21]. Because quantitative algorithms often execute highly correlated variations of the same core strategy (e.g., tweaking a moving average from 50 days to 51 days), the raw number of total backtests ($M$) overstates the true breadth of the search space. Researchers typically estimate the effective number of independent trials ($N$) using dimension-reduction clustering protocols based on the average correlation matrix of the trial return streams [cite: 15, 21, 22].

When applied rigorously, the DSR outputs the true probability that a data-mined strategy is not merely an artifact of optimization. A DSR output below 0.5 indicates performance indistinguishable from pure chance; an output near 0.8 suggests the presence of a weak signal; and a DSR exceeding 0.95 is generally required to reject the null hypothesis and confirm the existence of a robust statistical edge [cite: 5].

## Minimum Backtest Length Limitations

The mathematical relationship between the number of trials and the expected inflation of performance metrics establishes strict boundaries on sample sizing, quantified as the Minimum Backtest Length (MinBTL) [cite: 7, 17, 22]. The MinBTL defines the absolute minimum duration of historical data required to ensure that the expected maximum in-sample Sharpe Ratio does not fully deviate from the expected out-of-sample performance [cite: 7, 13, 22, 23]. 

The required sample length scales logarithmically relative to the number of independent trials executed [cite: 18, 22]. The upper bound of the MinBTL formula dictates that the required years of historical data must be approximately greater than $2 \ln(N)$ divided by the square of the expected maximum Sharpe Ratio [cite: 13, 22]. If an analyst lacks sufficient historical data to meet the MinBTL requirement for their specified search space, any resulting high-performance strategy is statistically invalid, irrespective of its nominal metrics [cite: 14, 20].

| Independent Strategy Trials ($N$) | Theoretical Max In-Sample Sharpe Ratio | Approximate Minimum Backtest Length Required |
| :--- | :--- | :--- |
| 1 Trial | N/A (Standard Evaluation) | N/A |
| 7 Trials | 1.00 | ~2.0 Years |
| 10 Trials | 1.57 | ~2.5 Years |
| 45 Trials | 1.00 (At fixed threshold) | ~5.0 Years |
| 128 Trials | 2.60 | ~8.0 Years |
| 1,000 Trials | 3.26 | ~15.0+ Years |

*Table 1: Approximate mathematical relationship between the volume of independent computational trials, the artificial inflation of expected in-sample Sharpe Ratios (assuming an out-of-sample expectancy of zero), and the requisite Minimum Backtest Length necessary to preserve statistical validity [cite: 13, 18, 20, 21, 22].*

As demonstrated by the parameters of the theorem, an analyst processing 45 independent model configurations using only five years of historical data is mathematically destined to identify an overfitted strategy that reports an annualized Sharpe Ratio of 1.0, but which will yield zero return out-of-sample [cite: 18, 22]. Consequently, researchers must carefully constrain their parameter optimization processes to avoid exceeding the statistical capacity of their available historical data [cite: 21].

## Academic Critiques of Conservative Adjustments

While the Deflated Sharpe Ratio and associated multiple testing corrections provide vital defense mechanisms against data mining, the framework has drawn targeted critiques regarding the severity of its statistical penalties [cite: 24]. In global equity and derivatives markets, structural inefficiencies and true signal-to-noise ratios are innately low; consequently, aggressive mathematical deflation can easily obscure genuine, albeit weak, financial signals [cite: 12, 25].

A primary objection is that stringent frameworks like the DSR induce significant Type II errors—the false rejection of valid, profitable investment algorithms [cite: 17, 24]. Critics argue that utilizing uniformly harsh thresholds fails to accommodate the nuances of specific market microstructures and leads to the discard of strategies that exhibit episodic drawdowns but ultimately retain long-term positive expectancy [cite: 25]. Within the broader literature of financial econometrics, Levi and Welch (2017) have articulated analogous concerns regarding hyper-conservative statistical adjustments, demonstrating that standard industry beta shrinkage models and conservative cost-of-capital estimates often obfuscate accurate risk-reward profiles by aggressively muting empirical variation [cite: 26, 27, 28, 29]. 

Further complicating the assumption that all data-mined strategies are functionally void, empirical investigations utilizing empirical Bayes (EB) mining have demonstrated resilience in naively optimized parameters. Research evaluating thousands of stock predictors revealed that constructing portfolios based purely on the top 1% of historically observed, unadjusted Sharpe Ratios continued to yield an out-of-sample Sharpe Ratio of 1.45, performing comparably to heavily vetted anomaly strategies published in tier-one financial journals [cite: 30]. These findings imply that while the False Strategy Theorem mathematically holds for perfectly random data, actual financial time-series contain persistent autocorrelation and structural anomalies that can occasionally survive brute-force selection processes without total out-of-sample decay [cite: 30, 31].

To balance this tension, some quantitative architects advocate against optimizing exclusively for a single risk-adjusted point estimator. By disaggregating complex models—for example, deploying one classifier strictly to predict trade direction (side) and an independent regression model to estimate position conviction (size)—researchers can prioritize the F1-score (harmonic mean of precision and recall) over the Sharpe Ratio, thereby retaining responsive strategies that might otherwise be filtered out by conservative deflation algorithms [cite: 25].

## Machine Learning Vulnerabilities in Market Environments

As quantitative finance increasingly transitions from simple parametric rulesets to high-dimensional machine learning frameworks, the mechanisms of strategy failure have evolved [cite: 32, 33, 34]. Advanced classification algorithms, such as Random Forests and Long Short-Term Memory (LSTM) neural networks, possess immense capacity to detect non-linear dependencies across thousands of technical indicators [cite: 33, 35, 36]. However, this capacity simultaneously makes them uniquely vulnerable to environmental non-stationarity [cite: 37, 38, 39].

Machine learning trading strategies frequently exhibit spectacular backtest profiles that instantly collapse upon live deployment [cite: 9, 35, 40]. Empirical validation studies repeatedly highlight this discrepancy. In a 2026 academic study spanning three years of Nifty 50 index data, researchers constructed a Random Forest model utilizing 17 technical indicators. While the model achieved a flawless 100% training accuracy in-sample, its out-of-sample live accuracy deteriorated to 50% [cite: 35]. Crucially, because the algorithm optimized for high-frequency signal generation based on lagging public data, the integration of a standard 0.2% round-trip transaction cost devastated its equity curve, yielding a net absolute return of merely 5.34% over 50 trades—severely underperforming a basic 55.02% passive buy-and-hold benchmark [cite: 35].

Similar limitations manifest in deep learning time-series forecasting. A study applying LSTM networks to intraday gold (XAUUSD) trading discovered that combining the neural network with technical indicators (SMA, MACD, Bollinger Bands) produced a commanding backtest Sharpe Ratio of 2.51 and a 38.4% win rate [cite: 36]. Yet, forward-testing the identical architecture in live market conditions revealed zero strategies with positive mathematical expectancy, as the indicator rankings from the backtest failed to transfer robustly to unseen data [cite: 36].

These failures emphasize that optimizing machine learning models against historical snapshots fundamentally assumes the future will structurally resemble the past [cite: 37]. In reality, financial markets undergo continuous volatility regime shifts, correlation breakdowns, and behavioral adaptations driven by macroeconomic events [cite: 37, 40]. Models that lack persistent memory architecture or adaptive contextual reasoning often become hopelessly brittle during regime transitions, optimizing for trending behaviors precisely as the market enters a period of high-noise consolidation [cite: 37, 40].

## Artificial Intelligence and Large Language Model Forecasting

The deployment of generative Artificial Intelligence, particularly Large Language Models (LLMs), has introduced unprecedented capabilities in financial sentiment analysis, automated reasoning, and unstructured data processing [cite: 41, 42, 43]. By autonomously interpreting earnings transcripts, global news flows, and geopolitical developments, LLMs operate fundamentally differently than numerical, rules-based algorithms [cite: 41, 42, 43]. Consequently, they generate novel backtesting paradoxes that completely bypass the statistical checks of the Deflated Sharpe Ratio [cite: 44].

### Data Leakage and the Eradication of True Out-of-Sample Testing

The single greatest impediment to evaluating LLM-based trading strategies is the impossibility of ensuring temporal isolation [cite: 44]. Traditional statistical models initialize with blank parameters and learn exclusively from the historical data provided [cite: 9]. In stark contrast, an LLM possesses vast, generalized world knowledge baked directly into its neural weights during pre-training [cite: 44]. 

If a quantitative team attempts to backtest an LLM agent on market events from 2023, but the underlying base model was trained on corpora extending into 2024, the agent implicitly possesses "future" knowledge regarding corporate bankruptcies, interest rate decisions, and broad market trajectories [cite: 44]. This absolute look-ahead bias completely invalidates the backtest [cite: 44, 45]. Furthermore, efforts to restrict the LLM by utilizing date-filtered web searches (e.g., instructing the agent to only read news from before a specific date) remain structurally flawed; search engine algorithms, result snippets, and subsequently updated web pages effortlessly leak future information back into the context window [cite: 44]. 

To conduct viable strategy evaluations with LLMs, researchers must utilize models with strict, documented knowledge cutoffs, creating a narrow validation window extending from the cutoff date to the present [cite: 44]. Additionally, the research phase must rely on "time-capsule" data architecture—scraping and locally storing tens of thousands of URLs precisely as they existed on the target historical date, thereby forcing the AI agent to query an isolated, static corpus rather than the live internet [cite: 44]. As commercial LLM providers increasingly shift toward "continual learning" models that update continuously based on real-time data streams, deterministic historical backtesting of AI agents will become functionally impossible [cite: 44].

### Label Bias and Evaluator Self-Preference

Compounding the issue of temporal leakage are the intrinsic cognitive biases displayed by LLMs during decision-making tasks [cite: 42, 46, 47]. Extensive empirical testing reveals that foundational models exhibit pronounced label bias, systematically favoring specific categorical answers regardless of the empirical input, often mirroring the semantic prejudices of their training data [cite: 42, 46]. When analyzing scientific or complex financial summaries, modern LLMs demonstrate a high propensity for generalization bias, producing broad, oversimplified conclusions that mask the nuanced risk parameters of the underlying data [cite: 47].

These vulnerabilities are magnified when LLMs are integrated into Retrieval-Augmented Generation (RAG) frameworks to autonomously generate, rank, and evaluate trading hypotheses [cite: 48]. Recent AI evaluation studies document a severe self-preference bias, wherein LLM judges systematically assign higher quality ratings, accuracy scores, and strategic validity to content generated by themselves (or models of similar architecture) over objectively superior human-authored baselines [cite: 48, 49]. If an automated quantitative pipeline relies on an LLM to self-evaluate the backtested performance of its own logic, this self-enhancement loop creates an illusion of high strategic conviction, shielding fundamentally flawed strategies from objective risk management [cite: 48, 49].

## Empirical Discrepancies in Live Deployment

The theoretical vulnerabilities of LLM-based financial strategies are distinctly visible in cross-sectional empirical studies spanning the 2023–2025 technology cycles [cite: 41, 50, 51]. While narrow, short-term evaluations often report significant AI outperformance, rigorous, long-term testing reveals severe performance degradation [cite: 41, 50].

To assess the generalizability of LLM market-timing strategies, researchers developed the FINSABER framework, evaluating AI agent performance across 100+ stock symbols over a two-decade simulation to explicitly control for survivorship bias and data-snooping [cite: 41, 50]. The systematic backtests revealed that previously reported LLM advantages collapsed under broader cross-sectional evaluation [cite: 41, 50]. Crucially, market regime analysis demonstrated that LLM strategies lack dynamic adaptivity: they act overly conservative during extended bull markets—consistently trailing passive benchmarks—and become overly aggressive in bear markets, incurring catastrophic drawdowns [cite: 41, 50]. 

Further confirming these limitations, a 2026 study isolated the impact of trading frequency on LLM decision-making using a GPT-4 class agent across five major technology equities over the 2023–2024 bull market [cite: 52]. The experiment sought to determine the optimal rebalancing horizon by analyzing daily, weekly, and monthly AI executions [cite: 52].


The findings indicated a structural "Goldilocks zone" for LLMs: a weekly rebalancing schedule yielded an optimal Sharpe Ratio of 1.028, effectively filtering the noise inherent in daily rebalancing (Sharpe 0.892) while preventing the signal decay suffered in monthly rebalancing (Sharpe 0.421) [cite: 52]. However, despite optimal frequency calibration, no AI variation managed to outperform a simple Buy-and-Hold benchmark, which commanded a Sharpe Ratio of 1.620 over the identical period, underscoring the models' failure to capture sustained directional momentum [cite: 52].

[image delta #2, 0 bytes]


### Survivorship Bias and Execution Slippage

When transitioning AI strategies from research environments to live capital deployment, mechanical execution realities frequently destroy theoretical alpha [cite: 33, 40]. Researchers commonly utilize sanitized price series that silently remove assets delisted due to bankruptcy or acquisition [cite: 40, 41]. By training exclusively on survivors, AI algorithms learn solely from the structural patterns of successful entities [cite: 40, 41]. Upon live execution, when the agent inevitably interacts with a distressed asset exhibiting deteriorating bid-ask spreads, it applies winner-based logic to a failing instrument, resulting in immediate catastrophic losses [cite: 40]. 

Compounding survivorship bias is the pervasive mismodeling of execution slippage. Basic backtesting architecture assumes instantaneous order fulfillment at the precise mid-price observed when the trading signal fired [cite: 40]. In physical markets, acquiring inventory requires crossing the spread via market orders, incurring immediate friction [cite: 33, 40]. For strategies utilizing leverage, rapid turnover, or complex multi-leg derivatives, this execution slippage transforms ostensibly high-Sharpe algorithms into capital incinerators [cite: 33, 40, 53]. A documented empirical test of an LLM crypto-trading bot revealed that high-frequency, highly-leveraged configurations bled 15.2% of total equity to trading costs and slippage in merely two weeks, highlighting that theoretical intelligence cannot overcome structural market friction [cite: 53].

## Institutional Validation Frameworks

To mitigate the systemic failures of machine learning and generative AI models, institutional quantitative funds deploy advanced validation pipelines that extend beyond basic point-estimator corrections [cite: 17, 40]. 

Standard out-of-sample hold-out methodology and k-fold cross-validation are deeply flawed when applied to financial time series, as they frequently leak temporal information from the test set back into the training logic [cite: 11, 17, 45]. To combat this, institutions utilize Combinatorial Purged Cross-Validation (CPCV). CPCV architecture systematically generates exact combinations of training and testing arrays while ruthlessly purging any training observations that overlap with or contain leaked temporal data from the testing periods, ensuring true out-of-sample integrity [cite: 17]. 

Furthermore, robust institutional architecture demands dynamic regime awareness. Rather than running a static AI model universally across all market phases, funds segment historical data into discrete regimes based on realized volatility (e.g., low, elevated, extreme) [cite: 37, 40]. Distinct models are trained exclusively on data reflecting their assigned regime. In live execution, a supervisory volatility-gating layer continuously monitors macroeconomic conditions and dynamically routes trade generation to the specific AI model calibrated for the current environment [cite: 40]. By combining regime-aware architecture, realistic fill-decay simulation, and stringent multiple-testing corrections like the Deflated Sharpe Ratio, quantitative researchers can systematically differentiate between durable investment logic and transient statistical mirages.

## Sources
1. [Deflated Sharpe Ratio: How to avoid being fooled by randomness](https://quantdare.com/deflated-sharpe-ratio-how-to-avoid-been-fooled-by-randomness/)
2. [Deflated Sharpe ratio - Wikipedia](https://en.wikipedia.org/wiki/Deflated_Sharpe_ratio)
3. [The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality](https://www.davidhbailey.com/dhbpapers/deflated-sharpe.pdf)
4. [Deflated Sharpe Ratio (DSR) - Medium](https://medium.com/balaena-quant-insights/deflated-sharpe-ratio-dsr-33412c7dd464)
5. [Performance Evaluation under IID Normal Returns](https://pdfs.semanticscholar.org/c215/d0a2064ce1a3565d276475abc84305418f0f.pdf)
6. [Backtesting Forecasts That Use LLMs](https://futuresearch.ai/backtesting-forecasts-that-use-llms/)
7. [arXiv:2505.07078v5 - FINSABER](https://arxiv.org/html/2505.07078v5)
8. [RePEc - FINSABER](https://ideas.repec.org/p/arx/papers/2505.07078.html)
9. [LLM Trading Agent Goldilocks Zone](https://agent4science.org/take/take_mnmm3ynh2f7t7w51)
10. [Why LLM Trading Backtests Need Two Weeks and Twenty Dollars](https://medium.com/@kojott/why-llm-trading-backtests-need-two-weeks-and-twenty-dollars-5a19a525a095)
11. [Bailey et al. SSRN Paper](https://sdm.lbl.gov/oapapers/ssrn-id2507040-bailey.pdf)
12. [Overfit Tools At - David H. Bailey](https://www.davidhbailey.com/dhbpapers/overfit-tools-at.pdf)
15. [QuantResearch.org Publications](https://www.quantresearch.org/Publications.htm)
16. [ResearchGate - Minimum Backtest Length 1](https://www.researchgate.net/figure/Minimum-Backtest-Length-needed-to-avoid-overfitting-as-a-function-of-the-number-of_fig2_275302374)
17. [ResearchGate - Overfitting as number of trials grows](https://www.researchgate.net/figure/Overfitting-a-backtests-results-as-the-number-of-trials-grows_fig1_275302374)
19. [ML in Finance: Specifically Backtesting](https://mrozenva.medium.com/ml-in-finance-specifically-backtesting-5173cdded692)
21. [What is Backtesting Overfitting and Why Should You Avoid It?](https://tradingenigma.wordpress.com/2021/06/25/what-is-backtesting-overfitting-and-why-should-you-avoid-it/)
22. [Why machine learning trading strategies failed: empirical analysis on Nifty 50](https://www.researchgate.net/publication/404007728_Why_machine_learning_trading_strategies_failed_empirical_analysis_on_Nifity_50)
23. [Why Most Machine Learning Trading Strategies Fail](https://quant.fish/wiki/why-most-machine-learning-trading-strategies-fail/)
24. [Why Most AI Trading Systems Fail in Real Markets](https://arekansoftware.com/blog/why-most-ai-trading-systems-fail-in-real-markets)
25. [Stockholm University - ML Trading LSTM Strategies](https://su.diva-portal.org/smash/get/diva2:2065345/FULLTEXT01.pdf)
26. [Three Reasons AI Trading Backtests Fail in Live Markets](https://www.youtube.com/watch?v=aAqbLPC_A0E)
32. [Journal of Financial Data Science - Vol 5](https://www.pm-research.com/content/iijjfds/5/3/local/complete-issue.pdf)
33. [GARP Whitepaper - AI Disappointments in Quant Finance](https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf)
34. [Oxford - Bet Against Beta or CAPM Failure](https://ora.ox.ac.uk/objects/uuid:4679cb02-068d-453c-9c63-9f272a93bf63/files/skk91fm71d)
35. [Decoding the Markets - Substack](https://onepagecode.substack.com/p/decoding-the-markets-an-introduction)
36. [arXiv:2311.10685v3 - Empirical Bayes Mining](https://arxiv.org/html/2311.10685v3)
37. [arXiv:2512.15792v1 - LLM Bias Evaluation](https://arxiv.org/html/2512.15792v1)
38. [ACL Anthology - RAG Self Preference](https://aclanthology.org/2025.findings-acl.1369.pdf)
39. [ACL Anthology - Synthetic Data Calibration](https://aclanthology.org/2025.findings-acl.1307.pdf)
40. [Royal Society - Generalization Bias in LLMs](https://royalsocietypublishing.org/rsos/article/12/4/241776/235656/Generalization-bias-in-large-language-model)
41. [LLM Evaluators Recognize and Favor Their Own Generations](https://www.researchgate.net/publication/397200002_LLM_Evaluators_Recognize_and_Favor_Their_Own_Generations)
42. [Europe Risks AI Dependency Trap - RMC Management](https://www.rmcmgt.com/expert-time/Europe-Risks-AI-Dependency-Trap-as-Tech-Dominance-Shifts-to-US-and-Asia-Report-Warns-21-8470)
44. [McKinsey - State of AI](https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai)
47. [ResearchGate - Trading strategies failure Nifty 50 analysis](https://www.researchgate.net/publication/404007728_Why_machine_learning_trading_strategies_failed_empirical_analysis_on_Nifity_50)
48. [FutureSearch - Pitfalls in LLM Models](https://futuresearch.ai/backtesting-forecasts-that-use-llms/)
49. [QuantDare - DSR Critiques](https://quantdare.com/deflated-sharpe-ratio-how-to-avoid-been-fooled-by-randomness/)
50. [David H Bailey - Math Definitions of PSR and DSR](https://www.davidhbailey.com/dhbpapers/deflated-sharpe.pdf)
51. [EDHEC - Alternative Risk Premium](https://www.edhec.edu/sites/default/files/theses-phd/Alternative%20Risk%20Premium%20Workhorse%20or%20Trojan%20Horse_GORMAN_Stephen_FEB2022.pdf)
54. [Journal of Financial Data Science - Vol 1](https://pm-research.com/content/iijjfds/1/4/local/complete-issue.pdf)
56. [Medium - Sharpe Ratio Distributions and Testing](https://medium.com/balaena-quant-insights/sharpe-ratio-distributions-hypothesis-testing-and-p-values-2817dbc3c04e)
57. [Schwab - Good Sharpe Ratio Ranges](https://www.schwab.com/learn/story/calculate-sharpe-ratio-to-gauge-risk)
60. [Stanford - WF Sharpe on the Sharpe Ratio](https://web.stanford.edu/~wfsharpe/art/sr/sr.htm)
61. [BacktestBase - How Many Trades for Backtest](https://www.backtestbase.com/education/how-many-trades-for-backtest)
62. [Portfolio Optimization Book - Slides on Backtesting](https://portfoliooptimizationbook.com/slides/slides-backtesting.pdf)
63. [arXiv:2101.07217 - Minimum Track Record Length](https://arxiv.org/pdf/2101.07217)
65. [NC State - Quant Quarterly Issue 1](https://financialmath.sciences.ncsu.edu/wp-content/uploads/sites/302/2024/09/Quant-Quarterly-Issue-1_101718.pdf)
66. [ResearchGate - Nifty 50 ML Failures](https://www.researchgate.net/publication/404007728_Why_machine_learning_trading_strategies_failed_empirical_analysis_on_Nifity_50)
71. [McCombs UT Texas - EJOR Sharpe](https://faculty.mccombs.utexas.edu/deepayan.chakrabarti/mywww/papers/ejor20-sharpe.pdf)
72. [Schwab - Evaluating Risk and Returns](https://www.schwab.com/learn/story/calculate-sharpe-ratio-to-gauge-risk)
76. [ResearchGate - Comparative ML Trading Analysis](https://www.researchgate.net/publication/395452143_A_Comparative_Analysis_of_a_Machine_Learning_Trading_Strategy_From_Research_to_Live_Implementation)
77. [Wall Street Scholars - Machine Learning Papers](https://wallstreetscholars.com/papers/recent)
78. [IMF eLibrary - AI Market Manipulation and Execution](https://www.elibrary.imf.org/downloadpdf/display/book/9798400277573/CH003.pdf)
81. [BacktestBase - MinBTL Context](https://www.backtestbase.com/education/how-many-trades-for-backtest)
82. [Daemon Investments - WP Backtesting](https://daemoninvestments.com/wp-content/uploads/2022/06/WP_backtesting-1.pdf)
85. [QuantResearch - Innovations and CPCV](https://www.quantresearch.org/Innovations.htm)
91. [Oxford - Beta Shrinkage Critiques](https://ora.ox.ac.uk/objects/uuid:4679cb02-068d-453c-9c63-9f272a93bf63/files/skk91fm71d)
93. [Stockholm University - Live vs Backtest Empirical Failures](https://su.diva-portal.org/smash/get/diva2:2065345/FULLTEXT01.pdf)
94. [ResearchGate - Beta adjustments literature](https://www.researchgate.net/publication/316312545_Best_Practice_for_Cost-of-Capital_Estimates)
96. [Emerald - Systematic Risk Peer Selection](https://www.emerald.com/arla/article/doi/10.1108/ARLA-02-2025-0045/1353250/Peer-selection-and-valuation-a-systematic-risk)
97. [St Andrews - Carbon intensity returns](https://research-portal.st-andrews.ac.uk/files/305466461/Trinks_2022_EJ_Carbon-intensity_AAM.pdf)
99. [ResearchGate - Bounds of Brute Force Search](https://www.researchgate.net/figure/Minimum-Backtest-Length-needed-to-avoid-overfitting-as-a-function-of-the-number-of_fig2_275302374)
101. [Scribd - What to Look for in a Backtest](https://www.scribd.com/document/898703074/What-to-Look-for-in-a-Backtest)
102. [SlideServe - Spotting Backtest Overfitting](https://www.slideserve.com/adora/how-to-spot-backtest-overfitting)
106. [Stanford - Ex Post and Ex Ante Sharpe Ratio formulas](https://web.stanford.edu/~wfsharpe/art/sr/sr.htm)

**Sources:**
1. [semanticscholar.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEzjxe4wjap1gp2yia-yMBq-Pln0e9_i9Rkmw3vnaPGUAOiG5n-SEQVdYkY120y5oJsuADaAtmlLpDJ8FpriNG95kmgYM_DTYW42DDtZeFNbRPX6IehKy7qapDhIzJQRvHDXOUS5WlYqo8S8tsdbm31wolO-79Q0kB3OFBJvryyCo4fR8g=)
2. [schwab.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGoS129vW8P5qB-aEVPBzsEXdbfy4RkGgX5n0pqWv-y4XIzuIOY0ng-ll4szckP5AtyE5VUdCqvIOJoQPSZaY8fIAgetYxGRtLq7HhEvwlXCv48MZg-bcI28IA-nKWBge0qvCXbX08faKBHJRpEH9dq2a48JfvvspCX02DPzw==)
3. [stanford.edu](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG0cmW6xbRRVGjo7cGKf8VKYeAwplV_LT_Kq-mF5CpCAGutbpO2uQbMTblZC4QWVMUPkRfOrKytCoqGb35i0eNSsOiMb-hws5HVAg4JIz5iEv7kCCfAHSz2m1O7Qff5U3qpT4TWb-4=)
4. [lbl.gov](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF5u_D1p2CYkhVeE7EkQ8Ct4HMi0Yrx4V_UE4ZuJJMdUmP9BowIOuZ2imhB86R5AyH22kHg64xrPmCxsyy02GTV-fhMHDPhQWBxZyOH7HDqsY-CsSESe1qZ7wLFxPqEBaRaIUeLbPuTm70cjck=)
5. [medium.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGcs9j9RmXNXJ3wmtQL7A0ZE0OqmK09gJ6FvHJg5hdnqZR8rSxcVjDyf_MmzRW_o6MFT3brn5RtuBVIx4ngjNh6j5dRajLY-73ZV8F2zI9-b6pmL-H6_l56unxObDaGI7XP_nFF5xX4-OlHnmhZKH-MeGsC1D13IrtIo81tnpIHXMxj9LRp9A==)
6. [medium.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGMnYHVMfIkGqZx_aGHE3buy3aDz-JrEQqtpnHEqgaOR-eh5_2r6FcX6zjR6WYOeD9ghhgvbgLwtWY7WcUdY7NDolJ1y8qp8IkZ5tTXzvBuJeSrH2e01e-svZnCkUEtzuwl50-HptWcsZKUwHXPxjrY1xsafonTJ_FpxmW88PjdIOdNgI4K4HGEZgJsMznrIc3nrrfvkKpLQmD7WtMvibMvfeg10PU_PQ==)
7. [davidhbailey.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGSLljmSOVqH3KTrP3ayoqaW1soJq6WGhPHxjY-ID-NSn92auVZ-Pi-rzfoRvGk7wtER0mTTVo8mOym6uUD8023RxvESiyFPVuCoODLAMtPXgBOMrhdaTUtPTHIxFve4NvIXZGavf6oZqu6yJxGq1Pi8A==)
8. [davidhbailey.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHx4tIGm-p4D1T-V8_W4PUY174BUEGmET3uO9GXcLvHvjT6zIeAzsOq91Gc6bqhTeAoA4cAI0vknYv6UCPsV2uOJ6iE7Dc5eXGUY8BhfpDTNjyAWlBqrsJ4QGfg_7I2S7398o3IpPClA6lJCJPmq_QJ)
9. [quant.fish](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE8CthxuC8t65OXxwgkZYlZPIGPsay3lbcGI0oQKcSjlgI2ZMRwftJHPCBEPQ_XrH7fd5xMGkrGfiEbIxghpXTotN99mTeZg7hFYpPyoqCCLs5jr0dwKVNcogctQBhWBL-Qwc6vDt745_4IFixOUQwLxuDj1cYisGftSO1v09293A==)
10. [medium.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEROb-ZjyoIfWCySiqjDUM9Oy8u6KFN-YZke6KbGf52rTJ1c-gmmmgt8gdScUJRiSQ3-tCnjg91xekDKZmb9Luha0lmpLX5N7gOsdaL5rRkoVWLhRIC50WY2bH7ZGLH1CtLbWt5aITcj7p9Hc2k5EiKqImcgAdN-SvCBD6sFjZ0fq8P-hOM)
11. [substack.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEN9jGDrWQ1pUutPDlvp7xKt9ktu6qSa2Ew0e8eGE3FUEFT0GmfQ4BINJwJz3sCwGvJJcgM5uWH7oY9VNevU9Wo2W-LfgnwoZ4Nw3bIGrKBACaIa4pjNxyK3BboclRsKyGQuhX0TPv--CbT_flUSXjsae1emwkg-pa5q9yvpA==)
12. [wordpress.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHYIM84Btt3hkF9sJfMw4iv5Uv0gL9zN_Lm2JINHDvEJlFFhtQAT2b2mh1-B4rN8IBZerNJzfRZbMzWe-N0GlBtsKODKPmvdBGclUliO1xskvxQZasQRXSZAAMTHxg4L4Ac1l-eznb7O4BOUE40JDQfIV6j5RsIjvRHY18H_ExmiYghSrKRU0Xx1POZON4RWz9cG0Go3_CEMmGz3ushtjJY9A==)
13. [scribd.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFvPIh-MGSSnle2n6ruggUyp3RKsCGvEhuVJ9zE9gCqEkz7WcDm_sYYDuS4s-KX8YqzfD0eUR8sRTAfHnZaH1fRoA9aex7WAuM8d_0gYB9m2TJfFQYFnKWVRDt7XCRMJbrSkhg7sSpLsZfcdNzhwsKFJs9grvU4KIFn2tzI58M=)
14. [slideserve.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHyWdyMOgTxb8EDlhn6i6j52KOG1l6ou5tNdvhJUBQz4JukbFaEN4bgLtcEershLux4342C3CfLlOI4-7kyiSh6GWrkJlc3lfYjWVZ2rXkbMKxJ76MmxzGhdV8ZhxFsWCjIgcwlondZEMVufT_xBg7W3asS9wAc8Q==)
15. [wikipedia.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF6HHKoulnr2cUWRi4fyc7UJTe5sMclT1kBRZn1s_YGOuXg60oVgzlCoM0H8Phv99NdaYzQYMjEKMYN2eEpztz3P-hq9Wsl-hpV0NgoQLDlNOPltpPO6Ls06G3LBkqKahqUaCdf3v2okxI=)
16. [quantresearch.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQElE05z12cc-MIAyiuCWYZMNpbm5m4kGEDmQSU744LHZsr9C_0_rDZ8Kq9tL6VLiW1fTkaMmz3wYGJGBci7P3fVsVvITPE3OaeAegbJv-7lSCn4_rdop2YKnMHdFkNpSx_Pb0cd)
17. [quantresearch.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE0mP4uzP-3Vt6pKhkY64DoZ-ShrnMstAhmc0gBTs3umc7GXBH7COOfA_Fle5BKDzcOrc99a2uxPKBwJCUaPbm8yZ1NPXyfeWvOChagHVO_zfWi3dtd5irTpgiu-mepxAV434k=)
18. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG1DNttVbk4n1jIEONBswq_CmdsEH_dLFvZ8ReYX78cL6clVd3KWEqeCMniZj2AQJtpOdeDzONxWXTChHd2u-q9qhFeTvo9dVDfB-zn_rVFj9CfHJhuKTTw6J5zT0svUDFCeVEXaFVWOpgqBN3R_Tz-IZh6NcbWbWsO9LoxuLcciKNTIt44d6yzwSx1pE9urvAWrD3lhcwa6e7xNrqawC4IkHo6GSc9)
19. [ncsu.edu](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH5HxRsctnGfNG_f7tPNvT9kCFwjrle29bWAX91mwF_VUrOdwjT4RpfPsnyI44dXGf9kNp4dWkyTzD0gaL0VCLiJ37S60HVWhn3swPpafOS9jF4QFBsBTAqJnXQnP39ByoAIy9e3ZLcj11jXUGA2IkHTJhtxRQG_9cgwb0-QzYliBBp-mbHcSVvuO2JzAeKZmneuCjGSqjJOsORutJEQtB2tDb1Qyg=)
20. [backtestbase.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGklPLGnxUY0Tgtb8Yrxro3EDHoPmC48NXIP6LcYcWjibzGMNuePwwtKhlzikHIq9rPdC9VurRdv8ut5IEUleV6WRQdxaKWB9gQ-4HJIIEbczdMyv-Nda8_tpohfmzS7edTXfb7qLbp9jO_W848CqSvvJkx7lAVFVL6)
21. [quantdare.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFnHdzEoin9kz_BjjhZJFb2xhVCAwEkJsix6Y0pRHzG4S1d5UKLo-OnOY3NBwmrGN7QrrxbrNUZ2gh9Z4NV7XcQJu-8jaaKEKOOiMCV-XtfZzDiPVy-8bKMorwtwysZHyQ3hou44r87XT1XtJA4XBhjDEjd_DBgmxPEwUlCZiLJBcVDHoLo5uqRGg==)
22. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGQ9O8-zmdm9tAm-yghmBwMZtPXe2gI2LPtX1FhjhW86mePQVbxVhQmqIKwdeMdaeV1IKzKRPGCrNhZB8yqyOSINYaEHk3sgmA7lVFLJ_YmNbJGTcQOsckab1X-LQeQqum_rht9QI5OLLmvXR3vHMMkzV_yL5ybaDzkKSAWOV8-wqEEpl9hfsTKki9UYQABP9iQIfOGA0ZshORp9YmUXCahW6Gx9hgikLsl2yxhT9mgZKpP10zsXfMlFf3Zvg==)
23. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG9t9QI96M3Lfw8If7yeN7Jf0ygE6oEJar-nXYk_0TZUkcTIZWxCiXpaqDQ0X80kVnhst3bCMee9LPOsBj8lhp4RH7Y6uFgCq48P1zb83V7UiYb2fLL9A==)
24. [pm-research.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH2p_RqwMNN1U4WhABtiaWeAcLXeTQnFo_G1jgOKaie7sC89y4WxC9Tu-3-b_-Zy3x5TC54ecYjtNe4SUUDuq6gvfaFHwa8d1dpAZQ4KpBeViiiKprHJ3MrC0RKLbmlkZgjtEBCJKGp0VPqz20kveUie-OU7nl7mntNm8LETbw=)
25. [garp.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGxyFHc9lbbmBqUxFIYoO6nlaDVHoFN1yUWmyRmvXszzdHSlqhim0k3-VhuKu9lgURCFsQutjVA55Bx5aPUwsEEGyENDexoOOORbrHFEUOMDdjZGjFXyHZY3QzHCnR-RHwNlu7f1BZ353iZ0GOO2t969Ekp)
26. [ox.ac.uk](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGxwHbjcmWBVXZfOrwrfdx42pwCZ1COziwOueSABuu3NjCcVQtgUvzYSVxTn_BIqSaJTavbfP54MIkpedFWZnuEwoVIOPHwcwddwsF8YP7BN1EQHYaTPC0A1DcoNFiBbTM41Lt1kc1mLJj0xipDXYRfgM-Lj_0nDnVP4qdgSQMpcnbIt_Vr5VPA652rA70=)
27. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHgBDS-4D6LRHr0gqsp7rO3A1wIHRWZF4DJSe7x_SoVZIo026aHgPHmTZA-EnoqgCOeGdLpMseDF5HNDR6T-U5vXwWTTBeQf0MGVRGcwEYY1oEbGYCNPPhEGlMhaZRAr6TA-K3Woe3VO6n9jtvxQtdqWF9qBoboBWFGIlgOUP1jne7Ob8VBHExhfq5oRRPfvozZNBqu)
28. [emerald.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGY9PaqOyTW0qlx0D6Jr5c0bxWlWfSerbE_uIuRymkLHAcmBbs_vDQP1inMopISj3aWmbbH3kCdqGVqf9B3mgi5_k0nWyBkyknslzwPrZx0RdKscZoP4SivL5-GBsnZ3cx8EqzxMYDr6YvglHZjGsWmFRknB9rebvm6SvPYG8Oyt1vpmmzXnB5WyHkCIgA8BQj3oNm1CMM1msKFl_9a3ruch8WXqSsMnxes1YnTHOFQ)
29. [st-andrews.ac.uk](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE-PeSC7U5moCjalSLG5qpn6arXBVJs4TwUTGZxsWz6b_iy8u_p0Mr9mIy3cRn5oZsw9xBRefokvfk4myIME0st58agD0dnJ57XSWTzXgu1sZagCTGq9lkSv6iE9thTse7WJRpFBLXkXxr7oTjy1q_iT1oKl8hrijyEPSPny-_fDdCfF275SRMl8FzWXskchoJzPYzgpLU=)
30. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHrgHfxBms4EyJcuy2t6WtWRDvYaRi0tturaH37wQ9CwCUpOuWRSXnXrLqVLf0CWeKGQtsl-1M8eD-7Z4zBqBnLWgKYthmHcEj1Tin3GH1EbYcmFdUbyrRZrA==)
31. [edhec.edu](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE28fgGrTGCTh5CvUy8w-1n3ycVTP_5C659PLwUaWbSNwEImnuNMJm7vZH4_0dPbbZ64losk9nMHH3DJELiKY4mXk0VjUwItgkdM9OhZye5nRYJNKL919eKQpLWtHdQajBcT0WoB0X221jIo5CQ6SIJGNVsfDoEie9Lok0M5xsJYlXip1IAn-W_-vycGFx3lQYcMAZA_kcIIBWJU72Sg0nFo84WZzTRP3L-1lFxh5eaRGGc8CYVeESKDDZuaEwPX8gFBoIbIQ==)
32. [pm-research.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFbeTqRsa94Ek94FDxPQVwG1lIxQv4tOAKs919-mFv6hQCmgER0t745Nyu-wBStid7OjEPip_CnCAeCWP5eLK3zrPr1NTirlmi3z8z5Wsj4JIcm16eJGnzizjHJNHQca2QtE0Wo4rmEVQYP7trmL4fRuzaa9MqYad4I7A==)
33. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQErRkbQpbouXnZ5LQuFeoxFp_dFtXwzZi-u7cqb6u4wXjxmcSXMsRm5XAKyPbinbBIvf37lxnXYHq8I9BQk-wRn-8dQqE_WjWxjw8rpfoX4yIsmFvxb-x_8zepPvAwU14AFrtVo8Yg0bf94I6Z6vulE4QU-1mTeKG-cutz-6xibVO_LBwXwq5Y49bmG7kkb6qCMdIwM6pL5PSyb_b1dapo8y_orkHnDEhTljyGl55yLqy_Cj1rMT-PQK3_Ef1hHMEVNPoRcV8VfK4MeTVU=)
34. [imf.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE5WVZbtE-BmLKkKDx5eVP2KYab06JJaJR3mrc2ISF1f_vxcu_sr9MGDqMlkn4V9uU1CgBQHF5tEI0kM1ELFtivDgnlj7BCf_uGMv4_1u3berOgE4yoaTXhuOS40-5H8c_zUHOTM77Y-8VPRWC1rvuhCWEChj-YgXu-hV96ylQ-XjYoMQ==)
35. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFxM457P1RWmmIX_YYd7lcIALSGHT6uY1wk_mjoN8Wklx2zFD8Ft6CoRocPwXp9i_fIdM4GZahnIHa2dUKN0efdulFtbAmNdwPkKyOxSH00s3n84uu1oigr0UoIQd8Gjftk34xA_ufDlEmyEyzRkWqQmQqBvGelO-tHn2EfUn_80285DzQHhuddzhghH3qKWiQTVPdWGF0WG9LcqIuhGwCQ1SHC3RXlmSToqRbhjaVVcGfBFtiqoK45)
36. [diva-portal.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEFeDJZjIhUds6_b_DVtZ07csceC7KyMFH98_gUnp1KYDaMn_PNefvU1xSFcpgHwmIpysE_mCUg2muvSTp6NS8MWGD8MfPaaMWxDrDLH3KWjTmwp8d9aCnawXONevm-zy8UUOx17L8aRDjGhykWiTrYXoNJ38I_OQ==)
37. [arekansoftware.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHKN9Cy2LXePcY2P7zaqoNQkXilPE80kQ1H2DrYnIKPLSY3w7-ifp7kBBL3cRxLiNBTd41Xe-jXG35WbO7IMzccS3QRcoKyhYOIZJCo9mhzeWrF3Vy27fGdYthPKsSo4t_lXKbW7tVVOSuDqHR85VL1f7QSwIHiUkFHqJgyUtxZwTLtyUmQNA==)
38. [utexas.edu](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH83CB28l_Zl9cdlDw7k-C3fM69PZExjLKyuloqXLc8ivPvak7w8AlfFh7Lk0Xjtrq_BHxzvuxbSw9DhWKrsEQF1w76rZhObZXMSRxuzoCteGwZssYq8bnQ1NtfBNJ6O_5tkEnFZSlGtTF0IfkW6IZJ8JzsflNxCmzmBFWXtivkMG2VXXCo16n02x2y_w==)
39. [wallstreetscholars.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEmXU6iILyprK02peKFaoBaoe3gmtvBcv3NPBWZBUxeLwvryUWH0sxCRSAjvYQjLYPhwdvlYU3Hah98zPYOhqs67QzSYXuywA0x_RuC32qmkKa682fUKGAUVQd-41-n4_3SIg==)
40. [youtube.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFRa0aqGUxVgFAMflWqe3FoQIqlhAPiHsjATXSh2gqMatAe7AWe1n119iSI6bhYGQVeqKWU1NMhdSp-s0WGJh2ePoEbd3K8lx5kyIMioBqa2wQVDQ2R7L6HuceEC3AnSDak)
41. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFjaVZEy_mK0kxOEgxXyEMxtp06Svy79nZdMgeqkvBuBn4IZ9BiiWisVE441bkTiL9aInI76gHGfSADajTILc6jS3OMiP1cWzs7UBFWlQPWBikkW8kcqiuFdQ==)
42. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGetVrej4UUq0LA1ts-yBdivzCQU_Zk79HROxX77q8PAFjEpcylMoXLia9E_nvI_0kx_nvFxDcEeTv8NE87Z_4EKyyTyqAJYh0Mag0dQteZuZOudOCRDVxgUQ==)
43. [mckinsey.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFLuxzKyl39KqDTCouy-80r2-TdJBxlKSkFmWsjeSzaL8t1-XbEpCIhZPsYJ77VXvuR9IJ3RG49n3hadAERGgoEfpprlnN7UZt-dCgPFCLfwaakDLlB-m54cYYSzv_WeKxZJY1A-Ky_mjDaUe48CYozcb2iTIJ7lKpeHNsE2KrvRZkmTJUC)
44. [futuresearch.ai](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFzNIgJi7w1WbpuIxP678zV4sIDufRTNmSdlamT1V2sjpgyP2vyTRUOZdgPBLj07VTEqi6dxEXEGcuUvGj8ibnF7K7QB0_-Ua4axyrxOvTqueXnBGc_ZT2UJ6Z3klqYeo9rXdWFNQigiCxtxNf1bVwrBbQ=)
45. [portfoliooptimizationbook.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE83h7Tvrt3BlW3iXV30ftUTkf3lbDGsVqK0yrtXw1Kf8o3qfPeZlFqPO2DMjHXTGcNHeMQclaabikSO_4dP0GOs3FvQ64FcQf0lgKcfHk8ZbE1ftBtFjM3yfbYigOxtC0ZFyorNFduh_O7oWnl66rPjY7jRcsXTb--)
46. [aclanthology.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH1Yfnti2o-MPhpLU3qQbb0XV6HDE_JnvoHx7PvwcmwFWnBJM-68R3AS2IkShHsCpxtDSGPPf2Ths9JvFHtPgQTfjOJuOG81rnpZad9HRZ4xiJcocaoGzqmOqFVUw5dBw0hSel7N1l-CT0=)
47. [royalsocietypublishing.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH9Hm1KaX_XtVrOa6PIMO7sI_3ViOq1v8rCeSuDj3UCrxZYM6BoRlKV45rDBbMfH9TLpmSVh3Mi0BnV5cBQHazJphIFty0t_ISUh5VdMFunKY2t6W2YHXLNLN_SX-_s0ySdftPD8tmPr9H6q0ePHkPHIExksWRw_Bz5KoeShlMOq0CX7Jf4nfq3uUIsUT2uMB-9A9aNZQ3nwZvb8IGR066ES-go3A==)
48. [aclanthology.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFyWgN2llBPhGZP1d7cal1u0r5wdwPDRdZqXlYnauk6RvVCrPYFPj_l1DIJdSS8h7DyzM8RRD4w8SSkWa_279WqX3M_4bflynrK1xdkeAfrhkNcuWKsWpRPCBoa4-5z3v7f_evTUrbPvpI=)
49. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF33LQc5TClX-1zS1Cezjyz_8FJPRTaElDGwXL021uLffCGYnfm7zxAbhzJrYMQvt69NBjFbdAX1l6dIYjaR5kF8co7Xf9KV1JtzFcLCK2F7xA2qL7q2xdmGlWSwwO9mSZ08U-q1uJnA3ODEaGGzNQpMvlEb5afqIdt62cCTkGEeq36yhwNF3V3YRj2ko8OTMEVWJEX4qGWi3RiTVkll53OJQ==)
50. [repec.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEPsWZwQzAZ6Y7POlyTNVtwRIb76sRSL4A0We2FT0UqPoDKe51qBxezAnuxBvW4tIYPbSJptxQhJ_smyRl8iuCjgtEBFZauMCfNQDxrHxPTo2NSyWloZYhvznCcfHy-RiO7SVSM96nTDlJ6)
51. [rmcmgt.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF_DwhVk6cvnSEgHrAZi1_Xnh601lOFldZLGQZ1IMConzXz0IcxwdXA9yDs8zQ1hVt8TQSb2zptLNVO1RsWLDmsg8QuyKYW6nF2UvvAZD7O2uJMyl0l717ZW6E3dIwiNa9f28UUByuEHLPg1lxn9TdItMiAO6tg1mhycHLETKhfkNUJ0Rt1qBgKwM0dhXM2tkA9UVW3VPWcJ4sEG_fgk3bGfVk4KNP2gwx13vOV6vxCpCCHFJZT)
52. [agent4science.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGUkOENedOclTir9iSbRlAuQivNJFuVZTA_nAAMr3Z59qy9Hk1dRIEVG5SnqPES6MbPDtXZbYkE_QZ48nhRLI72XLbuwYHfBKj6rUF38Kty9BhGFGafbJX9zTS-y6LM0ha6d_0mksghqGVC)
53. [medium.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEcLAzOxrxJhQMh_Hq9Nd_cqI6WU5UUK9YWrTN06ToNJNYujHzCgSRt26gwWL5QAuaVDeP0-1BIDYYJt6CCTwPlW8-RlfEWCLyJgnP1_yzq0rWQpCOqk3x30puIjHsyGgwB9TSWPhDnNcLN3lXD44C3HBZdLWfirgyIvBFJFnyGqU5Rfz1vpQMhyEzHrLCLUQUYslp1gSHGixU=)