# LLM-based signal generation for high-frequency trading in 2025

The integration of Large Language Models (LLMs) and advanced foundation architectures into quantitative finance has fundamentally altered the landscape of algorithmic signal generation. However, deploying these complex parameter architectures within high-frequency trading (HFT) and ultra-low-latency market making presents a severe computational paradox. LLMs possess unprecedented capacities for semantic reasoning, multimodal data synthesis, and complex nonlinear pattern recognition, yet their autoregressive inference mechanisms are inherently constrained by latency bottlenecks. These latency profiles are fundamentally antithetical to the microsecond demands of high-frequency order execution. In 2025, the research and deployment frontier has shifted away from generic, monolithic model querying toward highly specialized, latency-optimized signal generation strategies. These encompass direct microstructure transformers, asynchronous multi-agent orchestration frameworks, hybrid reinforcement learning pipelines augmented by semantic sentiment, and knowledge distillation techniques designed to map the deep reasoning capabilities of LLMs onto high-speed statistical learners and Field-Programmable Gate Arrays (FPGAs).

## Exchange Microstructure Dynamics

To evaluate the efficacy of any signal generation strategy, it is first necessary to define the physical and institutional parameters of the market microstructure in which the algorithm operates. High-frequency markets operate on strict queue-based order execution models where transactions are matched continuously based on absolute arrival time [cite: 1, 2]. In these environments, signal decay is not measured in minutes, but in microseconds. 

### Execution Mechanics and Transaction Cost Blind Spots

The practical consequence for a trading desk running passive execution strategies is that algorithmic fill-rate optimization may achieve fills at seemingly optimal prices while incurring severe queue-position-driven adverse selection costs [cite: 2]. These costs are often entirely invisible in standard Transaction Cost Analysis (TCA) outputs. The dynamic component of market making captures the optionality value of locking in a queue position; once an order is stationed at a specific queue depth, there is tangible economic value in maintaining that position because re-queuing after a cancellation incurs a substantial penalty in execution priority [cite: 2].

These mechanics are heavily influenced by the specific exchange protocol. For example, calibrations against the NASDAQ ITCH market-by-order data on highly liquid US equities and ETFs (where bid/ask spreads sit close to a single tick) demonstrate that Order-To-Trade Ratio (OTR) drift is a critical precursor to adverse selection [cite: 2, 3]. Elevated OTR on a specific instrument over a defined, time-windowed pre-fill period strongly indicates that market participants are aggressively submitting and canceling orders to probe liquidity or manipulate queue dynamics [cite: 2]. Under MiFID II RTS 9 regulations in Europe, OTR is a mandated surveillance metric utilized by major exchanges such as Eurex and the London Metal Exchange (LME) [cite: 2, 4]. If an LLM-based agent processes price data without explicitly accounting for these granular, protocol-specific microstructure signals, it will systematically misprice liquidity provision [cite: 2].

### Structural Exchange Asymmetries

Furthermore, global exchange structures introduce idiosyncratic volatility regimes that generic LLMs fail to model correctly. A primary example is the Tokyo Commodity Exchange (TOCOM), which operates with split trading sessions that induce distinct intraday and overnight volatility regimes [cite: 5]. For instruments such as TOCOM rubber futures, the dichotomy between intraday returns and overnight returns is exacerbated by trading halts and physical delivery constraints [cite: 5]. 

Standard daily Value-at-Risk (VaR) models, including asymmetric Generalized Autoregressive Conditional Heteroskedasticity (GARCH) variants, systematically underestimate true market risk by conflating these different volatility dynamics into a single daily series [cite: 5]. An effective LLM deployment in Asian or European markets must inherently recognize these institutional boundaries, utilizing two-tiered risk management frameworks that separately apply conventional models to intraday risk and jump-aware measures for overnight risk [cite: 5]. Failure to integrate the exchange's specific institutional mechanics into the model's spatial awareness leaves speculators and clearinghouses dangerously exposed during periods of market stress [cite: 5].

## Latency Specifications for Large Language Models

The deployment of generative AI in trading is strictly governed by the latency-quality trade-off. This dynamic is rigorously quantified by simulation systems such as HFTBench, which evaluates real-time decision-making of LLMs using historical per-second trading data sourced from Polygon.io [cite: 1]. HFTBench utilizes a linearly decaying price model to simulate queue-based execution, assigning execution prices where faster agents secure more favorable outcomes [cite: 1].

The HFTBench environment proves that trading tasks demand high response quality and low latency simultaneously [cite: 1]. If an LLM agent produces a highly accurate signal but suffers from high latency, the alpha decays entirely before the order reaches the matching engine, resulting in adverse execution [cite: 1]. Conversely, deploying excessively small models to reduce latency often results in degraded decision quality; if accuracy is overly compromised, the speed advantage is negated, and faster execution merely accelerates the rate of financial loss [cite: 1]. Empirical testing confirms that moderately larger models (e.g., 14B parameters) generally outperform smaller alternatives in trading benchmarks because their capacity to recognize high-reward patterns outweighs the marginal latency penalty, provided the infrastructure is heavily optimized [cite: 1].

### Hardware Bandwidth and Compute Architecture

The optimization of this trade-off requires specific hardware and inference software configurations. In 2025, the baseline for cloud-based LLM inference latency hovers between 800ms and 1.5 seconds, which is viable for daily portfolio rebalancing but disastrous for HFT [cite: 6, 7]. To penetrate the intraday and high-frequency execution windows, institutions are deploying customized hardware stacks.

The speed of open-source models is fundamentally memory-bound, limited by how fast the GPU can read model weights from High Bandwidth Memory (HBM) [cite: 8, 9]. Migrating from NVIDIA H100 architectures (3.35 TB/s bandwidth) to H200 architectures (HBM3e with 4.8 TB/s bandwidth) on bare-metal deployments has allowed quantitative funds to drop Time to First Token (TTFT) metrics significantly [cite: 8]. For a Llama-3 8B parameter model, H200 bare-metal clusters allow the entire model to be loaded into the GPU's L2 cache and HBM, dropping TTFT to 40 milliseconds and total response delay below 300ms [cite: 8]. 

Further compression is achieved via the NVIDIA Blackwell architecture (B200/GB200 GPUs). Blackwell features 208 billion transistors and natively supports FP4 tensor core quantization, claiming a 4x faster LLM inference over Hopper architectures [cite: 6].

[image delta #1, 0 bytes]

 Independent MLPerf 2025 benchmarks confirm 2.5-3x p95 latency reductions for GPT-scale models, dropping latencies from 800ms to 300ms on batch sizes of 128 [cite: 6]. Firms employing custom silicon, such as Groq's Language Processing Unit (LPU), bypass traditional GPU training patterns entirely, achieving processing speeds exceeding 300 tokens per second on 70B-parameter architectures [cite: 10].


### Runtime Engine Optimization

Hardware capability must be paired with optimized serving stacks. The choice of inference runtime determines how the system batches requests, overlaps prefill and decode phases, and manages the key-value (KV) cache [cite: 11].

The primary contention in 2025 is between vLLM and TensorRT-LLM [cite: 11, 12]. vLLM is built around PagedAttention, which partitions the KV cache into fixed-size blocks, reducing KV fragmentation to under 4% (compared to 60-80% in naive allocators) [cite: 11]. This enables high GPU utilization with continuous batching, making it highly effective for concurrent workloads [cite: 11]. 

Conversely, NVIDIA's TensorRT-LLM utilizes Kernel Fusion, which consolidates multiple tensor operations into single CUDA kernels to minimize DRAM-to-compute data movement [cite: 12]. It also leverages CUDA Graphs to capture GPU operation sequences for efficient replay, drastically reducing CPU-side overhead [cite: 12]. For algorithmic trading where sub-100ms latency is an absolute requirement, TensorRT-LLM often delivers lower single-request latency on identical hardware when strictly compiled for a specific model [cite: 11, 12]. However, telemetry indicates that at high concurrency (e.g., executing multiple simultaneous portfolio queries), TensorRT-LLM's TTFT degrades significantly, whereas vLLM maintains more stable tail latencies [cite: 12].

## Limit Order Book Transformer Architectures

Historically, quantitative finance relied on Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks to parse the spatial and temporal dependencies of Limit Order Book data [cite: 13, 14]. By 2025, sequence modeling for microstructure analysis advanced significantly with the introduction of the Limit Order Book Transformer (LiT).

### Deep Hierarchy Modeling

LiT is engineered specifically to address the "deep hierarchy" inherent in high-frequency order books [cite: 15, 16]. Unlike generic sequence models that treat the LOB as a flat vector of numerical prices and volumes, LiT is structurally aware [cite: 13, 15]. It utilizes customized attention mechanisms to map cross-level dependencies, explicitly differentiating the microstructural significance of activity at different depths—for example, recognizing that a Level 1 bid absorbing liquidity holds fundamentally different predictive power than a Level 10 bid indicating informed institutional stacking [cite: 15, 16]. 

By processing the full depth of the order book across both bid and ask sides simultaneously, LiT captures the nuanced pairwise comparisons necessary to forecast short-term mid-price movements, order book imbalances, and impending price jumps [cite: 13, 15]. In isolated classification tasks, architectures with positional or dual attention have demonstrated superior performance in intraday price-movement classification compared to traditional deep learning baselines [cite: 13, 17].

### Empirical Performance Limitations

Despite representing the state-of-the-art in transformer-based LOB modeling, direct LLM-style architectures applied to raw tick data face severe empirical realities when subjected to full-friction backtesting [cite: 14, 15]. 

In a rigorous walk-forward out-of-sample evaluation covering January 2020 through February 2026 (capturing the COVID-19 crash, the 2021 bull run, and the 2022 bear market), the LiT model generated an annualized return of 10.04% with an annual volatility of 7.90% [cite: 15]. While the Transformer successfully learned a return signal that forced the optimizer out of a low-return minimum-variance trap, it was structurally defeated by significantly simpler models [cite: 15]. 

A naive 1/N equal-weight baseline achieved a superior Sharpe ratio (1.04) and Calmar ratio (0.78), experiencing roughly half the maximum drawdown (-16.5%) of the Transformer strategy [cite: 15]. More critically, the Transformer was strictly beaten on return, Sharpe (0.32), drawdown, and Calmar (0.50) metrics by a rudimentary 20-day momentum factor [cite: 15]. Although a Memmel-corrected Jobson-Korkie test indicated that the underperformance relative to simple momentum was not statistically significant from sampling noise, it represents a practical failure [cite: 15]. 

This empirical reality demonstrates that while transformers possess immense representational capability, feeding raw, high-dimensional, noise-heavy microstructure data into large attention blocks is computationally inefficient and highly susceptible to overfitting structural market breaks [cite: 15, 18]. The network learns a noisy, momentum-like signal with similar autocorrelation but at vastly higher computational cost and latency [cite: 15]. Consequently, the industry consensus has pivoted away from using massive transformers as direct end-to-end price predictors at the tick level, shifting instead toward decoupled architectures.

## Decoupled Multi-Agent Orchestration Frameworks

To mitigate the brittleness and latency of monolithic models, modern HFT and intraday trading systems have adopted multi-agent orchestration frameworks [cite: 19, 20, 21]. These systems decompose the trading workflow into specialized cognitive tasks, leveraging LLMs not for raw tick ingestion, but for complex state perception, regime classification, and strategic routing [cite: 21, 22].

### Temporal Horizon Constraints

The degradation of LLM alpha in direct trading is closely tied to the temporal horizon of the decision. LLMs act primarily as information filters; when forced to make high-frequency buy/sell decisions based on highly noisy intraday data, their performance falters [cite: 19, 23]. 

Extensive ablation testing of models like GPT-4.1-mini acting as autonomous trading agents across five major equities (AAPL, MSFT, AMZN, TSLA, NFLX) reveals a distinct "Goldilocks zone" for LLM decision-making [cite: 23]. Operating on a weekly rebalancing horizon yielded the highest performance, achieving a Sharpe ratio of 1.028 and a cumulative return of +62.4% over a two-year period (2023-2024), while utilizing 75% fewer trades than daily models [cite: 23]. Daily rebalancing resulted in a lower Sharpe ratio of 0.892, while monthly rebalancing degraded performance dramatically to a Sharpe of 0.421, as the model lost critical signaling information [cite: 23]. Notably, no LLM frequency surpassed a simple Buy-and-Hold strategy (Sharpe 1.620) during this sustained bull market period, reinforcing the necessity for LLMs to handle strategic asset allocation rather than rapid tick execution [cite: 19, 23].

### Agent Specialization: QuantAgent

QuantAgent represents a shift toward specialized, price-driven multi-agent systems designed for slightly longer temporal windows (e.g., 1-hour to 4-hour bars, functioning as mid-frequency trading) [cite: 19, 21, 24]. It decomposes the trading process into four distinct agents: Indicator, Pattern, Trend, and Risk [cite: 19, 21]. 

Each agent is equipped with domain-specific tools and structured reasoning capabilities to capture distinct aspects of market dynamics over short temporal windows [cite: 21]. By isolating signals into explicit modules, QuantAgent has demonstrated up to 80% directional accuracy at short horizons in zero-shot evaluations across ten financial instruments, including Bitcoin and Nasdaq futures [cite: 19, 21, 24].

### MM-DREX: Multimodal Dynamic Routing

Financial markets exhibit severe non-stationarity, rendering fixed-structure trading models obsolete during rapid regime shifts. The MM-DREX (Multimodal-Driven Dynamic Routed Expert) framework addresses this explicitly by decoupling market state perception from strategy execution, enabling adaptive sequential decision-making [cite: 20, 22, 25, 26].

The architecture introduces a Vision-Language Model (VLM) functioning as a dynamic router. This router jointly analyzes visual candlestick chart patterns and long-term temporal features to classify the current market regime [cite: 20, 22]. To generate high-quality regime labels for pre-training, datasets undergo classification through technical indicators validated by institutional traders into three states: uptrend, downtrend, and consolidation [cite: 20]. 

Based on this real-time perception, the router dynamically allocates weights across four heterogeneous trading experts: trend, reversal, breakout, and positioning [cite: 20, 22].

[image delta #2, 0 bytes]

 This allows for the generation of specialized, fine-grained sub-strategies [cite: 20]. MM-DREX is trained via a novel SFT-RL (Supervised Fine-Tuning and Reinforcement Learning) hybrid paradigm, which handles joint training of heterogeneous modules with different objectives, mitigating gradient interference while synergistically optimizing both classification and risk-adjusted decision-making [cite: 20, 22]. 

Extensive experiments on multi-modal datasets covering U.S. and Chinese equity markets, futures, ETFs, and cryptocurrencies demonstrate that MM-DREX significantly outperforms 15 diverse baselines (including state-of-the-art financial LLMs and standard deep reinforcement learning models) across total return, Sharpe ratio, and maximum drawdown [cite: 20, 22, 25]. The system also incorporates an interpretability module that traces routing logic and expert behavior in real-time, providing a critical audit trail for strategy transparency [cite: 20, 25].


## Semantic Sentiment and Reinforcement Learning

While LLMs encounter friction with the numeric density of raw LOB data, they remain the undisputed state-of-the-art for natural language processing, semantic extraction, and financial sentiment analysis [cite: 27, 28, 29]. Financial news and macroeconomic reports exhibit long-memory effects and delayed incorporation into asset prices due to limits to arbitrage [cite: 27, 28]. The optimal strategy relies on asynchronous fusion: utilizing LLMs to extract real-time sentiment from unstructured text and feeding these high-level signals into traditional high-speed statistical or Reinforcement Learning (RL) execution engines [cite: 30].

### Sentiment-Augmented Proximal Policy Optimization

The Sentiment-Augmented Proximal Policy Optimization (SAPPO) framework exemplifies this synthesis [cite: 31, 32, 33]. Standard PPO algorithms in quantitative finance rely exclusively on historical price and volume data, optimizing policies via reward functions tied to cumulative returns. However, they are entirely blind to the exogenous textual shocks that drive sudden regime shifts [cite: 31, 32]. 

SAPPO utilizes a finance-optimized LLM (such as LLaMA 3.3) to parse continuous streams of Refinitiv financial news and generate daily sentiment scores [cite: 31, 32]. The critical innovation lies in how this sentiment is integrated: rather than treating sentiment as a mere input feature alongside technical indicators, SAPPO directly alters the PPO advantage function via a sentiment-weighted term [cite: 31, 32]. 

The advantage function dictates how much a specific action exceeded the expected baseline return. By injecting a sentiment influence parameter (empirically optimized through ablation studies at $\lambda = 0.1$), SAPPO forces the policy network to aggressively scale up exposure when favorable price momentum coincides with positive semantic sentiment, while dampening allocations when price action diverges from the underlying news narrative [cite: 31, 32, 33]. 

In empirical testing on multi-asset portfolios (e.g., evaluating a three-stock portfolio of Google, Microsoft, and Meta), the integration of LLM sentiment via SAPPO increased the Sharpe ratio from 1.55 (standard PPO) to 1.90 [cite: 31, 32, 33]. The strategy also resulted in a statistically significant reduction in maximum drawdowns relative to purely price-based strategies, confirmed via $t$-tests ($p < 0.001$) [cite: 31, 32].

## Knowledge Distillation and Gradient Boosting

The most profound operational realization in quantitative finance in 2025 is that LLMs do not need to exist in the critical execution path to generate HFT alpha. The computational overhead of even heavily quantized parameter sets precludes them from generating responses in the microsecond windows required to execute against elite market makers [cite: 8, 12, 34]. To resolve this, researchers have perfected distillation and rule extraction frameworks [cite: 35, 36, 37].

### The Distill-to-Select Paradigm

Model distillation in finance involves utilizing a massive, computationally expensive LLM (the "Teacher") in an offline environment to process deep historical datasets, unstructured filings, and complex factor interactions. The LLM identifies non-linear logic rules, penalizes hallucinated or adversarial signals, and curates a sparse set of highly predictive semantic features [cite: 35, 36, 37]. These insights are then distilled into a lightweight "Student" model—typically a Gradient Boosted Decision Tree (GBDT) architecture such as XGBoost or LightGBM [cite: 37, 38, 39, 40, 41].

Frameworks such as the Distill-to-Select approach build feature selection into the training process itself [cite: 37]. By combining label accuracy, teacher mimicry, and feature sparsity (via L1 regularization) into a single loss function, the resulting logistic regression or shallow tree is both compact and aligned with the deep reasoning of the high-performing teacher [cite: 37].

More advanced frameworks like *Statsformer* integrate LLM-derived feature priors into supervised learning via "validated prior integration." Semantic priors supplied by the foundation model act as an inductive bias for the base learners (e.g., XGBoost, Lasso), while empirical risk validation determines how strongly the system should rely on that bias [cite: 35]. Similarly, logic rule learning systems utilize Monte Carlo Tree Search (MCTS) to extract interpretable first-order logic rules from offline data, which are then used to boost reasoning capabilities without the latency overhead of Retrieval-Augmented Generation (RAG) [cite: 36].

### Efficacy of XGBoost in HFT

XGBoost and similar tree-based learners remain the state-of-the-art lower-bound single-model references for tabular tasks due to their robustness, computational efficiency through histogram-based algorithms, and leaf-wise growth strategies [cite: 35, 40, 42]. When benchmarked in credit risk and financial fraud prediction tasks against LLMs directly, XGBoost frequently achieves superior accuracy (e.g., 99.4% in specific baseline datasets) [cite: 39, 41]. Furthermore, adversarial training reveals that XGBoost matches the robustness of LLMs against input disturbances while offering highly interpretable SHapley Additive exPlanations (SHAP) values that quantify the marginal impacts of features on predictions [cite: 39, 40, 41]. Because XGBoost executes via simple decision nodes, inference times are measured in nanoseconds [cite: 37, 42]. This allows HFT firms to leverage the offline semantic reasoning of foundation models while executing trades using hyper-fast boosted trees directly on the trading desk [cite: 38, 39, 43].

## Hardware Acceleration and Field-Programmable Gate Arrays

For absolute latency minimization, the industry has bypassed general-purpose GPUs for the execution phase, shifting toward dedicated hardware acceleration via Field-Programmable Gate Arrays (FPGAs) [cite: 44, 45, 46, 47]. FPGAs allow quantitative researchers to program digital logic directly into the silicon, optimizing data movement and bypassing the operating system kernel entirely (kernel bypass) [cite: 44, 45, 48]. In the cloud-native architectures of 2025, FPGAs are deployed directly in the data path to manage streaming, act as smart Network Interface Cards (NICs), and pre-process infrastructure data before it reaches central processors [cite: 44, 45].

### Embedded Transformer Quantization

Recent advancements have enabled the deployment of compact Transformer architectures directly onto FPGA boards (e.g., AMD Alveo U50, Spartan-7, or Xilinx Ultra96V2) [cite: 48, 49]. Training transformers on resource-constrained embedded devices requires overcoming significant memory demands. Researchers leverage low-rank tensor compression and quantization-aware training down to 4-bit fixed-point precision to create unified Tiny Transformer deployments [cite: 48, 49].

By storing all highly compressed model parameters and gradient information on-chip (utilizing BRAM and URAM), these architectures eliminate off-chip PCIe communication bottlenecks [cite: 49]. Custom computing kernels employ intra-layer parallelism and pipe-lining to enhance run-time efficiency [cite: 49]. These tensorized FPGA accelerators can execute time-series Transformer inference with latencies as low as 1.03 milliseconds, consuming merely 0.033 mJ of energy per inference on embedded systems [cite: 48]. 

On institutional-grade hardware, the STAC-ML (Markets) Inference benchmark—a critical standard for financial institutions measuring time series models under realistic market conditions—demonstrates the sheer speed of customized hardware [cite: 46]. Benchmarks for LSTM models (Tacana parameters) on NVIDIA GH200 Grace Hopper superchips achieved p99 latencies of 4.70 microseconds [cite: 46]. FPGAs achieve comparable single-digit microsecond latencies by focusing on custom-tailored solutions that leverage precomputations outside critical sections for the final time step of sliding windows [cite: 46]. In the context of HFT, an FPGA-accelerated model parsing stationary LOB features or distilled LLM rules provides a deterministic, ultra-low latency signal that standard neural network clusters cannot physically match [cite: 44, 47].

## Comparative Performance of Signal Generation Architectures

The effectiveness of these architectures varies significantly based on the intended trading horizon, the underlying data modality, and the rigid hardware constraints of the trading venue. The following table summarizes the comparative performance and characteristics of the leading LLM-based signal generation strategies in 2025.

| Architecture / Framework | Primary Data Modality | Core Mechanism | Execution Latency Tier | Net Alpha / Sharpe Impact |
| :--- | :--- | :--- | :--- | :--- |
| **LiT (LOB Transformer)** | Raw Tick / Limit Order Book | Cross-level attention on deep LOB hierarchy | High (GPU-bound) | Underperforms naive momentum (Sharpe 0.32) due to overfitting and data noise [cite: 15]. |
| **MM-DREX** | Multimodal (Candlesticks + Indicators) | VLM dynamic router delegating to specialized trading experts | Medium (Multi-agent inference) | High robustness; significant outperformance in Sharpe and drawdowns [cite: 20, 22, 25]. |
| **SAPPO** | Text (Financial News) + Price | LLM sentiment parameter integrated into PPO advantage function | Low (Asynchronous signal) | Increases Sharpe from 1.55 to 1.90 with reduced volatility [cite: 31, 32, 33]. |
| **Distill-to-XGBoost** | Structured Features + Semantic Rules | Deep LLM logic rules distilled into lightweight GBDT | Ultra-Low (Nanoseconds) | Maintains LLM accuracy profile; ideal for sub-millisecond execution [cite: 35, 38, 41]. |
| **FPGA Tiny-Transformer** | Time-Series / LOB Features | Tensor-compressed Transformer on bare-metal logic gates | Ultra-Low (Microseconds) | High deterministic execution; single-digit microsecond latency [cite: 46, 48, 49]. |

## Evaluation Realities and the Alpha Illusion

As the deployment of LLMs in quantitative finance accelerates, the academic literature has been saturated with frameworks claiming extraordinary risk-adjusted returns [cite: 50, 51]. However, rigorous systemic reviews have identified a pervasive "Alpha Illusion" resulting from severe evaluation flaws in agentic trading research [cite: 50, 51].

Many academic LLM trading agents (such as FinMem, TradingAgents, and FinCon) report headline Sharpe ratios exceeding 2.0 or 3.0 based on short, heavily prompted evaluation windows that suffer from temporal leakage [cite: 50, 51]. When these strategies are subjected to rigorous reproduction harnesses that account for full-friction deployment realities—specifically bid-ask spreads, trading commissions, slippage, execution latency, and LLM API token costs—the gross alpha is rapidly devoured [cite: 50, 51]. 

For example, in a 2025-2026 reproduction harness testing an equal-weight portfolio (TSLA, NVDA, KO, XOM, MSTR), the TradingAgents framework generated a gross return of $106.4K but dropped to $102.3K net of frictions [cite: 50]. More severely, the QuantAgent framework resulted in a net return of $77.9K, massively underperforming a naive buy-and-hold strategy that ended at $104.8K [cite: 50]. In contamination-free, point-in-time real-market evaluations, most LLM agents struggle to consistently outperform a passive benchmark, exposing the gap between theoretical architecture research and deployable trading capability [cite: 23, 50].

## Conclusion

In 2025, the most effective LLM-based signal generation strategies for high-frequency trading are explicitly those that remove the autoregressive generation of the LLM from the critical execution path. While direct sequence models like the Limit Order Book Transformer push the boundaries of spatial-temporal analysis, they are computationally heavy and highly vulnerable to underperforming simple, robust heuristics like momentum. True edge is achieved through architectural decoupling. The most robust frameworks operate asynchronously: utilizing multimodal LLM routers (MM-DREX) to determine market regimes, employing real-time semantic analysis to reshape the advantage functions of reinforcement learning agents (SAPPO), or conducting offline Knowledge Distillation to imprint deep semantic reasoning onto microsecond-capable XGBoost decision trees. Ultimately, securing an advantage in modern market microstructure demands that the vast intelligence of foundation models be constrained, translated, and hardware-accelerated via FPGAs to survive the brutal latency realities of the electronic limit order book.

## Sources
1. [Transformers for Limit Order Books](https://jonathankinlay.com/tag/deep-learning/)
2. [HFTBench: latency-sensitive LLM evaluation](https://arxiv.org/html/2505.19481v1)
3. [Deep learning and price formation in equity markets](https://www.researchgate.net/publication/334328402_Universal_features_of_price_formation_in_financial_markets_perspectives_from_deep_learning)
4. [Empirical Market Microstructure and Institutions](https://www.researchgate.net/publication/375216303_Empirical_Market_Microstructure_The_Institutions_Economics_and_Econometrics_of_Securities_Trading)
5. [Hacker News discussion on Exchange structures](https://news.ycombinator.com/item?id=43045558)
6. [Working Papers on Exchange Microstructure](https://fisher.osu.edu/academic-departments/department-finance/dice-center/working-papers)
7. [Global Stock Market Indices processing](https://openreview.net/pdf/72d36a13499c3a444ae294f75de20a8e061bc9ef.pdf)
8. [LLM inference latency benchmarks 2025](https://sparkco.ai/blog/gpt-51-api-latency)
9. [Latency-Sensitive Agent Decision Tasks](https://arxiv.org/html/2505.19481v1)
10. [vLLM vs TensorRT-LLM Definitive Comparison](https://medium.com/synthetic-futures/vllm-vs-tensorrt-llm-the-definitive-2026-comparison-for-llm-inference-ed0943fb81d2)
11. [Bare Metal GPUs for AI Inference](https://ecosystem.aethir.com/blog-posts/the-inference-revolution-why-bare-metal-gpus-are-becoming-the-secret-weapon-for-ai-companies)
12. [Microsecond Latency Inference for Capital Markets](https://developer.nvidia.com/blog/achieving-single-digit-microsecond-latency-inference-for-capital-markets/)
13. [Financial Data Workflows with AI Model Distillation](https://developer.nvidia.com/blog/build-efficient-financial-data-workflows-with-ai-model-distillation/)
14. [Integrating LLMs in Quantitative Finance](https://biztechmagazine.com/article/2025/03/quant-strats-2025-4-ways-integrate-llms-quantitative-finance)
15. [Financial Risk Prediction and Interdisciplinary Application](https://arxiv.org/html/2503.21422v1)
16. [LLM Model Distillation Research Survey](https://medium.com/@abhi-84/llm-model-distillation-a-research-survey-3c2a2eeb61a7)
17. [AI Agents for Scientific Discoveries in Quantitative Finance](https://aixplain.com/blog/a-new-paradigm-ai-agents-for-scientific-discoveries-in-quantitative-finance/)
18. [Exchange data processing and analysis](https://www.wfic.net/2024-sponsors/)
19. [Broadridge to acquire CQG](https://johnlothiannews.com/broadridge-to-acquire-cqg-expanding-global-futures-and-options-trading-capabilities/)
20. [The Full FX Site Map](https://thefullfx.com/site-map/)
21. [WFECM Programme details](https://gaam2025.wfecm.com/programme)
22. [Text Processing Unigrams](https://huggingface.co/Cherishh/wav2vec2-slu-1/resolve/refs%2Fpr%2F1/unigrams.txt?download=true)
23. [Statsformer: Rule distillation into statistical learning](https://arxiv.org/html/2601.21410v3)
24. [Prompt-based LLMs vs XGBoost in Credit Risk](https://www.scirp.org/journal/paperinformation?paperid=143252)
25. [InsightTab: Distilling Data into Actionable Insights](https://arxiv.org/html/2508.21561v1)
26. [Multimodal Detection Framework for Financial Fraud](https://www.researchgate.net/publication/395168847_Multimodal_detection_framework_for_financial_fraud_integrating_LLMs_and_interpretable_machine_learning)
27. [Cloud Architecture for Open Banking Analytics](https://ijrai.org/index.php/ijrai/article/download/241/227/451)
28. [AI in Trading Technology Stack](https://www.nomtek.com/blog/ai-in-trading)
29. [Global AI Report 2025](https://dpo-india.com/Resources/Global_AI_Reports_&_Handbooks/Global-AI-Report-2025.pdf)
30. [Low-latency financial decision-making](https://www.mdpi.com/2673-2688/7/4/117)
31. [Guide to fine-tuning LLMs](https://medium.com/data-science-collective/comprehensive-guide-to-fine-tuning-llm-4a8fd4d0e0af)
32. [AI Strategy cost impacts](https://www.avidclan.com/blog/why-your-cheap-ai-strategy-is-bankrupting-you-silently/)
33. [Quantitative Tools for Asset Management](https://www.pm-research.com/content/iijpormgmt/52/2/local/complete-issue.pdf)
34. [Seminal Quant Finance AI Papers August 2025](https://www.smallake.kr/wp-content/uploads/2026/01/Seminal-Quant-Finance_AI_LLM-Papers-August-2025.pdf)
35. [Evolution of Alpha Strategy Investment](https://arxiv.org/html/2503.21422v1)
36. [LLMs for Sentiment Analysis and Stock Prediction](https://web.media.mit.edu/~xdong/paper/jpm24b.pdf)
37. [News-Aware Direct Reinforcement Trading](https://www.researchgate.net/publication/396789797_News-Aware_Direct_Reinforcement_Trading_for_Financial_Markets)
38. [LLMs as proxies for DRL methods](https://openreview.net/pdf?id=w7BGq6ozOL)
39. [TradExpert: Mix of Expert LLMs](https://huggingface.co/papers?q=trading%20style%20adaptation)
40. [Financially Grounded Loss Functions](https://arxiv.org/html/2509.04541v2)
41. [Long-Term Capital Market Assumptions](https://am.jpmorgan.com/content/dam/jpm-am-aem/americas/us/en/institutional/insights/portfolio-insights/ltcma-full-report.pdf)
42. [Large Investment Model Factor Mining](https://www.fitee.zjujournals.com/rc-pub/front/front-article/download/135063175/lowqualitypdf/Large%20investment%20model.pdf)
43. [Statistical Significance of LLM Alpha](https://www.mdpi.com/2673-2688/7/4/138)
44. [LLM Trading Agents Decision Horizons](https://agent4science.org/page/paper_mm2ewidc95mixg6s)
45. [Sentiment-Augmented PPO (SAPPO)](https://openreview.net/forum?id=6DeqCxngGp)
46. [Leveraging LLM-based sentiment analysis for PPO](https://aclanthology.org/2025.realm-1.12/)
47. [SAPPO Evaluation and Benchmarks](https://iclr.cc/virtual/2025/35584)
48. [Limit Order Book Transformer metrics](https://www.smallake.kr/wp-content/uploads/2026/01/Seminal-Quant-Finance_AI_LLM-Papers-August-2025.pdf)
49. [LiT Sharpe Ratio Evaluation](https://jonathankinlay.com/)
50. [Transformers for Limit Order Books comparison](https://www.researchgate.net/publication/339642189_Transformers_for_Limit_Order_Books)
51. [Market Microstructure and LOB Innovation](https://zanista.ai/paperpal)
52. [Price Prediction exploiting stationary LOB features](https://www.researchgate.net/publication/341468840_Using_Deep_Learning_for_price_prediction_by_exploiting_stationary_limit_order_book_features)
53. [QuantAgent Multi-Agent Framework](https://github.com/Hypogenic-AI/refine-llm-trading-e871-claude/blob/main/literature_review.md)
54. [The Alpha Illusion: Reported Alpha from LLM Agents](https://arxiv.org/html/2605.16895v1)
55. [Agentic Trading vs Algorithmic Trading](https://openreview.net/pdf?id=gBX7JibXub)
56. [Alpha-GPT Automated Alpha Discovery](https://aclanthology.org/2025.emnlp-demos.14.pdf)
57. [Deep Learning for Portfolio Decisions](https://blog.ml-quant.com/p/quant-letter-october-2025-week-1)
58. [MM-DREX: Multimodal-Driven Dynamic Routing](https://arxiv.org/html/2509.05080v1)
59. [MM-DREX Semantic Analysis](https://www.semanticscholar.org/paper/MM-DREX%3A-Multimodal-Driven-Dynamic-Routing-of-LLM-Chen-Jiang/d1370c155148e2fe7605895460beb9ecb2f378b6)
60. [Financial decision-making LLM models](https://huggingface.co/papers?q=financial%20decision-making)
61. [Trading style adaptation frameworks](https://huggingface.co/papers?q=trading%20style%20adaptation)
62. [Dynamic Routing of LLM Experts](http://www.arxivdaily.com/thread/71342)
63. [GMI Cloud AI Inference Benchmark](https://www.gmicloud.ai/en/blog/which-ai-inference-platform-is-fastest-for-open-source-models-2026-engineering-guide)
64. [Llama 3 8B Inference Latency](https://artificialanalysis.ai/models/llama-3-instruct-8b/providers)
65. [Azure High Performance Computing Inference](https://techcommunity.microsoft.com/blog/azurehighperformancecomputingblog/performance-of-llama-3-1-8b-ai-inference-using-vllm-on-nd-h100-v5/4448355)
66. [NVIDIA NIM Benchmarking](https://docs.nvidia.com/nim/benchmarking/llm/latest/performance.html)
67. [DGX Benchmarking Llama 3](https://github.com/NVIDIA/dgxc-benchmarking/blob/main/llama3.3/inference/README.md)
68. [LiT Sharpe Ratio Backtest Results](https://jonathankinlay.com/tag/deep-learning/)
69. [Quant Finance LLM Papers Review](https://www.smallake.kr/wp-content/uploads/2026/01/Seminal-Quant-Finance_AI_LLM-Papers-August-2025.pdf)
70. [Deep Learning Architectures for LOB](https://www.researchgate.net/publication/339642189_Transformers_for_Limit_Order_Books)
71. [CNN-LSTM for Stationary LOB Features](https://www.researchgate.net/publication/341468840_Using_Deep_Learning_for_price_prediction_by_exploiting_stationary_limit_order_book_features)
72. [Transformers compared to CNN/LSTM for LOB](https://comp.ita.br/~pauloac/papers/icaif24-90_private_copy)
73. [MM-DREX metrics and baseline comparisons](https://arxiv.org/html/2509.05080v1)
74. [Deterministic market simulator evaluation](https://huggingface.co/papers?q=deterministic%20market%20simulator)
75. [Trading style adaptation evaluations](https://huggingface.co/papers?q=trading%20style%20adaptation)
76. [MM-DREX validation of robustness](http://www.arxivdaily.com/thread/71342)
77. [Deep Research Evaluation with Agentic Metrics](https://huggingface.co/papers?q=dynamic%20capability%20classification)
78. [Machine Learning and Generative AI in Investing](https://www.hermes-investment.com/uploads/2025/10/39badafcb4e775bca88d59c716f3df69/0019262_bd015852_spectrum_q3_25_v375.pdf)
79. [Financial NLP and Classification](https://huggingface.co/papers?q=financial%20report%20interpretation)
80. [Logic Rule Learning and LLM Extraction](https://proceedings.iclr.cc/paper_files/paper/2025/file/5d7e8991f75f3e5af14edf7aebb5be5e-Paper-Conference.pdf)
81. [LLM Bias and Context Analysis](https://emergentmethods.medium.com/llm-bias-check-5b023b9c735d)
82. [Stock Price Prediction Using LLM-Based Sentiment](https://www.researchgate.net/publication/388092371_Stock_Price_Prediction_Using_LLM-Based_Sentiment_Analysis)
83. [TCA Blind Spots and Microstructure Signals](https://electronictradinghub.com/tca-blind-spots-four-microstructure-signals-your-execution-dashboard-is-missing/)
84. [Rust Parser implementations](https://lib.rs/parser-implementations)
85. [Exchange Microstructure Insights](https://ideas.repec.org/f/c/pya634.html)
86. [Programmer Analyst Jobs in Trading](https://www.zippia.com/programmer-analyst-jobs/jobs/)
87. [C++ Jobs in Trading Singapore](https://jobs-radar.com/jobs/singapore/cpp)
88. [Dissertation Advisor Profiles](https://www.scribd.com/document/386773755/Dissertation-Advisor-Profiles)
89. [Brandeis University Scholarship Outputs](https://scholarworks.brandeis.edu/esploro/output/F)
90. [LLM-guided semantic feature selection](https://www.researchgate.net/publication/395577256_LLM-guided_semantic_feature_selection_for_interpretable_financial_market_forecasting_in_low-resource_financial_markets)
91. [Credit Risk Classification LLM vs Traditional Models](https://www.scirp.org/pdf/jcc_1733178.pdf)
92. [Distill-to-Select Approach for Features](https://medium.com/data-science-collective/teaching-models-to-choose-features-wisely-a-distill-to-select-approach-a9359e2ba5d1)
93. [LLMs for corporate credit rating forecasting](https://aclanthology.org/2025.finnlp-1.11.pdf)
94. [Performance-guided LLM Knowledge Distillation](https://www.federalreserve.gov/econres/feds/files/2025108pap.pdf)
95. [Comparing Top Inference Runtimes 2025](https://www.marktechpost.com/2025/11/07/comparing-the-top-6-inference-runtimes-for-llm-serving-in-2025/)
96. [Choosing a Low-Latency LLM Inference Provider](https://www.gmicloud.ai/en/blog/choosing-a-low-latency-llm-inference-provider-2026)
97. [Inference Speed Benchmarks Explained](https://blog.prodia.com/post/inference-speed-benchmarks-explained-compare-llm-performance-for-developers)
98. [LLM Latency Benchmark Analysis](https://aimultiple.com/llm-latency-benchmark)
99. [NVIDIA LLM Inference Metrics](https://docs.nvidia.com/nim/benchmarking/llm/latest/metrics.html)
100. [Constrained trading strategies](https://huggingface.co/papers?q=constrained%20trading%20strategies)
101. [Chart patterns analysis and LLMs](https://huggingface.co/papers?q=chart%20patterns)
102. [Multi-agent trading system evaluation](https://huggingface.co/papers?q=multi-agent%20trading%20system)
103. [Trading strategies evaluation benchmark](https://huggingface.co/papers?q=trading%20strategies)
104. [Financial decision-making decoupling](https://huggingface.co/papers?q=financial%20decision-making)
105. [FPGA-accelerated Preprocessing in HFT](https://www.ijfmr.com/papers/2025/4/53007.pdf)
106. [ML Inference Runtimes Guide](https://medium.com/@digvijay17july/ml-inference-runtimes-in-2026-an-architects-guide-to-choosing-the-right-engine-d3989a87d052)
107. [FPGAs in the High-Speed AI Era](https://semiengineering.com/fpgas-find-new-workloads-in-the-high-speed-ai-era/)
108. [STAC-ML Benchmark and Low-Latency FPGA](https://developer.nvidia.com/blog/achieving-single-digit-microsecond-latency-inference-for-capital-markets/)
109. [Top High-Frequency Trading Firms](https://www.quantvps.com/blog/top-10-high-frequency-trading-firms-dominating-global-markets)
110. [FPGA Acceleration for Transformers](https://www.worldscientific.com/doi/10.1142/S2705109925500014)
111. [Tensorized Transformer Training on FPGA](https://arxiv.org/html/2501.06663v2)
112. [Tiny Transformers on Embedded FPGAs](https://www.computer.org/csdl/proceedings-article/isvlsi/2025/11130202/29yF4bJrc76)
113. [Edge AI Architecture and Inference](https://ijetcsit.org/index.php/ijetcsit/article/view/618)
114. [Privacy-preserving Transformer serving](https://conferences.sigcomm.org/sigcomm/2025/program/papers-info/)
115. [Support Vector Machines vs LLMs in Trading](https://www.pm-research.com/content/iijpormgmt/52/2/local/complete-issue.pdf)
116. [Future Trends in AI for Stock Trading](https://www.bitget.com/en-CA/wiki/what-is-the-best-ai-for-stock-trading)
117. [End-to-End Portfolio Generation](https://arxiv.org/html/2503.21422v1)
118. [Generative AI for Limit Order Book Modeling](https://web.media.mit.edu/~xdong/paper/jpm24b.pdf)
119. [NLP and Sentiment Momentum Extraction](https://www.researchgate.net/publication/396789797_News-Aware_Direct_Reinforcement_Trading_for_Financial_Markets)
120. [Distilling Structured Search into Efficient Models](https://papers.lunadong.com/area/reasoning)
121. [Few-Shot Tabular Classification using LLMs](https://arxiv.org/html/2508.21561v1)
122. [Distilling RAG for SLMs](https://www.newmind.ai/pdf/NEWMIND%20AI%20JOURNAL%20WEEKLY%20CHRONICLES%20-%201st%20Week%20June.pdf)
123. [Specialized Tabular Reasoning Models](https://arxiv.org/html/2604.13392v2)
124. [FutureX: Live Evaluation Benchmark for LLMs](https://huggingface.co/papers?q=live%20benchmark)
125. [Disentangling Volatility in Rubber Futures](https://www.researchgate.net/publication/397344586_The_Overnight_Jump_Disentangling_Microstructural_and_Informational_Volatility_in_TOCOM_Rubber_Futures)
126. [ETF Connectedness and Secondary Asset Markets](https://ideas.repec.org/f/c/pma618.html)
127. [Financial Risk Disclosure Reactions](https://ideas.repec.org/r/bla/jfinan/v25y1970i2p383-417.html)
128. [Unlocking Economic Insights with ESG Integration](https://ideas.repec.org/f/c/pya634.html)

**Sources:**
1. [Link](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF9nENAETh82eJMRfVw2Wymeo5fvKv8XholBd22SmykF5EjeY_sMjv9yKIZP6F4C--kFZyRQC5pt_UeFQvgxVkzdBVW-FxcYRBeHN2Pz5TOhDxZs2xysFFdkQ==)
2. [electronictradinghub.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEk-Hk2m4IkmIY1TMcm1JvWhy_5qbsduhQjUpYlJz9OLRo9_B59d98CIa7KF1a-6yXhDnmwQ2SxzKihrVRA1s4U8Ceig9vZfc2CYpdNV2mBK_Y5jARyKn73qwlI2XbkmyVtGD4WCPFX3Gu7rwf_dtjnzfsadVtvbCjpVKclkz4RgsDjDJuv357Wuk8bpwG-o2P8hapIWl6LqpTr6KUXi9fJS9gTL51Jeg==)
3. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFYAx1zzxAt15Uxbm1F6d1ncJmUPP14w2v3pHqEG7Cteeqz7YcxbzwnE1xwa9QiMlYM5JTcjPQBdwYpk217OI1r8PAHRJOlKlIYR8X3WVCpSY8bsPLx0sHQNsjU-JBbdLopPXcciAon_ch3hvC6e_bnR56kYHMReIoCn5uEU6w9HC4zDBMLIA8qCK2jmjfJejEM2IIWIaDrbTihcfJs2yAe2qf_NWXgsIh2fKGdlSL7o1MylxhZoLd9AAyyQ0bJ-eOgAFvizUWRwJM4-w==)
4. [wfic.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGLepEDvlWGgnQULF_32ylU9L51pAhQ3fgq1Oldn3TVssB8I2wF9Or4gEYZLxCsYPVbX_o2gOu9MJOWJdsoOBuK7gmTrGdEouv3PejbJyX5XfzaqZZWzfbCjA==)
5. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEAzaVUSpTxBVeban7SkBvnx1QBiORbwvD6UIB49v2x49uP8pgZQsh3Oq20j_hG0KWi3lH46_XiBf1rbqA-IAVxIt6xyV9ZFl4AUiVm4hYqvoI48s7Kns9X_gJ4QvRkyfXpYX2ObaNfofP1ZXca3FduaQsWRj9NeKPOLqYLUsHbJXRc55LmLsBcq__CkYboUNqQYq62RODYeFi50v3dp94FKpA8E8PWKT29hDqjIByRQBoVrFfvW--UWE_2V04HYXkhJbvCtXnDOk0ucU75hU8=)
6. [sparkco.ai](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEc_YKtDIGgf68PlU6MWyJL11kIvL3avrLhnqk8JdU95gVxuB2UtVMR4BicmLclWnET_-DrZNGyyWGxrDWM2lxcF-DHQYm_F-TwqW66p_VjqYcKb5EVIRdQiKmOZheegPQ=)
7. [gmicloud.ai](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGqVOHfVRvvUvqfkKiw-aYSLwqh0FaBiKFpOeD9sy9RmUmeV1z7sbqnrMbo1G6BzYWBpdsaBNxPVTIBbUuNIpZLpFOlQVbxu4TrNDYn3e2EK66dMUOplmTM9fg4ssskrt0nd6sZhHTQ2s0Qek6-6hJmG5ImfW062-OaLoRoiqGX5OSmEFrqXbOQ)
8. [gmicloud.ai](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH2R5bnn63T6PdBo75XqE7r33koJXxkyp7VNwXZh1G7N0ajHkiCo8fo6uZuETn2FYe5z-EjRdy5H7CVrI19kVEYk0JtwcWBVoMDG3iMPl5E_csiPoBf-tMewrfbycPdv1aF06RYVpoGujKKDeBYnanKHFeXOi9g-go83qWiYT8UbpeI97zhVu6CZs8MCK4XWSByk5vt79wJNxwX8pXgrUB2_OUWe3oQz9Z8-g==)
9. [microsoft.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEaVGG_P0OLLKX8rTXMoV9jjJO-3Y7c35NfmrlYELE6R8Mu13vthJ8WP8wGbeLosIJbu8B-NS0WIZH7HxWab2bjZRb_30Iu0RkrqFNhNVl8uPW5_L97-DeAIQVauS9RomK08rPnEQzF211EF91Bs2MlFOReBwOU2PxYeLu_XEiNy6g_ROkQ0S_a-CgwZCQVhB4_yJjeRek1AmxBcbmPxd-orYd-0NvrsIStgAbPhU7H0gKPrR_5nYaTFeNm_-19PRXViGwAwG8SezXiaw==)
10. [nomtek.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQG_coGjIcbze5nnJLqAy-DcEGfkJWPZaM0rbeqspHvocGnaOI5OQxxpNvWYlj6vyQeU--7DTtD6JvsnIyNJkjblXVBe4y7Tcai3lBLNjwQR9T4gUOOCCiwkVXnjFPBRTw==)
11. [marktechpost.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHlU4xcWw64uG_k4OluRZ8Hb3sxRgN49zXXFzI6r8fhGwBoYO1KU_9xQTRbJqNS6pUUUkaKg3G9PiJhSpK4tA4JsxF23NTuWawS7q8WVSJw03FgWDN4jJ4RaasAMPlvbJGr6cAO0Q1YyFFtHqhHk3_E5QsYvksRyG7eY79CjrYC1D3a1yIAOMGlCnXqrit-pMzkeFY6yjOWZi01ELP5)
12. [medium.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGy6JbJoskNdjZBPY5mrhPw2cYDR8_G3nYrzQtWp5qzy6JzEsPenGWJwr9ucsHHBDgEJmuZxhxdAZBKoek-s_fqb5TFI47pncFbk1klMmv4J06E7lB_XAPng-voNjlRW_YeTBEVfGez1nNp7gh4VYFU6CJtJXygoKda8U_qvAKFkSVheXSH9FZBkX1s5hOYwmmBBPO_artEtp4KFufW397pEVWK514O9GDOYn9Few==)
13. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGhrlFaIgvUYF0WYqc-DmVkUnTL4tM5CVlJud-jLTQsPidai7v_QdzlOCDXS2gA9cpFstvaqi6gkEZ3-95Y9juuBBAp4_ec9H_9C4131zB1z_XPigmEPzk2PcsyhkkJl-IyUV89T_bydd8f0qwPL5cOOfJNQyeC7fdtQt3rQ7kmUqsxTR-EtX2R97sm)
14. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFH2lw5IBPhzATGvtxVRx0rCpbxvmEQPHoskTeTNWmEPbnOHrJ0aobEAbCa_Pj8_5yx5L01Fd-gUpXrhcoWGIm21DYsrcBOPDmCm9FZPikwa2YgoPLH6EzqWyzDivqDT0OWM3Y1B8iPRxf_I_N-uCR2OHZXOeTOP--jOfxsqSZKusJJwfd1eyh_1KrcXcsWxJzdZODUqD1XYvZy6Vn7XF-SXFuK2YPqnJGy7kSE_TZewUZOcMXj9DKUW24x7QII4EBnAxdNQQ==)
15. [Link](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEHq6m1KBHKUKMlx5UbCAFykExeaC1A8VF3RlZmds1yusNoOMlpvzJf6UDbvvX-oX8X2ZH-L_8y-aK1-mrKyHMG3H610sD-sh-KCslyCNzBRkcRS0C6POdTUXge8omOme_05r4=)
16. [jonathankinlay.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQET7henXA4sjRqlFCw_-c14XOdRJnyHFnbZStj91qajWP1_ctV8t2SRUTfYnxGodINrGFAf4CHcSfbNRXMWqVD5aZ8vNHISy19hNyyCW2_WBUk=)
17. [ita.br](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGDLTm8-zEfSogi0JPLzA8yLHNsUXlB1AiNFIbaTwfYtG7x8D-lLP_6taqF629iPjExOQ8L3Q0l5lg5KyZUnAuVXk09W0xMIzxe8bOPcyumlEIuYcM6DpV4dxyGkKzuNL67t-BMNG9ABwyETvwlCL781g==)
18. [zjujournals.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFIHlv8-VIO51RkNR8dPsdqPIP-CP46tX8gosfOhSAiLbb_1YRrVpWmHBJrcIe54F40vCprepFLwSdZrxlQ11hc3Udu-QXGq25W3eVMlsHU69wcN19rwTLcTIYhz2x4OJhe_ekgxTfDfgub61mYMEgJ3v8tUzaNEwcWNuL3IrGAURJj795yTI-0t6ZMcTllwq8y5MHddNVt3CYHqZTWdtz_8BbpgJyikw_W76gh0iVsIeDo)
19. [github.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFQYAc7IOQzMNkstQr_cKqLX_U46fvOd0qZAwReemfb3IPTv23rYLxZ8qYMOpD90dW8a7QSrD3YjenIXzxMEnWhhDm2pDTYvYFeAIqAdnjcWFXDZggiqOj6zWxhOb4cLkALIFFWHHWZde0daFQA7QKuUfE-2N7tTN-a6RtBW2sqwr0q9rh2QXRY6qUlxR09G4zZBLA=)
20. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGODnMyS0fYfXaGYkYVy-_urdAlEkVnb2aV6uFanuTw9CzTdWn1HxaaAfgBT_Yblvcdn1J0llN2KyYQLdHLzNwCcXfwrGDyazdJD55lxnPLJpBPeF_GI4VTbw==)
21. [huggingface.co](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGaL4f3NIwCV3-ImcHYcV4aMAxXEKgbAIu57_K7RR6NGFGrAopHPgYKHKMMAjZoHsbYQq40MDtHxQaSwX3WdJm3wGl_yl9irrnXyWrRY8Mt90TsMR24t1rymFlJKP_jp_A2pjyO0flFKRyFdXVj5alUja6zQg==)
22. [huggingface.co](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFyBRzy7TGmPVz9Q-GEuBu26A7hbw4QRAhywcX29L_oCfFu2BvCeLR36YWE7POprrovz_yxF1PQfwmMTMa9o8FAyN7IZqWbvKpT5HUmEH25pS--iBr6I5nEBthgytXdL6rY_Bl9d4DaXeW3uX-Vybj6dQ==)
23. [agent4science.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHInyEDcyLsA3w1ieB2F-SYrLCAWF1KNPhjLqBnNIv9Fo1ATJipccP2PHypXp1z0FRa76gThm6RdLAi2NmhFypgByOaOwiGIXwKL4PbONCvtEBodHYsi94gthEkd7caH_rjcz8d-5A3dyYCag==)
24. [huggingface.co](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE4j7Xe13YPuTISFu2k1MkOBUh8OVV6tdVlFs073MS03uG9fXL9dCLfvypM7ax4CfNGi4LUM5gUej9-gox6sB0j2ghpETrJniSB-RArbonQi0F-baoK4OYRuXW9Dn3emwaJutJIsSs=)
25. [huggingface.co](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEGUmgjodri0IzZdaWyLxRLBqjd3hl_6BD0egvbXLJcGLsvZ7c8q1zChvobv_0Qv7cmJGrwlPfdbTnz5CMpdn7aMKTdgi79SpVt51A44Rgw80uRVfUgofJCF8ur0HxUeZnTpPbuazJ299PqjZ3G-3qbmxg=)
26. [huggingface.co](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHrE55L1iUGXWwB0epzxAt-cBZKHc_OBWe4vLjwnsctyLObSg1jiMmQUbb8HvRKPmq4de31HnDeHKBQ7-DqnqR8nK7Kp1AF-eDphn1Y4mi_dVocish3ce2WKrlg7UEYJGwobDH-bDu-58-IRMqIGTi_ukmOzqzCU5w=)
27. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGnODVtpwTbzxry2exIDfbR9lkeDr158TYoe0vjc6bGiioQMm6u-nXu52lSk8ZSL7QKabHet-JwMejEN_9DFR1GSFZ8t0SBTUEBj4MR0CxD-V2qWLymPrMnbw==)
28. [mit.edu](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEErb8_AEWBg1pe2eha2aoFI0fTKu7t8tOsw5_h89dVWcxNigptnWhc0hmC3nfdtpLBZfvfI1tvqTEnWw5aWWPD0OyTGDsZFzqUxYFh_09Crzow09324dz07x9u2UUjkJ1Pea8sFmrK)
29. [bitget.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEdY3d-OxIRjReQRChpXEIU22RqcQWgCI9srPM_nTzHZ-2fOcIetUo9SfxpBVX48DNjZUKL4-rE3wCRxIgaEXPWWvsh06FnCCCNLArDw9LC-bBVuhihRrlQ5E-J-nJ6fbzxADwy6kaGK3O9qx1jGtIL16rYs6NEvHhzSCngpQ==)
30. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQE-iVNLuAB5RgXBxA4yFXKIztG1GQ48hi5P7MYoZukafrpo41FBrlfsjFl2MGBKD1OVVE1EvLCgwVq9jxOcdOiV3MZcqKY4lLXOV17yY_a6Q8191vYKmniIMOmeb4BQUIO87iOCnxaN286CSbNyo778HFBsNQ-YuR7cbtu0VyC3QvLYQjkexnpOSMUy2QJjOLZVbS6yA_P6EhfVOPmvNIB8DauPBST4)
31. [openreview.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFQXbBdyvDWutOJSkeYO9Wa-6mbKxaD-GDrDMsAf9lqRBWWxyrS_WKbY2-T9PjPk3mctghyjoLlzekuKFSc0atBYQdVB7sea94NU4_cSdPtCwiYv7e9Inci7ykSjCeZAUc=)
32. [aclanthology.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGmDd1D8D1nyYptDNZKVc5R4-oayyToJBUOGD-1_vOwiVeL42WKMuPJPKDY_lH_y44HJqhZDhzH4_5no6mznWViI4lSwE3NX5KyvkkEfRDJMTxLLKVDi4aWSyHT_GG4pg==)
33. [iclr.cc](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFc-XhblvGJBZ0gv73qGMwYqWlLJkEcc3BlQxUeLhpShGRHWRXoa7QmR6ZGk89ehFRXUIKSTl1xF7RAKv48yWJhm_EumT8kENbHaAuLOplltkaOdrJz2t_G)
34. [quantvps.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH0-or5ikXq8Rr6j6ZsvzIOW6_WBM0eW19AeAEYHKFNkKdNFqjZ80xyVSja7Drs1u_xWuVnNZF9sVVFxmyeHMyBfbAR84Dfd0Db1yYJI2VfHkGrV7_hapBNF8ah8iz4auBYRlEpw9vimctOjnkSrN8SzcY1uB0a2okB0G6EVVf_fqSY1Gyehu2pmkZIA-981-Yj)
35. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGVt2CXlR1kk_wQQ-Usja2Y1x-fEGKph7V6Soc8DLcBBCsaSNUk1j3gn0O_nLTi9sSv8KJ8jJ36GEiQdOjyLOxUgvrMJ7FWGc2yrfiD7VewLcGSsMYuYbAXlw==)
36. [iclr.cc](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEGHypTjNlp8XjiKT-1IpsLexnn8oxdeQxs0F7bnzI5thYamYXZZ5EwVZ018wFuajRVvmZWKfffOrTV9cTAKK1UiS5rz5o2VmAgLgWpDhbw04rVkLCvsGY8a61kVT4fJl-ERZK4xuE5iZD9KcisPl3qMSLOVeDIZda-_r_Ylw9q5H19ud3W-yB-jX3Wu-rZg5uBe-yZJR3MheA0kxDc4-Mh1Ovm)
37. [medium.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH5mioqyhEsnsE6-mYahSqG4yKDkmwfb6bA6YFceEpACWDnFW-pSnMtcODsz9xuLHAYZGIhUZsQVGHmALsK08Gy3TSWFypDXgotGXbWVMmFE-It9vDEUqgX8wN1r2Vam-dB2ehPaBp5kIZqmKX6ZHf_11vbcVM1YUNBRjtERD1pLS-gUFLuUrPR9LRGAJxFyeS7OfEcq9FbUqN0HIyvpxSPT9zd0lvvBn9HAl8ZkV-uTV-LtzA=)
38. [nvidia.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGPS7YDv7jKoFb60QOXqVrnB6VVsvM30gczxECC-2vEpQUuPi8Yu_LzLcz-gUDkNnrRfid4jeKDB1Ahow6-hc5Y3DIgfPG9YcyuaHmm0TWRk2puAItiwgaXZ7ZMbex_h-QUCwwx4AbjtoUxA6F29URI3CNlmL9aEtPSzR8XDmH3ng12yo4K6tTp7F7N4gD4qrG1LmMGF5Z14Z0vZiI=)
39. [scirp.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHEEy9Zmofe55tGWRkudEFYBW_-gqeJWR4dSUcMyIQNd-cwdBV00LMFAz0tabu8jXzY9SeLT5n6Qn8ykBAm6WzJDDz6mJxuIzoWG4kyEVajdyZ79VrA-PzxVqXxayaQ4y8gwLF5fFiGzzSTVpTNOm39L6Sv)
40. [researchgate.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFGTwqQfr3c42PfkGkRDKgzAZoShkuNeCP2kUWTUSs2OJWLzNgGEdPHyn9UbKs_5kKwvSqcPolo8a_5wziL5zoGurQ65FFVJH_2HE4tN1gMH-LmkONQ-lHtTQahAV3qinjvPhSr2Cn9lUlmkZ2F5yBTgAdHnDyXYP_6Blz1VIkw5kyAnl0iyQVOuOYX0BaDljYI4B7OsNUCctWHKJAT5aY6hD-muoyJPJDM2OHbz5WPIeJqWe0lWb5FCbpta9sC2sMqt9ehcVswwrau1Mg76V7s)
41. [scirp.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHcwwHnrqJHjWtRXrbpLg5Y3_evM54_PVko_UjkTxg7G85_11ObclnVEJ4vUNMkFELWfTuT1XzwaQrRWPogBfNiZ3-hcw64qXdklcfmot1QNY9PE67nsVzkqsXdePcOig==)
42. [ijrai.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHjRCQcjB6fVqMludxKFg-R_ER6XyHMp5fJmwLkbzvTbja7rU6rjvs6nJpMY4hjwZVEEte7RHPoJF6vdtrMlKNE9Redv2wJdtLfQ-U9hb8ShO1vTNKJ4zhLZ8rBATVcgBtY1OQrcgqFxoErZiLlkpkjyfCbfg==)
43. [federalreserve.gov](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF3l0JTpsGfBZlkLzLrvC29pNynXJOETEEHr-fJZalQvVCJ52OA8cT33daFm0jcI7JEPLREIj1DIaCF6ekDGrxSiPJhFyTQ1mDGoB7tdiSchLv0qbae57iiE_Sh8-nxkCBBRnSaUL2h1af1Rkdl2wgD5QB-S7zz)
44. [ijfmr.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEDPFYV8gr_MSltNyLgINBx9pbUmb6IpAOmIzVSt0N4q7LefOP5fsdmMJaluOc5P3-elRfxiOfPL3ctIEbuYtrrCL6x1lqsp3jNd9dRKO3ArSwdwEHvGWm-NhHcpBsBmg5kNKg=)
45. [semiengineering.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF2VcTjwykKVvJvSzsljUlkjSo8FPJUeq_fn6C9WYEMuCrdrs_Txd1JXyqev3aMzac7dGnlWyYRuDkeBAU7voK-i4ZcMBej1Usc7eINMfgG672z21hdGumK8b2pasyQiDtL91Pima6-qs2mJvnyQkFeduTraneemPUiSu6XqMK5novXWA8=)
46. [nvidia.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHRD9xUCE42jVp4DQpeNrHb5Qfztg3A84akdtfHXc2-ztuchtOlFIraTu5BM3HMtJ4ND5VZxDXYEfc-d21R1n7kqs0DzhAajz1qBbWxaM_kWod9p7ubnn12lGBIjNsrYm5kXvrcOuGB0I2RtCsbwhk5lm9HSlGqgwI1UUkhuGNZFGEuuAeJn1IRSYdWJQyVucOCHWAc8TwTjzDOH2OSiFllQA==)
47. [worldscientific.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHwRmnISWajaBoe3uSjXoYWkWu-zge6OV3s85L3_SNJQo_Ljv_2risgOJkurFMDMauP89hxQ_L9wC33NKf111rsMcVFFCVZ0WkPuZjkaVzMKe-WfIQEilWGHJciKRc-Ocurz3QiMGMGB3IYVSKkemwhJHma)
48. [computer.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHlY3jC_ahycI3YCudhFohqDEGbtPUKR5m7I0VjdbgjcoKSyMoSywP3Jlt5-t0RnqoZw8SjusCbRGJFeB0tCJc2Sx-pYxqBupxcqF0R2gvtYjfwfSYTfqAGFSg2Q4NTtoYpRYnY7JPl4mQvjZ-gZsZ6b2Rc63BKRCzjA2BwGmrLX_tkUB2SuU_G)
49. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQH80TW-T6qnGs47eyeCS6QWcAo-PMnqY-CAjoo26xNsokeVXpg75qEC8iB2HZch2aA1Xxc8LL_1QdQRUbDWHbPd5vfG9sMs8-ayDtNGBQqE3NoaUfqhAFa44A==)
50. [arxiv.org](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQFpX2l_zExUOVjhssECR4IGVdchw_RwrZRs1nCbyD-bFXyr-3LPmCiMpMbZvejiBBSsekKFbR3EMXj1pytwYEBLi9dmUkbtTMfwhuTLmDD25YNcyK3fsrmO3Q==)
51. [openreview.net](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQGZDX0LSMQQnRVYXZzJ1JSCFiaaDIfq9kBFmZI4aYvNlHzBzpJo3AHYU_cBKtT6XMT5q_chYC2no_g8nqSnr6ACYeNlMTb3t9_2UHGJ1p8odfZ-384EZPOc8OE8qWEG)