B Trajectory Synthesis Details
B.1 Offline Corpus Motivation
Online search APIs (e.g., Serper, Bing) are a major bottleneck for large-scale trajectory synthesis:
- They incur per-query monetary cost
- Their behavior is non-deterministic and evolves over time
- They suffer from latency spikes, rate limits, and occasional failures
- They make long-horizon experiments difficult to reproduce
To remove this dependency, we build a locally served search engine that acts as a drop-in replacement for web search.
B.2 Benefits of Offline Search
Our offline setup provides several advantages:
- Zero marginal cost for search at scale
- Full determinism and reproducibility
- No rate limits or latency variability
- Support for long-horizon trajectories (100+ turns)
(Implementation details, including hardware requirements and retrieval latency, are provided in Appendix X.)
B.3 Cost Estimation of Offline Corpus Construction
Online APIs are expensive at scale, but our usage is tightly bounded:
- Serper search API: 6K × 1 credit
- Serper scrape API: 6K × 3 pages × 2 credits
The total one-time cost is approximately $42.