Leap
The Unified API and Workflow Engine for Enterprise AI Automation
A high-performance guidance language for controlling large language models.
Guidance is a domain-specific programming paradigm designed to solve the inherent unpredictability of Large Language Models (LLMs). By treating LLM interactions not as simple text-in/text-out prompts, but as a stateful execution of a template, Guidance allows developers to interleave generation, prompting, and control logic seamlessly. Its technical architecture relies on a specialized interpreter that can force the model to follow specific grammars, such as regular expressions or JSON schemas, at the token level. This prevents the model from generating invalid syntax or hallucinating structural elements, significantly reducing the need for post-generation validation or retry loops. In the 2026 landscape, Guidance serves as a critical infrastructure layer for 'Deterministic AI Agents,' bridging the gap between stochastic model outputs and strict software engineering requirements. It supports multiple backends including OpenAI, Hugging Face, and Llama.cpp, and utilizes advanced features like 'token healing' to eliminate common tokenization artifacts that degrade model performance at the start of generated strings.
Uses a custom regex-based or CFG-based engine to force LLM token selection to match a specific syntax.
The Unified API and Workflow Engine for Enterprise AI Automation
Orchestrate multi-agent autonomous content pipelines with LangGraph and industry-leading RAG architecture.
The no-code foundry for deploying production-ready AI agents and multi-modal workflows.
Enterprise-Grade Conversational Voice AI for Seamless Human-Like Interactions.
Verified feedback from the global deployment network.
Post queries, share implementation strategies, and help other users.
Automatically fixes token boundary issues when prompts end at sub-token boundaries.
Allows Python code to execute between model generations without losing the model's internal state.
Efficiently caches prompt prefixes and intermediate states across different generation calls.
Integrated notebook UI that shows which parts of the text were fixed and which were generated.
Forces the model to generate only strings that satisfy a specific regular expression.
Unified syntax that works across local models (LlamaCpp) and remote APIs (OpenAI).
LLMs often hallucinate extra text or break JSON formatting when asked to extract data from a document.
Registry Updated:2/7/2026
Standard Chain-of-Thought can wander off-topic or fail to produce a final answer in a specific format.
Prompting an LLM for a file path starting with '/usr' might fail if the token for '/usr' is different from the token for '/u' followed by 'sr'.