Install LangExtract

The easiest way to install LangExtract is from PyPI. You can use your existing Python environment or create a dedicated virtual environment:

python -m venv langextract_env
source langextract_env/bin/activate  # Windows: langextract_env\Scripts\activate

pip install langextract

If you want to hack on the library itself or run tests locally, you can install from source using the instructions in the README on GitHub.

Configure a provider

LangExtract supports multiple LLM providers. Choose the one that fits your environment and follow the relevant configuration steps; you can always switch later via the model_id and language_model_params arguments.

Using Gemini with LangExtract API key

Set the LANGEXTRACT_API_KEY environment variable before running your script:

export LANGEXTRACT_API_KEY="your-api-key-here"

Using Vertex AI

If you are using Vertex AI with service accounts, configure your Google Cloud credentials as usual, then pass language_model_params:

language_model_params = {
    "vertexai": True,
    "project": "your-project-id",
    "location": "global",
}

Using OpenAI

For OpenAI models, install the optional extra and set your API key:

pip install "langextract[openai]"

export OPENAI_API_KEY="your-openai-key"

Using local models with Ollama

To run fully local, you can use the Ollama provider. Install Ollama, pull a model like gemma2:2b, and keep the service running. Then point LangExtract at the local endpoint with model_url.

See the Providers page for complete configuration snippets for each provider type and tips on choosing the right model for your task.

Run your first extraction

Once LangExtract is installed and a provider is configured, you can run a basic extraction against any block of text:

import os
import langextract as lx

input_text = """
Patient is a 67-year-old with a history of hypertension.
Started on lisinopril 10mg daily. Follow-up in 3 months.
"""

result = lx.extract(
    text_or_documents=input_text,
    prompt_description="Extract patient age, conditions, medications, and follow-up.",
    examples=[],
    model_id="gemini-2.5-flash",
)

print(result.data)
print(result.grounding)

From here, you can iterate on your prompt_description, add a few-shot examples, or introduce explicit schemas, as described in the main Docs section.

Next steps

Read about grounding to understand provenance and spans.
Design robust outputs using schemas & validation.
Explore domain-specific examples you can adapt to your own data.
Review performance characteristics on the benchmarks page.
Get involved via the community page.

Getting started with LangExtract