Radiology Report Structuring: RadExtract
This example demonstrates how to structure radiology reports using LangExtract, extracting key sections like findings, impressions, and recommendations. This example showcases LangExtract's visualization capabilities and is available as a live interactive demo.
Overview
Radiology reports contain structured information that can be extracted:
- Findings: Detailed observations from the imaging study
- Impression: Radiologist's interpretation and diagnosis
- Recommendations: Suggested follow-up actions
- Technique: Imaging technique and parameters used
- Comparison: Comparison to previous studies
- Clinical history: Relevant patient history
Defining the Schema
Create a schema that captures the structure of radiology reports:
from pydantic import BaseModel, Field
from typing import Optional, List
class Finding(BaseModel):
body_part: str = Field(description="Anatomical location or body part")
description: str = Field(description="Detailed description of the finding")
measurement: Optional[str] = Field(None, description="Any measurements mentioned")
location: Optional[str] = Field(None, description="Specific location within the body part")
class Impression(BaseModel):
primary_diagnosis: Optional[str] = Field(None, description="Primary diagnosis or finding")
secondary_findings: List[str] = Field(default_factory=list, description="Additional findings")
confidence: Optional[str] = Field(None, description="Level of confidence or certainty")
class RadiologyReport(BaseModel):
technique: Optional[str] = Field(None, description="Imaging technique and parameters")
clinical_history: Optional[str] = Field(None, description="Relevant clinical history")
comparison: Optional[str] = Field(None, description="Comparison to previous studies")
findings: List[Finding] = Field(description="Detailed findings from the study")
impression: Impression = Field(description="Radiologist's impression and interpretation")
recommendations: List[str] = Field(default_factory=list, description="Follow-up recommendations")
Example Prompts
Provide examples that demonstrate the structure:
examples = [
{
"input": """
CHEST X-RAY
TECHNIQUE: Single AP view of the chest.
FINDINGS: The lungs are clear bilaterally. No pleural effusion or pneumothorax.
The cardiac silhouette is normal in size. The mediastinum is unremarkable.
IMPRESSION: Normal chest x-ray. No acute cardiopulmonary process.
RECOMMENDATIONS: None.
""",
"output": {
"technique": "Single AP view of the chest",
"findings": [
{
"body_part": "lungs",
"description": "clear bilaterally",
"location": "bilateral"
},
{
"body_part": "cardiac silhouette",
"description": "normal in size"
}
],
"impression": {
"primary_diagnosis": "Normal chest x-ray",
"secondary_findings": ["No acute cardiopulmonary process"]
},
"recommendations": []
}
}
]
Extraction with Visualization
Extract and visualize the structured report:
import langextract as lx
radiology_report = """
CHEST CT WITH CONTRAST
TECHNIQUE: Axial images through the chest with IV contrast.
CLINICAL HISTORY: 65-year-old male with cough and shortness of breath.
FINDINGS: There is a 2.3 cm nodule in the right upper lobe.
Mild mediastinal lymphadenopathy. No pleural effusion.
IMPRESSION: Right upper lobe pulmonary nodule, recommend follow-up CT in 3 months.
Mild mediastinal lymphadenopathy, likely reactive.
RECOMMENDATIONS: Follow-up CT chest in 3 months to assess stability of the nodule.
"""
result = lx.extract(
text_or_documents=radiology_report,
prompt_description="Extract structured information from the radiology report, including technique, clinical history, findings, impression, and recommendations.",
examples=examples,
schema=RadiologyReport,
model_id="gemini-2.0-flash-exp"
)
# Visualize with interactive highlighting
visualization = lx.visualize(result, source_text=radiology_report)
visualization.show()
Interactive Visualization Features
The visualization highlights:
- Source spans: Each extracted field is highlighted in the original text
- Color coding: Different colors for findings, impressions, and recommendations
- Click navigation: Click on extracted values to see source locations
- Structured view: Formatted display of the extracted structure
This makes it easy to verify extraction accuracy and understand how the model interpreted the report.
Handling Complex Reports
Radiology reports can be complex. Consider:
- Multiple findings: Extract all findings, even if numerous
- Nested structures: Handle findings within body parts
- Measurements: Extract measurements with units
- Comparisons: Handle comparison to previous studies
- Uncertainty: Capture expressions of uncertainty or differential diagnoses
Production Considerations
For production use:
- HIPAA compliance: Ensure proper handling of protected health information
- Validation: Implement validation checks for critical findings
- Human review: Consider human review for critical diagnoses
- Audit trail: Maintain audit trails of extractions
- Error handling: Handle cases where structure cannot be determined