SDK Overview
An introduction to the Ragnerock Python SDK.
The Ragnerock Python SDK provides a high-level interface to the Ragnerock platform. It uses an SQLAlchemy-inspired pattern with Engine and Session objects, making it familiar to Python developers who work with databases.
Installation
pip install ragnerock
Or with uv:
uv add ragnerock
Quick Example
from ragnerock import create_engine, Session, Document, Annotation
# Connect to your project
engine = create_engine("ragnerock://user@example.com:pass@api.ragnerock.com/my_project")
with Session(engine) as session:
# List all documents
for doc in session.list(Document):
print(f"{doc.name} - {doc.status}")
# Get annotations for a document
doc = session.get(Document, name="Apple 10-K 2024")
for ann in doc.list(Annotation, operator="financial_metrics"):
print(ann.data)
# Query annotation data with SQL
result = session.query("""
SELECT document_name, revenue, net_income
FROM financial_metrics
WHERE revenue > 100000
""")
df = result.to_pandas()
Core Concepts
The SDK is organized around a few key concepts:
| Concept | Description |
|---|---|
Engine | Holds connection configuration and manages authentication |
Session | A context manager for interacting with a project |
| Resources | Data objects like Document, Annotation, Operator, Workflow |
PaginatedIterator | Lazy iterator for efficient pagination |
QueryResult | Results from SQL queries on annotation data |
Connection String
Connect to Ragnerock using a connection string:
ragnerock://email:password@host/project_name
Examples:
# Production
engine = create_engine("ragnerock://user@company.com:pass@api.ragnerock.com/sec_analysis")
# Local development
engine = create_engine("ragnerock://dev@test.com:pass@localhost:8080/test_project")
Session Pattern
All operations happen within a Session context:
from ragnerock import create_engine, Session
engine = create_engine("ragnerock://...")
with Session(engine) as session:
# get() - Retrieve a single resource
doc = session.get(Document, id="uuid-here")
doc = session.get(Document, name="My Document")
# list() - Iterate over resources
for doc in session.list(Document):
print(doc.name)
# create() - Create a new resource
doc = Document(file_path="/path/to/file.pdf", name="New Doc")
session.create(doc)
# delete() - Delete a resource
session.delete(doc)
# query() - Run SQL queries
result = session.query("SELECT * FROM annotations")
# run() - Execute workflows
job = session.run(workflow, documents=[doc])
Resources
The SDK provides these resource types:
| Resource | Description |
|---|---|
Document | An uploaded document (PDF, Word, etc.) |
Annotation | AI-generated structured data attached to a document |
Operator | An annotation schema that defines extraction logic |
Workflow | A DAG of operators that process documents |
Job | A handle to track workflow execution |
Chunk | A text segment within a document |
Page | A page within a document |
Type Safety
The SDK is fully typed and works with mypy and Pyright:
from ragnerock import Session, Document, Annotation
with Session(engine) as session:
doc: Document | None = session.get(Document, name="report.pdf")
annotations: list[Annotation] = doc.list(Annotation).all() if doc else []
Error Handling
The SDK provides typed exceptions for different error conditions:
from ragnerock import (
RagnerockError,
AuthenticationError,
NotFoundError,
ValidationError,
QueryError,
)
try:
doc = session.get(Document, id="nonexistent")
except NotFoundError:
print("Document not found")
except AuthenticationError:
print("Invalid credentials")
Optional Dependencies
Install optional dependencies for additional features:
# pandas support for query results
pip install ragnerock[pandas]
Next Steps
- Engine & Connection — Configure your connection
- Session — Learn about Session methods
- Resources — Understand Document, Annotation, and other resources
- SQL Queries — Query your annotation data
- Pagination — Work with large result sets
- Error Handling — Handle errors gracefully