Soliplex Ingester Architecture
Overview
Soliplex Ingester is a document processing and RAG (Retrieval-Augmented Generation) ingestion system designed to handle large-scale document workflows. It provides a FastAPI-based REST API, workflow orchestration, and integration with document parsing and embedding services.
System Components
1. FastAPI Server
The server provides REST API endpoints for document and workflow management:
- Document Routes (
/api/v1/document/*) - Document upload, retrieval, and management - Batch Routes (
/api/v1/batch/*) - Batch processing operations - Workflow Routes (
/api/v1/workflow/*) - Workflow execution and monitoring - Stats Routes (
/api/v1/stats/*) - System statistics and metrics
Server entry point: src/soliplex/ingester/server/__init__.py:30
2. Workflow System
The workflow system orchestrates multi-step document processing pipelines:
Workflow Components:
- WorkflowDefinition - Defines the steps and lifecycle events for a workflow
- WorkflowRun - Represents a single execution instance for one document
- RunGroup - Groups multiple workflow runs together
- RunStep - Individual step execution within a workflow run
Step Types:
INGEST- Load document into systemVALIDATE- Validate document format and contentPARSE- Extract text and structure from documentCHUNK- Split document into semantic chunksEMBED- Generate vector embeddingsSTORE- Save to RAG system (LanceDB + HaikuRAG)ENRICH- Add metadata or additional processingROUTE- Conditional routing logic
Implementation: src/soliplex/ingester/lib/wf/
3. Worker System
Async workers process workflow steps concurrently:
- Workers poll for pending workflow steps
- Configurable concurrency levels for different operations
- Automatic retry logic with configurable retry counts
- Health check/heartbeat system via
WorkerCheckin
Worker implementation: src/soliplex/ingester/lib/wf/runner.py
4. Storage Layer
Database:
- SQLModel + SQLAlchemy with async support
- Supports SQLite (dev) and PostgreSQL (production)
- Alembic for migrations
File Storage:
- Configurable backends (filesystem, S3-compatible via OpenDAL)
- Separate storage locations for different artifact types:
- Raw documents
- Parsed markdown
- Parsed JSON
- Chunks
- Embeddings
Vector Storage:
- LanceDB for vector embeddings
- HaikuRAG client for retrieval operations
5. Document Processing Pipeline
graph LR
A[Upload Document] --> B[Create DocumentURI]
B --> C[Hash & Store as Document]
C --> D[Queue Workflow Run]
D --> E[Validate Step]
E --> F[Parse with Docling]
F --> G[Chunk Text]
G --> H[Generate Embeddings]
H --> I[Store in LanceDB]
I --> J[Update RAG Index]
6. External Services
Docling Server:
- Document parsing service
- Extracts text, structure, and metadata
- Configurable via
DOCLING_SERVER_URL
HaikuRAG:
- RAG backend for document retrieval
- Vector search and document management
- Optional (controlled by
DO_RAGsetting)
Data Flow
Document Ingestion Flow
- Upload - Client uploads document via
/api/v1/document/upload - Hash & Dedupe - System computes SHA256 hash, checks for duplicates
- Create URI - Maps source URI to document hash
- Batch Assignment - Associates document with processing batch
- Workflow Creation - Creates WorkflowRun and RunSteps
- Worker Processing - Workers pick up and execute steps
- Status Updates - Database tracks step and run status
- Completion - Document marked complete when all steps succeed
Workflow Execution Flow
- Worker Startup - Worker registers and starts polling
- Step Selection - Worker queries for PENDING steps with
FOR UPDATElock - Status Transition - PENDING → RUNNING → COMPLETED/ERROR/FAILED
- Step Execution - Calls registered handler method
- Artifact Storage - Saves intermediate results
- Retry Logic - Automatic retry on ERROR status
- Run Completion - Aggregates step status to run status
- Group Completion - Aggregates run status to group status
Configuration
Configuration via environment variables with pydantic-settings:
- Database connection
- File storage paths
- Worker concurrency settings
- External service URLs
- Workflow and parameter directories
See src/soliplex/ingester/lib/config.py:15 for full configuration schema.
Scalability
Horizontal Scaling:
- Multiple workers can run concurrently
- Database row-level locking prevents duplicate processing
- Stateless API servers can be load balanced
Vertical Scaling:
- Configurable concurrency per worker
- Batch size controls for embedding operations
- Connection pooling for database access
Workflow Parallelism:
- Multiple workflows can process simultaneously
- Steps within a workflow run sequentially
- Different documents process independently
Technology Stack
- Web Framework: FastAPI 0.120+
- Database ORM: SQLModel 0.0.27+
- Async Runtime: asyncio
- CLI: Typer
- Document Parsing: Docling
- Vector DB: LanceDB 0.25+
- RAG: HaikuRAG
- Storage: OpenDAL (multi-backend support)
Extension Points
Custom Workflow Steps: Define custom step handlers by:
- Creating a new async function matching the EventHandler signature
- Registering in workflow YAML configuration
- Implementing retry logic and error handling
Custom Storage Backends:
Configure via FILE_STORE_TARGET environment variable and OpenDAL configuration.
Custom Lifecycle Events: Add event handlers in workflow configuration to respond to:
GROUP_START/GROUP_ENDITEM_START/ITEM_ENDSTEP_START/STEP_ENDITEM_FAILED/STEP_FAILED
Monitoring
Database Tables:
workflowrun- Track run status and durationrunstep- Monitor individual step executionworkcheckin- Worker health and activitylifecyclehistory- Audit trail of events
Metrics Available:
- Document processing throughput
- Step success/failure rates
- Worker utilization
- Processing durations
- Batch completion times
Access via /api/v1/stats/* endpoints.