Soliplex Ingester Architecture
Overview
Soliplex Ingester is a document processing and RAG (Retrieval-Augmented Generation) ingestion system designed to handle large-scale document workflows. It provides a FastAPI-based REST API, workflow orchestration, and integration with document parsing and embedding services.
System Components
1. FastAPI Server
The server provides REST API endpoints for document and workflow management:
- Document Routes (
/api/v1/document/*) - Document upload, retrieval, and management - Batch Routes (
/api/v1/batch/*) - Batch processing operations - Workflow Routes (
/api/v1/workflow/*) - Workflow execution and monitoring - Stats Routes (
/api/v1/stats/*) - System statistics and metrics
Server entry point: src/soliplex_ingester/server/__init__.py:30
2. Workflow System
The workflow system orchestrates multi-step document processing pipelines:
Workflow Components:
- WorkflowDefinition - Defines the steps and lifecycle events for a workflow
- WorkflowRun - Represents a single execution instance for one document
- RunGroup - Groups multiple workflow runs together
- RunStep - Individual step execution within a workflow run
Step Types:
- INGEST - Load document into system
- VALIDATE - Validate document format and content
- PARSE - Extract text and structure from document
- CHUNK - Split document into semantic chunks
- EMBED - Generate vector embeddings
- STORE - Save to RAG system (LanceDB + HaikuRAG)
- ENRICH - Add metadata or additional processing
- ROUTE - Conditional routing logic
Implementation: src/soliplex_ingester/lib/wf/
3. Worker System
Async workers process workflow steps concurrently:
- Workers poll for pending workflow steps
- Configurable concurrency levels for different operations
- Automatic retry logic with configurable retry counts
- Health check/heartbeat system via
WorkerCheckin
Worker implementation: src/soliplex_ingester/lib/wf/runner.py
4. Storage Layer
Database: - SQLModel + SQLAlchemy with async support - Supports SQLite (dev) and PostgreSQL (production) - Alembic for migrations
File Storage: - Configurable backends (filesystem, S3-compatible via OpenDAL) - Separate storage locations for different artifact types: - Raw documents - Parsed markdown - Parsed JSON - Chunks - Embeddings
Vector Storage: - LanceDB for vector embeddings - HaikuRAG client for retrieval operations
5. Document Processing Pipeline
graph LR
A[Upload Document] --> B[Create DocumentURI]
B --> C[Hash & Store as Document]
C --> D[Queue Workflow Run]
D --> E[Validate Step]
E --> F[Parse with Docling]
F --> G[Chunk Text]
G --> H[Generate Embeddings]
H --> I[Store in LanceDB]
I --> J[Update RAG Index]
6. External Services
Docling Server:
- Document parsing service
- Extracts text, structure, and metadata
- Configurable via DOCLING_SERVER_URL
HaikuRAG:
- RAG backend for document retrieval
- Vector search and document management
- Optional (controlled by DO_RAG setting)
Data Flow
Document Ingestion Flow
- Upload - Client uploads document via
/api/v1/document/upload - Hash & Dedupe - System computes SHA256 hash, checks for duplicates
- Create URI - Maps source URI to document hash
- Batch Assignment - Associates document with processing batch
- Workflow Creation - Creates WorkflowRun and RunSteps
- Worker Processing - Workers pick up and execute steps
- Status Updates - Database tracks step and run status
- Completion - Document marked complete when all steps succeed
Workflow Execution Flow
- Worker Startup - Worker registers and starts polling
- Step Selection - Worker queries for PENDING steps with
FOR UPDATElock - Status Transition - PENDING → RUNNING → COMPLETED/ERROR/FAILED
- Step Execution - Calls registered handler method
- Artifact Storage - Saves intermediate results
- Retry Logic - Automatic retry on ERROR status
- Run Completion - Aggregates step status to run status
- Group Completion - Aggregates run status to group status
Configuration
Configuration via environment variables with pydantic-settings:
- Database connection
- File storage paths
- Worker concurrency settings
- External service URLs
- Workflow and parameter directories
See src/soliplex_ingester/lib/config.py:15 for full configuration schema.
Scalability
Horizontal Scaling: - Multiple workers can run concurrently - Database row-level locking prevents duplicate processing - Stateless API servers can be load balanced
Vertical Scaling: - Configurable concurrency per worker - Batch size controls for embedding operations - Connection pooling for database access
Workflow Parallelism: - Multiple workflows can process simultaneously - Steps within a workflow run sequentially - Different documents process independently
Technology Stack
- Web Framework: FastAPI 0.120+
- Database ORM: SQLModel 0.0.27+
- Async Runtime: asyncio
- CLI: Typer
- Document Parsing: Docling
- Vector DB: LanceDB 0.25+
- RAG: HaikuRAG
- Storage: OpenDAL (multi-backend support)
Extension Points
Custom Workflow Steps: Define custom step handlers by: 1. Creating a new async function matching the EventHandler signature 2. Registering in workflow YAML configuration 3. Implementing retry logic and error handling
Custom Storage Backends:
Configure via FILE_STORE_TARGET environment variable and OpenDAL configuration.
Custom Lifecycle Events:
Add event handlers in workflow configuration to respond to:
- GROUP_START / GROUP_END
- ITEM_START / ITEM_END
- STEP_START / STEP_END
- ITEM_FAILED / STEP_FAILED
Monitoring
Database Tables:
- workflowrun - Track run status and duration
- runstep - Monitor individual step execution
- workcheckin - Worker health and activity
- lifecyclehistory - Audit trail of events
Metrics Available: - Document processing throughput - Step success/failure rates - Worker utilization - Processing durations - Batch completion times
Access via /api/v1/stats/* endpoints.