System Architecture¶
Champa Intelligence is built on a modern, scalable architecture designed for high performance, reliability, and maintainability.
Architecture Overview¶
graph TB
subgraph "Client Layer"
A[Web Browser]
B[Alpine.js Framework]
C[Tailwind CSS]
D[BPMN.js / DMN.js]
end
subgraph "Application Layer"
E[Gunicorn WSGI Server]
F[Flask Application]
G[Blueprint Architecture]
G --> H1[Auth BP]
G --> H2[Dashboard BP]
G --> H3[Health BP]
G --> H4[AI Analysis BP]
G --> H5[Journey BP]
G --> H6[Diff Tool BP]
G --> H7[Portfolio BP]
G --> H8[Linter BP]
end
subgraph "Data Layer"
I[(Camunda DB<br/>PostgreSQL)]
J[(System DB<br/>PostgreSQL)]
K[Redis Cache]
end
subgraph "External Services"
L[Google Gemini AI]
M[Camunda REST API]
N[Prometheus/Grafana]
end
A --> B
B --> E
E --> F
F --> G
F --> I
F --> J
F --> K
F --> L
F --> M
F --> N Hold "Alt" / "Option" to enable pan & zoom
Architectural Principles¶
1. Separation of Concerns¶
Blueprint Architecture:
- Each feature is a self-contained Flask Blueprint
- Clean separation between auth, analytics, monitoring, etc.
- Independent testing and deployment per blueprint
Database Separation:
- Camunda DB: Read-only access to process data
- System DB: Champa's own data (users, sessions, cache)
- No schema pollution in customer database
2. Performance First¶
Multi-Level Caching:
- Redis for hot data (sessions, query results)
- PostgreSQL fallback for session management
- Smart cache invalidation on deployments
Lazy Loading:
- Dashboard components load on-demand
- Reduces initial page load to <500ms
- Parallel data fetching with ThreadPoolExecutor
Optimized Queries:
- 80+ hand-crafted SQL queries
- Strategic indexes and query planning
- Batch operations for bulk data
3. Scalability¶
Horizontal Scaling:
- Stateless application design
- Session data in Redis (shared across instances)
- Multiple Gunicorn workers per container
Vertical Scaling:
- Configurable worker/thread count
- Connection pooling for databases
- Efficient memory management
4. Security by Design¶
Defense in Depth:
- JWT-based authentication
- Role-based access control (RBAC)
- Audit logging for all actions
- SQL injection prevention (parameterized queries)
Data Protection:
- Salted password hashing (PBKDF2)
- Secure session management
- API token lifecycle management
Technology Stack¶
Backend¶
| Component | Technology | Purpose |
|---|---|---|
| Language | Python 3.12 | Core application logic |
| Framework | Flask 3.x | Web framework |
| WSGI Server | Gunicorn | Production application server |
| Database Driver | psycopg2-binary | PostgreSQL connectivity |
| Cache Client | redis | High-performance caching |
| AI SDK | google-genai | Gemini API (default) |
| HTTP Client | requests | External API calls |
| XML Parser | lxml | BPMN/DMN parsing |
| Auth | PyJWT | Token-based authentication |
Frontend¶
| Component | Technology | Purpose |
|---|---|---|
| JavaScript Framework | Alpine.js 3.x | Reactive UI components |
| CSS Framework | Tailwind CSS 3.x | Utility-first styling |
| BPMN Rendering | bpmn-js 18.x | Process diagram visualization |
| DMN Rendering | dmn-js 17.x | Decision table visualization |
| Charts | Chart.js 4.x | Data visualization |
| Build Tool | Webpack 5.x | Module bundling |
| Transpiler | Babel | ES6+ to ES5 |
| Documentation | MkDocs Material | Project documentation |
Infrastructure¶
| Component | Technology | Purpose |
|---|---|---|
| Container | Docker | Application containerization |
| Orchestration | Docker Compose | Multi-container deployment |
| Database | PostgreSQL 15+ | Data persistence |
| HA Database | Patroni + etcd | PostgreSQL high availability |
| Cache | Redis 7+ | In-memory data store |
| Web Server | Nginx | Reverse proxy, static files, docs |
| Monitoring | Prometheus + Grafana | Metrics and dashboards |
Component Architecture¶
Database Schema¶
Camunda Database (Read-Only):
-- Process Definitions
act_re_procdef
act_re_deployment
-- Process Instances
act_hi_procinst
act_ru_execution
-- Activities
act_hi_actinst
-- Variables
act_hi_varinst
act_ru_variable
-- Incidents
act_hi_incident
act_ru_incident
-- Jobs
act_hi_job_log
act_ru_job
-- User Tasks
act_hi_taskinst
act_ru_task
-- DMN
act_hi_decinst
act_re_decision_def
System Database (Champa):
-- Authentication Schema
auth.users
auth.roles
auth.permissions
auth.role_permissions
auth.sessions
auth.audit_log
-- Configuration
auth.system_config (linter rules, settings)
Caching Strategy¶
graph LR
A[Request] --> B{Cache?}
B -->|Hit| C[Return from Cache]
B -->|Miss| D[Query Database]
D --> E[Store in Cache]
E --> F[Return Data]
G[Deployment Event] --> H[Invalidate Cache]
H --> I[Static Data Only] Hold "Alt" / "Option" to enable pan & zoom
Cache Layers:
Session Cache (Redis)
- TTL: 1 hour (normal) / 30 days (remember me)
- Fallback: PostgreSQL
- Purpose: User authentication
Query Cache (Redis)
- TTL: 5 min - 24 hours (data-dependent)
- Purpose: Expensive SQL queries
- Invalidation: Smart, per-query-type
AI Cache (Redis)
- TTL: 30 min - 24 hours
- Purpose: AI analysis components
- Key structure:
ai:{type}:{process}:{version}:{params}
Request Flow¶
1. Authentication Flow¶
sequenceDiagram
participant Browser
participant Flask
participant Redis
participant SystemDB
Browser->>Flask: POST /auth/login
Flask->>SystemDB: Verify credentials
SystemDB-->>Flask: User data
Flask->>Flask: Generate JWT
Flask->>Redis: Store session
Redis-->>Flask: OK
Flask-->>Browser: Set cookie + JWT
Browser->>Flask: GET /dashboard (with JWT)
Flask->>Redis: Validate session
Redis-->>Flask: Session data
Flask->>Flask: Check permissions
Flask-->>Browser: Dashboard HTML Hold "Alt" / "Option" to enable pan & zoom
2. Dashboard Load Flow¶
sequenceDiagram
participant Browser
participant Flask
participant Redis
participant CamundaDB
Browser->>Flask: GET /dashboard/<key>/<v1>/<v2>
Flask->>Redis: Check auth session
Redis-->>Flask: Valid session
Flask-->>Browser: Dashboard shell (HTML)
Note over Browser: Page renders instantly
Browser->>Flask: GET /api/dashboard/section/incidents
Flask->>Redis: Check cache
alt Cache Hit
Redis-->>Flask: Cached data
else Cache Miss
Flask->>CamundaDB: Query incidents
CamundaDB-->>Flask: Raw data
Flask->>Redis: Store in cache
end
Flask-->>Browser: JSON data
Note over Browser: Section updates dynamically Hold "Alt" / "Option" to enable pan & zoom
3. AI Analysis Flow¶
sequenceDiagram
participant Browser
participant Flask
participant Redis
participant CamundaDB
participant Gemini
Browser->>Flask: POST /ai-analysis/api/generate
Flask->>Redis: Check cached components
par Parallel Data Fetch
Flask->>CamundaDB: Query incidents
Flask->>CamundaDB: Query performance
Flask->>CamundaDB: Query variables
end
CamundaDB-->>Flask: All data
Flask->>Redis: Cache components
Flask->>Flask: Build prompt
Flask->>Gemini: Generate analysis
Gemini-->>Flask: AI response
Flask->>SystemDB: Save to history
Flask-->>Browser: HTML report Hold "Alt" / "Option" to enable pan & zoom
Deployment Architecture¶
Development Environment¶
graph TB
subgraph "Developer Workstation"
A[Flask Dev Server<br/>:5000]
B[(Local PostgreSQL<br/>Camunda DB)]
C[(Local PostgreSQL<br/>System DB)]
D[Local Redis<br/>:6379]
E[npm watch<br/>Frontend Build]
A --> B
A --> C
A --> D
E -.->|Hot Reload| A
end
style A fill:#4CAF50
style E fill:#2196F3 Hold "Alt" / "Option" to enable pan & zoom
Production (Single Server - Docker Compose)¶
graph TB
subgraph "Docker Host"
subgraph "Nginx Container :80/:443"
N1[Static Files<br/>/static]
N2[Documentation<br/>docs.champa-bpmn.com]
N3[Reverse Proxy<br/>www.champa-bpmn.com]
end
subgraph "Application Container :8088"
A[Gunicorn<br/>4 Workers]
F[Flask Application<br/>Blueprints]
end
subgraph "Data Layer"
SDB[(System DB<br/>PostgreSQL :5433)]
R[Redis<br/>:6379]
end
EXT[(External Camunda DB<br/>Customer Infrastructure)]
end
Internet --> N1
Internet --> N2
N3 --> A
A --> F
F --> SDB
F --> R
F --> EXT
style A fill:#4CAF50
style F fill:#66BB6A
style N3 fill:#2196F3
style EXT fill:#FF9800 Hold "Alt" / "Option" to enable pan & zoom
High Availability Setup (K8)¶
graph TB
subgraph "Load Balancer Layer"
LB[HAProxy/Nginx LB<br/>:80/:443]
end
subgraph "Application Layer - 3 Nodes"
APP1[Champa Node 1<br/>Gunicorn+Flask]
APP2[Champa Node 2<br/>Gunicorn+Flask]
APP3[Champa Node 3<br/>Gunicorn+Flask]
end
subgraph "Cache Layer - Redis Cluster"
subgraph "Redis Sentinel"
RS1[Sentinel 1<br/>:26379]
RS2[Sentinel 2<br/>:26379]
RS3[Sentinel 3<br/>:26379]
end
RMASTER[Redis Master<br/>:6379]
RSLAVE1[Redis Replica 1<br/>:6380]
RSLAVE2[Redis Replica 2<br/>:6381]
RS1 -.Monitor.-> RMASTER
RS2 -.Monitor.-> RMASTER
RS3 -.Monitor.-> RMASTER
RMASTER -->|Replicate| RSLAVE1
RMASTER -->|Replicate| RSLAVE2
end
subgraph "Database Layer - Patroni HA"
subgraph "etcd Cluster - Distributed Consensus"
E1[etcd Node 1<br/>:2379]
E2[etcd Node 2<br/>:2379]
E3[etcd Node 3<br/>:2379]
end
subgraph "PostgreSQL Cluster"
PG1[Patroni + PostgreSQL<br/>Primary :5432]
PG2[Patroni + PostgreSQL<br/>Standby :5432]
end
E1 <-->|Raft Consensus| E2
E2 <-->|Raft Consensus| E3
E3 <-->|Raft Consensus| E1
PG1 -.Patroni API.-> E1
PG1 -.Patroni API.-> E2
PG1 -.Patroni API.-> E3
PG2 -.Patroni API.-> E1
PG2 -.Patroni API.-> E2
PG2 -.Patroni API.-> E3
PG1 -->|Streaming<br/>Replication| PG2
end
subgraph "External Services"
CAMDB[(Customer<br/>Camunda DB)]
DOCS[Documentation<br/>Static Site]
end
Internet --> LB
LB --> APP1
LB --> APP2
LB --> APP3
APP1 --> RMASTER
APP2 --> RMASTER
APP3 --> RMASTER
APP1 --> PG1
APP2 --> PG1
APP3 --> PG1
APP1 -.ReadOnly.-> CAMDB
APP2 -.ReadOnly.-> CAMDB
APP3 -.ReadOnly.-> CAMDB
LB --> DOCS
style LB fill:#2196F3
style APP1 fill:#4CAF50
style APP2 fill:#4CAF50
style APP3 fill:#4CAF50
style RMASTER fill:#FF5722
style PG1 fill:#1976D2
style PG2 fill:#64B5F6
style E1 fill:#9C27B0
style E2 fill:#9C27B0
style E3 fill:#9C27B0
style CAMDB fill:#FF9800
style DOCS fill:#00BCD4 Hold "Alt" / "Option" to enable pan & zoom
High Availability Components:
Load Balancer (HAProxy/Nginx)
- Health checks on application nodes
- Session affinity (sticky sessions)
- SSL termination
- Automatic failover
Application Layer (3+ Nodes)
- Stateless Flask applications
- Horizontal scaling
- Zero-downtime deployments
- Independent failure domains
Redis Sentinel (3 Nodes)
- Automatic master failover
- Configuration provider
- Notification system
- Quorum-based decisions
- Promotes replica to master on failure
PostgreSQL with Patroni (2+ Nodes)
- Automatic failover via Patroni
- Streaming replication
- Point-in-time recovery
- Connection pooling via PgBouncer
- Watchdog for split-brain prevention
etcd Cluster (3 Nodes)
- Distributed consensus (Raft algorithm)
- Configuration storage for Patroni
- Leader election
- Highly consistent key-value store
- Tolerates single node failure
Failover Scenarios:
sequenceDiagram
participant App as Application
participant S1 as Sentinel 1
participant S2 as Sentinel 2
participant S3 as Sentinel 3
participant RM as Redis Master
participant RS as Redis Replica
Note over RM: Master Fails
S1->>RM: PING (timeout)
S1->>S2: Master down?
S1->>S3: Master down?
S2-->>S1: Confirmed down
S3-->>S1: Confirmed down
Note over S1,S3: Quorum reached (3/3)
S1->>RS: Promote to Master
RS-->>S1: Promotion complete
S1->>App: Config update: New master
App->>RS: Connect to new master Hold "Alt" / "Option" to enable pan & zoom
sequenceDiagram
participant App as Application
participant E as etcd Cluster
participant P1 as Patroni Primary
participant P2 as Patroni Standby
participant PG1 as PostgreSQL Primary
participant PG2 as PostgreSQL Standby
Note over PG1: Primary DB Fails
P1->>E: Failed to update lease
P2->>E: Attempt to acquire lease
E-->>P2: Lease acquired (leader)
P2->>PG2: Promote to primary
PG2-->>P2: Promotion complete
P2->>E: Update cluster state
E-->>App: New primary endpoint
App->>PG2: Connect to new primary Hold "Alt" / "Option" to enable pan & zoom
Monitoring & Health Checks:
- Application:
/health/ping,/health/db - Redis: PING command via Sentinel
- PostgreSQL:
pg_isready, replication lag monitoring - etcd: HTTP health endpoint, cluster health API
- Patroni: REST API health endpoint
Security Architecture¶
Authentication & Authorization¶
graph TD
A[User Request] --> B{Has JWT?}
B -->|No| C[Redirect to Login]
B -->|Yes| D{Valid Token?}
D -->|No| C
D -->|Yes| E{User Active?}
E -->|No| C
E -->|Yes| F{Has Permission?}
F -->|No| G[403 Forbidden]
F -->|Yes| H[Process Request] Hold "Alt" / "Option" to enable pan & zoom
Security Layers:
- Authentication: JWT tokens with expiration
- Session Management: Redis-backed with TTL
- Authorization: RBAC with 12+ permissions
- Audit: All actions logged
- Rate Limiting: API token lifecycle management
Permission Model¶
PERMISSIONS = {
'full_access': 'Complete system access',
'api_access': 'Programmatic API access',
'portfolio_data': 'Portfolio dashboard',
'extended_dashboard_data': 'Process intelligence',
'bpmn_analysis_data': 'BPMN analytics viewer',
'dmn_analysis_data': 'DMN analytics',
'health_monitor_data': 'Health monitoring',
'journey_analysis_data': 'Journey monitoring',
'ai_analysis_data': 'AI-powered analysis',
'diff_tool_data': 'BPMN diff tool',
'model_validation_data': 'Model validator',
'manage_users': 'User management',
'manage_roles': 'Role management'
}
Monitoring & Observability¶
Application Logs¶
Log Levels:
- DEBUG: Detailed execution flow
- INFO: Normal operations
- WARNING: Recoverable issues
- ERROR: Errors with stack traces
- CRITICAL: System failures
Log Categories:
logs/
├── application.log # General application
├── access.log # HTTP requests
├── security.log # Auth & security events
├── database.log # DB queries & performance
├── cache.log # Cache operations
├── ai.log # AI analysis operations
└── structured.log # Machine-readable JSON
Prometheus Metrics¶
Exported Metrics:
- Cluster health (nodes, instances, incidents)
- Per-node metrics (workload, job rates)
- JVM metrics (heap, GC, threads)
- Database metrics (connections, latency)
- Process-level KPIs (health scores, rates)
Health Checks¶
GET /health/ping # Simple liveness
GET /health/db # Database connectivity
GET /health/api/full # Comprehensive health
GET /health/light/metrics # Prometheus metrics
Next Steps¶
- Database Schema - Detailed schema documentation
- Caching Strategy - Cache configuration guide
- Security Model - Security deep-dive
- Frontend Architecture - UI/UX architecture