Skip to content

Portfolio Dashboard

Portfolio Dashboard Executive command center for monitoring all process definitions at scale


Overview

The Portfolio Dashboard is your strategic command center for managing and monitoring your entire Camunda process landscape. Unlike feature-specific views that focus on individual processes or instances, the Portfolio Dashboard provides a bird's-eye view of your complete BPM ecosystem, enabling executives, operations managers, and process owners to understand portfolio-wide health, identify critical issues instantly, and track performance trends over time.


The Challenge: Managing Process Portfolios at Scale

Organizations running Camunda in production typically manage dozens or hundreds of process definitions across multiple versions, environments, and business domains. Without portfolio-level visibility, teams face:

  • No unified view - Must check each process individually to understand overall health
  • Reactive firefighting - Discover critical issues only after escalation
  • Resource inefficiency - Can't identify where to focus optimization efforts
  • Blind spot on trends - Miss patterns that indicate systemic problems
  • Lack of executive metrics - No way to report business velocity and operational excellence

The Portfolio Dashboard solves these problems by aggregating, analyzing, and visualizing metrics across your entire process portfolio in a single, actionable interface.


Key Capabilities

📊 At-a-Glance Executive KPIs

Eight critical metrics provide instant understanding of portfolio health:

Operational Metrics:

  • Total Processes - Number of unique process definitions being monitored
  • Active Instances - Real-time count of currently running process instances across all definitions
  • Open Incidents - Total unresolved incidents indicating immediate operational issues
  • Started This Month - New instances launched in the current calendar month

Business Velocity:

  • Started 30d - Instances started in the last 30 days (throughput indicator)
  • Success Rate - Percentage of instances completed without incidents in last 30 days (quality metric)
  • Avg Completion Time - Historical average duration across all completed instances (efficiency benchmark)
  • Processes At Risk - Count of process definitions with one or more open incidents (risk exposure)

These KPIs answer the fundamental questions:

  • How much work are we processing? (Active Instances, Started metrics)
  • How well are we performing? (Success Rate, Completion Time)
  • What needs immediate attention? (Open Incidents, Processes At Risk)

🚨 Critical Attention Module

The "Critical Attention Required" section automatically identifies the top 5 most problematic processes based on a weighted score combining:

  • Incident count - Absolute number of open incidents
  • Incident rate - Percentage of active instances with incidents

This intelligent prioritization ensures teams focus optimization efforts where they'll have the greatest impact. Each entry shows:

  • Process key (clickable to Journey Monitoring)
  • Number of open incidents
  • Number of active instances
  • Incident rate percentage

Empty state: When no incidents exist, displays "All Systems Operational" with green checkmark—instant confidence for stakeholders.

📈 Portfolio KPIs Table

Comprehensive metrics table for every process definition in your portfolio:

Metric Description Use Case
Process Key Definition identifier (clickable) Navigate to details
Ver. Latest version number Version tracking
Active Currently running instances Real-time workload
Month Started this month Current month activity
30d Start Started last 30 days Recent throughput
30d Done Completed last 30 days Completion throughput
Compl. % Completion rate (Done/Start × 100) Process efficiency
Daily Average completions per day Steady-state capacity
Total All-time completed instances Historical volume
Avg Duration Mean completion time Performance baseline
Incidents Current open incidents Real-time problems
Inc. Rate Incidents/Active × 100 Problem density
Success % Completed without incidents (30d) Quality metric
Health Composite health score (0-100) Overall status

🔍 Process Health Matrix

Visual grid providing rapid health assessment of all processes:

Three Health States:

🟢 Healthy (Green)

  • No open incidents
  • Has active instances
  • Operating normally

Idle (Gray)

  • No open incidents
  • Zero active instances
  • Recently inactive

🔴 Incidents (Red)

  • One or more open incidents
  • Requires attention

Each tile shows:

  • Process key
  • Health status indicator
  • Active instances count
  • Instances started this month

Use Case: Quickly scan the portfolio to identify patterns (e.g., "Why are 5 processes idle?" or "Which healthy processes need capacity planning?")

📉 12-Month Trend Analysis

Interactive ApexCharts line chart plotting historical trends:

Dual Metrics:

  • Blue Line: Instances Started per month
  • Red Line: Incidents Created per month

Insights Enabled:

  • Seasonal Patterns - Identify monthly workload variations (e.g., month-end spikes)
  • Trend Detection - Spot increasing incident rates over time
  • Release Impact - Correlate deployments with stability changes
  • Capacity Planning - Forecast based on growth trends
  • Performance Degradation - Catch declining quality before crisis

Interactive Controls:

  • Hover to see exact values
  • Legend toggle to focus on single metric
  • Responsive to dark mode theme changes

🔄 Auto-Loading Architecture

The dashboard uses intelligent lazy loading to optimize performance:

Initial Page Load:

  • Renders shell immediately
  • Shows loading skeletons for data sections
  • No blocking database queries

Progressive Loading:

  • Overview data loads when summary cards enter viewport
  • Trends data loads when chart section enters viewport
  • Refresh button reloads all data on-demand

Benefits:

  • Fast perceived performance - Page appears instantly
  • Reduced server load - Only loads data when viewed
  • Responsive updates - Refresh specific sections independently

Understanding the Metrics

Time Scopes Explained

The dashboard uses three distinct time windows for different purposes:

Real-Time (Current Moment)

  • Active Instances
  • Open Incidents
  • Incident Rate
  • Health Score

These reflect right now - the live state of your portfolio.

Last 30 Days (Rolling Window)

  • Started 30d
  • Completed 30d
  • Completion Rate
  • Success Rate

These track recent performance - how well you've been doing lately.

Current Month (Calendar Month)

  • Started This Month

Tracks this month's activity - useful for monthly reporting.

All-Time (Historical)

  • Total Completed
  • Average Duration

Provides long-term context - your complete history.

Calculated Metrics

Completion Rate

Completion Rate = (Completed Last 30d / Started Last 30d) × 100
Values below 80% indicate backlog buildup.

Success Rate

Success Rate = ((Completed - Instances with Incidents) / Completed) × 100
Target: Above 95% for production-grade processes.

Incident Rate

Incident Rate = (Open Incidents / Active Instances) × 100
Any value above 5% warrants investigation.

Throughput (Daily)

Daily Throughput = Completed Last 30d / 30
Measures steady-state capacity.

Health Score

Health Score = 100 
  - (Open Incidents × 15)
  - (Recently Active but Now Idle ? 20 : 0)
Composite metric factoring incidents and activity. Range: 0-100.


Using the Portfolio Dashboard

For Executives

Morning Check (2 minutes):

  1. Open Portfolio Dashboard
  2. Review KPI cards at top
  3. Check "Critical Attention Required" section
  4. If all green → Review "Processes At Risk" count
  5. If issues exist → Delegate to operations team with process keys

Key Questions Answered:

  • Is our BPM platform healthy overall?
  • How much business are we processing?
  • Are we meeting quality targets?
  • Where should we invest improvement efforts?

For Operations Managers

Daily Workflow:

  1. Morning Standup Prep

  2. Review "Critical Attention Required"

  3. Note top 3 processes by incident count
  4. Check Success Rate trend

  5. Incident Triage

  6. Click processes with incidents

  7. Navigate to Journey Monitoring for root cause
  8. Assign fixes to team

  9. Capacity Planning

  10. Review "Throughput" column

  11. Compare "Active" to historical baseline
  12. Identify processes approaching capacity

  13. Trend Analysis

  14. Check 12-month chart weekly

  15. Look for degrading trends
  16. Correlate with deployments

For Process Owners

Weekly Review:

  1. Find Your Processes

  2. Scan KPIs table for your process keys

  3. Sort by "Health" to find issues

  4. Performance Assessment

  5. Compare "Avg Duration" to target SLAs

  6. Check "Success Rate" against goal (typically 95%+)
  7. Review "Completion Rate" for efficiency

  8. Optimization Opportunities

  9. Low "Daily Throughput" + High "Active Instances" = Bottleneck

  10. High "Avg Duration" = Optimization candidate
  11. Low "Completion Rate" = Investigation needed

For SRE/DevOps Teams

Integration Scenarios:

Prometheus/Grafana:

# Scrape portfolio metrics
prometheus.yml:
  scrape_configs:
    - job_name: 'camunda-portfolio'
      static_configs:
        - targets: ['champa:5000']
      metrics_path: '/portfolio/overview/metrics'
      scrape_interval: 5m

Alerting Rules:

# Grafana alert rules
- alert: HighIncidentRate
  expr: camunda_process_incident_rate > 10
  annotations:
    summary: "Process {{ $labels.process }} has high incident rate"

- alert: ProcessAtRisk
  expr: camunda_portfolio_processes_at_risk > 3
  annotations:
    summary: "Multiple processes require attention"

Dashboard Integration: Create Grafana dashboard combining Portfolio metrics with infrastructure metrics for correlation analysis.


Prometheus Integration

All Portfolio KPIs are exposed via Prometheus endpoint: /portfolio/overview/metrics

Available Metrics

Portfolio-Level Aggregates:

# Portfolio-wide totals
camunda_portfolio_total_processes
camunda_portfolio_active_instances  
camunda_portfolio_open_incidents
camunda_portfolio_started_last_30_days
camunda_portfolio_completed_last_30_days
camunda_portfolio_processes_at_risk

Per-Process Metrics:

# Labels: process, version
camunda_process_active_instances{process="order-fulfillment",version="3"}
camunda_process_open_incidents{process="order-fulfillment",version="3"}
camunda_process_started_last_30_days{process="order-fulfillment",version="3"}
camunda_process_completed_last_30_days{process="order-fulfillment",version="3"}
camunda_process_started_this_month{process="order-fulfillment",version="3"}
camunda_process_success_rate_30d{process="order-fulfillment",version="3"}
camunda_process_health_score{process="order-fulfillment",version="3"}
camunda_process_incident_rate{process="order-fulfillment",version="3"}
camunda_process_avg_duration_seconds{process="order-fulfillment",version="3"}
camunda_process_total_completed_all_time{process="order-fulfillment",version="3"}
camunda_process_completion_rate_30d{process="order-fulfillment",version="3"}
camunda_process_throughput_30d{process="order-fulfillment",version="3"}

Example PromQL Queries

Portfolio Health Check:

# Alert when portfolio has critical incident load
camunda_portfolio_open_incidents > 50

# Alert when too many processes at risk
camunda_portfolio_processes_at_risk > 5

# Portfolio-wide success rate
100 * (
  camunda_portfolio_completed_last_30_days - 
  sum(camunda_process_open_incidents)
) / camunda_portfolio_completed_last_30_days

Process Performance:

# Top 5 processes by incident rate
topk(5, camunda_process_incident_rate)

# Processes with declining health
camunda_process_health_score < 60

# Average completion time across portfolio
avg(camunda_process_avg_duration_seconds)

# Throughput leaders
topk(10, camunda_process_throughput_30d)

Capacity & Load:

# Portfolio-wide capacity utilization
sum(camunda_process_active_instances) / 
sum(camunda_process_throughput_30d * 30)

# Processes nearing capacity
camunda_process_active_instances / 
camunda_process_throughput_30d > 10


Common Patterns & Use Cases

Pattern 1: Morning Health Check

Scenario: Daily standup prep

Workflow:

  1. Open Portfolio Dashboard
  2. Check "Open Incidents" KPI
  3. If 0 → "All clear, team!"
  4. If >0 → Review "Critical Attention Required"
  5. Note top 3 problem processes
  6. Click process key → Navigate to details
  7. Assign incidents to team members

Time: 2-3 minutes

Pattern 2: Capacity Planning

Scenario: Quarterly capacity review

Workflow:

  1. Review "12-Month Trend" chart
  2. Identify growth rate in instances started
  3. Calculate projected growth:
  4. Current: 10,000/month
  5. Growth: +15%/quarter
  6. Projection: 11,500/month next quarter
  7. Check "Active Instances" across all processes
  8. Compare to infrastructure capacity
  9. Plan scaling if projected load exceeds 80% capacity

Time: 15 minutes

Pattern 3: Release Impact Analysis

Scenario: Post-deployment validation

Workflow:

  1. Note date of deployment
  2. Check "Success Rate" before vs after
  3. Review "12-Month Trend" for incident spike
  4. Check specific deployed process in KPIs table
  5. If incident rate increased → Rollback or hotfix

Decision Point: - Success Rate dropped >5% → Investigate immediately - Incident Rate >10% on new process → Consider rollback

Time: 5 minutes

Pattern 4: Executive Reporting

Scenario: Monthly board report

Workflow:

  1. Export Prometheus metrics to spreadsheet
  2. Calculate month-over-month changes:
  3. Instances processed: +12%
  4. Success rate: 96.5% (target: 95%)
  5. Processes at risk: 2 (down from 5)
  6. Avg completion time: 2.3 hours (improved from 2.8)
  7. Create PowerPoint slides with trend chart
  8. Highlight achievements and areas of concern

Time: 30 minutes monthly

Pattern 5: Process Portfolio Rationalization

Scenario: Annual process review

Workflow:

  1. Sort KPIs table by "Active Instances" ascending
  2. Identify processes with 0 active for 30+ days
  3. Check "Total Completed" to confirm legacy status
  4. Review "Health Matrix" for idle processes
  5. Propose decommissioning candidates
  6. Document business justification for remaining idle processes

Criteria for Decommissioning:

  • Zero active instances for 90 days
  • Low total completed (< 100)
  • No future business need
  • Replaced by newer process

Time: 2-4 hours annually


Troubleshooting

Issue: KPI Cards Show "Loading" Forever

Symptoms: Cards display skeleton animation indefinitely

Causes:

  • Database connectivity issue
  • Authentication token expired
  • API endpoint returning error

Solutions:

  1. Open browser developer console (F12)
  2. Check Network tab for failed requests
  3. Check Console tab for JavaScript errors
  4. Refresh page (Ctrl+R)
  5. Verify authentication status
  6. Contact administrator if issue persists

Issue: Metrics Seem Outdated

Symptoms: Numbers don't match current reality

Causes:

  • Looking at cached data (5-minute TTL)
  • Recent incident not yet reflected

Solutions:

  1. Wait 5 minutes for cache to expire
  2. Click "Refresh" button to reload
  3. Check "Last Updated" timestamp
  4. Verify database replication lag (distributed systems)

Issue: Health Matrix Shows Everything as Idle

Symptoms: All processes gray despite activity

Causes:

  • Database query issue
  • Timezone mismatch
  • No recent process starts

Solutions:

  1. Check "Active Instances" in KPI cards (should be >0)
  2. Verify instances are actually running in Camunda Cockpit
  3. Check database query filters for time zones
  4. Review process deployment status

Issue: Chart Not Rendering

Symptoms: Empty space where chart should be

Causes:

  • JavaScript error
  • No trend data available
  • Dark mode rendering bug

Solutions:

  1. Check browser console for errors
  2. Verify at least 1 month of historical data exists
  3. Try toggling dark mode
  4. Disable browser extensions
  5. Try different browser

Best Practices

Dashboard Review Cadence

Daily (5 minutes):

  • Check "Open Incidents" count
  • Review "Critical Attention Required"
  • Triage any red-flagged processes

Weekly (15 minutes):

  • Scan entire KPIs table for anomalies
  • Review "Success Rate" trends
  • Check "Health Matrix" for new idle processes
  • Note any declining health scores

Monthly (1 hour):

  • Analyze "12-Month Trend" chart
  • Calculate month-over-month metrics
  • Review capacity projections
  • Generate executive report

Quarterly (4 hours):

  • Full portfolio health assessment
  • Process rationalization review
  • Capacity planning
  • SLA compliance review

Metric Interpretation

Success Rate:

  • 98-100%: Excellent
  • 95-97%: Good (target range)
  • 90-94%: Acceptable but monitor
  • <90%: Investigation required

Incident Rate:

  • 0-2%: Healthy
  • 2-5%: Acceptable
  • 5-10%: Monitor closely
  • 10%: Critical - immediate action

Health Score:

  • 90-100: Excellent health
  • 70-89: Good, minor issues
  • 50-69: Fair, attention needed
  • <50: Poor, critical issues

Completion Rate:

  • 90-100%: Normal
  • 70-89%: Backlog building
  • 50-69%: Significant backlog
  • <50%: Capacity crisis

Alerting Thresholds

Critical (Immediate Response):

  • Open Incidents > 50
  • Processes At Risk > 5
  • Any process Incident Rate > 20%
  • Portfolio Success Rate < 90%

Warning (Within 24 Hours):

  • Processes At Risk > 2
  • Any process Incident Rate > 10%
  • Any process Health Score < 50
  • Completion Rate < 80%

Info (Weekly Review):

  • Any process idle >7 days
  • Success Rate declined >2% week-over-week
  • Throughput increased >50% week-over-week

API Reference

REST Endpoints

Endpoint Method Description
/portfolio GET Portfolio dashboard page
/portfolio/api/overview GET KPIs and process metrics (JSON)
/portfolio/api/trends GET 12-month trend data (JSON)
/portfolio/overview/metrics GET Prometheus metrics (text)

JSON Response Format

Overview API:

{
  "success": true,
  "overview": [
    {
      "process_key": "order-fulfillment",
      "latest_version": 3,
      "total_completed_all_time": 15420,
      "avg_duration_seconds": 8640,
      "started_last_30_days": 892,
      "completed_last_30_days": 845,
      "started_this_month": 312,
      "active_instances": 47,
      "open_incidents": 2,
      "success_rate_30d": 97.5,
      "health_score": 85,
      "incident_rate": 4.3
    }
  ],
  "summary_stats": {
    "total_processes": 43,
    "active_instances": 1234,
    "open_incidents": 12,
    "started_this_month": 5678,
    "started_last_30_days": 8901,
    "completed_last_30_days": 8234,
    "processes_at_risk": 5,
    "avg_completion_time": 7200,
    "success_rate_30d": 96.2
  },
  "top_issues": [...],
  "timestamp": "2025-11-01T10:30:00Z"
}


FAQ

Q: How often does the dashboard update?
A: Data refreshes every 5 minutes automatically via caching. Click "Refresh" button for immediate update.

Q: Can I export the KPI table?
A: Not directly in the UI. Use Prometheus metrics endpoint and export to CSV, or copy-paste from table.

Q: What's the difference between "Started This Month" and "Started 30d"?
A: "This Month" uses calendar month (resets monthly). "30d" is rolling 30-day window (more consistent for trends).

Q: Why is Success Rate different from (100% - Incident Rate)?
A: Success Rate = completed without incidents (quality). Incident Rate = active with incidents (current problems). Different denominators.

Q: Can I filter to specific processes?
A: Not currently. Workaround: Use browser search (Ctrl+F) to find process in table, or use Prometheus metrics with PromQL filters.

Q: What causes Health Score to be low with no incidents?
A: Process might be "Recently active but now idle" (penalty of 20 points). Indicates potential issue even without open incidents.

Q: How do I track improvement over time?
A: Use Prometheus integration to store metrics in time-series database, then create Grafana dashboards showing trends.

Q: Can multiple teams use different Portfolio views?
A: Not currently. All users see same portfolio. Recommendation: Use Grafana with filtered views per team.


Summary

The Portfolio Dashboard transforms BPM operations from reactive firefighting to proactive management:

Unified visibility across entire process landscape
Intelligent prioritization of issues requiring attention
Executive-ready KPIs for business reporting
Trend analysis for strategic planning
Prometheus integration for enterprise monitoring
Performance-optimized with smart caching

By providing the right metrics at the right level of detail, the Portfolio Dashboard enables informed decision-making and operational excellence at scale.