Overview

The Workflow Load Testing feature allows you to stress test your MCP workflows with configurable parameters, measuring performance under concurrent execution. Get detailed metrics, visual timelines, and automated insights to validate your workflows can handle real-world load.

Perfect for validating workflow reliability, identifying performance bottlenecks, and ensuring your MCP server implementations scale appropriately.


Key Capabilities

📊 Comprehensive Load Testing

Execute workflows repeatedly with configurable duration and parallelism

⏱️ Smart Ramp-Up/Down

Gradually increases parallelism in first 60 seconds, decreases in last 60 seconds to avoid overwhelming the system

📈 Real-Time Progress Tracking

Live metrics during test execution showing success, failure, and active execution counts

📉 Detailed Performance Metrics

Duration statistics including Average, Min, Max, Median, P95, P99, plus throughput and peak concurrency

📊 Timeline Visualization

SVG chart showing cumulative successes, failures, and active executions over time

🤖 Automated Observations

Rule-based insights about reliability, performance variance, throughput patterns, and error analysis

💾 Results Management

Persistent storage with history panel, export to JSON, and delete capabilities

🔒 Execution Isolation

Load test runs do NOT appear in Recent Executions list—keeping them separate from normal workflow runs


Getting Started

Step 1: Select a Workflow

Before running a load test, you need an existing workflow.

  1. Navigate to 🔗 Workflows tab
  2. Select a workflow from the list
  3. Verify the workflow has been tested manually at least once
  4. Locate the 📊 Load Test button next to “Run Workflow”

📸 Screenshot placeholder:

Description: Show the workflow viewer with the Load Test button (📊) positioned next to the Run Workflow button


Step 2: Configure Load Test

Click the 📊 Load Test button to open the configuration dialog.

Configuration options:

FieldDescriptionRange
ConnectionMCP server connection to useRequired
DurationTotal test duration in seconds120-3600 seconds
Max Parallel ExecutionsMaximum concurrent workflow runs1-100
Runtime ParametersValues for prompt-at-runtime parametersIf required

📸 Screenshot placeholder:

Description: Show the load test configuration dialog with connection dropdown, duration field showing “300”, max parallel field showing “5”, and runtime parameters section

info: Ramp Up/Down: The test automatically ramps up parallelism during the first 60 seconds and ramps down during the last 60 seconds. This prevents sudden load spikes that could overwhelm your server.


Step 3: Run the Load Test

  1. Click 📊 Start Load Test button
  2. Watch the real-time progress indicator
  3. Monitor live metrics as executions complete
  4. Wait for test completion (or click Cancel to stop early)

During execution, you’ll see:

  • Execution count: Total completed executions
  • Success badge (✓): Number of successful runs
  • Failure badge (✗): Number of failed runs
  • Average duration: Rolling average execution time

📸 Screenshot placeholder:

Description: Show the progress indicator with “42 executions completed”, success/failure badges showing “✓ 40 ✗ 2”, and average duration “⏱ 1250ms avg”


Step 4: Review Results

After the test completes, the Load Test Results Viewer opens automatically.

Results are organized into sections:

  1. Test Summary - Configuration details and final status
  2. Performance Metrics - Execution counts and duration statistics
  3. Execution Timeline - Visual chart of test progression
  4. Step Duration Breakdown - Per-step timing analysis
  5. Observations - Automated insights and recommendations
  6. Error Summary - Grouped error counts

📸 Screenshot placeholder:

Description: Show the full results viewer with all sections visible: summary, metrics cards, timeline chart, step durations, observations, and error summary


Understanding Results

Performance Metrics

The results viewer displays key metrics in easy-to-read cards:

Execution Counts:

  • Total: All execution attempts
  • Success: Completed successfully
  • Failed: Errored or timed out
  • Partial: Some steps succeeded, some failed

Duration Statistics:

  • Average: Mean execution time
  • Median: Middle value (50th percentile)
  • Min/Max: Fastest and slowest executions
  • P95: 95th percentile (95% of executions were faster)
  • P99: 99th percentile (99% of executions were faster)

Throughput Metrics:

  • Throughput: Executions per second
  • Peak Concurrent: Maximum simultaneous executions reached

📸 Screenshot placeholder:

Description: Show the metrics cards grid with Total (147), Success (142, 96.6%), Failed (5), Partial (0), Average (1250ms), Median (1180ms), Min (850ms), Max (2340ms), P95 (1890ms), P99 (2210ms), Throughput (2.89/sec), Peak Concurrent (5)


Timeline Visualization

The timeline chart shows how the test progressed over time:

Chart elements:

  • Green line: Cumulative successful executions
  • Red line: Cumulative failed executions
  • Blue dashed line: Active (in-progress) executions at each moment

What to look for:

  • Steady green line slope = consistent throughput
  • Red line spikes = failure clusters (investigate server issues)
  • Blue line shape = parallelism ramp-up and ramp-down pattern

📸 Screenshot placeholder:

Description: Show the SVG timeline chart with green success line rising steadily, minimal red failure line, and blue dashed line showing ramp-up pattern in first 60s and ramp-down in last 60s


Step Duration Breakdown

See which workflow steps take the most time:

Bar chart shows:

  • Each step’s average duration
  • Percentage of total workflow time
  • Visual comparison between steps

Use this to:

  • Identify slow steps that need optimization
  • Understand where time is spent
  • Prioritize performance improvements

📸 Screenshot placeholder:

Description: Show horizontal bar chart with 3 steps: tool_1 (650ms, 52%), tool_2 (400ms, 32%), tool_3 (200ms, 16%)


Automated Observations

The system analyzes results and generates insights:

Observation types:

IconTypeExample
Excellent“Success rate is 99% or higher”
Good“Success rate is above 95%”
Warning“P95 latency is 45% higher than average”
Info“Average throughput: 2.34 executions per second”

Common observations:

  • Reliability assessment: Based on success rate
  • Performance variance: Comparing P95 to average
  • Throughput classification: Low/Average/High
  • Error pattern detection: Systematic vs random failures

📸 Screenshot placeholder:

Description: Show the observations section with 4 observation badges: “✓ Good reliability: Success rate is above 95%”, “⚠ Moderate performance variance: P95 latency is 51% higher than average”, “ℹ Average throughput: 2.89 executions per second”, “ℹ Failures distributed across 3 different error types”


Error Summary

When failures occur, errors are grouped by message:

Display format:

  [count] Error message
  

Example:

  [3] Connection timeout after 30 seconds
[1] Invalid parameter: 'value' is required
[1] Tool execution failed: unexpected error
  

Use this to:

  • Identify systematic issues (same error many times)
  • Diagnose server problems
  • Prioritize error fixes

Managing Load Tests

Load Test History

Recent load tests appear in the Load Test History section below the workflow viewer.

Each history entry shows:

  • Status badge (✓ Completed, ⚠ Cancelled)
  • Timestamp
  • Execution counts (total, succeeded, failed)
  • Delete button

📸 Screenshot placeholder:

Description: Show the Load Test History section with 3 entries: two completed tests and one cancelled test, with timestamps and execution counts


Viewing Past Results

  1. Click any entry in Load Test History
  2. Results viewer opens with full metrics
  3. Review timeline, observations, and errors
  4. Compare with other test runs

Exporting Results

Export load test results as JSON for further analysis:

  1. Open a load test result (current or from history)
  2. Click 📥 Export button
  3. JSON file downloads with descriptive filename

Filename format:

  loadtest_{WorkflowName}_{YYYYMMDD_HHmmss}_{TestId}.json
  

Export includes:

  • Complete configuration
  • All execution records
  • Calculated metrics
  • Timeline data
  • Observations

Deleting Load Tests

Remove load tests you no longer need:

  1. Find the test in Load Test History
  2. Click the 🗑 delete button
  3. Confirm deletion

warning: Permanent deletion: Load test results cannot be recovered after deletion.


Best Practices

🎯 Start Small

Begin with short duration (120s) and low parallelism (2-3) to establish baseline metrics.

📊 Test Incrementally

Gradually increase parallelism to find the breaking point of your server.

🔄 Run Multiple Tests

Execute several tests to account for variance and validate consistency.

📝 Document Results

Export and save results for comparison over time as you optimize.

⚠️ Monitor Server Resources

Watch server CPU, memory, and network during load tests to identify bottlenecks.

🧪 Test Different Scenarios

Create separate workflows for different use cases and test each independently.

📈 Compare P95 to Average

Large differences indicate inconsistent performance that may affect users.

🔍 Investigate Failures

Don’t ignore even small failure rates—they may indicate systemic issues.


Troubleshooting

Load Test Won’t Start

Problem: Start button disabled or test fails immediately

Solutions:

  1. Verify a connection is selected
  2. Ensure connection is valid and server is running
  3. Check all runtime parameters are filled
  4. Verify workflow has at least 2 steps
  5. Test workflow manually first

All Executions Failing

Problem: 100% failure rate during load test

Solutions:

  1. Test workflow manually to verify it works
  2. Check server logs for errors
  3. Verify connection credentials are valid
  4. Ensure server can handle concurrent requests
  5. Check for rate limiting on the server

High P95/P99 Latency

Problem: Tail latencies much higher than average

Causes:

  • Server resource contention under load
  • Garbage collection pauses
  • Network latency variance
  • Database connection pooling

Solutions:

  1. Monitor server resources during test
  2. Check for memory pressure
  3. Review database query performance
  4. Consider connection pooling configuration

Results Not Persisting

Problem: Load test history is empty after test completes

Solutions:

  1. Check application has write permissions
  2. Verify storage location exists:
    • Windows: %APPDATA%\McpExplorer\load_tests\
    • macOS/Linux: ~/.local/share/McpExplorer/load_tests/
  3. Review console for storage errors
  4. Ensure adequate disk space

Timeline Chart Empty

Problem: No data appears in timeline visualization

Solutions:

  1. Ensure test ran for more than a few seconds
  2. Check that executions actually completed
  3. Verify timeline data was captured (check export JSON)
  4. Refresh the results viewer

Technical Details

Storage Location

Load test results are saved as JSON files:

  • Windows: %APPDATA%\McpExplorer\load_tests\
  • macOS/Linux: ~/.local/share/McpExplorer/load_tests\

Ramp-Up/Down Algorithm

  First 60 seconds:  parallelism = max(1, MaxParallel × (elapsed / 60))
Middle period:     parallelism = MaxParallel
Last 60 seconds:   parallelism = max(1, MaxParallel × (remaining / 60))
  

Metrics Calculation

  • Median: Middle value when durations are sorted
  • P95: Value at 95th percentile position
  • P99: Value at 99th percentile position
  • Throughput: Total executions ÷ Total duration

Create Workflows First

Load testing requires an existing workflow:

Configure Connections

Tests use MCP connections:

Understand Tool Execution

Learn about individual tool testing:


Next Steps

Now that you understand load testing:

  1. Create a test workflow - Build a simple 2-3 step workflow
  2. Run baseline test - Start with 120s duration, 2 parallel
  3. Analyze results - Review metrics and observations
  4. Scale up testing - Gradually increase parallelism
  5. Export and compare - Track performance over time

Load testing gives you confidence your workflows will perform under real-world conditions! 📊