Skip to main content
Version: Next

MCP Integration

Model Context Protocol (MCP) integration allows extensions to register custom AI agent capabilities that integrate seamlessly with Superset's MCP service. Extensions can provide both tools (executable functions) and prompts (interactive guidance) that AI agents can discover and use.

What is MCP?

MCP enables extensions to extend Superset's AI capabilities in two ways:

MCP Tools

Tools are Python functions that AI agents can call to perform specific tasks. They provide executable functionality that extends Superset's capabilities.

Examples of MCP tools:

  • Data processing and transformation functions
  • Custom analytics calculations
  • Integration with external APIs
  • Specialized report generation
  • Business-specific operations

MCP Prompts

Prompts provide interactive guidance and context to AI agents. They help agents understand how to better assist users with specific workflows or domain knowledge.

Examples of MCP prompts:

  • Step-by-step workflow guidance
  • Domain-specific context and knowledge
  • Interactive troubleshooting assistance
  • Template generation helpers
  • Best practices recommendations

Getting Started

MCP Tools

Basic Tool Registration

The simplest way to create an MCP tool is using the @tool decorator:

from superset_core.mcp import tool

@tool
def hello_world() -> dict:
"""A simple greeting tool."""
return {"message": "Hello from my extension!"}

This creates a tool that AI agents can call by name. The tool name defaults to the function name.

Decorator Parameters

The @tool decorator accepts several optional parameters:

Parameter details:

  • name: Tool identifier (AI agents use this to call your tool)
  • description: Explains what the tool does (helps AI agents decide when to use it)
  • tags: Categories for organization and discovery
  • protect: Whether the tool requires user authentication (defaults to True)

Naming Your Tools

For extensions, include your extension ID in tool names to avoid conflicts:

Complete Example

Here's a more comprehensive example showing best practices:

# backend/mcp_tools.py
import random
from datetime import datetime, timezone
from pydantic import BaseModel, Field
from superset_core.mcp import tool

class RandomNumberRequest(BaseModel):
"""Request schema for random number generation."""

min_value: int = Field(
description="Minimum value (inclusive) for random number generation",
ge=-2147483648,
le=2147483647
)
max_value: int = Field(
description="Maximum value (inclusive) for random number generation",
ge=-2147483648,
le=2147483647
)

@tool(
name="example_extension.random_number",
tags=["extension", "utility", "random", "generator"]
)
def random_number_generator(request: RandomNumberRequest) -> dict:
"""
Generate a random integer between specified bounds.

This tool validates input ranges and provides detailed error messages
for invalid requests.
"""

# Validate business logic (Pydantic handles type/range validation)
if request.min_value > request.max_value:
return {
"status": "error",
"error": f"min_value ({request.min_value}) cannot be greater than max_value ({request.max_value})",
"timestamp": datetime.now(timezone.utc).isoformat()
}

# Generate random number
result = random.randint(request.min_value, request.max_value)

return {
"status": "success",
"random_number": result,
"min_value": request.min_value,
"max_value": request.max_value,
"range_size": request.max_value - request.min_value + 1,
"timestamp": datetime.now(timezone.utc).isoformat()
}

Best Practices

Response Format

Use consistent response structures:

# Success response
{
"status": "success",
"result": "your_data_here",
"timestamp": "2024-01-01T00:00:00Z"
}

# Error response
{
"status": "error",
"error": "Clear error message",
"timestamp": "2024-01-01T00:00:00Z"
}

Documentation

Write clear descriptions and docstrings:

@tool(
name="my_extension.process_data",
description="Process customer data and generate insights. Requires valid customer ID and date range.",
tags=["analytics", "customer", "reporting"]
)
def process_data(customer_id: int, start_date: str, end_date: str) -> dict:
"""
Process customer data for the specified date range.

This tool analyzes customer behavior patterns and generates
actionable insights for business decision-making.

Args:
customer_id: Unique customer identifier
start_date: Analysis start date (YYYY-MM-DD format)
end_date: Analysis end date (YYYY-MM-DD format)

Returns:
Dictionary containing analysis results and recommendations
"""
# Implementation here
pass

Tool Naming

  • Extension tools: Use prefixed names like my_extension.tool_name
  • Descriptive names: calculate_tax_amount vs calculate
  • Consistent naming: Follow patterns within your extension

How AI Agents Use Your Tools

Once registered, AI agents can discover and use your tools automatically:

User: "Generate a random number between 1 and 100"
Agent: I'll use the random number generator tool.
→ Calls: example_extension.random_number(min_value=1, max_value=100)
← Returns: {"status": "success", "random_number": 42, ...}
Agent: I generated the number 42 for you.

The AI agent sees your tool's:

  • Name: How to call it
  • Description: What it does and when to use it
  • Parameters: What inputs it expects (from Pydantic schema)
  • Tags: Categories for discovery

Troubleshooting

Tool Not Available to AI Agents

  1. Check extension registration: Verify your tool module is listed in extension entrypoints
  2. Verify decorator: Ensure @tool is correctly applied
  3. Extension loading: Confirm your extension is installed and enabled

Input Validation Errors

  1. Pydantic models: Ensure field types match expected inputs
  2. Field constraints: Check min/max values and string lengths are reasonable
  3. Required fields: Verify which parameters are required vs optional

Runtime Issues

  1. Error handling: Add try/catch blocks with clear error messages
  2. Response format: Use consistent status/error/timestamp structure
  3. Testing: Test your tools with various input scenarios

Development Tips

  1. Start simple: Begin with basic tools, add complexity gradually
  2. Test locally: Use MCP clients (like Claude Desktop) to test your tools
  3. Clear descriptions: Write tool descriptions as if explaining to a new user
  4. Meaningful tags: Use tags that help categorize and discover tools
  5. Error messages: Provide specific, actionable error messages

MCP Prompts

Basic Prompt Registration

Create interactive prompts using the @prompt decorator:

from superset_core.mcp import prompt
from fastmcp import Context

@prompt("my_extension.workflow_guide")
async def workflow_guide(ctx: Context) -> str:
"""Interactive guide for data analysis workflows."""
return """
# Data Analysis Workflow Guide

Here's a step-by-step approach to effective data analysis in Superset:

## 1. Data Discovery
- Start by exploring your datasets using the dataset browser
- Check data quality and identify key metrics
- Look for patterns and relationships in your data

## 2. Chart Creation
- Choose appropriate visualizations for your data types
- Apply filters to focus on relevant subsets
- Configure proper aggregations and groupings

## 3. Dashboard Assembly
- Combine related charts into coherent dashboards
- Use filters and parameters for interactivity
- Add markdown components for context and explanations

Would you like guidance on any specific step?
"""

Advanced Prompt Examples

Domain-Specific Context

@prompt(
"sales_extension.sales_analysis_guide",
title="Sales Analysis Guide",
description="Specialized guidance for sales data analysis workflows"
)
async def sales_analysis_guide(ctx: Context) -> str:
"""Provides sales-specific analysis guidance and best practices."""
return """
# Sales Data Analysis Best Practices

## Key Metrics to Track
- **Revenue Growth**: Month-over-month and year-over-year trends
- **Conversion Rates**: Lead-to-opportunity-to-close ratios
- **Customer Lifetime Value**: Total value per customer segment
- **Sales Cycle Length**: Time from lead to close by product/region

## Recommended Chart Types
- **Time Series**: Revenue trends, seasonal patterns
- **Funnel Charts**: Conversion analysis across sales stages
- **Geographic Maps**: Regional performance comparison
- **Cohort Analysis**: Customer retention and growth patterns

## Common Pitfalls to Avoid
- Don't mix different time granularities without proper context
- Always normalize for business days when comparing periods
- Consider external factors (holidays, market events) in analysis
- Segment by relevant dimensions (product, region, channel)

## Next Steps
1. Identify your primary sales KPIs
2. Create baseline trend charts for each metric
3. Build comparative views across segments
4. Set up automated alerts for significant changes
"""

Interactive Troubleshooting

@prompt("support_extension.troubleshoot_charts")
async def troubleshoot_charts(ctx: Context) -> str:
"""Interactive troubleshooting assistant for chart issues."""
return """
# Chart Troubleshooting Assistant

Let's diagnose your chart issue step by step:

## Common Issues and Solutions

### 🚫 No Data Showing
**Possible causes:**
- Filters are too restrictive
- Date range doesn't match your data
- Database connection issues
- Missing permissions

**Check:** Try removing all filters and expanding the date range first.

### 📊 Unexpected Aggregation Results
**Possible causes:**
- Incorrect grouping dimensions
- Wrong aggregation function (SUM vs COUNT vs AVG)
- Data quality issues (duplicates, nulls)
- Time zone mismatches

**Check:** Verify your GROUP BY columns and aggregation logic.

### 🐌 Slow Performance
**Possible causes:**
- Large dataset without proper indexing
- Complex joins or calculations
- Missing query optimizations
- Resource constraints

**Check:** Simplify the query and add appropriate filters first.

## Debug Steps
1. **Start Simple**: Create a basic count query first
2. **Add Gradually**: Introduce complexity step by step
3. **Check SQL**: Review the generated SQL for issues
4. **Test Data**: Verify with a small sample first

What specific issue are you experiencing?
"""

Prompt Best Practices

Content Structure

  • Use clear headings and sections for easy navigation
  • Provide actionable steps rather than just theory
  • Include examples relevant to the user's domain
  • Offer next steps to continue the workflow

Interactive Design

  • Ask questions to engage the user
  • Provide options for different scenarios
  • Reference specific Superset features by name
  • Link to related tools when appropriate

Context Awareness

@prompt("analytics_extension.context_aware_guide")
async def context_aware_guide(ctx: Context) -> str:
"""Provides guidance based on current user context."""
# Access user information if available
user_info = getattr(ctx, 'user', None)

guidance = """# Personalized Analytics Guide\n\n"""

if user_info:
guidance += f"Welcome back! Here's guidance tailored for your role:\n\n"

guidance += """
## Getting Started
Based on your previous activity, here are recommended next steps:

1. **Review Recent Dashboards**: Check your most-used dashboards for updates
2. **Explore New Data**: Look for recently added datasets in your domain
3. **Share Insights**: Consider sharing successful analyses with your team

## Advanced Techniques
- Set up automated alerts for key metrics
- Create parameterized dashboards for different audiences
- Use SQL Lab for complex custom analyses
"""

return guidance

Combining Tools and Prompts

Extensions can provide both tools and prompts that work together:

# Tool for data processing
@tool("analytics_extension.calculate_metrics")
def calculate_metrics(data: dict) -> dict:
"""Calculate advanced analytics metrics."""
# Implementation here
pass

# Prompt that guides users to the tool
@prompt("analytics_extension.metrics_guide")
async def metrics_guide(ctx: Context) -> str:
"""Guide users through advanced metrics calculation."""
return """
# Advanced Metrics Calculation

Use the `calculate_metrics` tool to compute specialized analytics:

## Available Metrics
- Customer Lifetime Value (CLV)
- Cohort Retention Rates
- Statistical Significance Tests
- Predictive Trend Analysis

## Usage
Call the tool with your dataset to get detailed calculations
and recommendations for visualization approaches.

Would you like to calculate metrics for your current dataset?
"""

Next Steps

  • Development - Project structure, APIs, and dev workflow
  • Security - Security best practices for extensions