Building a Remote MCP Server for Azure DevOps with APIM, Functions, and Zero Secrets

Tell an AI assistant to "check why the build failed" and it'll politely explain that it doesn't have access to your CI/CD system. Fair enough. The Model Context Protocol gives agents a standard way to call external tools, but the interesting engineering happens when those tools live behind corporate authentication and the agent needs to act as the user, not as some over-privileged service account.

This post walks through how I built a remote MCP server that exposes Azure DevOps pipeline operations to AI assistants. Users authenticate via Entra ID, and every API call runs under their identity. No PATs. No shared service accounts. No client secrets. The entire credential chain uses Managed Identity and Federated Identity Credentials — the kind of setup that makes your security team slightly less nervous.

What we're building

Four MCP tools that let an AI assistant interact with Azure DevOps Pipelines:

Tool	What it does
`list_pipeline_runs`	List recent builds with statuses, branches, and durations
`get_run_failure_logs`	Inspect a failed run — task names, error messages, log tails
`list_deployments`	List Classic Release deployments by environment and status
`trigger_pipeline_run`	Queue a new pipeline run with optional branch and parameters

The compute layer is an Azure Function App (Flex Consumption, Python 3.12). The authentication and session management layer is API Management. The identity layer is Entra ID. Everything is deployed via Terraform.

The token problem

The Azure Functions MCP extension¹ is what makes a Function App speak MCP — it handles the JSON-RPC protocol, tool discovery, and SSE transport. But it has a constraint that shapes the entire architecture: the extension doesn't pass HTTP headers to your function code.

When your function receives a tool invocation, the context object contains name, arguments, and _meta. That's it. No Authorization header. No cookies. No ambient user identity.

This means EasyAuth can sit in front of the Function App and validate tokens at the platform level, but your function code can never see that token. If you need to call a downstream API as the user — which we do, because Azure DevOps permissions are per-user — you need another way to get the token into the function.

The solution: API Management acts as an OAuth gateway. It handles the entire authentication flow, caches the user's Entra tokens, and on every MCP tool call it injects the user's access token directly into the JSON-RPC arguments. The function reads args["bearerToken"] like any other parameter and performs an On-Behalf-Of exchange for an Azure DevOps-scoped token.

APIM becomes the brain. The Function App becomes the hands.

APIM OAuth gateway — validates the encrypted session key on every request and manages token caching
EasyAuth — the Function App requires a valid Entra ID bearer token. APIM authenticates using its System-Assigned Managed Identity
Host key — APIM injects the Function App's host key as an additional backend header

Even if someone discovers the Function App's *.azurewebsites.net URL, they can't call it without both a valid MI-issued bearer token and the host key.

The OAuth dance

MCP clients that support remote servers (VS Code, for example) expect an OAuth 2.0 PKCE flow. APIM implements the full ceremony across nine API operations:

MCP Client                        APIM                              Entra ID
│                                  │                                  │
│ GET /.well-known/                │                                  │
│ oauth-authorization-             │                                  │
│ server                           │                                  │
│──────────────────────────►│                                         │
│◄──────────────────────────│ (returns metadata)                      │
│                                  │                                  │
│ POST /register                   │                                  │
│──────────────────────────►│                                         │
│◄──────────────────────────│ (dynamic client registration)           │
│                                  │                                  │
│ GET /authorize                   │                                  │
│ ?code_challenge=...              │                                  │
│ &code_challenge_method=          │                                  │
│ S256                             │                                  │
│──────────────────────────►│                                         │
│                                  │                                  │
│◄─ 302 /consent                   │                                  │
│                                  │                                  │
│ User clicks "Allow"              │                                  │
│──────────────────────────►│                                         │ 
│                                  │                                  │
│◄─ 302 /authorize                 │                                  │
│ (with approval cookie)           │                                  │
│                                  │                                  │
│──────────────────────────►│ GET /authorize (Entra ID)               │
│                                  │───────────────────────────────►  │
│                                  │                                  │
│                                  │ (user signs in, consents)        │
│                                  │                                  │
│                                  │◄─ code + state                   │
│                                  │                                  │
│                                  │ POST /token                      │
│                                  │ (code + code_verifier +          │
│                                  │ FIC client assertion)            │
│                                  │───────────────────────────────►  │
│                                  │                                  │
│                                  │◄─ access_token +                 │
│                                  │ refresh_token                    │
│                                  │                                  │
│                                  │ Cache tokens, encrypt            │
│                                  │ session ID with AES-256          │
│                                  │                                  │
│◄──────────────────────────│ (encrypted session key)                 │
│                                  │                                  │
▼                                  ▼                                  ▼

A few things worth noting:

The consent page is custom. Before redirecting to Entra ID, APIM shows a consent screen that tells the user which MCP client is requesting access and what scopes it wants. The user's approval decision is stored in a __Host-MCP_APPROVED_CLIENTS cookie, so returning users skip the consent page on subsequent authorizations. CSRF protection uses constant-time comparisons to prevent timing attacks.

The token exchange uses Federated Identity Credentials, not secrets. When APIM calls Entra ID's /token endpoint, it doesn't send a client_secret. Instead, it presents a JWT assertion from its Managed Identity with aud=api://AzureADTokenExchange. Entra ID trusts this because we've configured a Federated Identity Credential on the app registration that says: "when this specific MI presents a token with this audience, treat it as a valid client credential."

Session state is encrypted. After a successful token exchange, APIM generates a random session ID, encrypts it with AES-256, caches the Entra tokens keyed by the session ID, and returns the encrypted session key to the client. On every subsequent MCP request, APIM decrypts the session key, looks up the cached tokens, and injects the access token into the JSON-RPC body.

How token injection works

This is the core trick that makes the architecture work. When the MCP client sends a tool invocation:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "list_pipeline_runs",
    "arguments": {
      "top": 10,
      "status": "completed"
    }
  }
}

The APIM MCP API policy intercepts the request, decrypts the session key from the Authorization header, retrieves the cached Entra token, and rewrites the body:

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "list_pipeline_runs",
    "arguments": {
      "top": 10,
      "status": "completed",
      "bearerToken": "eyJ0eXAiOiJKV1Qi..."
    }
  }
}

The function code then picks up bearerToken from the arguments and uses it for the OBO exchange:

def _get_devops_token(args: dict) -> str:
    bearer_token = args.get("bearerToken")
    mi_client_id = os.environ.get("AZURE_MI_CLIENT_ID")

    if bearer_token and mi_client_id:
        mi_cred = ManagedIdentityCredential(client_id=mi_client_id)
        obo_cred = OnBehalfOfCredential(
            tenant_id=os.environ["AZURE_TENANT_ID"],
            client_id=os.environ["AZURE_CLIENT_ID"],
            client_assertion_func=lambda: mi_cred.get_token(
                "api://AzureADTokenExchange"
            ).token,
            user_assertion=bearer_token,
        )
        return obo_cred.get_token(
            "499b84ac-1321-427f-aa17-267ca6975798/.default"
        ).token
    else:
        # Local dev: use az login session
        return AzureCliCredential().get_token(
            "499b84ac-1321-427f-aa17-267ca6975798/.default"
        ).token

The OnBehalfOfCredential does two things at once: it uses the Managed Identity to prove it's the app registration (via the FIC), and it exchanges the user's Entra token for one scoped to Azure DevOps. The result is a token that carries the user's identity and permissions — not the application's.

That magic string 499b84ac-1321-427f-aa17-267ca6975798 is the Azure DevOps first-party application ID. Every Azure DevOps token request uses it as the resource scope.

The Function App

Each MCP tool is an Azure Function using the mcpToolTrigger binding from the experimental extension bundle:

@app.generic_trigger(
    arg_name="context",
    type="mcpToolTrigger",
    toolName="list_pipeline_runs",
    description="List recent pipeline runs. Returns build IDs, statuses, branches, and durations.",
    toolProperties=json.dumps([
        {
            "propertyName": "pipeline_id",
            "propertyType": "integer",
            "description": "Filter to a specific pipeline ID."
        },
        {
            "propertyName": "status",
            "propertyType": "string",
            "description": "Filter by status: completed, inProgress, cancelling, notStarted."
        },
        {
            "propertyName": "top",
            "propertyType": "integer",
            "description": "Number of results to return (default 20, max 50)."
        },
    ]),
)
async def list_pipeline_runs(context: str) -> str:
    ctx = json.loads(context)
    args = ctx.get("arguments", {})
    bearer_token = _get_devops_token(args)
    client = get_devops_client()
    # ... call Azure DevOps REST API ...

The host.json opts into the experimental extension bundle and configures the MCP server metadata:

{
  "version": "2.0",
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle.Experimental",
    "version": "[4.*, 5.0.0)"
  },
  "extensions": {
    "mcp": {
      "serverName": "azure-devops-pipelines-mcp",
      "serverVersion": "1.0.0"
    }
  }
}

The Azure DevOps REST client is thin by design — retry logic for 429 rate limits, and that's about it:

class AzureDevOpsClient:
    def _request_with_retry(self, method, url, *, params, json_body, bearer_token):
        for attempt in range(1, self._retry_attempts + 1):
            resp = requests.request(method, url, headers=headers, params=params, json=json_body)
            if resp.status_code == 429 and attempt < self._retry_attempts:
                retry_after = float(resp.headers.get("Retry-After", self._retry_delay))
                time.sleep(retry_after)
                continue
            resp.raise_for_status()
            return resp.json()

Infrastructure as Code

The entire stack is Terraform. No portal clicking, no post-deployment scripts, no "just run this one az command manually."

Entra ID

The app registration is the identity anchor. It defines the OAuth scopes, redirect URIs, required permissions, and pre-authorized clients:

resource "azuread_application" "mcp_server" {
  display_name     = "${var.project_name}-${random_id.suffix.hex}"
  sign_in_audience = "AzureADMyOrg"

  api {
    requested_access_token_version = 2

    oauth2_permission_scope {
      id      = random_uuid.scope_id.result
      value   = "access_as_user"
      type    = "User"
      enabled = true
    }
  }

  required_resource_access {
    resource_app_id = "499b84ac-1321-427f-aa17-267ca6975798" # Azure DevOps
    resource_access {
      id   = "ee69721e-6c3a-468f-a222-571a0b31e4c1" # user_impersonation
      type = "Scope"
    }
  }
}

The service principal has app_role_assignment_required = true, meaning only explicitly assigned users or groups can obtain tokens. A security group is provisioned for group-based access, and admin consent for Azure DevOps user_impersonation is pre-granted via Terraform — without this, Entra ID throws AADSTS65001 during the OBO exchange.

Federated Identity Credential

This is what eliminates secrets from the entire credential chain:

resource "azuread_application_federated_identity_credential" "mi_trust" {
  application_id = azuread_application.mcp_server.id
  display_name   = "${var.project_name}-mi-fic"
  audiences      = ["api://AzureADTokenExchange"]
  issuer         = "https://login.microsoftonline.com/${data.azuread_client_config.current.tenant_id}/v2.0"
  subject        = azurerm_user_assigned_identity.mcp.principal_id
}

Translation: "When the Managed Identity azurerm_user_assigned_identity.mcp presents a token with aud=api://AzureADTokenExchange, trust it as a client credential for this app registration." The MI can now do OBO exchanges — proving it's the application — without a client secret or certificate.

Function App

Flex Consumption (FC1 SKU) with MI-authenticated storage — no connection strings anywhere:

resource "azurerm_function_app_flex_consumption" "mcp" {
  name            = "${var.project_name}-func-${random_id.suffix.hex}"
  service_plan_id = azurerm_service_plan.mcp.id

  storage_container_type            = "blobContainer"
  storage_authentication_type       = "UserAssignedIdentity"
  storage_user_assigned_identity_id = azurerm_user_assigned_identity.mcp.id

  runtime_name    = "python"
  runtime_version = "3.12"

  app_settings = {
    "AZURE_DEVOPS_ORG"     = var.azure_devops_org
    "AZURE_DEVOPS_PROJECT" = var.azure_devops_project
    "AZURE_MI_CLIENT_ID"   = azurerm_user_assigned_identity.mcp.client_id
    "AZURE_TENANT_ID"      = data.azuread_client_config.current.tenant_id
    "AZURE_CLIENT_ID"      = azuread_application.mcp_server.client_id
  }

  auth_settings_v2 {
    auth_enabled           = true
    require_authentication = true
    unauthenticated_action = "Return401"

    active_directory_v2 {
      client_id            = azuread_application.mcp_server.client_id
      tenant_auth_endpoint = "https://login.microsoftonline.com/${data.azuread_client_config.current.tenant_id}/v2.0"
    }
  }
}

EasyAuth is configured as a defense-in-depth layer. APIM authenticates to the Function App using its own System-Assigned MI, and the Function App validates the bearer token audience. This means even if APIM were compromised, an attacker would need a token issued for this specific app registration to reach the functions.

APIM

Two APIs. The OAuth API handles the full PKCE ceremony (nine operations: discovery, registration, authorize, consent, callback, token, and their CORS counterparts). The MCP API proxies SSE and POST requests to the Function App with session validation and token injection.

Fifteen named values feed into the policies — Entra IDs, scopes, encryption keys, the Function App's host key. All generated dynamically by Terraform, no manual configuration.

Local development

For local dev, the OBO path is bypassed entirely. When AZURE_MI_CLIENT_ID isn't set (which it won't be on your laptop), the function falls back to AzureCliCredential:

# Login to Azure with DevOps permissions
az login

# Start the Function App locally
func start

The MCP extension starts an SSE endpoint on localhost. Point your MCP client at it and you're calling Azure DevOps as yourself, with zero infrastructure dependencies.

Deployment

# Deploy infrastructure (APIM takes 30-45 minutes on first create)
cd infra
terraform init && terraform apply

# Deploy function code
func azure functionapp publish $(terraform -chdir=infra output -raw function_app_name)

# Verify
curl $(terraform -chdir=infra output -raw apim_oauth_metadata)

Then add the server to VS Code:

{
  "mcp": {
    "servers": {
      "azure-devops-pipelines": {
        "type": "sse",
        "url": "<apim_mcp_endpoint terraform output>"
      }
    }
  }
}

On first use, VS Code opens a browser window for Entra ID consent. After that, the encrypted session key handles re-authentication transparently.

Trade-offs and honest simplifications

A few things I'd change before running this in production at scale:

APIM Developer_1 SKU — fine for development, but you'd want a higher tier for production SLAs and throughput. The Developer SKU has no built-in SLA.
Token cache TTL — tokens are cached for one hour in APIM's internal cache. For a production deployment, you'd want to handle refresh token rotation and cache eviction more carefully.
Extension bundle is experimental — the Microsoft.Azure.Functions.ExtensionBundle.Experimental bundle contains the MCP trigger bindings. It works, but the API surface may change before GA.
Single project scope — the AZURE_DEVOPS_PROJECT environment variable pins the server to one project. Multi-project support would need either parameterised tool calls or multiple Function App deployments.

Wrapping up

The pattern here is broadly applicable beyond Azure DevOps. Any scenario where an MCP server needs to call downstream APIs as the user faces the same challenge: how do you get an identity-scoped token into function code when the MCP extension doesn't pass headers?

The answer is an APIM OAuth gateway that owns the full authentication lifecycle and injects tokens at the protocol level. The Function App stays focused on business logic. Entra ID handles identity. Managed Identity and FIC eliminate secrets. Terraform makes it reproducible.

The entire codebase — infrastructure, policies, and function code — is available in the repository

Acknowledgements

The APIM OAuth gateway pattern and policy structure are adapted from Azure-Samples/remote-mcp-apim-functions-python by Den Delimarsky. Jay's MCP Azure OAuth2 OBO guide was the reference for the Federated Identity Credential and OBO exchange pattern.

The Azure Functions MCP extension (mcpToolTrigger) is part of the experimental extension bundle. It adds MCP protocol support — JSON-RPC, tool discovery, SSE transport — as a Functions binding type. ↩