top of page

MCP Demystified: A Starting Point for Engineers

  • Marco
  • 1 day ago
  • 8 min read

Updated: 6 hours ago

title


Table of Contents


AI Canvas - this announcement made a significant impact, and everyone is discussing it. This raises questions about how it is working under the hood and how you might integrate your own tools and systems. This is where MCP starts to shine.

AI is to network engineers what automation was a few years ago. It was a new topic that everyone was discussing. Many were apprehensive, while others were very curious. Today, it turns out that almost everyone in this field needs to know at least the basics, and most have developed a passion for automation, working with it regularly.

I believe MCP will soon become what Python was then: the essential tool to leverage this new innovation.


In this post, I am excited to explain what MCP is for beginners, and then dive a bit deeper. So, if you already have a high-level understanding of the concept, you might find the initial part uninteresting and prefer to skip to the in-depth section.


Chapter 1: MCP explained for beginners


To start, I want to clarify some components we will discuss. When people think of AI, they often think of ChatGPT. Essentially, "Chat" refers to the user interface that facilitates interaction between the user and the LLM(Large Language Model). The LLM, or GPT, is the actual AI. It is a neural network trained on a massive amount of text and data to model patterns in human language. You can ask questions and the LLM will respond.

However, it cannot perform tasks for you. For example, if you ask ChatGPT to book a rental car or set up a Docker container, it will give you a step-by-step guide on how to do it yourself. The GUI itself cannot access the necessary resources.


example trying ChatGPT doing a task


This is where AI agents become relevant. They are programs equipped with their own memory and have access to the LLM, external resources, and the user. This enables them to make decisions and perform actions independently, yet on behalf of the user. (human-in-the-loop)


What is a potential use case for an AI agent?

Imagine you want to book a flight from Munich to Los Angeles.

In the past, there were a few common ways to automate this task:


  • Macro automation: Automating repetitive steps by “teaching” a computer to click through the same screens and forms as a human

  • Web scraping: Extracting and processing the raw HTML from a site

  • APIs: you know what this is :-)


The most common way today is to use APIs, but what is the problem of using it an AI agent can help? In this scenario hundreds or even thousands of flights across all airlines worldwide must be gathered and compared. And there is no exact standard how it must look like.


Each API might use a different URL and endpoint to access flight information.


https://turbulence-taxi.com/api/flights
https://winginit-airlines.eu/api/v1/get-flights
https://skyhighparty.de/api/flight-list

Even more challenging is that they can change from time to time.

Another issue is the format of the result. Some airlines will provide a three-letter code, others the airport's name, and some a combination of both.


{
  "flights": [
    {
      "destination": "LAX",
      "number": "LH345",
      "origin": "MUC"
    }
  ]
}

{
  "flights-list": [
    {
      "destination": "los angeles",
      "number": "RX4543",
      "origin": "munich"
    }
  ]
}

{
  "flight-details": [
    {
      "destination": "LAX - los angeles",
      "number": "UI584",
      "origin": "MUC - munich"
    }
  ]
}


To ensure comparability, you need a code that transforms them into a comparable format. If a new airline comes into play or an existing API changes, your code might fail until you adjust it.


The same issue arises when obtaining switch configurations in a string. You require a specific regex to parse it into an object. If Cisco decides to change the format, your code might stop working.


A large language model can address these challenges by leveraging natural language to interpret and interact with informations.

However, a common question arises: Can an agent directly use APIs, find the correct documentation, and apply it correctly? The answer is no — not without additional help. This is where the Model Context Protocol (MCP), developed by Anthropic, becomes relevant.

MCP is frequently compared to USB-C, although I don´t like it much. A more fitting comparison may be to SNMP. Basically MCP functions as an adapter or broker that provides resources, either from within its own environment or from external systems such as APIs and databases.

Through MCP, an agent can provide an LLM with the context of a resource. MCP informs the agent about the available resource types exposed by the server. In other words, the server informs the agent about the actions it can perform using the server's help. The server, in turn, retrieves informations, for example, from an API. Then it take the result and returns it in an standardized format.


approach ai agent using mcp


Thanks to the MCP Server, the agent can now access all airlines through the server. The Server is set up to connect to the airlines via API.

There are hundreds of pre-built MCP servers available on https://mcp.so/ and many other websites. However, you can also create your own server. I will explain how to do this in my next blog post.



Chapter 2: Technical deep-dive into MCP components


The Model Context Protocol is exclusively used for exchanging context with the AI application or Agent, if you prefer. It does not involve managing or controlling a large language model (LLM).



Architecture


Let's clarify the components:

MCP Host

This component serves as the AI application (or agent) that coordinates interactions between the LLM and MCP clients. Examples include Claude Desktop and LM Studio.

MCP Client

This component establishes the connection to the MCP server and functions as the client module. It is responsible for capability negotiation, function discovery, and notification handling.

MCP Server

The server maintains and provides data, tools, and prompts to the client. It also executes API calls and returns the results in a format readable by the AI.



structure of all MCP components



Protocols


Data Layer

JSON-RPC 2.0 is used for message exchange. This is important if you're developing your own MCP Server. The response to the LLM must be in plain JSON.



Transport Layer

  • STDIO: Refers to standard input/output. This requires both the client and server to be on the same system. It's recommended only for testing your own MCP servers, as the server needs access to your local OS pipes, which can be a security risk.

  • Streamable HTTP: The preferred deployment method, which is also accessible through a network.

  • SSE: Refers to Server-Sent Events. It has become obsolete and is not widely adopted.




Primitives


The most crucial aspect within MCP is the primitives. Primitives refer to the types of resources and information that an MCP Server can provide.


There are three types:

Tools

Most important! Executable functions such as API calls or database queries

Resources

A data source providing context such as file contents and database records

Prompts

A set of predefined, reusable prompt templates. Examples include system prompts or few-shot examples*

*few-shot: prompt with some examples (2-10)

zero-shot: no examples at all in the prompt

one-shot: giving one example

fine-tune: training the model with a set of labeled data instead of a prompt



Methods


The three main methods are:

  • */list

  • */get

  • */call


Combined with primitives, this may look as follows:


  • tools/list --> list of all available tools

  • tools/call --> execute/call a specific tool


Notifications

If this capability is negotiated a server is able to inform changes instantly so the client can react and adapt to it in time.



Chapter 3: Bringing Everything Together


This chapter will explore what happens when an MCP Server is specified in the mcp.json file of an AI Agent.


mcp.json edit


The client will start the connection and ask about the server's capabilities. The server will then reply with something like this:

 

{
  "jsonrpc": "2.0",
  "id": 326, --> MCP is asynchronous. ID same as in the request
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": { -->  Primitives that are available
      "tools": {
        "listChanged": true --> notifications possible
      },
      "resources": {}
    },
    "serverInfo": {
      "name": "example-server",
      "version": "1.0.0"
    }
  }
}


I want to highlight a few sections here. First the ID. Since MCP operates asynchronously, the client does not wait for a response. Waiting would slow down or essentially freeze the entire MCP process. Using the example of flight searches, if the server, and consequently the client, would wait to receive information for one flight before requesting the next, searching dozens of flights would take hours to complete.

However, sending hundreds of requests to a server can result in responses arriving out of order because some airlines process requests faster than others, and some even do not respond at all. So, how do we maintain that fact? This is achieved through the ID. The server sends the response with the exact same ID, allowing them to be matched. In this scenario, the initialization would look like this:


{
  "jsonrpc": "2.0",
  "id": 326,
  "method": "initialize",
<...>
}

The next point I want to emphasize is the list of available primitives. In the example above, it includes only tools. However, at this stage, the host is unaware of which ones exactly exist or their functions. This information will be requested in the subsequent step.

"listChanged": true indicates to the host that the server can send notifications if there are any changes regarding tools, such as additions, removals, or modifications in descriptions.


Now that the client is aware of the lists, it will request all of them. This is accomplished using the list-method previously discussed in this post:

{
  "jsonrpc": "2.0",
  "id": 52,
  "method": "tools/list"
}

This request will result in a response like this:


{
  "jsonrpc": "2.0",
  "id": 52,
  "result": {
    "tools": [
      {
        "name": "inventory_list",
        "description": "List all managed devices; optional filter
                        by device family {SWITCH, AP, WLC}.",
        "inputSchema": {
          "type": "object",
          "properties": {
            "family": {
              "type": "string",
              "enum": ["SWITCH", "ROUTER", "AP", "WLC"]
            }
          }
        }
      },
      {
        "name": "port_assignment",
        "description": "Configure a switchport: set authentication 
                        type (open|close) and access VLAN on the 
                        given port.",
        "inputSchema": {
          "type": "object",
          "properties": {
            "deviceId": { "type": "string" },
            "interface": { "type": "string" },
            "vlan": { "type": "integer" },
            "auth": { "type": "string", "enum": ["open", "close"] }
          },
          "required": ["deviceId", "interface", "vlan", "auth"]
        }
      }
    ]
  }
}

Once again, I want to emphasize some crucial elements here.

If you're developing your own MCP Server, the description is crucial. The Agent considers this to decide whether it will use the tools of this MCP Server or not. If it decides to use the tool, the description will influence how it is used, directly affecting the quality and functionality of the entire tool. So take your time with this.


The input schema is set by the creator of the MCP tool. This outlines which arguments can be passed to the tool to perform actions or influence the response. Some arguments are optional, while others are required, similar to RESTful APIs.

This is indicated in the required field. If an argument isn't listed here, it's optional.


Therefore, if you instruct your AI Agent to assign Port 1/0/23 on a switch named edge01 to VLAN 512, it will search through all the tools of all MCPs. If your description is well formulated, it will select this tool. Next, it should ideally recognize that a UUID is needed instead of a name. It will then search for a tool that can obtain the device_id using the name.

If both are found based on the description, the agent will get the device_id using the appropriate tool, obtain the UUID, and proceed with the tool shown below without a clear instruction of a human. A good LLM will make decisions independently based on the available tools and complete the task. Finally, the call should look like this which will advise the MCP to craft an appropriate API call to configure the Catalyst Center.


{
  "jsonrpc": "2.0",
  "id": 333,
  "method": "tools/call",
  "params": {
    "name": "switchport_set_access",
    "arguments": {
      "deviceId": "01f7cdf2-2298-42c7-bb74-dc68e3c3a051",
      "interface": "gigabitethernet1/0/23",
      "vlan": 512,
      "auth": "open"
    }
  }
}

ai agent uses multiple mcp tools to do a complex task
agent using multiple mcp tools

I hope you found the information valuable. In my next blog post, I'll discuss how to create your own MCP Server and utilize this technology in your everyday business activities.



References

 
 
 

Comments


Beitrag: Blog2 Post
  • LinkedIn

©2022 Marco Networking. Erstellt mit Wix.com

bottom of page