Understanding and leveraging Semantic Kernel - Unleashing the power of plugins

In this post, we’ll tap into the real power of Semantic Kernel: moving beyond its role as a simple orchestrator. Enter plugins, and get ready to experience AI on steroids.

LLMs are trained on massive amounts of data, often sourced from high-quality materials such as newspapers, books, and scientific articles. This training allows them to generate fluent and coherent text across a wide range of topics.
However, if we ask one of these models for real-time information —like the current weather in Paris or the score of last week’s football match— we’ll likely get an unsatisfactory response. That’s because this kind of information is not present in the training data, and the model cannot simply invent facts it hasn’t seen.

Even though these models are incredibly powerful at generating high-quality content on timeless topics, they tend to struggle when it comes to recent or real-time information, like the current weather or last week’s news. That’s why they need to be augmented with additional capabilities that allow them to handle such queries and respond more like a real, informed human.

What is retrieval-augmented generation (RAG) ?

The mechanism that allows LLMs to stay up to date is known in the literature as retrieval-augmented generation (RAG).

Information

RAG is a technique used in combination with large language models (LLMs) to improve their accuracy, especially when dealing with real-time, domain-specific, or factual information that may not be part of the model’s training data.

  • When a user asks a question, the system first searches an external knowledge base (like a document store, database, or API) to retrieve relevant information (retrieval).

  • The retrieved content is then injected into the model’s prompt or context (augmentation).

  • The LLM generates a response based on both the user’s query and the retrieved data (generation).

The key point here is that the LLM retrieves information from an external source before generating its response. This source could be the internet, or even private or specialized data such as internal company documents. In any case, the goal remains the same: to enhance accuracy and ensure the information is up-to-date.

Information

ChatGPT employs this technique by retrieving missing information from the internet, effectively functioning like a search engine.

As a comprehensive framework for building AI solutions, Semantic Kernel supports RAG. In Semantic Kernel’s terminology, this capability is implemented through what it calls a plugin.

Implementing a plugin in Semantic Kernel

As in the previous post, we will use Semantic Kernel within a Console application. We will thus set up our solution following the same steps outlined earlier (refer to the previous articles if needed).
In this section, we will simulate a Weather API to demonstrate a plugin in action. But before anything else, keep in mind that plugins in Semantic Kernel can be implemented in various ways. Here, we will use native plugins, meaning those coded directly in C#.

To code a plugin in Semantic Kernel, we generally follow these steps (note that this is not a rigid structure enforced by Microsoft; rather, it is our own approach).

  • Implement the plugin
    We write a class that implements the logic we desire with accurate descriptions. This class can connect to external APIs, perform data processing, or any other tasks our plugin requires.

  • Register the plugin with Semantic Kernel
    In our application setup, we register our plugin with the Semantic Kernel instance so it can be invoked by the AI agent.

Implementing the plugin

  • Create a folder in the solution named Plugins.
  • Add a WeatherPlugin.cs file in this folder.
  • Add the following code to this class.
 1public class WeatherPlugin
 2{
 3    [KernelFunction("get_weather_by_location")]
 4    [Description("Get the weather in a specifi location")]
 5    public string GetCurrentWeather(string location)
 6    {
 7        // Simulate calling a weather API
 8        return $"The current weather in {location} is sunny and 25°C.";
 9    }
10}
Information

In a real-world scenario, we would of course call an actual API. Here, however, we are simply simulating a call.

Note that the most important element here is the description. It is this description that allows the LLM to understand the plugin’s purpose and how to use it effectively.

Registering the plugin in Semantic Kernel

The plugin must, of course, be registered within Semantic Kernel in order to be recognized and utilized by the AI ecosystem. The following code must be added to the Program.cs class.

1// Registering the plugin
2kernel.Plugins.AddFromObject(new WeatherPlugin());
Information

We can see how simple it is to register a plugin in Semantic Kernel. In our scenario, we're using C# code for this task, but it's also possible to use semantic functions defined in .txt files. Additionally, there are multiple ways to register plugins—refer to this link for more details.

Note also that the entire code fits in just a dozen lines.

 1using Microsoft.SemanticKernel;
 2using Microsoft.SemanticKernel.ChatCompletion;
 3
 4// Configure Semantic Kernel
 5var deploymentName = "<your_deployment_name>"; // see above
 6var endPoint = "<your_endpoint>"; // see above
 7var apiKey = "<your_apikey>"; // see above
 8
 9// Add Azure OpenAI chat completion service
10var builder = Kernel.CreateBuilder();
11builder.AddAzureOpenAIChatCompletion(deploymentName, endPoint, apiKey);
12var kernel = builder.Build();
13
14// Registering the plugin
15kernel.Plugins.AddFromObject(new WeatherPlugin());
16
17var settings = new OpenAIPromptExecutionSettings()
18{
19    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
20};
21
22var query = "What is the weather like in Rome ?";
23var response = await kernel.InvokePromptAsync(query, new(settings));
24Console.WriteLine($"\nQuestion:{query} \n\nAnswer: {response}\n");
Information

We explicitly configure our application to automatically invoke plugins when necessary. This behavior is enabled by the following line of code.

1var settings = new OpenAIPromptExecutionSettings()
2{
3    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
4};

That’s why the function description is crucial: it guides Semantic Kernel and the LLM in determining whether calling the function will help answer the user's query effectively.

Running the program

In short

Under the hood, when the LLM analyzes the user's prompt, it recognizes that the request involves retrieving the weather in Rome. Since this information is not part of its internal knowledge, it examines the available plugins—focusing particularly on their descriptions—to determine whether any of them can provide the necessary data. In this case, it identifies the appropriate plugin and invokes the corresponding function. The LLM correctly extracts "Rome" as the desired location and passes it along. The simulated weather API is then called and returns the temperature, which the LLM integrates and formats using its own natural language capabilities to generate a complete response.

We didn’t need to explicitly tell Semantic Kernel which plugin to invoke. Instead, Semantic Kernel automatically identified the appropriate function based on the user’s prompt. This powerful feature is known as automatic function calling.

It is configured in the code using the following line.

1var settings = new OpenAIPromptExecutionSettings() 
2{
3    ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions 
4};

So plugins are incredibly powerful tools for adding dedicated features to our AI agents. As we saw in our example, we can invoke a Weather API. But while accessing the weather is certainly useful, we could just as easily need access to private data or company documents (which would require a separate plugin with its own API), or integrate a search engine (another plugin), or provide up-to-date information on football games (another plugin), or basketball matches (yet another plugin).
In the end, we might find ourselves managing dozens of APIs to ensure our AI agent stays cutting-edge and relevant—an effort that requires careful design and ongoing maintenance.

Wouldn’t it be fantastic if we could interact with every stakeholder uniformly in our code, without having to adapt to each individual API ? To address this very need, a protocol was recently developed—introducing MCP.

Understanding and leveraging Semantic Kernel - Integrating Semantic Kernel with the MCP protocol