Understanding and leveraging Semantic Kernel - Generative AI, LLMs and others
Before diving into the nitty-gritty details of Semantic Kernel, this post will first focus on understanding the context in which it operates and the problems it aims to address.
Artificial intelligence is more than just a buzzword—it has become a mandatory term in every conversation, lest we appear out of touch or uninformed. So much so that today, it often feels as if any problem in the world can be solved simply by invoking the magic of AI.
In reality, few people can clearly define what is actually meant by "AI", and everyone seems to have their own interpretation. The purpose of this section is simply to bring some clarity and establish a shared understanding of the key terms we’ll be using.
What is AI ?
Defining what AI truly is can be a complex task, and we won’t venture into a deep philosophical debate here. Instead, we’ll assume a practical definition: AI refers to the recent advancements in natural language processing (NLP) that allow a program or agent to understand human language in its natural form and respond appropriately—or even take intelligent actions on its own.
In this sense, AI strives to mimic human behavior and simulate how a person would respond in real-world situations.
In this sense, AI enables us to engage in question-and-answer interactions much like we would with friends—and the most prominent example of this trend is arguably ChatGPT.
I often hear about generative AI as well—what exactly does it mean ?
As we've just seen, AI is expected to answer both simple and complex questions. What sets modern AI apart from its predecessors is that the responses are no longer selected from a set of predefined patterns based on the input question. Instead, answers are generated in real time, taking into account various factors such as tone, language, history, and context. This approach is known as generative AI, because the content is dynamically created rather than pre-written.
Generative AI refers to a type of artificial intelligence designed to create new content—such as text, images, audio, or even code—based on patterns it has learned from existing data. Instead of simply analyzing or classifying information, generative AI produces original outputs that resemble human-created content.
In itself, generating text based on context is not a new concept—we’ve even written a post on this topic before, focusing on Google’s autocomplete feature. For those interested, follw this link: Implementing the Google's autocomplete feature with ngram language models.
The difference lies in the techniques used by modern tools. While earlier models were based on relatively simple mathematical concepts such as Markov chains or basic word embeddings, today’s tools rely on more advanced approaches like sentence embeddings or language models (see below). Moreover, they can be trained on massive datasets. This combination is what makes generative AI a true game-changer.
And what are LLMs ?
LLMs (large language models) represent a new generation of language models (that is, systems capable of generating coherent and context-aware content). They are built on groundbreaking concepts such as transformers, encoders/decoders, and the attention mechanism. In addition, they leverage deep neural networks with billions of parameters, trained on vast datasets. This combination allows them to understand and produce language with remarkable accuracy and fluency. Naturally, these models are incredibly complex and expensive to train—often requiring several weeks of computation.
It’s worth noting that some language models are specialized—for example, there are models specifically designed to answer mathematical questions, others focused on cooking, and so on. The advantage of these specialized models is that they are smaller in size and require less computational power and training time. As a result, there are now countless language models available, which can sometimes make it difficult to navigate the landscape and choose the right one.
- ChatGPT, for example, is designed to answer general-purpose questions across a wide range of topics.
- MathBERT or Minerva are specifically trained to solve mathematical problems.
- BloombergGPT is a large model trained on financial data, which can handle accounting-related questions.
- ...
We can get a sense of the sheer number of available models by visiting Hugging Face—a platform that hosts a vast collection of language models for a wide range of tasks.
What is an agent ?
Now that we have a better understanding of what LLMs and generative AI are, we can envision scenarios where these advanced technologies can be leveraged. For example, we could design a program that responds to customer inquiries and takes appropriate actions. Such a program is often referred to as an agent, as it can analyze situations and make intelligent decisions autonomously.
An AI agent is a software program or system designed to perceive its environment, process information, and take actions to achieve specific goals—often autonomously or with minimal human intervention. Unlike simple algorithms, AI agents can learn, adapt, and make decisions based on data and context.
For example, virtual assistants like Siri or Alexa, chatbots that handle customer service, or recommendation systems can all be considered AI agents. In enterprise settings, AI agents might automate workflows, analyze data, or interact intelligently with users to streamline business processes.
But how can we quickly develop such a program and bring it into production ? This is where Semantic Kernel comes into play.
What is Semantic Kernel ?
Building an AI agent involves selecting the right language model, connecting to it—typically via an API—querying it, and handling the responses. These tasks can be tedious and repetitive, and worse, they often need to be redone every time we switch providers. On top of that, we also have to implement all the supporting infrastructure—such as logging, tracing, and other boilerplate code—that's essential when working with cutting-edge technologies.
As we can see, constantly reinventing the wheel can become burdensome, especially when the true value lies in what we deliver to our customers, not in the underlying technology. To help avoid these complexities, Microsoft introduced Semantic Kernel.
Semantic Kernel is an open-source SDK developed by Microsoft that helps developers build AI-powered applications using large language models (LLMs) like OpenAI’s GPT or Azure OpenAI models. It acts as a bridge between traditional programming and modern AI capabilities by allowing us to integrate natural language processing, reasoning, and decision-making into our software.
As highlighted in this definition, Semantic Kernel is designed to simplify the implementation of AI agents and facilitate continuous iteration to deliver value to customers. By providing a unified interface for various models and connectors to different APIs, it streamlines the developer’s workflow—especially since these features are accompanied by enterprise-grade capabilities such as logging.
But Semantic Kernel is more than just a simple passthrough that lets you orchestrate various language models with a few lines of code. It also introduces the powerful concept of plugins, which enable us to enrich data and perform retrieval-augmented generation (RAG). We'll explore that topic in more detail in a future post. For now, enough theory, let’s dive into some code to explore how to use this new tool.
Understanding and leveraging Semantic Kernel - Building a very simple agent in a few lines