
Let’s look at what it takes to build an AI powered knowledge base. Imagine your company has a large set of documents and you want the employees to be able to search through the documents, but you want them to be able to use natural language search.
Suppose the knowledge base includes the employee handbook, and in the handbook is this sentence: “All employees are required to submit time-off requests at least two weeks in advance.”
But instead of forcing the users to search for event words, you want to allow something like this: “How far in advance do I need to request vacation time?”
A well-built, AI-powered app would locate the document containing the answer and display it in response to such a query—even if the query is misspelled or unclear. And keep this in mind: according to the 2025 edition of Dice’s Tech Salary Report, natural language processing and document databases were increasingly lucrative skills over the past year—companies will pay top dollar for a tech professional who can quickly and effectively build out a knowledge base using a company’s database and a core set of AI tools.
Choosing the Model Type
Before diving into the application, it’s important to choose the correct type of AI model. Because of the ubiquitous of the term “LLM” (i.e., Large Language Model), you might think that’s the right choice. But in fact, it’s not. Let’s compare.
An LLM is meant for generating text in response to a prompt (i.e., ChatGPT and similar chatbots). When you provide an LLM with a prompt, the prompt is converted into tokens where each token represents roughly one word. Those tokens are then fed into the LLM. The LLM then returns the most likely first token (again, usually a word) that should come next; this serves as the first word in the answer. Then the entire prompt is fed again, as tokens, along with that first token response, and the LLM provides the second token based on what’s most likely to occur. This is repeated over and over until a full response is built.
While you could potentially train an LLM with the information in your knowledge base, this really isn’t a good way to go at this point. For starters, LLMs aren’t always correct (we’ve all heard of recent stories where AI has told people to, for example, use glue on pizza). Now imagine if you’re using such an AI for the employee handbook—you don’t want the LLM to give out incorrect information to employees.
Also, for what it’s worth, it takes a long time to train such a model, and as new documents are created, it takes time to add those as well. Instead, there’s a better option: a sentence transformer model. A sentence transformer model uses similar technology as an LLM, except instead of converting individual words into tokens, it uses the data in the model to convert sentences into individual vectors (a vector is basically a sequence of floating point numbers. Typically there are several hundred numbers in a single vector).
What this means is you can take an entire knowledge base, let the AI software use the sentence transformer model to convert the documents into multiple vectors, and then store those in what’s called a vector database. Note that the documents aren’t being stored in the AI model, which remains unchanged.
Now that the documents are stored in a database, they can be searched. But here’s where it gets fascinating: The sentence transformer model is programmed such that the vectors it creates for two similar sentences based on context will have similar (but not exactly the same) vectors. And note that the context matters. The sentence “I saw her duck” would give one vector if it’s in a paragraph about somebody squatting down to avoid a baseball, versus seeing the animal called a duck. The sentence transformers model will detect the difference based on the context of the surrounding sentences.
And this means with the two sentences mentioned earlier, one from the employee handbook, and the other the user’s question, would both generate vectors that are similar but not identical. The vector for the first would be stored in the database. The vector for the second would be generated in response to a query. The AI app would then search through the database and find that the first is similar to the second and either return it as an answer, or return the entire document, perhaps with that sentence highlighted.
That’s the essence of how a sentence transformer model can be used as a search engine for a knowledge base. And notice what’s fascinating about it: You don’t need to search for exact words. The employee typed “vacation,” which wasn’t even in the resulting sentence, yet the correct sentence was located. This is a huge advancement in database searches.
Talking Angles
When Stephen Hawking wrote “A Brief History in Time,” the editors encouraged him to not include any mathematical formulas beyond Einstein’s famous formula for E. They said that for every formula you add, the readership would be cut in half.
We won’t put a formula here, but we’ll mention a topic that comes up a great deal in current AI: Cosine similarity. When you’re comparing two vectors generated by a sentence transformer model, the usual way is with a technique called cosine similarity: you treat the vectors as arrows in space, and you check if they’re pointing the same direction and if they’re about the same length. And the way to do it is by taking their cosine. This gives a real number anywhere from -1 to 1. If their cosines are close to 1, they’re similar.
Pro tip: What we just described is very much a watered down version of the math. If you’re serious about learning this, we highly encourage you to learn the actual mathematics. Learn the formula (which involves vector dot products) as well as the concept of how vectors can extend into much higher dimensional spaces beyond the three that we live in.
Learning This in Python
Although we won’t show you the code, we’re going to point you to some resources where you can learn to build your own AI powered knowledge base.
First, you’ll want to continue studying the concepts outlined above. You probably don’t need to learn how the sentence transformer model does it work internally (unless you really want to, but be forewarned, it’s incredibly advanced technology with a lot of math behind it).
Next, learn how to use the library called Sentence Transformers. Start with the basic examples; there’s one right on the home page. At this point, the code should make sense:
- Load the model
- Put three sentences into a list
- Calculate the vectors, which are also called embeddings. (All three vectors are stored together in a single object.)
- Print out the “shape” of the embeddings: There are three vectors, and each has a length of 384 numbers in it.
- Then calculate the cosine similarities of all combinations of the three and print them out.
Pro tip: Play with the demo a bit. Try adding code that asks the user for a sentence, then compare the similarity of that sentence to each of the three example sentences.
Next, start looking at all the examples in the Usage section of the documentation, and carefully read the instructions. Make sure you understand each page before moving to the next.
As you work through the examples, pay special attention to the section called Semantic Search. That’s the one you’ll need to learn for your own knowledge base, and it’s the one you’ll want to spend time on.
The examples show how to load a sequence of sentences. In your case, consider how you would instead load a sequence of entire documents. You don’t need to only load single sentences here.
Then instead of hardcoding the queries, prompt the user for a query.
And right there is when you will have achieved a basic searchable knowledge base.
Adding a Front End and Hosting
To run these AI models, you need an NVIDIA graphics card that has CUDA capabilities. If you already have one, you can do your development on your own machine. If not, you have a couple options: a free plan with limitations, or premium.
For a premium option, your best bet is to provision a GPU powered EC2 instance on Amazon Web Services. (But be careful: Set a reminder on your phone to shut it down anytime you’re finished. They can cost several hundred dollars per month to run if left on all month.)
A great free option is Google Colab, which is Google’s hosted version of Jupyter Notebooks. Through these, you can load models and write test code. The limitation is that you can’t build entire applications; however, it’s a great place to practice. They even offer a starter notebook that loads a sentence transformer model.
The code you initially build will be a command-line program, which is hardly user-friendly. But the same code can be placed inside a backend on a web server such as one you provision on AWS (or more likely a local server at your work).
From there you would build an API in front of it; you’ll probably want to use Flask for this. (There are other options, but Flask is the most popular.)
Then you would create your beautiful web interface in a front end (robably React, Angular or Vue). That’s where you would include a “Search box” where the user would type in their query, a button that would send the query to the back end, and then a page that would display the resulting documents, perhaps with the answer highlighted.
Conclusion
Learning to add AI to your Python apps, such as building a searchable knowledge base, is entirely possible with the right tools. You can easily add in the sentence transformer library and the code isn’t very complex. But don’t just drop the code in; take the time to understand how it’s working as we described here. Then soon you’ll be able to add a point on your resume that you’ve built a searchable knowledge base powered by sentence-transformer AI.