Build and test a LlamaIndex RAG application
Software Engineer
LlamaIndex is a framework for building context-augmented large language model (LLM) applications. It enables you to augment the model with domain-specific data to customize it and improve its responses to your use case. You can use this framework to build a question-answering chat bot, a document-understanding and extraction application, or an autonomous agent. The framework provides tools to ingest data, process it, and implement query workflows that combine data access with LLM prompts.
LlamaIndex is available in Python and TypeScript. In this tutorial, you will learn how to build a RAG-powered question-answering (QA) system using LlamaIndex and expose the functionality as a REST API. You will learn how to write unit tests for it using pytest and automate the build and test process using continuous integration with CircleCI.
Prerequisites
For this tutorial, you need to set up a Python development environment on your machine. You also need a CircleCI account to automate the testing of your LLM model. Refer to this list to set up everything required for this tutorial.
Create a new Python project
You will be building a question-answering system with LLMs using the retrieval augmented generation (RAG) framework. RAG combines the power of LLMs with a retrieval mechanism that searches a database or knowledge source for relevant information, grounding its responses in specific, retrieved content rather than relying solely on pre-trained knowledge.
LlamaIndex supports multiple data formats, including SQL, CSV, or raw text files. This tutorial uses a Paul Graham essay as the input data for the QA system. You could replace the input data with a custom data of your choice.
First, create a new directory for your Python project and navigate into it.
mkdir llamaindex_question_answer_circleci
cd llamaindex_question_answer_circleci
Next, create a data/paul_graham_essay.txt
file in your project’s root and add the contents from this file to it.
Installing the dependencies
In this tutorial, we will use the llama-index Python package for building context-augmented LLM applications and Flask for exposing the model’s question-answer functionality as a REST API. We will also use the pytest
package for unit testing. Create a requirements.txt
file in the project’s root and add the following dependencies to it:
Flask
llama-index
pytest
To install the dependencies, first create a virtual environment to isolate the dependencies. Enter these commands in your terminal to create a envrionment named menv
and activate it:
python -m venv menv
source menv/bin/activate
Next, use the pip
install command (in your terminal) to install the Python dependencies:
pip install -r requirements.txt
Defining the question answering script
To build a QA system, you need to define a script that loads the text document, creates a vector store index, and uses it to answer queries. A vector store is a key component of RAG that accepts a list of Node objects and builds an index from them.
Create a query.py
file in the project’s root and add these contents to it:
import os.path
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
StorageContext,
load_index_from_storage,
)
def get_query_index():
# check if storage already exists
PERSIST_DIR = "./storage"
if not os.path.exists(PERSIST_DIR):
# load the documents and create the index
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
# store it for later
index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
# load the existing index
storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
index = load_index_from_storage(storage_context)
# Either way we can now query the index
query_engine = index.as_query_engine()
return query_engine
def answer_query(query):
if not query or len(query) == 0:
return None
query_engine = get_query_index()
response = query_engine.query(query)
return response
Here’s what this code snippet does:
get_query_index()
creates aVectorStoreIndex
if it doesn’t already exist and returns aquery_engine
for it.answer_query()
performs basic validations on the input query and uses thequery_engine
to generate a response.
Adding unit tests for the QA script
Next, you will define a unit test to test the answer_query
function defined in the previous section. Create a test_query.py
file and add these contents to it:
from query import answer_query
import unittest
class TestQuestionAnswerAgent(unittest.TestCase):
def test_valid_response(self):
response = answer_query("What did the author do growing up?")
self.assertIsNotNone(response)
self.assertTrue(len(str(response)) > 0)
def test_invalid_response(self):
response = answer_query("")
self.assertIsNone(response)
if __name__ == "__main__":
unittest.main()
The test file defines unit tests for valid and invalid response scenarios, checking if answer_query()
returns a non-null, non-empty response. You could add more specific assertions to increase the clarity and robustness of your tests.
Before running the tests locally, you need to set the OPENAI_API_KEY
environment variable. Use the API key that you created earlier in the prerequisites section.
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
Note: Make sure that your OpenAI account is funded before using the API key. Refer to this guide for details on how to fund your account.
Run the test:
pytest ./
Defining a Flask web server
In this section, you will create a Flask app using Python 3 and a /answer
endpoint that returns responses based on the input query. Flask is an open source Python framework for developing web applications. It is a popular choice for building the API service layer for lightweight applications. To define the Flask app create a app.py
file at the root of the project and add the following code snippet to define a simple Flask app.
from flask import Flask, jsonify, request
from query import answer_query
app = Flask(__name__)
@app.route('/')
def index():
return 'Welcome to the Q&A service!'
@app.route('/answer', methods=['POST'])
def get_answer():
query = request.json.get('query')
response = answer_query(query)
return jsonify({
"response": f"{response}"
})
if __name__ == '__main__':
app.run()
Int his code, get_answer()
accepts query
as input and uses the answer_query
defined in the previous section to return a response.
To test the API endpoint, start the Flask web server:
flask --app app run
It will start a web server at http://127.0.0.1:5000
, where you can test the /answer
API using curl
.
curl --location 'http://127.0.0.1:5000/answer' \
--header 'Content-Type: application/json' \
--data '{
"query": "What did the author do growing up?"
}'
Adding unit tests for the API endpoint
Create a test_app.py
at the root of the project. Add this code snippet to it:
from app import app
import unittest
class TestQAAPI(unittest.TestCase):
def setUp(self):
self.app = app
self.client = self.app.test_client
def test_valid_response(self):
request = {"query": "What did the author do growing up?"}
response = self.client().post("/answer", data=request)
self.assertIsNotNone(response)
self.assertTrue(len(str(response.data)) > 0)
def test_invalid_response(self):
request = {"query": ""}
response = self.client().post("/answer", data=request)
self.assertNotEqual(response.status_code, 200)
The above code snippet defines test cases for both success and failure scenarios. You can run the test by executing the following command:
pytest ./
The test should pass and you should get this kind of output:
===================== test session starts ====================
platform darwin -- Python 3.8.18, pytest-7.4.2, pluggy-1.3.0
rootdir: tutorials/llamaindex_question_answer_circleci
plugins: anyio-4.3.0
collected 4 items
test_app.py .. [ 50%]
test_query.py .. [100%]
===================== 4 passed in 3.73s ====================
Automate testing using CircleCI
Now that you have tested the QA system and the API locally, automate the workflow so that the unit tests can be executed every time you push code to the main branch.
Add the configuration script
Add a .circleci/config.yml
script in the project’s root containing the configuration file for the CI pipeline. Add this code snippet to it:
version: 2.1
orbs:
python: circleci/python@2.1.1
workflows:
build-app-with-test:
jobs:
- build-and-test
jobs:
build-and-test:
docker:
- image: cimg/python:3.9
steps:
- checkout
- python/install-packages:
pkg-manager: pip
- python/install-packages:
pip-dependency-file: requirements.txt
pkg-manager: pip
- python/install-packages:
args: pytest
pkg-manager: pip
pypi-cache: false
- run:
name: Run tests
command: pytest
Take a moment to review the CircleCI configuration.
- The
build-and-test
job uses the circleci/python@2.1.1 orb to build and test the LlamaIndex script and API. - The job checks out the repository, installs
pip
packages using therequirements.txt
file, and runs the tests usingpytest
.
Now that the configuration file has been properly set up, create a repository for the project on GitHub and push all the code to it. Review Pushing a project to GitHub for instructions.
Setting up the project on CircleCI
Log in to your CircleCI account. On the CircleCI dashboard, click the Projects tab, search for the GitHub repo name and click Set Up Project.
You will be prompted to add a new configuration file manually or use an existing one. Since you have already pushed the required configuration file to the codebase, select the Fastest option. Enter the name of the branch that hosts your configuration file. Click Set Up Project to continue.
![Circle CI project configuration]
Once you click on the Set up project button, it will automatically trigger the pipeline. The pipeline will fail because you have not set the environment variables yet.
Setting up environment variables
On the project page, click Project settings. Go to the Environment variables tab and click Add environment variable. Add an environment variable for OpenAI API key. Set the key name as OPENAI_API_KEY
and the API key’s value in the value
field. Once you add the environment variables, the key values will be displayed on the dashboard.
Now that the environment variables are configured, trigger the pipeline again. This time the build should succeed.
Conclusion
In this tutorial, you learned how to automatically build and test a LlamaIndex question-answering RAG application using CircleCI. LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models. The pipeline can be used to execute unit tests for the QA query script and the corresponding Flask API using pytest
to boost development speed.
You can check out the complete source code used in this tutorial on GitHub.