LlamaIndex is a framework for building context-augmented large language model (LLM) applications. It enables you to augment the model with domain-specific data to customize it and improve its responses to your use case. You can use this framework to build a question-answering chat bot, a document understanding and extraction application, or an autonomous agent. The framework provides tools to ingest data, process it, and implement query workflows that combine data access with LLM prompts.

LlamaIndex is available in Python and TypeScript. In this tutorial, you will learn how to build a question-answering (QA) system using LlamaIndex and expose the functionality as a REST API. You will learn how to write unit tests for it using pytest and automate the build and test process using continuous integration with CircleCI.

Prerequisites

For this tutorial, you need to set up a Python development environment on your machine. You also need a CircleCI account to automate the testing of your LLM model. Refer to this list to set up everything required for this tutorial.

Create a new Python project

You will be building a question-answering system with LLMs using the retrieval augmented generation (RAG) framework. RAG combines the power of LLMs with a retrieval mechanism that searches a database or knowledge source for relevant information, grounding its responses in specific, retrieved content rather than relying solely on pre-trained knowledge.

LlamaIndex supports multiple data formats, including SQL, CSV, or raw text files. This tutorial uses a Paul Graham essay as the input data for the QA system. You could replace the input data with a custom data of your choice.

First, create a new directory for your Python project and navigate into it.

mkdir llamaindex_question_answer_circleci
cd llamaindex_question_answer_circleci

Next, create a data/paul_graham_essay.txt file in your project’s root and add the contents from this file to it.

Installing the dependencies

In this tutorial, we will use the llama-index Python package for building context-augmented LLM applications and Flask for exposing the model’s question-answer functionality as a REST API. We will also use the pytest package for unit testing. Create a requirements.txt file in the project’s root and add the following dependencies to it:

Flask
llama-index
pytest

To install the dependencies, first create a virtual environment to isolate the dependencies. Enter these commands in your terminal to create a envrionment named menv and activate it:

python -m venv menv 
source menv/bin/activate

Next, use the pip install command (in your terminal) to install the Python dependencies:

pip install -r requirements.txt

Defining the question answering script

To build a QA system, you need to define a script that loads the text document, creates a vector store index, and uses it to answer queries. A vector store is a key component of RAG that accepts a list of Node objects and builds an index from them.

Create a query.py file in the project’s root and add these contents to it:

import os.path

from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
    load_index_from_storage,
)

def get_query_index():
    # check if storage already exists
    PERSIST_DIR = "./storage"

    if not os.path.exists(PERSIST_DIR):
        # load the documents and create the index
        documents = SimpleDirectoryReader("data").load_data()
        index = VectorStoreIndex.from_documents(documents)
        # store it for later
        index.storage_context.persist(persist_dir=PERSIST_DIR)
    else:
        # load the existing index
        storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
        index = load_index_from_storage(storage_context)

    # Either way we can now query the index
    query_engine = index.as_query_engine()

    return query_engine

def answer_query(query):
    if not query or len(query) == 0:
        return None

    query_engine = get_query_index()
    response = query_engine.query(query)

    return response

Here’s what this code snippet does:

  • get_query_index() creates a VectorStoreIndex if it doesn’t already exist and returns a query_engine for it.
  • answer_query() performs basic validations on the input query and uses the query_engine to generate a response.

Adding unit tests for the QA script

Next, you will define a unit test to test the answer_query function defined in the previous section. Create a test_query.py file and add these contents to it:

from  query import answer_query

import unittest

class TestQuestionAnswerAgent(unittest.TestCase):
    def test_valid_response(self):

        response = answer_query("What did the author do growing up?")

        self.assertIsNotNone(response)
        self.assertTrue(len(str(response)) > 0)

    def test_invalid_response(self):
        response = answer_query("")
        self.assertIsNone(response)

if __name__ == "__main__":
    unittest.main()

The test file defines unit tests for valid and invalid response scenarios, checking if answer_query() returns a non-null, non-empty response. You could add more specific assertions to increase the clarity and robustness of your tests.

Before running the tests locally, you need to set the OPENAI_API_KEY environment variable. Use the API key that you created earlier in the prerequisites section.

export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>

Note: Make sure that your OpenAI account is funded before using the API key. Refer to this guide for details on how to fund your account.

Run the test:

pytest ./

Defining a Flask web server

In this section, you will create a Flask app using Python 3 and a /answer endpoint that returns responses based on the input query. Flask is an open source Python framework for developing web applications. It is a popular choice for building the API service layer for lightweight applications. To define the Flask app create a app.py file at the root of the project and add the following code snippet to define a simple Flask app.

from flask import Flask, jsonify, request

from query import answer_query

app = Flask(__name__)

@app.route('/')
def index():
    return 'Welcome to the Q&A service!'

@app.route('/answer', methods=['POST'])
def get_answer():
    query = request.json.get('query')

    response = answer_query(query)

    return jsonify({
        "response": f"{response}"
    })

if __name__ == '__main__':
    app.run()

Int his code, get_answer() accepts query as input and uses the answer_query defined in the previous section to return a response.

To test the API endpoint, start the Flask web server:

flask --app app run

It will start a web server at http://127.0.0.1:5000, where you can test the /answer API using curl.

curl --location 'http://127.0.0.1:5000/answer' \
--header 'Content-Type: application/json' \
--data '{
    "query": "What did the author do growing up?"
}'

Adding unit tests for the API endpoint

Create a test_app.py at the root of the project. Add this code snippet to it:

from app import app

import unittest

class TestQAAPI(unittest.TestCase):
    def setUp(self):
        self.app = app
        self.client = self.app.test_client

    def test_valid_response(self):

        request = {"query": "What did the author do growing up?"}

        response = self.client().post("/answer", data=request)

        self.assertIsNotNone(response)
        self.assertTrue(len(str(response.data)) > 0)

    def test_invalid_response(self):
        request = {"query": ""}

        response = self.client().post("/answer", data=request)

        self.assertNotEqual(response.status_code, 200)

The above code snippet defines test cases for both success and failure scenarios. You can run the test by executing the following command:

pytest ./

The test should pass and you should get this kind of output:

===================== test session starts ====================
platform darwin -- Python 3.8.18, pytest-7.4.2, pluggy-1.3.0
rootdir: tutorials/llamaindex_question_answer_circleci
plugins: anyio-4.3.0
collected 4 items

test_app.py ..                                       [ 50%]
test_query.py ..                                     [100%]

===================== 4 passed in 3.73s ====================

Automate testing using CircleCI

Now that you have tested the QA system and the API locally, automate the workflow so that the unit tests can be executed every time you push code to the main branch.

Add the configuration script

Add a .circleci/config.yml script in the project’s root containing the configuration file for the CI pipeline. Add this code snippet to it:

version: 2.1
orbs:
  python: circleci/python@2.1.1

workflows:
  build-app-with-test:
    jobs:
      - build-and-test
jobs:
  build-and-test:
    docker:
      - image: cimg/python:3.9
    steps:
      - checkout
      - python/install-packages:
          pkg-manager: pip
      - python/install-packages:
          pip-dependency-file: requirements.txt
          pkg-manager: pip
      - python/install-packages:
          args: pytest
          pkg-manager: pip
          pypi-cache: false
      - run:
          name: Run tests
          command: pytest

Take a moment to review the CircleCI configuration.

  • The build-and-test job uses the circleci/python@2.1.1 orb to build and test the LlamaIndex script and API.
  • The job checks out the repository, installs pip packages using the requirements.txt file, and runs the tests using pytest.

Now that the configuration file has been properly set up, create a repository for the project on GitHub and push all the code to it. Review Pushing a project to GitHub for instructions.

Setting up the project on CircleCI

Log in to your CircleCI account. On the CircleCI dashboard, click the Projects tab, search for the GitHub repo name and click Set Up Project.

Circle CI set up project

You will be prompted to add a new configuration file manually or use an existing one. Since you have already pushed the required configuration file to the codebase, select the Fastest option. Enter the name of the branch that hosts your configuration file. Click Set Up Project to continue.

![Circle CI project configuration]2024-05-04-llamaindex-q-a-configure-project

Once you click on the Set up project button, it will automatically trigger the pipeline. The pipeline will fail because you have not set the environment variables yet.

Setting up environment variables

On the project page, click Project settings. Go to the Environment variables tab and click Add environment variable. Add an environment variable for OpenAI API key. Set the key name as OPENAI_API_KEY and the API key’s value in the value field. Once you add the environment variables, the key values will be displayed on the dashboard.

Circle CI set up environment variables

Now that the environment variables are configured, trigger the pipeline again. This time the build should succeed.

Circle CI pipeline builds successfully

Conclusion

In this tutorial, you learned how to automatically build and test a LlamaIndex question-answering RAG application using CircleCI. LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models. The pipeline can be used to execute unit tests for the QA query script and the corresponding Flask API using pytest to boost development speed.

You can check out the complete source code used in this tutorial on GitHub.


Vivek Kumar Maskara is a Software Engineer at JP Morgan. He loves writing code, developing apps, creating websites, and writing technical blogs about his experiences. His profile and contact information can be found at maskaravivek.com.

Read more posts by Vivek Maskara