Test-driven development using mocking and stubbing

In How and where to segregate test environments, I talked about building a structured path to production: which tests to include, when to do them, and why. In this post, we’ll get into exactly how to do each kind of test.

We’ll cover the techniques of mocking and stubbing, and test-driven development to help each testing layer. First, let’s review a concept from the previous post: the test pyramid. This helps illustrate the difference between different kinds of tests and when it’s advantageous to do them.

Unit or component tests (shown here at the bottom of our pyramid) are inexpensive and fast to perform. Rely heavily on these. Only once you’ve exhausted what these tests can do should you move on to more expensive tests (in both time and resources), such as integration tests, and UI layer tests.


In this series, I will cover an assortment of testing tools that should be in every developer’s toolbox, and go over when, why, and how to use them. I’ll cover testing across the layers of the pyramid, as well as the concepts of mocking, stubbing, and contract testing. In the second piece of this series, we’ll get into test-driven development and behavior-driven development (TDD and BDD).

What is mocking and stubbing used for?

A lot of people think that mocking and stubbing are used just for unit and component tests. However, I want to show you how mock objects or stubs can be used in other layers of testing as well.

What is mock testing?

Mocking means creating a fake version of an external or internal service that can stand in for the real one, helping your tests run more quickly and more reliably. When your implementation interacts with an object’s properties, rather than its function or behavior, a mock can be used.

What is stub testing?

Stubbing, like mocking, means creating a stand-in, but a stub only mocks the behavior, but not the entire object. This is used when your implementation only interacts with a certain behavior of the object.

A great blog post that covers the difference between mocking and stubbing can be found [here].(https://martinfowler.com/articles/mocksArentStubs.html){: target=”_blank” rel=”noreferrer noopener”}

Let’s discuss how we can apply these methods to improve our testing in all levels of the pyramid above.

Using mocking and stubbing in unit + component tests

I recommend mocking or stubbing when your code uses external dependencies like system calls, or accessing a database. For example, whenever you run a test, you’re exercising the implementation. So when a delete or create function happens, you’re letting it create a file, or delete a file. This work is not efficient, and the data it creates and deletes is not actually useful. Furthermore, it’s expensive to clean up, because now you have to manually delete something every time. This is a case where mocking/stubbing can help a lot.

Using mocks and stubs to fake the external functionality help you create tests that are independent. For instance, say that the test writes a file to /tmp/test_file.txt and then the system under the test deletes it. The problem then is not that the test is not independent; it is that the system calls take a lot of time. In this instance, you can stub the file system call’s response, which will take a lot less time because it immediately returns.

Another benefit is that you can reproduce complex scenarios more easily. For instance, it is much easier to test the many error responses you might get from the filesystem then to actually create the condition. Say that you only wanted to delete corrupt files. Writing a corrupt file can be difficult programmatically, but returning the error code associated with a corrupt file is a matter of just changing what a stub returns.

Mock and stub testing example

def read_and_trim(file_path)
	return os.open(file_path).rstrip("\n") #method will call system call to look for the file from the given file path and read the content from them and removing new line terminator.

The code above interacts with Python’s built-in open function which interacts with a system call to actually look for the file from the given file path. Which means wherever and whenever you run the test for that function:

  1. You will need to ensure that the file that the test will be looking for exists; when it does not exist, the test fails.
  2. The test will need to wait for the system call’s response; if the system call times out, the test fails.

Neither case of failure means your implementation failed to do its job. These tests are now neither isolated (since they’re dependent on the system call’s response) nor efficient (since the system call connection will take time to deliver the request and response).

The test code for the implementation above looks like this:

@unittest.mock.patch("builtins.open", new_callable=mock_open, read_data="fake file content\n")
def test_read_and_trim_content(self, mock_object):

    self.assertEqual(read_and_trim("/fake/file/path"), "fake file content")

We are using a Python mock patch to mock the built-in open call. In this way, we are only testing what we actually built.

Another good example of using mocks and stubs in unit testing is faking database calls. For example, let’s say you are testing whether your function deletes the entity from a database. For the first test, you manually create a file so that there’s one to be deleted. The test passes. But then, the second time, someone else (who isn’t you) doesn’t know that they have to manually create the entity. Now the test fails. There was no file to delete since they didn’t know they had to create the entity, so this is not an independent test.

In cases like these, you’ll want to prevent modifying the data or making operating system calls to remove the file. This will prevent tests from being flaky whenever someone accidentally fails to create test data.

Mocking and stubbing of internal functions

Mocks and stubs are very handy for unit tests. They help you to test a functionality or implementation independently, while also allowing unit tests to remain efficient and cheap, as we discussed in our previous post.

A great application of mocks and stubs in a unit/component test is when your implementation interacts with another method or class. You can mock the class object or stub the method behavior that your implementation is interacting with. Mocking or stubbing the other functionality or class, and therefore only testing your implementation logic, is the key benefit of unit tests, and the way to reap the biggest benefit from performing them.

Note: Your tests should grow with your code. Since the unit test is focused more on implementation details than the overall functionality of the feature, it’s the test that will change the most over time. It follows that when you are using a lot of mocked data in your testing, your mocking has to evolve the same way that your code evolves. Otherwise, it can potentially lead to unexpected bugs in the system. Tests aren’t something you write once and expect to always work. As you change your code and refactor, it’s your responsibility to maintain and evolve your tests to match.

Mocking in integration testing

With integration tests, you are testing relationships between services. One approach might be to get all the dependent services up and running for the testing environment. But this is unnecessary. It can create a lot of potential failure points from services you do not control, adding time and complexity to your testing. I recommend narrowing it down by writing a few service integration tests using mocks and stubs. I’ll show you how this makes your test suite more reliable.

In integration testing, the rules are different from unit tests. Here, you should only test the implementation and functionality that you have the control to edit. Mocks and stubs can be used for this purpose. First, identify which integrations are important. Then, you can decide which external or internal services can be mocked.

Let’s say your code interacts with the GitHub API, like in the example below. Since you personally can’t change how the GitHub API is responding from your request call, you don’t have to test it. Mocking the expected GitHub API’s response lets you focus more on testing the interactions within your internal code base.

def test_parsed_content_from_git(self, mocked_git):
   expected_decoded_content = "b'# Sample Hello World\n\n> How to run this app\n\n- installation\n\n dependencies\n"
   mocked_git.get_repo.return_value = expected_decoded_content

   parsed_content = read_parse_from content(repo='my/repo',

   self.assertEqual(parsed_content['titles'], ['Sample Hello World'])

In the test code above, the read_parse_from_content method is integrated with the class that parses the JSON object from the GitHub API call. In this test, we are testing the integration in between two classes.

Since we are using a mock in the test above, your test will be faster and less dependent by avoiding making the call to the GitHub API. This will also save time and effort by not needing internet access for the environment that will run the test. However, in order for you to have reliable testing while mocking the dependent external services, it’s extremely important for you to understand how external dependencies will behave in the real world. For example, if the expected_decoded_content in the code example above is not how GitHub returns the repo file content, incorrect assumptions from the mocked test can lead to unexpected breakage. Before writing the test that will have the mocked response, it’s best to make the actual snapshot of the external dependency call and use it as a mocked response. Once you have created the mocked response with the snapshot, that should not change often since the Application Programming Interface should almost always be backward compatible. However, it is important to validate the API regularly for the occasional unexpected change.



Mocks and stubs in contract-based testing (in a microservices architecture)

When two different services integrate with each other, they each have “expectations,” i.e. standards about what they’re giving and what they expect to get in return. We can think of these as contracts between integrated endpoints. Because of this standardization, contract tests can be used to test integrations.

Let’s walk through an example. As I mentioned, the version-tagged API should not change often, possibly not ever. For any API you choose, you will generally be able to find documentation about that API, and what to expect from it. And when you decide to use a certain version of an API, you can rely on the return of that API call. This is the presumed contract between the engineers who provide the API and the engineers who will use its data.

You can use the idea of contracts to test internal services as well. When testing a large scale application using microservices architecture it could be costly to install the entire system and infrastructure. Such applications can benefit greatly from using contract testing. In the testing pyramid, contract testing sits in between the unit/component testing and integration testing layers, depending on the coverage of the contract testing in your system. Some organizations utilize contract testing to completely replace end-to-end or functional testing.

Contract-based testing can cover two important things:

  1. Checking the connectivity of end point that has been agreed upon
  2. Checking the response from the endpoint with a given argument

As an example, let’s imagine a weather-reporting application involving a weather service interacting with a user service. When the user service connects to the endpoint of the weather service with the date (the request), the user service processes the date data to get the weather for that date. These two services have a contract: the weather service will maintain the endpoint to be always accessible by the user service and provide the valid data that the user service is requesting, and in the same format.

Now, let’s take a look at how we can utilize mocks and stubs in the contract test. Instead of the user service making the actual request call to the weather service in the test, you can create a mocked response. Since there is a contract between two services, the endpoint and response should not change. This will free both services from depending on each other during tests, allowing tests to be faster and more reliable.

In the last post, we talked about running different tests in different environments and how sometimes, it can be useful to run the same test in a different environment with a different configuration. Contract tests are one of the great examples of the latter case. We can achieve different goals when running contract tests in different environments with different configurations. When it’s a lower layer environment such as Dev or CI, running the test with a mocked contract would serve the purpose of testing our internal implementation within the constraints of the environment. However, when it goes to an upper layer environment such as QA or Staging, the same test can be used without a mocked contract but with the actual external dependency connection. Mbtest is one tool that can help with the kind of contract testing and mocking response I explained above.


We’ve taken a look at examples of different layers of testing using mocks and stubs. Now let’s recap why they are useful:

  1. Tests with mocks and stubs go faster because you don’t have to connect with external services. There’s no delay waiting for them to respond.
  2. You have the flexibility to scope the test to cover just the parts you can control and change. With external services, you are powerless in the case that they’re wrong or the test fails. Mocking ensures you are scoping tests to work that you can do – and not giving yourself problems you can’t fix.
  3. Mocking external API calls helps your test to be more reliable
  4. Contract testing empowers service teams to be more autonomous in development

In Test-driven development and behavior-driven development, we’ll explore the principles of test-driven development (TDD) and behavior-driven development (BDD), and see how they can improve outcomes for everything from functional testing to unit testing.

Read more: