Preferable way of making code testable: Dependency injection vs encapsulation

Question

I often find myself wondering what is the best practice for these problems. An example:

I have a java program which should get the air temperature from a weather web service. I encapsulate this in a class which creates a HttpClient and does a Get REST request to the weather service. Writing a unit test for the class requires to stub the HttpClient so that dummy data can be received in stead. There are som options how to implement this:

Dependency Injection in constructor. This breaks encapsulation. If we switch to a SOAP web service in stead, then a SoapConnection has to be injected instead of HttpClient.

Creating a setter only for the purpose of testing. The "normal" HttpClient is constructed by default, but it is also possible to change the HttpClient by using the setter.

Reflection. Having the HttpClient as a private field set by the constructor (but not taking it by parameter), and then let the test use reflection to change it into a stubbed one.

Package private. Lower the field restriction to make it accessible in test.

When trying to read about best practices on the subject it seems to me that the general consensus is that dependency injection is the preferred way, but I think the downside of breaking encapsulation is not given enough thought.

What to you think is the preferred way to make a class testable?

Jamey · Accepted Answer

I believe the best way is through dependency injection, but not quite the way you describe. Instead of injecting an HttpClient directly, instead inject a WeatherStatusService (or some equivalent name). I would make this a simple interface with one method (in your use case) getWeatherStatus(). Then you can implement this interface with an HttpClientWeatherStatusService, and inject this at runtime. To unit test the core class, you have a choice of stubbing the interface yourself by implementing the WeatherStatusService with your own unit testing requirements, or using a mocking framework to mock the getWeatherStatus method. The main advantages of this way are that:

You don't break encapsulation (because changing to a SOAP implementation involves creating a SOAPWeatherStatusService and deleting the HttpClient handler).
You have broken your initial single class down, and now have two classes with a distinct purpose, one class explicitly handles retrieving the data from an API, the other class handles the core logic. This will probably be a flow like: Receive weather status request (from higher up) -> request data retrieval from api -> process/validate the returned data -> (optionally) store data or trigger other processes to operate on the data -> return the data.
You can re-use the WeatherStatusService implementation easily if a different use case emerges to utilise this data. (For example, perhaps you have one use case to store the weather conditions every 4 hours (to show the user an interactive map of the days' developments), and another use case to get the current weather. In this case, you need two different core logic requirements which both need to use the same API, so it makes sense to have the API access code consistent between these approaches).

This method is known as hexagonal/onion architecture which I recommend reading about here:

Or this post which sums the core ideas up:

http://blog.8thlight.com/uncle-bob/2012/08/13/the-clean-architecture.html

EDIT:

Further to your comments:

What about testing the HttpClientWeatherStatus? Ignore unit testing or else we have to find a way to mock HttpClient there?

With the HttpClientWeatherStatus class. It should ideally be immutable, so the HttpClient dependency is injected into the constructor on creation. This makes unit testing easy because you can mock HttpClient and prevent any interaction with the outside world. For example:

public class HttpClientWeatherStatusService implements WeatherStatusService {
    private final HttpClient httpClient;

    public HttpClientWeatherStatusService(HttpClient httpClient) {
        this.httpClient = httpClient;
    }

    public WeatherStatus getWeatherStatus(String location) {
        //Setup request.
        //Make request with the injected httpClient.
        //Parse response.
        return new WeatherStatus(temperature, humidity, weatherType);
    }
}

Where the returned WeatherStatus 'Event' is:

public class WeatherStatus {
    private final float temperature;
    private final float humidity;
    private final String weatherType;
    //Constructor and getters.
}

Then the tests look something like this:

public WeatherStatusServiceTests {
    @Test
    public void givenALocation_WhenAWeatherStatusRequestIsMade_ThenTheCorrectStatusForThatLocationIsReturned() {
        //SETUP TEST.
        //Create httpClient mock.
        String location = "The World";
        //Create expected response.
        //Expect request containing location, return response.
        WeatherStatusService service = new HttpClientWeatherStatusService(httpClient);
        //Replay mock.

        //RUN TEST.
        WeatherStatus status = service.getWeatherStatus(location);

        //VERIFY TEST.
        //Assert status contains correctly parsed response.
    }
}

You will generally find that there will be very few conditionals and loops in the integration layers (because these constructs represent logic, and all logic should be in the core). Because of this (specifically because there will only be a single conditional branching path in the calling code), some people would argue that there is little point unit testing this class, and that it can be covered by an integration test just as easily, and in a less brittle way. I understand this viewpoint, and don't have a problem with skipping unit tests in the integration layers, but personally I would unit test it anyway. This is because I believe unit tests in an integration domain still help me ensure that my class is highly usable, and portable/re-usable (if it's easy to test, then it's easy to use from elsewhere in the codebase). I also use unit tests as documentation detailing the use of the class, with the advantage that any CI server will alert me when the documentation is out of date.

Isn't it bloating the code for a small problem which could have been "fixed" by just some lines using reflection or simply changing to package private field access?

The fact that you put "fixed" in quotes speaks volumes about how valid you think such a solution would be. ;) I agree that there is definitely some bloat to the code, and this can be disconcerting at first. But the real point is to make a maintainable codebase which is easy to develop for. I think some projects start fast because they "fix" problems by using hacks and dodgy coding practices to maintain the pace. Often productivity grinds to a halt as the overwhelming technical debt renders changes which should be one liners into mammoth re-factors which take weeks or even months.

Once you have a project set up in a hexagonal way, the real payoffs come when you need to do one of the following:

Change the technology stack of one of your integration layers. (e.g. from mysql to postgres). In this case (as touched on above), you simply implement a new persistence layer making sure you use all the relevant interfaces from the binding/event/adapter layer. There should be no need to change core code or the interface. Finally delete the old layer, and inject the new layer in place.
Add a new feature. Often integration layers will already exist, and may not even need modification to be used. In the example of the getCurrentWeather() and store4HourlyWeather() use-cases above. Let's assume you've already implemented the store4HourlyWeather() functionality using the class outlined above. To create this new functionality (let's assume the process begins with a restful request), you need to make three new files. You need a new class in your web layer to handle the initial request, you need a new class in your core layer to represent the user story of getCurrentWeather(), and you need an interface in your binding/event/adaptor layer which the core class implements, and the web class has injected to its constructor. Now on the one hand, yes, you've created 3 files when it would have been possible to create only one file, or even just tack it onto an existing restful web handler. Of course you could, and in this simple example that would work fine. It is only over time that the distinction between layers become obvious and refactors become hard. Consider in the case where you tack it onto an existing class, that class no longer has an obvious single purpose. What will you call it? How will anyone know to look in it for this code? How complicated is your test set-up becoming so that you can test this class now that there are more dependencies to mock?
Update integration layer changes. Following on from the example above, if the weather service API (where you are getting your information from) changes, there is only one place where you need to make changes in your program to be compatible with the new API again. This is the only place in the code which knows where the data actually comes from, so it's the only place which needs changing.
Introduce the project to a new team member. Arguable point, since any well laid out project will be fairly easy to understand, but my experience so far has been that most code looks simple and understandable. It achieves one thing, and it's very good at achieving that one thing. Understanding where to look (for example) for Amazon-S3 related code is obvious because there is an entire layer devoted to interacting with it, and this layer will have no code in it relating to other integration concerns.
Fix bugs. Linked to the above, often reproducibility is the biggest step towards a fix. The advantage of all the integration layers being immutable, independent, and accepting clear parameters, is that it is easy to isolate a single failing layer and modify the parameters until it fails. (Although again, well designed code will do this well too).

I hope I've answered your questions, let me know if you have more. :) Perhaps I will look into creating a sample hexagonal project over the weekend and linking to it here to demonstrate my point more clearly.

Preferable way of making code testable: Dependency injection vs encapsulation

Answers (2)

Related Questions