Unit testing on implementation-specific database usage

Question

I'm trying to create unit tests for a function that uses database queries in its implementation. My understanding of unit testing is that you shouldn't be using outside resources such as databases for unit testing, and you should just create mock objects essentially hard coding the results of the queries.

However, in this case, the queries are implementation specific, and if the implementation would change, so would the queries. My understanding is also that unit testing is very useful because it essentially allows you to change the implementation of your code whenever you want while being sure it still works.

In this case, would it be better to create a database for testing purposes, or to make the testing tailored to this specific implementation and change the test code if we ever change the implementation?

JL Peyret · Accepted Answer

Well, to start with, I think this is very much something that depends on the application context, the QA/dev's skill set & preferences. So, what I think is right may not be right for others.

Having said that...

In my case, I have a system where an extremely complex ERP database, which I dont control, is very much in the driver's seat and my code is a viewer/observer, rather than a driver of that database. I don't, and can't really, use an ORM layer much, all my added value is in queries that deeply understand the underlying database data model. Note also that I am mostly a viewer of that db, in fact my code has read-only access to the primary db. It does have write access to its own tagging database which uses the Django ORM and testing there is different in nature because of my reliance on the ORM.

For me, it had better be tested with the database.

Mock objects? Please, mocking would have guzzled time if there is a lot of legitimate reasons to view/modify database contents with complex queries.

Changing queries. In my case, changing and tweaking those queries, which are the core of my application logic, is very often needed. So I need to make fully sure that they perform as intended against real data.

Multi-platform concerns. I started coding on postgresql, tweaked my connectivity libraries to support Oracle as well. Ran the unit tests and fixed anything that popped up as an error. Would a database abstraction have identified things like the LIMIT clause handling in Oracle?

Versioning. Again, I am not the master of the database. So, as versions change, I need to hook up my code to it. The unit testing is invaluable, but that's because it hits the raw db.

Test robustness. One lesson I learned along the way is to uncouple the test from the test db. Say you want to test a function that flags active customers that have not ordered anything in a year. My initial test approach involved manual lookups in the test database, find CUST701 to be a match to the condition. Then call my function and test if CUST701 is the result set of customers needing review. Wrong approach. What you want to do is to write, in your test, a query that finds active customers that have not ordered anything in a year. No hardcoded CUST701s at all, but your test query query can be as hardcoded as you want - in fact, it should look as little as your application queries as possible - you don't want your test sql to replicate what could potentially be a bug in your production code. Once you have dynamically identified a target customer meeting the criteria, then call your code under test and see if the results are as expected. Make sure your coverage tools identify when you've been missing test scenarios and plug those holes in the test db.

BDD. To a large extent, I am starting to approach testing from a BDD perspective, rather than a low-level TDD. So, I will be calling the url that handles the inactive customer lists, not testing individual functions. If the overall result is OK and I have enough coverage, I am OK, without wondering about the detailed low-level to and fro. So factor this as well in qualifying my answer.

Coders have always had test databases. To me, it seems logical to leverage them for BDD/unit-testing, rather than pretending they don't exist. But I am at heart a SQL coder that knows Python very well, not a Python expert who happens to dabble in SQL.

Unit testing on implementation-specific database usage

Answers (2)

Related Questions