Unit Testing with Large Collections - How do you handle these situations best?

Question

I have a situation where I need to unit test some scenarios that require some pre-initialization of very large collections, but the pre-initialized data needs to be hard-coded for the unit tests to work.

Are there typical practices for this kind of thing? Or do most of you just make a clunky variable inside of each unit test and use that? I've been very curious about how other developers are tackling this situation..

Adam Houldsworth · Accepted Answer

If this data is sort of used in lots of different tests, then we use a static container to hold the data (assuming the data doesn't change). Tests can then just reference this when needed.

If the data is specific to a fixture, then it's just made a part of the fixture in order to keep the scope narrow.

For other parts of data, we can use mocking / stubbing techniques to expose test data. A lot of our data comes through our DAL interfaces, even the static stuff, so for that we have stubbed a test implementation of the interface that provides the static data through the normal interface methods we use. Lots of our tests are built on using this stub.

We use this in conjunction with SpecFlow. We can define Background: tables that are fed into the DAL stub, this DAL is then injected when our code under test is using the DAL interfaces to talk to data. For large amounts of static data, we simply hard-code it or code-gen it into an area where the DAL stub can get it on request.

Of course, this isn't necessarily how you should do it. This is just how I've seen it handled.

but the pre-initialized data needs to be hard-coded for the unit tests to work

In my opinion, there is nothing wrong with tests requiring set data in order to prove output. We have a mix of true unit tests, where external things are separated and we test just a method in question, but then with SpecFlow we have sort of "use case" tests, where we test things in a broader scope. However, this still needs defined input.

One important thing to keep under control is that unit tests should be as separated as possible. Fixtures allow you to expand the scope to a small collection of tests, but if you find yourself with lots of backing data that is potentially mutable being used across lots of tests - you need to take a step back.

We recently had this with a static list of configuration actions that weren't immutable. Making a change affected tests run after the change. We identified this and rectified it, but it wasn't trivial.

Unit Testing with Large Collections - How do you handle these situations best?

Answers (1)

Related Questions