Adibe7
Adibe7

Reputation: 3539

What are the downsides using random values in Unit Testing?

I'm talking about a large scales system, with many servers and non deterministic input in high capacity. When i say non deterministic i'm talking about messages that are sent and you catch what you can and do the best you can. There are many types of messages, so the input could be very complicated. I can't imagine writing the code for so many scenarios and a simple non random (deterministic) message's generator is not good enough.

That's why i want to have a randomized unittest or server test that in case of a failure could write a log.

And i prefer the unittest instead of a random injector because i want it to run as part of the night build automated tests.

Any downsides?

Upvotes: 34

Views: 9849

Answers (11)

relet
relet

Reputation: 7009

They are random.

(Your test might randomly work, even if your code is broken.)

Upvotes: 13

Nestor Milyaev
Nestor Milyaev

Reputation: 6595

Additional Downside that hasn't been mentioned as yet is that your tests may intermittently fail at random, especially when you randomly generate several test variables so they form a convoluted and sometimes untractable dependencies. See example here.

Debugging these is a right pain in the backside and sometimes is (next to) impossible.

Also, it's often hard to tell what your test actually tests (and if it tests anything at all).

Historically in my company we use random tests at multiple levels (Unit, Integration, SingleService Tests), and that seemed like a great idea initially - it saves you code, space and time allowing to test multiple scenarios in one test.

But increasingly that gets to be a sticky point in our development, when our (even historic and reliable in the past) test start failing at random - and fixing these is way labour-intensive.

Upvotes: -1

AbuNassar
AbuNassar

Reputation: 1235

Upsides: they reveal when your other tests have not covered all the invariants. Whether you want your CI server to run nondeterministic tests is another issue. Given how incredibly useful I've found https://www.artima.com/shop/scalacheck, I don't intend to do without it from now on. Let's say you're implementing a pattern-matching algorithm. Do you really know all the different corner cases? I don't. Randomized inputs may flush them out.

Upvotes: 1

Evan Grantham-Brown
Evan Grantham-Brown

Reputation: 138

Randomizing unit tests is using a screwdriver to hammer in a nail. The problem is not that screwdrivers are bad; the problem is you're using the wrong tool for the job. The point of unit tests is to provide immediate feedback when you break something, so you can fix it right there.

Let's say you commit a change, which we'll call BadChange. BadChange introduces a bug, which your random tests will sometimes catch and sometimes not. This time, the tests don't catch it. BadChange is given the all-clear and goes into the code base.

Later, someone commits another change, GoodChange. GoodChange is one hundred percent fine. But this time, your random tests catch the bug introduced by BadChange. Now GoodChange is flagged as a problem, and the developer who wrote it will be going in circles trying to figure out why this innocuous change is causing issues.

Randomized testing is useful to constantly probe the whole application for issues, not to validate individual changes. It should live in a separate suite, and runs should not be linked to code changes; even if no one has made a change, the possibility remains that the random tests will stumble across some exotic bug that previous runs missed.

Upvotes: 5

Manu
Manu

Reputation: 4137

I believe that generating random input values can be a reliable testing technique when used together with equivalence partitioning. This means that, if you partition your input space and then randomly pick values from an equivalence class, then you are fine: same coverage (any of them, including statement, branch, all-uses etc). This under the assumption that your equivalence partitioning procedure is sound. Also, I would recommend boundary value analysis to be paired with equivalence partitioning and randomly generated inputs.

Finally, I would also recommend considering the TYPE of defects you want to detect: some testing techniques address specific types of defects, which might be hardly (and just by chance) detected by other techniques. An example: deadlock conditions.

In conclusion, I believe that generating random values is not a bad practice, in particular in some systems (e.g. web applications), but it only addresses a subset of existing defects (like any other technique) and one should be aware of that, so to complement his/her quality assurance process with the adequate set of activities.

Upvotes: 0

Kendrick
Kendrick

Reputation: 3787

The results aren't repeatable, and depending on your tests, you may not know the specific conditions which caused the code to fail (thus making it tough to debug).

Upvotes: 2

Mark Simpson
Mark Simpson

Reputation: 23365

Downsides

Firstly, it makes the test more convoluted and slightly harder to debug, as you cannot directly see all the values being fed in (though there's always the option of generating test cases as either code or data, too). If you're doing some semi-complicated logic to generate your random test data, then there's also the chance that this code has a bug in it. Bugs in test code can be a pain, especially if developers immediate assume the bug is the production code.

Secondly, it is often impossible to be specific about the expected answer. If you know the answer based on the input, then there's a decent chance you're just aping the logic under test (think about it -- if the input is random, how do you know the expected output?) As a result, you may have to trade very specific asserts (the value should be x) for more general sanity-check asserts (the value should be between y and z).

Thirdly, unless there's a wide range of inputs and outputs, you can often cover the same range using well chosen values in a standard unit tests with less complexity. E.g. pick the numbers -max, (-max + 1), -2, -1, 0, 1, 2, max-1, max. (or whatever is interesting for the algorithm).

Upsides

When done well with the correct target, these tests can provide a very valuable complementary testing pass. I've seen quite a few bits of code that, when hammered by randomly generated test inputs, buckled due to unforeseen edge cases. I sometimes add an extra integration testing pass that generates a shedload of test cases.

Additional tricks

If one of your random tests fails, isolate the 'interesting' value and promote it into a standalone unit test to ensure that you can fix the bug and it will never regress prior to checkin.

Upvotes: 48

Eastern Monk
Eastern Monk

Reputation: 6645

You need to remember which random numbers you generated during verification.

Example.

Username= "username".rand();
Save_in_DB("user",Username); // To save it in DB
Verify_if_Saved("user",Username); 

Upvotes: 0

Sam Bisbee
Sam Bisbee

Reputation: 4441

As others have suggested, it makes your test unreliable because you don't know what's going on inside of it. That means it might work for some cases, and not for others.

If you already have an idea of the range of values that you want to test, then you should either (1) create a different test for each value in the range, or (2) loop over the set of values and make an assertion on each iteration. A quick, rather silly, example...

for($i = 0; $i < 10; $i++)
  $this->assertEquals($i + 1, Math::addOne($i));

You could do something similar with character encodings. For example, loop over the ASCII character set and test all of those crazy characters against one of your text manipulation functions.

Upvotes: 1

mpenrow
mpenrow

Reputation: 5683

It is much better to have unit tests that are 100% repeatable and include all the edge cases. For example, test zero, negatives, positives, numbers too big, numbers too small, etc. If you want to include tests with random values in addition to all the edge cases and normal cases that would be fine. However, I'm not sure you would get much benefit out of the time spent. Having all the normal cases and edge cases should handle everything. The rest is "gravy".

Upvotes: 3

fernferret
fernferret

Reputation: 856

Also, you won't be able to reproduce tests many times over. A unit test should run exactly the same with given parameters.

Upvotes: 8

Related Questions