Reputation: 67223
When testing systems (any system, really, e.g. a database), is it important that the test data is in the same form (format) as the live data?
To what degree do you allow differences in the two types of data?
Thanks
Upvotes: 2
Views: 1189
Reputation: 755
I try to use both test data that hits specific cases I have designed (often modified from live data); and a significant volume of live data whenever it is available, which hits a large number of scenarios that could definitely impact customers and may include scenarios I haven't thought of.
Keep in mind precisely what you are testing at any given moment. If you are just testing that the data acceptance service grabs files and it should grab any files and then reject bad formats later, then you don't care so much about what is inside the file and you will need at least some other-format test files. In that case, maybe just changing extensions on a notepad file would be enough for the functionality testing, with some large files generated to test file size, etc.
Using non-accurate test data could be especially useful if the format is still being worked out while the devs start work on the other parts of the system. However, you will want to run live or similar-to-live data through every part of your system for integration and end-to-end testing at some point.
Upvotes: 1
Reputation: 165
I think it's more complex than some people have made out and I would generally have the following test environments
Upvotes: 0
Reputation: 116169
I disagree with MusiGenesis, unless you are testing your ability to read from the data source.
If you are just testing how the system performs with certain data, then you can just use mocking to remove all connectivity to external data sources. However, if you need to test things like handling failures in connections and dropping connections, then you will probably want to try to connect to the same type of data source.
Upvotes: 0
Reputation: 51052
Barring specific reasons to use fake data, I think it's important to get as close as you can to the live data when testing. Otherwise you will definitely miss issues.
Specific reasons you might use fake data:
Upvotes: 2
Reputation: 75276
Put it this way: the more different your test data is from your live data, the less valuable the testing actually is. So yes, your test data should be as close as possible to your live data.
Upvotes: 5