Tuesday, 7 July 2015

Defining a good Unit Test suite

I measure test suites under 6 main criteria. The criteria are pretty hard and fast and there are key indicators to measure them. There is also an AND relationship between them. So if you can tick 5 out of the 6, the other one should be addressed.

  1. Trust. Tests pass when the component is ok.
  2. Comprehensiveness Majority of the ways of use for the component are covered by the tests
  3. Correct level of abstraction. Tests should be written to a stable, well defined interface. Unit tests faciliate refactoring.
  4. Language Tests should match the language of the problem
  5. Reliability. Tests fail only when the code is not ok.
  6. Independent Tests should be independent of other tests, methods and classes, in a pragmatic way. ie each test should only use methods that are "well used" in the public domain. This does not include data driven approaches.
  7. Libraries Don't test well used and independently verified libraries.

What defines a useful test suite is:

  1. A developer can be pretty sure, once the unit test suite passes, that no other functional issues will be found. We are happy to release the product after the automated suite passes.
  2. Majority of the problems are found at the unit-test level. For this our Fault Slip-through Analysis of our bugs indicates that the majority of bugs are found in the right level of test.
  3. The unit tests are a vital tool to help refactoring. I can do multiple run-test - refactor - run-test cycles, without making a change to the tests. The unit-tests are written towards the "thing" wrapped in an interface, and not just any method or any class.
  4. They reflect the language of the problem definition re using terminology the customer used. Ideally Customers should be able to understand the tests.
  5. When a test case fails, it points to an actual problem in the component
  6. Test cases shouldn't change when we change or extend the system, therefore I can trust them. Test cases are the guarantee that what worked yesterday, still works today. If we have common methods and utility classes referenced in our tests, that are changed as the system grows then we cannot depend on our tests. In other words, if I change my test code, who will test my tests?
  7. Libraries such as the extensive set of libraries in the JDK, databases like MySql, NEO4J etc are published by competent organisations and are heavily re-used in a lot of software settings. You can trust that their functionally works. Don't write tests that extensively test CRUD of database operations. You can trust that the Collections framework works. You may need to make an exception if you are using pre-release libraries or if they are libraries from your own organisation that you can't trust (i.e. it's your code that is really testing them).

What "smells" to measure that a unit test suite is useless:

  1. Developers don't trust the unit tests to verify the component. This means, more or less, that a developer isn't that confident to release the component based on unit test alone. We require a manual test before we are confident to release the product.
  2. Majority of problems are being found in later stages of testing. Our fault slip through Analysis is showing large numbers of bugs appearing in later phases of test, that could have been found in earlier phases.
  3. The unit tests are written at too low a level and now hinder refactoring. I change the internals of a component and several unit tests no longer compile, never mind that the don't run. Every method of every class has at least one unit test associated with it. Worse still, methods that should be private are public to enable testing!
  4. They reflect the terminology of the code - we see language of the solution in the tests. For example things like factories or other design patterns start appearing in the tests.
  5. Test cases regularly fail at random times during various runs. Failures are "false" because they were caused by some environmental or platform problem. For example a database service we needed wasn't started or the disk was full.
  6. All my tests depend on a test utility method I wrote a good while back and this utility method needs to be regularly changed when we add new features. Most times I add new tests, I have to change the utility method, causing a subtle change in all my tests.
  7. Lots and lots of tests that test POJO (Plain old java objects), lots and lots of tests that test whether data entry into the database was successful. Often tests are written due to inexperience and a need for coverage metrics. It may be fine to have one test that tests the connection to the database and ensures read/write works, but any more than that is overkill. For POJO's I recommend excluding them from code coverage altogether - there's no logic in there and test cases for the getters/setters are noise.


Updated 26th May, 2016

Updated 3rd June, 2016

Updated 23rd August, 2016

Updated 4th September, 2017

Updated 26th October, 2017

Updated 27th October, 2017

Updated 24th February, 2020

No comments:

Post a Comment