Writing a blog post every week requires a good source of inspiration. Fortunately, Twitter is there.
This time, it was the following tweet that sparked the flame.
I haven't been following Twitter's tech blog lately, but a fascinating post on testing. https://t.co/wiej11QoM1— Cindy Sridharan (@copyconstruct) January 9, 2018
- feature tests works *much* better than class tests (unit tests)
- advantages of feature tests over unit tests
- refactoring code shouldn't involve refactoring tests pic.twitter.com/6JunHDJksl
While I understand Twitter is not the place for subtle and complex thoughts, I believe such approach do more harm than good. Given that I have devoted a significant portion of my time thinking about, designing and writing tests of all kinds, I believe my perspective can bring something to the table.
While you’re at it, have a look at my book Integration Testing from the Trenches, dedicated to, guess what, integration testing.
By definition, a tweet is not meant to provide detailed arguments supporting one’s point of view. So, let’s check statements made in the original post.
As in the post, I find the term "unit testing" ambiguous, because of the scope of the unit. Likewise, the term "integration testing" can refer to different things in the mind of different people.
In the rest of this post, I’ll follow the terminology proposed in the article.
Agreeing on a vocabulary is very important:
"A safety net for refactoring"
Properly designed feature tests provide comprehensive code coverage and don’t need to be rewritten because they only use public APIs. Attempting to refactor a system that only has single-class tests is often painful, because developers usually have to completely refactor the test suite at the same time, invalidating the safety net. This incentivizes hacking, creating tech debt
The first part about testing public APIs, I completely agree with.
The second part is more nuanced: "Attempting to refactor a system that only has single-class tests is often painful."
This makes perfect sense. If the only tests you have are single-class tests, refactoring is going to break those. And in that case, you’re left without any clue whether the refactoring introduces regressions or not.
"Test end-to-end behavior"
With only single-class tests, the test suite may pass but the feature may be broken, if a failure occurs in the interface between modules. feature tests will verify end-to-end feature behavior and catch these bugs.
There’s an important condition there: "With only single-class tests".
Yes, a testing harness consisting only of single-class tests cannot insure the whole system works as intended.
When I make presentations about testing, I usually use the following comparison: Let’s consider the making of a car. Single-class testing is akin to testing each nut and bolt separately. Imagine testing of such components brought no issue to light. Still, it would be very risky to mass manufacture the car without having built a prototype and sent it to a test drive.
"Write fewer tests"
A feature test typically covers a larger volume of your system than a single class test
Again, there’s nothing to disagree with.
Still, I find the statement a bit strange, as it’s not a real advantage. While you write fewer feature tests to cover the same scope as single-class tests, they are made bigger.
The size of a test is a problem in feature testing as it makes it harder to pinpoint the root cause in case of a test failure. The bigger the scope of a test, the harder the analysis, see the next section for more details.
“My feature test broke, and it’s way harder to debug than single class tests.”
Think about how much time you will spend if a customer reports a new bug. First, you spend time trying to reproduce the bug. Next, you have to debug and fix the problem. Finally, you have to write single class tests. In general, reproducing the bug alone will cost more time than you spent debugging a feature test. In fact, if you have trouble debugging a properly-designed feature test, it is actually the same as debugging your whole server and we assert you have bigger issues if you have trouble debugging your server.
At this point, I start to disagree.
First, I don’t think a production bug can be captured in a feature test. If that would have been the case, then why was the bug allowed in the production system and not fixed before deployment?
Then, there’s one bold statement: "if you have trouble debugging a properly-designed feature test". This looks like a the No True Scotsman logical fallacy. Because if I have trouble debugging a feature test, then the author can always answer it was not properly designed. Yet, he offers no advice on how to design a feature test in a proper way.
I could continue, but I think I made my point. While at the beginning, the author pays attention not to oppose single-class testing and feature testing, it continues by trying to prove that the later is superior to the former. On the opposite, I do think they complement each other.
Keeping the above analogy of the car, would anyone assemble the prototype car and send it to a test drive without having tested separately each nut and bolt? Probably not, because if the car crashes in the test drive:
- It will take time and effort to understand the root cause
- If the root cause if one faulty nut/bolt, it could have been caught much earlier with less time/effort
And guess what, it directly translates into some of the disadvantages of feature tests:
- Hard to debug
Of course, feature tests have advantages over single-class tests. But so do single-class tests have over feature tests.
I think people too easily discarded the Testing Pyramid.
If you want to know more about testing in general and integration testing specifically, don’t forget to check Integration Testing from the Trenches.