Archive

Posts Tagged ‘code coverage’
  • Why are you testing your software?

    Burned bus at a bus stop

    :page-liquid: :experimental:

    15 years ago, automated tests didn’t exist in the Java ecosystem. One had to build the application and painfully test it manually by using it. I was later introduced to the practice of adding a main method to every class and putting some testing code there. That was only marginally better, as it still required to manually run the methods. Then came https://junit.org[JUnit^], the reference unit testing framework in Java which brought test execution automation. At that point, I had to convince teams I was part of that we had to use it to create an automated tests harness to prevent regression bugs. Later, this became an expectation: no tests meant no changes in the code for fear of breaking something.

    More recently, however, it happened that I sometimes have to advocate for the opposite: not write too many tests. Yes, you read that well, and yet I’m no turncoat. The reason for this lies in the title of the post: why are you testing your software? It may sound like the answer to this question is pretty obvious - but it’s not, and the answer is tightly coupled to the concept of quality.

    [quote, Wikipedia] __ In the context of software engineering, software quality refers to two related but distinct notions that exist wherever quality is defined in a business context:

    • Software functional quality reflects how well it complies with or conforms to a given design, based on functional requirements or specifications. That attribute can also be described as the fitness for purpose of a piece of software or how it compares to competitors in the marketplace as a worthwhile product. It is the degree to which the correct software was produced.
    • Software structural quality refers to how it meets non-functional requirements that support the delivery of the functional requirements, such as robustness or maintainability. It is the degree to which the software was produced correctly. __

    Before answering the question of the why, let’s consider some reason why not I’ve already been confronted to:

    • Because everyone does it
    • Because the boss/the lead/colleagues/authority figures say so
    • To achieve link:{% post_url 2011-04-18-100-code-coverage %}[100% of code coverage^]
    • To achieve more code coverage than another team/colleague
    • And so on, and so forth

    All in all, those “reasons” boil down to either plain cargo culting or mistaking a metric for the goal. Which brings us back to the question, why do you test software?

    [IMPORTANT]

    The only valid reason for testing is that resources spent in making sure the software conforms to non-/functional requirements will be less over the course of time than the resources spent if not done. ====

    That’s pure and simple Return Over Investment. If ROI is positive, do test; if it’s negative, don’t. It’s as simple as that.

    [quote] __ Perfect is the enemy of good __

    The real difficulty lies in estimating the cost of testing vs the cost of not testing. The following is a non-exhaustive list of ROI-influencing parameters:

    Business:: No bugs are allowed in some industries e.g. medical, airplanes, banks without serious consequences to the business, while this is less critical for others such as mobile gaming. Estimated lifespan of the app:: the longer the lifespan of an app, the better the ROI because the same amount of testing code will yield more times e.g. nearly no tests for one-shot single-event apps vs. more traditional testing for tradition business apps running for a decade or so. Nature of the app:: some technologies are more mature than others, allowing for easier automated testing. The testing echosystem around webapps is richer than around native or mobile apps. Architecture of the app:: The more distributed the app, the harder it is to test. In particular, the migration from monoliths to microservices has some interesting side-effects on the testing side. It’s easier to test each component separately, but harder to test the whole system. Also, testing specific scenarios in clustered/distributed environments, such as node failure in a cluster, increase the overall cost. Nature and number of infrastructure dependencies:: The higher the number of dependencies, the more test doubles are required to test the app in isolation, which in turns drive up testing costs. Also, some dependencies are more widespread e.g. databases and web services, with many available tools while some are not e.g. FTP servers. Size of the app:: Of course, the bigger the size of the app, the bigger the number of possible combinations that needs to be tested. Maturity of the developers, and the size of the team(s):: Obviously, developers range from the ones who don’t care about testing to those who integrate testing requirements in their code from the start. Also, just for developers, adding more testers is subject to the https://en.wikipedia.org/wiki/Diminishing_returns[law of diminishing returns^]. Nature of the tests:: I don’t want to start a war, suffice to say there are many kinds of tests - unit, integration, end-to-end, performance, penetration, etc. Each one is good at one specific thing, and has pros and cons. Get to know them and use them wisely. Strength of the type system:: Developing in dynamically-typed languages require more tests to handle the job of the compiler in comparison to more statically-typed languages.

    While it’s good to listen to other’s advices - including well-established authority figures and this post, it’s up to every delivery team to draw the line between not enough testing and too much testing according to its own context.

    Categories: Java Tags: code qualitytestingcode coverage
  • Travis CI tutorial Java projects

    Travis CI logo

    :page-liquid: :experimental: :imagesdir: /assets/resources/travis-ci-tutorial-for-java-projects/

    As a consultant, I’ve done a lot of Java projects in different “enterprise” environments. In general, the Continuous Integration stack - when there’s one, is comprised of:

    • Github Enterprise or Atlassian Stash for source version control
    • Jenkins as the +++CI+++ server, sometimes but rarely Atlassian Bamboo
    • Maven for the build tool
    • JaCoCo for code coverage
    • Artifactory as the artifacts repository - once I had Nexus

    Recently, I started to develop https://github.com/nfrankel/kaadin[Kaadin^], a Kotlin-+++DSL+++ to design Vaadin applications. I naturally hosted it on Github, and wanted the same features as for my real-life projects above. This post describes how to achieve all desired features using a whole new stack, that might not be familiar with enterprise Java developers.

    Github was a perfect match. Then I went on to search for a Jenkins cloud provider to run my builds… to no avail. This wasn’t such a surprise, as I already searched for that last year for a course on Continuous Integration without any success. I definitely could have installed my own instance on any +++IaaS+++ platform, but it always pays to be idiomatic. There are plenty of Java projects hosted on Github. They mostly use https://travis-ci.org[Travis CI^], so I went down this road.

    == Travis CI registration

    Registering is as easy as signing in using Github credentials, accepting all conditions and setting the project(s) that need to be built.

    image::travis-projects-choice.png[Travis CI projects choice,668,538,align=”center”]

    == Basics

    The project configuration is read from a .travis.yml file at its root. Historically, the platform was aimed at Ruby, but nowadays different kind of projects in different languages can be built. The most important configuration part is to define the language. Since mine is a Java project, the second most important is which JDK to use:

    [source,yaml]

    language: java jdk: oraclejdk8 —-

    From that point on, every push on the Github repository will trigger the build - including branches.

    == Compilation

    If a POM exists at the root of the project, it’s automatically detected and Maven (or the Maven wrapper) will be used as the buidl tool in that case. It’s good idea to use the wrapper for it sets the Maven version. Baring that, there’s no hint of the version used.

    As written above, Travis CI was originally designed for Ruby projects. In Ruby, dependencies are installed system-wide, not per project as for Maven. Hence, the lifecycle is composed of:

    . an install dependencies phase, translated by default into ./mvnw install -DskipTests=true -Dmaven.javadoc.skip=true -B -V . a build phase, explicitly defined by each project

    This two-phase mapping is not relevant for Maven projects as Maven uses a lazy approach: if dependencies are not in the local repository, they will be downloaded when required. For a more idiomatic management, the first phase can be bypassed, while the second one just needs to be set:

    [source,yaml]

    install: true script: ./mvnw clean install —-

    == Improve build speed

    In order to speed up future builds, it’s a good idea to keep the Maven local repository between different runs, as it would be the case on Jenkins or a local machine. The following configuration achieves just that:

    [source,yaml]

    cache: directories:

    • $HOME/.m2

    == Tests and failing the build

    As for standard Maven builds, phases are run sequentially, so calling install will first use compile and then test. A single failing test, and the build will fail. A report is sent by email when it happens.

    image::travis-build-failed.png[A build failed in Travis CI,369,321,align=”center”]

    Most Open Source projects display their build status badge on their homepage to build trust with their users. This badge is provided by Travis CI, it just needs to be hot-linked, as seen below:

    image::https://travis-ci.org/nfrankel/kaadin.svg?branch=master[Kaadin build status,align=”center”]

    == Code coverage

    Travis CI doesn’t use JaCoCo reports generated during the Maven build. One has to use another tool. There are several available: I chose https://codecov.io/[https://codecov.io/^], for no other reason than because Mockito also uses it. The drill is the same, register using Github, accept conditions and it’s a go.

    Codecov happily uses JaCoCo coverage reports, but chooses to display only lines coverage - the link:{% post_url 2014-10-05-your-code-coverage-metric-is-not-meaningful %}[less meaningful metrics^] IMHO. Yet, it’s widespread enough, so let’s display it to users via a nice badge that can be hot-linked:

    image::https://codecov.io/gh/nfrankel/kaadin/branch/master/graph/badge.svg[Codecov,align=”center”]

    The build configuration file needs to be updated:

    [source,yaml]

    script:

    • ./mvnw clean install
    • bash <(curl -s https://codecov.io/bash)

    On every build, Travis CI calls the online Codecov shell script, which will somehow update the code coverage value based on the JaCoCo report generated during the previous build command.

    == Deployment to Bintray

    Companies generally deploy built artifacts on an internal repository. Open Source projects should be deployed on public repositories for users to download - this means https://bintray.com/nfrankel[Bintray^] and JCenter. Fortunately, Travis CI provides a lot of different remotes to deploy to, including Bintray.

    Usually, only a dedicated branch should deploy to the remote e.g. release. That parameter is available in the configuration file:

    [source,yaml]

    deploy: - on: branch: release provider: bintray skip_cleanup: true file: target/bin/bintray.json user: nfrankel key: $BINTRAY_API_KEY —-

    Note the above $BINTRAY_API_KEY variable. Travis CI offers environment variables to allow for some flexibility. Each of them can be defined as not to be displayed in the logs. Beware that in that case, it’s treated as a secret and it cannot be displayed again in the user interface.

    image::travis-environment-variables.png[Travis CI environment variables,409,194,align=”center”]

    For Bintray, it means getting hold of the API key on Bintray, creating a variable with a relevant name and setting its value. Of course, other variables can be created, as many as required.

    Most of the deployment configuration is delegated to a dedicated Bintray-specific JSON file. Among other information, it contains both the artifactId and version values from the Maven POM. Maven filtering is configured to automatically get the content from the POM and source it to the configuration file to use: notice the path reference is under the generated target folder.

    == Conclusion

    Open Source development on Github requires a different approach than for in-house enterprise development. In particular, the standard build pipeline is based on a completely different stack. Using this stack is quite time-consuming because of different defaults and all the documentation that needs to be read, but it’s far from impossible.

  • Your code coverage metric is not meaningful

    :imagesdir: /assets/resources/your-code-coverage-metric/

    Last week, I had a heated but interesting https://twitter.com/nicolas_frankel/status/514766152883777536[Twitter debate^] about Code Coverage with my long-time friend (and sometimes squash partner) Freddy Mallet.

    The essence of my point is the following: the Code Coverage metric that most quality-conscious software engineers cherish doesn’t guarantee anything. Thus, achieving 80% (or 100%) Code Coverage and bragging about it is just as useful as blowing in the wind. For sure, it’s quite hard to have a fact-based debate over Twitter, as 140 chars put a hard limit on any argument. This article is an attempt at writing down my arguments in a limitless space.

    The uselessness of raw Code Coverage can be proved quite easily. Let’s have a simple example with the following to-be-tested class:

    [source,java]

    public class PassFilter {

    private int limit;
    
    public PassFilter(int limit) {
        this.limit = limit;
    }
    
    public boolean filter(int i) {
        return i < limit;
    } } ----
    

    This class is quite straightforward, there’s no need to comment. A possible test would be the following:

    [source,java]

    public class PassFilterTest {

    private PassFilter passFilterFive;
    
    @BeforeMethod
    protected void setUp() {
        passFilterFive = new PassFilter(5);
    }
    
    @Test
    public void should_pass_when_filtering_one() {
        boolean result = passFilterFive.filter(1);
    }
    
    @Test
    public void should_not_pass_when_filtering_ten() {
        boolean result = passFilterFive.filter(10);
    } } ----
    

    This test class will happily return 100% code coverage as well as 100% line coverage: executing the test will go through all the code’s lines and on both sides of its single branch. Isn’t life sweet? Too bad there are no assertions; they could have been “forgotten” on purpose by a contractor who couldn’t achieve the previously agreed-on code coverage metric. Let’s give the contractor the benefit of the doubt, and assume programmers are of good faith - and put assertions:

    [source,java]

    public class PassFilterTest {

    private PassFilter passFilterFive;
    
    @BeforeMethod
    protected void setUp() {
        passFilterFive = new PassFilter(5);
    }
    
    @Test
    public void should_pass_when_filtering_one() {
        boolean result = passFilterFive.filter(1);
        Assert.assertTrue(result);
    }
    
    @Test
    public void should_not_pass_when_filtering_ten() {
        boolean result = passFilterFive.filter(10);
        Assert.assertFalse(result);
    } } ----
    

    Still 100% code coverage and 100% line coverage - and this time, “real” assertions! But it still is no use… It is a well-know fact that developers tend to test for passing cases. In this case, the two cases use parameters of 1 and 10, while the potential bug is at the exact threshold of 5 (should the filter let it pass or not with this value?).

    In conclusion, the raw Code Coverage only guarantees the maximum possible Code Coverage. If it’s 0%, of course, it will be 0%; however, with 80%, it gives you nothing… just the insurance that at most, your code is covered at 80%. But it can also be anything in between, 60%, 40%… or even 0%. What good is a metric that only hints at the maximum? In fact, in this light, Code Coverage is a http://en.wikipedia.org/wiki/Bloom_filter[Bloom Filter^]. IMHO, the only way to guarantee that test cases and the associated test coverage are really meaningful is to use http://en.wikipedia.org/wiki/Mutation_testing[Mutation Testing^]. For a basic introduction to Mutation Testing, please check my http://vimeo.com/105758362[Introduction to Mutation Testing^] talk at JavaZone (10 minutes).

    The good thing is that Mutation Testing is not some kind of academic paper only known by nerdy scientists, there tools in Java to start using it right now. Configuring http://pitest.org/[PIT^] with the previous test will yield the following result:

    image::pit-report.png[PIT report,558,802,align=”center”]

    This report pinpoints the remaining mutant to be killed and its associated line (line 12), and it’s easy to add the missing test case:

    [source,java]

    @Test public void should_not_pass_when_filtering_five() { boolean result = passFilterFive.filter(5); Assert.assertFalse(result); } —-

    Now that we’ve demonstrated without doubt the value of Mutation Testing, what do we do? Some might tried a couple of arguments against Mutation Testing. Let’s have a review of each of them:

    • Mutation Testing takes a long time to be executed. For sure, the combination of all possible mutations takes much more longer than standard Unit Testing - which has to be be under 10 minutes. However, this is not a problem as Code Coverage is not a metric that has to be checked at each build: a nightly build is more than enough.
    • Mutation Testing takes a long time to analyze results. Also right… but what of the time to analyze Code Coverage results that, as seen before, only hint at the maximum possible Code Coverage?

    On one hand, the raw Code Coverage metric is only relevant when too low - and requires further analysis when high. On the other hand, Mutation Testing lets you have confidence in the Code Coverage metric at no additional cost. The rest is up to you…

    Categories: Development Tags: code coveragequality
  • 100% code coverage!

    The basis of this article is a sequences of tweets betwen me and M. Robert Martin, on April 8th 2011:

    1. If you have 100% coverage you don’t know if your system works, but you do know that every line you wrote does what you thought it should.
    2. @unclebobmartin 100% code coverage doesn’t achieve anything, save making you safer while nothing could be further from the truth.
    3. @nicolas_frankel 100% code coverage isn’t an achievement, it’s a minimum requirement. If you write a line of code, you’d better test it.
    4. @unclebobmartin I can get 100% code coverage and test nothing because if have no assert. Stop making coverage a goal, it’s only a way!

    This left me a little speechless, even more coming from someone as respected as M. Martin. Since the size of my arguments is well beyond the 140-characters limit, a full-fledged article is in order.

    Code coverage doesn’t test anything

    Let’s begin with the base assertion of M. Martin, namely that 100% code coverage ensures that every code line behaves as intended. Nothing could be further from the truth!

    Code coverage is simply a way to trace where the flow passed during the execution of a test. Since coverage is achieved through instrumentation, code coverage could also be measured during the execution of standard applications.

    Testing, however, is to execute some code by providing input and verifying the output is the desired one. In Java applications, this is done at the unit level with the help of frameworks like JUnit and TestNG. Both use the assert() method to check output.

    Thus, code coverage and test are completely different things. Taken to the extreme, we could code-cover the entire application, that is achieve 100%, and not test a single line because we have no assertion!

    100% coverage means test everything

    Not all of our code is neither critical nor complex. In fact, some can even seen as downright trivial. Any doubt? Think about getters and setter then. Should they be tested, even though any IDE worth his salt can generate them faithfully for us? If your answer is yes, you should also consider testing the frameworks you use, because they are a bigger source of bugs than generated getters and setters.

    Confusion between the goal and the way to achieve it

    The second assertion of M. Martin is that 100% of code coverage is a requirement. I beg to differ, but I use the word ‘requirement’ with a different meaning. Business analysts translate business needs into functional requirements whereas architects translate them into non-functional requirements.

    Code coverage and unit testing are just a way to decrease the likehood of bugs, they are not requirements. In fact, users couldn’t care less about code average, as long as the software meet their needs. Only IT, and only in the case of long lived applications, may have an interest in high code coverage, and even then not 100%.

    Cost effectiveness

    I understand that in certain particular contexts, software cannot fail: in chirurgy or aeronautics, lives are at stake whereas in financial applications, a single failure can cost millions or even billions.

    However, what my meager experience taught me so far, it’s that money reigns, whether we want or not. The equation is very simple: what’s the cost of the test, what’s the cost of the bug and what’s the likehood of it. If the cost of the bug is a human life and the likehood is high, the application better be tested like hell. On the contrary, if the cost of the bug is half a man-day and the likehood low, why should we take 1 man-day to correct it in advance? The technical debt point of view help us to answer this question. Moreover, it’s a decision managers have to make, with the help of IT.

    Point is, achieving 100% testing (and not 100% coverage) is overkill in most software.

    For the sake of it

    Last but not least, I’ve seen in my line of work some interesting deviant behaviours.

    The first is being, quality for quality’s sake. Me being a technical person and all, I must admit I fell for it: “Wow, if I just test this, I can gain 1% code coverage!” Was it necessary? More importantly, did it increase maintainability or decrease the likehood of bugs? In all cases, you should ask yourself these questions. If both answers are no, forget about it.

    A slightly different case is the challenge between teams. Whereas challenges are good in that they create an emulation and can make everyone better, the object of the challenge should bring some added-value, something the raw code coverage percentage doesn’t.

    Finally, I already rambled about it, but some egos there just do things to write on their CV, 100% coverage is just one of them. This is bad and if you have any influence over such individuals, you should strive to orient them toward more project-oriented goal (such as 100% OK acceptance tests).

    Conclusion

    With the above arguments, I proved beyond doubt that assertions like ‘we must achieve 100% code coverage’ or ‘it’s a requirement’ cannot be taken as a general rule and is utter nonsense without some proper context.

    As for myself, before beginning a project, one of the thing on my todo list is to agree on a certain level of code coverage with all concerned team members (mainly QA, PM and developers). Then, I take great care to explain that this metrics is not a hard-and-fast one and I list getters/setters and no assert as ways of artificially increasing it. The more critical and the more complex the code, the higher the coverage it should have. If offered a choice, I prefer not reaching the agreed upon number and yet having both firmly tested.

    For teams to embrace quality is a worthy goal. It’s even more attainable if we set SMART objectives: 100% code coverage is only Measurable, which is why it’s better to forget it, the sooner the better, and focus on more business-oriented targets.

    Categories: Development Tags: code coveragetest