Sri Sankaran argues that research and experience prove that a standardized, effective code review process mitigates costs and produces better code.
Reviewing code and getting it right
Need to be sure your program really runs right? Oh sure, testing's a part of it, but so are code reviews. Sri Sankaran argues that research and experience prove that a standardized, effective code review process mitigates costs and produces better code.
Problems in software are easier to fix the earlier they are detected. In describing the software constructive cost model (COCOMO), Boehm (in Software Engineering Economics ) contends that the cost of remediation goes up nearly three-fold from coding to testing and a further ten-fold if the problem is not detected before release. Each layer of software built on top of unratified code aggravates the problem by infesting other parts of the application.
A code review is an excellent checkpoint that can help flush out erroneous assumptions and gaps in reasoning. It helps minimize the impact of a problem by means of early detection. While testing is an excellent way to improve the quality, testing alone is not enough. For one, as shown by McConnell in Code Complete , it is statistically impossible to completely test a non-trivial software project. Testing must be bolstered by up-front code reviews.
Conducting a code review, however, can also be a costly proposition, due to the involvement (interruption) of several individuals and their loss of productivity. With the context switch and the cost to the project as a whole, code reviews can add a sizable lengthening of the project timeline. The purist may point to the benefits of identifying, diagnosing, and fixing problems early in the project lifecycle.
How does one reconcile these polarizing effects of a code review? The following sections show how you can stay on the high horse and yet deliver on time. They synthesize the essence of lessons learned from experience and best practices such as those evangelized by the Fagan review process (PDF, 7MB). Use these lessons as guidelines while developing in a more agile and cost-effective manner.
Purpose of Code Reviews
Adherence to coding standards Writing to standards has several benefits, such as a shared vocabulary, which makes the code easier to reuse and maintain. One of the primary focuses of a code review is to ensure abidance with these norms. As you'll see later, this benefit of a review is best derived in an automated manner. Such an approach has the dual benefit of a machine's thoroughness while circumventing the cost of a manual review.
Addressing the requirements One of the key goals of a code review is to ensure that the code under scrutiny is feature-complete. Beautiful code that fails to meet user requirements is useless. The review must ensure that the logic does what it is called upon to do.
Code correctness Ensure that the authors are following proper programming techniques as appropriate:
- Object orientation: Be on the lookout for monolithic, jack-of-all-trades methods.
- Code reuse: Promote the use of tried and tested (and debugged) APIs and avoid code duplication by recommending the use of available APIs.
- Adequate and proper documentation: Ensure that the code is amply commented. This is especially important when the logic is complex. Such inline documentation must explain what is being done and not how it is being done.
- Defensive coding practices: See any number of online resources.
- Maintainability: The ability to maintain the code is assessed from documentation and code organization.
Errors and omissions A well-written code fragment that abides by all the departmental norms, satisfies requirements, and otherwise "stays within the lines" could still be in error. It is the task of the reviewer to make sure that the code does not make incorrect business/usage assumptions and that all possible usage scenarios are taken into account. A business process expert helping with the review will be quick to spot logic that presupposes the state of the application while the logic is executed. For example, an application that allows guest users cannot be guaranteed to know the current user's identity.
Test for coverage. Test coverage is a measure of the quality of testing. Tools like Maven and CruiseControl simplify the process of testing and auditing coverage reports. The reviewer must understand the significance of branch and path coverage.
Silo-busting pedagogical aid This tool exposes others in the team to the code. The reviewer gets to read, analyze, and discuss someone else's work. Such cross-pollination is an excellent way to raise the skill of the entire group. This approach has the added tangential benefit of busting silos of expertise that projects tend to unwittingly foster.
How can a project have an effective review and oversight process, thereby assuring quality and on-time delivery? How can interruptions be minimized so that the developer/reviewer isn't constantly switching between coding and reviewing? Such task switches not only add to the timeline, they have a serious deleterious effect on the quality of the work. There are different facets to the solution. I will demonstrate that you can make a significant impact to the code base and yet work within the real-world project delivery constraints.
Managing what is reviewed You must accept that you cannot realistically manually review every last line of code that goes into a project. So what can you omit? Are there aspects of code that are, by nature, robust and whose veracity is implicit? The next section shows that you can make such a distinction and focus on what must be reviewed.
Using tools An automated tool is an ideal partner in your quest for code quality. A good tool functions while the developer is working, either directly during the writing of the code or during the build process. This means that the developer is not interrupted, and a formal manual review does not have to be conducted. However, tools aren't a panacea for this problem. A tool cannot validate business requirements. It cannot ensure against incorrect assumptions. It can only perform minimal oversight against insecure coding practices. Nevertheless, it is a major ally in the review process. It greatly simplifies what must be manually examined. Automated reviews are discussed in a later section.
Pair programming Proponents of extreme programming (XP) and other agile project development techniques have long sung the virtues of pair programming. In the current context, it has the benefit of being an on-the-spot code review. It is as if you are at a formal review process while you are coding. The added benefit is that it doesn't incur the additional cost that would otherwise be the case for a manual (scheduled) review.
What to Review
The purpose of a code review is to ensure a high level of code quality. The ideal way to ensure this is by verifying every line and every statement that has been written. This, however, is unrealistic given the premium on delivery faced by most projects. This statement should not be mistaken as indicating that you do not have time for quality assurance. It points out the need to be more judicious with your use of the time to deliver the highest quality product.
Focus on the programmer Java development tools have matured to the point where several routine/repetitive tasks can be safely delegated to tools. Examples include IDE getter/setter generators, JavaBeans, and XDoclet-generated code. I'll make a simplifying assumption that such code is accurate and does not need to be further validated by a code review. However, you must be aware of possible manual updates to machine-generated code. For example, a developer may tweak the Hibernate mapping file generated by
middlegen . It is the responsibility of the developer to highlight such exceptions, so that the reviewer is aware of the need to include the affected file during the review. The reviewer will assign the appropriate weight to such code during the examination process.
Risk-driven The higher the risk of using a given software component, the better its oversight should be. One way of quantifying risk is to use the formula from Madiera's Experimental software risk assessment [PDF 364KB]:
Risk = probability of bug x probability of bug activation x impact of bug activation
The probability of a bug is predicated on a couple factors. It is directly proportional to the frequency of usage of the affected code fragment and its complexity. The more complex the logic, the more prone it is to human error. This brings me to the next item on which to focus the review.
Complexity-driven Complexity in code is sometimes unavoidable; it may just be the nature of the problem. However, I have found that more often than not, complexity is a result of poor coding practice. Using metrics such as cyclomatic complexity , afferent , and efferent coupling allows you to quantify the complexity of the code. Several tools (IDE or Maven plug-ins) measure and report on the complexity of the code base. These are an integral part of a code review.
Test coverage-driven As cited above, a high degree of test coverage isn't a guarantor of quality. However, it is a step in the right direction. Conversely, logic that is not well tested is a candidate for thorough evaluation during a code review. It could also be argued that this egregious problem be fixed prior to a review.
When to Review
Just as important as identifying what to review is the determination of when to conduct a review. If the reviews are held too frequently, they become ineffective due to the fluidity of the code under review. If they are held too far apart, you risk a mushrooming cost of remedy due to afferent coupling of the code under review. I therefore postulate that a scheduled review must be conducted either after a development task has been completed or after a bug has been fixed.
Automate the review where possible. A review conducted by a computer has several benefits:
- Reduced manual cost of code reviews
- Fast, consistent, and repeatable
- Removes emotion from the reviews: pride, ego, and ownership need to be constantly recognized when conducting a review. When offering a critique you should take care to not to step on the author's feelings. Using an automated process abdicates this responsibility.
- In some cases you have tools that allow for real-time reviews, such as the Eclipse plug-in CodePro Analytix . These tools perform an examination of the code as it is being written. This is the holy grail of fixing broken windows.
The benefits notwithstanding, it must be acknowledged that the scope of automated reviews is limited to rules and conventions and are hard to extend to study business logic. The latter have to be handled by manual (or peer) examination.
From the prior sections you have seen that interactive and build-time tools at your disposal, while a great boon to the review process, cannot substitute for manual review for a certain class of problems. You have to manually review for certain types of improper coding practices. Tools cannot highlight errors of omission as described in the earlier section. Given that, let's see how a manual review must be conducted.
Participants: The review must focus on different facets of code correctness. Therefore, the participants of a code review must be chosen to satisfy a diverse set of roles.
- Technical lead: S/he plays the role of the facilitator and moderator
- The author(s)
- Quality Assurance: This role focuses on ensuring that the code under examination satisfies the set standards
- Functional expert: The person who is an expert in the business domain being reviewed
How to conduct: The technical lead will work with the authors to coordinate the review. This process includes:
- Identifying the participants in the review
- Determining the amount of preview time
- Providing the participants the necessary information
- Requirement/bug fix that was implemented
- CVS location of the files affected. The lines in the file that were affected by this change.
- Project reports with coverage and complexity details
- Review time duration
- Setting up a meeting to discuss the results of the preview
- Moderate the meeting – During the actual review the authors will describe the intent of the code and provide an overview of the implementation before the code is examined
- Capture comments
- Repeat above steps if necessary
There is some debate as to the effectiveness of review meetings (for an example, see A Controlled Experiment to Assess the Effectiveness of Inspection Meetings [PDF, 120KB]). I recommend that the participants and roles be determined as described above. All participants will meet after they have had a chance to inspect the code. The purpose of the meeting is solely administrative and to clarify corrections recommended by the reviewer.
A code review is an integral part of a project. Just as every coding task has an associated test process, so too should it have a code review. Therefore, it must be treated as another task that is tracked on the project plan by the project's manager. The project's technical lead who is working in concert with the authors will be able to assign appropriate resources to the review process.
This article has shown how you can have your cake and eat it too. You canconduct thorough code reviews without succumbing to time pressures. Did you know that more than 60 percent of the defects can be captured by effective use of peer reviews (c.f., Boehm and Basili's Software Defect Reduction Top 10 List [PDF, 124KB])? With a judicious combination of automation, interactive tools, and risk-based evaluation, you can reap the benefits of early problem detection. The techniques described here come from experience and are directly applicable to your next project. Use the references for additional relevant information on the subject of quality assurance as a whole.
- Software Engineering Economics , Boehm, B. -- Reference on analyzing software project costs.
- Code Complete , McConnell, S. -- Microsoft Press 1993 -- The seminal work on quality software development.
- Experimental software risk assessment (PDF, 364KB), Madeira, H. -- An article on ways of measuring risk in application software.
- Cyclomatic Complexity , VanDoren, E. -- A description of McCabe cyclomatic complexity
- Eye on performance , Shirzai, J., and Pepperdine, K. -- Explanation of code coupling, ways of measuring it and eliminating it when appropriate.
- CodePro Analytix -- A tool (for Eclipse) that ensures Java code quality by providing powerful software tools for code audit, metrics, automated test case generation, code coverage analysis, dependency analysis and team collaboration.
- Fagan review process (PDF, 7MB), Fagan, M. -- Original work by Michael Fagan describing the eponymous review process
- Software Defect Reduction Top 10 List (PDF, 124KB), Boehm, B., and Basili, V. -- An article co-authored by Boehm highlighting ten ways to improved quality.
- Metrics 2001 (PDF, 120KB), Bianchi, A., Lanubile, F., and Visaggio, G. -- A study on the effectiveness of inspection meetings.
width="1" height="1" border="0" alt=" " />