Software Testing and Code Quality
Testing - execute (part of) the implementation, see if it behaves as expected We test systems to gain confidence that the code is correct.
Software Testing Methods
We typically test at different levels:
- Unit testing: test specific classes and methods, fine grained
- Subsystem testing: test an entire “package” at a high level. For example: say we have a compiler broken into subsystems for tokenizing the input file, parsing the input file, and generating code. Describe how each subsystem might be tested.
- System testing: test the system as a whole (possibly by simulating a user of the system)
We will focus on unit testing: making sure that our classes and methods work correctly. Each class will have a suite of unit tests.
Basic idea in writing a unit test:
create one or more objects
call a method or methods on those objects
check if the behavior of the called method(s) was correct (typically, by checking the return value(s) using assertions)
A unit testing framework can make writing an running tests easier
E.g., JUnit for testing Java classes
How do we know when we have done “enough” testing? We know that we can never (in general) exhaustively test software. However, we should have some idea of when we have an “adequate” set of tests. This is the problem of test adequacy.
One kind of metric for test adequacy is code coverage. There are various levels of code coverage:
- method coverage: ensure that every method is called
- statement coverage: ensure that every statement in every method is executed
- branch coverage: ensure that every branch (decision point) in the program is exercised in all possible ways. For example, the tests must ensure that every if statement and loop condition is made both true and false by the tests.
A code coverage tool can automatically analyze your tests and the program being tested in order to determine the exact level of code coverage.
As a rule of thumb, your tests should achieve:
- 100% method coverage. If a method is never called, then remove it.
- 100% statement coverage (except debugging and error handling code). If a statement can’t be executed, remove it.
- As close as possible to 100% branch coverage. It is surprisingly difficult to reach 100% branch coverage, but you should try.
We can look at the problem from an economic perspective. (This is appropriate for system-level testing.) We have finite resources (money, time). We have two objectives in testing:
- reach an “acceptable” level of quality
- maximize the return on the investment of resources in testing
Although we should always strive for the highest quality, the total absence of defects is not necessary for most applications.
Testing effort will eventually reach a point of diminishing returns:
Once a testing effort reaches the point where the rate at which bugs are being found is sufficiently close to 0, we can stop testing. (Or, at least, more testing is not likely to yield many more bugs.)
Ensuring code quality
Testing is the main way that we can ensure that the software is sufficiently free of defects.
However, there are many other techniques and practices that are valuable to us.
Code Inspections, Pair Programming
A very useful way to ensure that software has high quality and is free from defects is to inspect the code after it’s written.
Ideally, the person who inspects the code is not the same person who wrote the code. If the goal of a code inspection is to try to find defects in the code, the person who wrote it has an inherent bias towards seeing it as correct.
We’ve all had the experience of trying to debug a program, and having someone walk by, look at the code, and immediately spot the error.
- Humans are good at playing the “adversary”; trying to think of situations in which the code will fail
- A thorough code inspection takes time.
- Code inspections can be very tedious.
- Developer time is expensive.
- Developers tend to be assigned to other tasks.
Because code inspection is time consuming, it should not be be applied too frequently. One idea is to conduct code inspection of only newly-written code.
Pair programming is a special case of code inspection. In pair programming, all code (and unit tests) is written by a pair of developers, working together. Basically, this is two people sitting at the same mouse/keyboard/monitor.
Pair programming works like a code inspection that is applied continuously; there is always a second set of eyeballs looking out for problems.
In general, we can reasonably expect code written using pair programming to contain fewer bugs than code written by a single developer. One phenomenon that makes programming difficult is that it is often difficult to “see” a defect once it has been introduced; we often see what the “intent” of the code is, rather than looking at what the code actually does. Pair programming takes advantage of the fact that two different people will have largely different “blind spots” - bugs that are “invisible”, i.e.: