Thursday, May 26, 2005

Data Flow Analysis Techniques for Test Data Selection

Abstract
This paper examines a family of program test data selection
criteria derived from data flow analysis techniques similar to those
used in compiler optimization. It is argued that currently used path
selection criteria which examine only the control flow of a program
are inadequate. Our procedure associates with each point in a
program at which a variable is defined, those points at which the
value is used. Several related path criteria, which differ in the
number of these associations needed to adequately test the
program, are defined and compared.

Introduction
Program testing is the most commonly used method for
demonstrating that a program actually accomplishes its intended
purpose. The testing procedure consists of selecting elements from
the program's input domain, executing the program on these test
cases, and comparing the actual output with the expected output
(in this discussion, we assume the existence of an "oracle", that is,
some method to correctly determine the expected output). While
exhaustive testing of all possible input values would provide the
most complete picture of a program's performance, the size of the
input domain is usually too large for this to be feasible. Instead,
the usual procedure is to select a relatively small subset of the
input domain which is, in some sense, representative of the entire
input domain. An evaluation of the performance of the program
on this test data is then used to predict its performance in general.
Ideally, the test data should be chosen so that executing the
program on this set will uncover all errors, thus guaranteeing that
any program which produces correct results for the test data will
produce correct results for any data in the input domain.
However, discovering such a perfect set of test data is a difficult, if
not impossible task [1,2]. In practice, test data is selected to give
the tester a feeling of confidence that most errors will be
discovered, without actually guaranteeing that the tested and
debugged program is correct. This feeling of confidence is
generally based upon the tester's having chosen the test data
according to some criterion; the degree of confidence depends on
the tester's perception of how directly the criterion approximates
correctness. Thus, if a tester has a "good" test data criterion, the
problem of test data selection is reduced to finding data that meet
the criterion.
Post a Comment