Advanced Software Fault Localization for Programs with Multiple Bugs




Journal Title

Journal ISSN

Volume Title



In practice, a program may contain multiple bugs. The simultaneous presence of these bugs may deteriorate the effectiveness of existing fault-localization techniques to locate program bugs. While it is acceptable to use all failed and successful tests to identify suspicious code for programs with exactly one bug, it is not appropriate to use the same approach for programs with multiple bugs because the due-to relationship between failed tests and underlying bugs cannot be easily identified. One solution is to generate fault-focused clusters by grouping failed tests caused by the same bug into the same clusters. That is, failed test cases in the same cluster are related to the same bug, whereas failed test cases in different clusters are related to different bugs. A fault-focused suspiciousness ranking is then generated using failed test cases of a given cluster and some or all of the successful test cases. Examining code along this ranking can help programmers locate the corresponding causative bug linked to this ranking.

In this dissertation, MSeer -- an advanced fault localization technique for locating multiple bugs in parallel is proposed. Major contributions of MSeer include the use of (1) a revised Kendall tau distance to measure the distance between two failed tests, (2) an innovative approach to simultaneously estimate the number of clusters and assign initial medoids to these clusters, and (3) a revised K-medoids clustering algorithm to better identify the due-to relationship between failed tests and their corresponding bugs. Case studies on 720 multiple-bug versions of six programs suggest that MSeer performs better in terms of effectiveness and efficiency than two other techniques for locating multiple bugs in parallel.

However, while fault localization techniques such MSeer are quite effective at locating faults, they still suffer from one fundamental limitation in that most existing fault localization techniques have assumed the existence of a test oracle. Otherwise, the program spectrum will not be associated with the testing result of failed or passed, and as a consequence, there will be insufficient information to perform the rank generation. Therefore, in this dissertation, we also proposed an execution results prediction framework for software fault localization technique. Case studies using 22 programs and seven fault localization techniques were conducted to evaluate the fault localization effectiveness of the proposed framework on 1203 faulty versions, some of which have a single bug and others with multiple bugs. A discussion on factors that may affect the accuracy of execution result prediction and the resulting fault localization effectiveness is also presented. Our data suggests that, in general, with respect to fault localization techniques using execution results verified against the expected outputs, those using predicted execution results can be even more effective than (by examining a smaller number of statements to locate the first faulty statement) or as good as the former (the verified).



Software failures, Debugging in computer science, Software measurement


Copyright ©2017 is held by the author. Digital access to this material is made possible by the Eugene McDermott Library. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.