Systematic Black-Box Testing of Computer-Based …Systematic Black-Box Testing of Computer-Based...

Systematic Black-Box Testing of

Computer-Based Systems through

Formal Abstraction Techniques

Bernhard Klaus Aichernig

Dissertation

vorgelegt zur Erreichung des akademischen Grades

Doktor der Technischen Wissenschaften

an der

Technischen Universitat Graz

Graz, im Janner 2001

Abstract

In the past of computer science, sometimes, a gap between the testing and the formalmethods community could be observed. Testers did not believe in the applicability offormal verification techniques to real world problems, and formal methods’ advocatescould not accept testing-techniques as an adequate verification method for producingcorrect software. However, today the gap is closing and this thesis contributes tobuilding the bridge by offering new results on testing based on formal developmenttechniques.

Others have worked on testing based on formal specifications before, but only afew considered the related development techniques such as program synthesis. Mostof the previous work focused on generating test-cases by partitioning the formalspecification into equivalence classes of input-output behavior. This related workon testing is reviewed.

In this thesis three formal development techniques have been examined, regardingtheir support in the testing of computer-based systems: Light-weight approachesto formal methods, the refinement calculus, and data-reification techniques. As aresult, three novel testing techniques have been developed.

First, a light-weight approach to testing, based on an executable formal specifi-cation, has been applied in two industrial experiments in the domain of voice com-munication for air-traffic control. Abstract VDM prototypes have been interpretedin order to assess and extend existing system-level test-cases. The encouraging re-sults demonstrate that formality is essential for testing high-quality systems. Exactfigures of the detected errors in the test documents and of the efforts are presented.

Second, a main contribution of this thesis is a novel approach to calculate test-cases. It is based on the observation that test-cases are abstractions of systemspecifications. The refinement calculus of Back and von Wright is used to formulateabstraction rules for deriving correct test-case scenarios from formal specifications.The advantage of this abstraction approach is that simple input-output test-cases aswell as testing scenarios can be handled. This is demonstrated by giving abstractionrules for partition analysis, mutation testing, and scenarios.

Third, it is demonstrated that abstract formal specifications can be used as auto-matic testing oracles for concrete implementations. Here, classical data-refinementtechniques are combined with features of modern tools in order to evaluate testresults with respect to a formal specification.

iii

Zusammenfassung

In der Vergangenheit der Informatik konnte manchmal eine Kluft zwischen denAnhangern des Testens und der formalen Methoden festgestellt werden. Testerglaubten nicht an die Anwendbarkeit von formalen Verifikationsmethoden in derrealen Welt. Befurworter von formalen Methoden wiederum, konnten die Testtech-niken nicht als adaquate Verifikationsmethode akzeptieren. Heute aber, schließt sichdie Kluft, und diese Dissertation tragt dazu bei, diese zu uberbrucken.

Andere haben bereits vorher im Bereich Testen basierend auf formalen Spezifi-kationen gearbeitet, aber nur wenige haben die verwandten Entwicklungsmethoden,wie die Programmsynthese, mit in Betracht gezogen. Es wird ein Uberblick dieservorangegangenen Arbeiten gegeben.

In Rahmen dieser Dissertation wurden drei formale Entwicklungsmethoden be-zuglich ihrer Unterstutzung des Testens von computerbasierten Systemen unter-sucht: Leichte Ansatze zu formalen Methoden, der Refinement-Kalkul und Daten-verfeinerungstechniken. Das Resultat sind drei neue Testtechniken.

Erstens: Ein leichter Ansatz des Testens, basierend auf ausfuhrbaren formalenSpezifikationen, wurde in zwei industriellen Experimenten im Bereich der Sprachver-mittlung zur Flugkontrolle angewandt. Abstrakte VDM-Prototypen wurden inter-pretiert, um existierende Testfalle zu bewerten und zu erweitern. Die ermutigendenErgebnisse zeigen, daß Formalitat essentiell fur das Testen von hoch-qualitativenSystemen ist. Exakte Zahlen der in den Testdokumenten gefundenen Fehler unddes Aufwands werden angegeben.

Zweitens: Ein Hauptbeitrag dieser Arbeit ist ein neuer Ansatz, um Testfalle zuberechnen. Er basiert auf der Beobachtung, daß Testfalle Abstraktionen von System-spezifikationen sind. Der Refinement-Kalkul von Back und vonWright wird verwen-det, um formale Abstraktionsregeln zu formulieren. Mit diesen Regeln konnen kor-rekte Testfallszenarien aus formalen Spezifikationen abgeleitet werden. Der Vorteildieses Abstraktionsansatzes ist es, daß sowohl einfache Eingabe-Ausgabe Testfalleals auch Testszenarien behandelt werden konnen. Es werden Regeln fur Partitions-testen, Mutationstesten und Szenarien angegeben.

Drittens: Es wird gezeigt, daß abstrakte formale Spezifikationen als automatischeTestorakel fur konkrete Implementierungen verwendet werden konnen. Hier werdenklassische Datenverfeinerungstechniken mit den Moglichkeiten moderner Werkzeugekombiniert, um Testergebnisse bezuglich einer formalen Spezifikation auszuwerten.

v

Acknowledgments

I would like to express my gratitude to all those people who have supported methroughout my research.

First of all, I wish to thank Prof. Peter Lucas, my supervisor, for his help andpatience during the course of this work. It was him, who introduced me to formaldevelopment methods and has given me the opportunity to work as his assistant.

Then, I would also like to thank my colleague Andreas Kerschbaumer, who sharedin many fruitful discussions his opinions and points of view with me.

Thanks belong also to my students in software technology. It has been theirconstructive feedback on the subject, that convinced me that the formal techniquespresented in this thesis might be accepted by well-educated engineers.

Especially, I want to thank my students Johann Horl and Andreas Gerstingerwho took part in the two industrial experiments at Frequentis. It has been theirability that made the experiments a success. Both experiments have been fundedby Frequentis.

Finally, I am indebted to my wife, Kathrin, for her support and for encouragingme all the way. Thanks,

Bernhard K. Aichernig

vii

Contents

1 Introduction 11.1 Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 The Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Black-Box and White-Box Testing . . . . . . . . . . . . . . . . 31.1.3 Test Automation . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Formal Development Methods . . . . . . . . . . . . . . . . . . . . . . 51.2.1 Formal Specification Techniques . . . . . . . . . . . . . . . . . 61.2.2 Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.3 Formal Methods and Testing . . . . . . . . . . . . . . . . . . . 9

1.3 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.4 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.6 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.7 Structure of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 Review of Related Work on Testing 152.1 Partition Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1.1 Domain Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1.2 DNF-Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . 162.1.3 Separation of the Input . . . . . . . . . . . . . . . . . . . . . . 192.1.4 Structural Partitioning . . . . . . . . . . . . . . . . . . . . . . 20

2.2 Mutation Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3 Structuring Test-Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.1 Classification Trees . . . . . . . . . . . . . . . . . . . . . . . . 222.3.2 Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 Test Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.4.1 Finite State Machine (FSM) Calculation . . . . . . . . . . . . 232.4.2 Refinement of a FSM . . . . . . . . . . . . . . . . . . . . . . . 23

2.5 Test Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.6 Algebraic Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.6.1 Testing a Data Type Implementation . . . . . . . . . . . . . . 252.6.2 Selection Strategies . . . . . . . . . . . . . . . . . . . . . . . . 26

2.7 Behavioral Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 28

ix

x CONTENTS

2.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.8.1 A New Approach to Test Strategies . . . . . . . . . . . . . . . 282.8.2 Algebraic vs. Model-Based Specifications . . . . . . . . . . . . 292.8.3 Taking the Dynamic Behavior into Account . . . . . . . . . . 302.8.4 How Much Does It Cost? . . . . . . . . . . . . . . . . . . . . . 302.8.5 New Tools — New Oracle Approaches . . . . . . . . . . . . . 30

3 A Light-Weight Approach to Testing 333.1 Light-Weight Formal Methods . . . . . . . . . . . . . . . . . . . . . . 333.2 VDM and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.1 The Vienna Development Method . . . . . . . . . . . . . . . . 353.2.2 IFAD VDMTools . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Test Design with Abstract Prototypes . . . . . . . . . . . . . . . . . . 383.3.1 Abstract Prototypes . . . . . . . . . . . . . . . . . . . . . . . 383.3.2 The Light-Weight Test Approach . . . . . . . . . . . . . . . . 39

3.4 Formal Specifications as a Catalyst in Validation . . . . . . . . . . . . 413.4.1 Issues in the Requirements and Test Documentation . . . . . . 413.4.2 A Catalyst in Validation . . . . . . . . . . . . . . . . . . . . . 42

3.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Industrial Experiences 434.1 The Application Domain . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.1.1 Air Traffic Control . . . . . . . . . . . . . . . . . . . . . . . . 434.1.2 Voice Communication in Air Traffic Control . . . . . . . . . . 444.1.3 A Network of Voice Communication Systems . . . . . . . . . . 45

4.2 Experiment I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.2.1 Requirements Validation . . . . . . . . . . . . . . . . . . . . . 464.2.2 Test Case Validation . . . . . . . . . . . . . . . . . . . . . . . 504.2.3 Remarks on Experiment I . . . . . . . . . . . . . . . . . . . . 53

4.3 Experiment II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.3.1 Requirements Validation . . . . . . . . . . . . . . . . . . . . . 554.3.2 Examples of Requirement Issues . . . . . . . . . . . . . . . . . 564.3.3 Test-Case Validation . . . . . . . . . . . . . . . . . . . . . . . 594.3.4 Remarks on Experiment II . . . . . . . . . . . . . . . . . . . . 60


5 Test-Cases are Abstractions! 635.1 The New Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.1.2 Test-Design as a Formal Synthesis Problem . . . . . . . . . . . 63

5.2 The Refinement Calculus . . . . . . . . . . . . . . . . . . . . . . . . . 645.2.1 Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.2.2 Example Contracts . . . . . . . . . . . . . . . . . . . . . . . . 65

CONTENTS xi

5.2.3 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2.4 Refinement and Abstraction . . . . . . . . . . . . . . . . . . . 68

5.3 Test-Cases as Abstractions . . . . . . . . . . . . . . . . . . . . . . . . 695.3.1 Input-Output Tests . . . . . . . . . . . . . . . . . . . . . . . . 695.3.2 Non-Deterministic Test-Cases . . . . . . . . . . . . . . . . . . 715.3.3 Partition Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.4 Testing Interactive Systems . . . . . . . . . . . . . . . . . . . . . . . 725.4.1 User Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 735.4.2 Iterative Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 735.4.3 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


6 Testing Strategies 776.1 Partition Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.2 Structural Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 806.3 Mutation Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.4 Calculating Scenarios for Testing . . . . . . . . . . . . . . . . . . . . 83

6.4.1 Critical Remarks on FSM approaches . . . . . . . . . . . . . . 836.4.2 Compositions . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.4.3 Scenario Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 846.4.4 Scenario Based Testing Strategies . . . . . . . . . . . . . . . . 84

6.5 Example: Process Scheduler . . . . . . . . . . . . . . . . . . . . . . . 856.5.1 Interactive Process Scheduler . . . . . . . . . . . . . . . . . . 856.5.2 Interaction Partitioning . . . . . . . . . . . . . . . . . . . . . 876.5.3 Compositions . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.5.4 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89


7 Automated Test Result Verification 917.1 From Formal Specifications to Oracles . . . . . . . . . . . . . . . . . 91

7.1.1 VDM-SL Specifications as Oracles . . . . . . . . . . . . . . . . 917.1.2 The Homomorphism of Abstraction . . . . . . . . . . . . . . . 93

7.2 Automated Test Evaluation . . . . . . . . . . . . . . . . . . . . . . . 947.2.1 Code-Generation of Oracles . . . . . . . . . . . . . . . . . . . 947.2.2 Interpretation of Oracles with the Dynamic Link Facility . . . 967.2.3 CORBA Oracles . . . . . . . . . . . . . . . . . . . . . . . . . 96


8 Case Study 998.1 Roles in Voice Communication . . . . . . . . . . . . . . . . . . . . . . 99

8.1.1 An Abstraction of the System’s State . . . . . . . . . . . . . . 1008.1.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.2 Tester-System Interactions . . . . . . . . . . . . . . . . . . . . . . . . 102

xii CONTENTS

8.3 DNF-Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058.4 Scenario Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

8.4.1 Interaction Compositions . . . . . . . . . . . . . . . . . . . . . 1068.4.2 Selecting Test Scenarios . . . . . . . . . . . . . . . . . . . . . 111


9 Concluding Remarks 1199.1 Research Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

9.1.1 Light-Weight Approach . . . . . . . . . . . . . . . . . . . . . . 1199.1.2 Abstraction-Based Test Synthesis . . . . . . . . . . . . . . . . 1209.1.3 Formal Specifications as Oracles . . . . . . . . . . . . . . . . . 121

9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1219.3 Epilogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Bibliography 122

Chapter 1

Introduction

This dissertation presents novel approaches to software testing. The testing tech-niques are based on formal development methods for software. These formal meth-ods usually start with a formal description of the software requirements and refinethis specification into an executable implementation. The advantage is the possibil-ity of proving the correctness of the implementation with respect to its specification.In this work, formal requirements specifications as well as formal development meth-ods are applied in the design, validation and execution of software tests. It turnedout that abstraction techniques are the key to a more systematic testing process.This chapter gives a general introduction into the subject, motivations, aims andscientific contributions of our work.

1.1 Software Testing

1.1.1 The Challenge

Software testing is a challenging task. It should evaluate the software to demonstratethat it meets the requirements. This is difficult, both in theory and in practice.However, software is increasingly being used in our daily life as well as in safety-critical systems where failure could be life threatening. This enforces the need foran improved systematic discipline of testing which must be scientifically justifiable.

Testing and Correctness

From theory it is known that testing cannot guarantee the absence of failures incomplex computer-based systems. The reason is the discrete nature of software.For example, if two test-points in the input-space of a piece of software have beentested successfully, nothing is known about the behavior of the software for otherinputs situated between these test-points. This is fundamentally different to otherengineering disciplines, mainly dealing with continuous physical domains.

1

2 CHAPTER 1. INTRODUCTION

In order to certify a software for being correct with respect to its requirements, itshould be stimulated with all possible inputs. However, the input-space of complexsoftware is in general too large. Consequently, a representative subset of inputs, thetest-cases, has to be selected. What is considered to be representative is based on ahypothesis. This means that testing does only show that a system behaves correctlywith respect to its requirements for the actual test-cases, the rest is an assumption.Therefore, the purpose of software testing is rather the detection of failures than thedemonstration of their absence. Edsgar Dijksta summarized this in his well-knownstatement: “Program testing can be used to show the presence of bugs, but never toshow their absence” [DDH72]. In practice, further difficulties with respect to testingarise.

Testing and Requirements

The testing process is strongly dependent on the quality of the requirements doc-umentation available to a tester. User requirements that have not been explicitlydocumented cannot be systematically tested. It is known from practice that ap-proximately 60 percent of all defects in the requirements phase are due to missingrequirements [Per95]. Therefore, the systematic capturing of user requirements is aprerequisite to successful software testing. However, experience shows that require-ment documents written in a natural language tend to be ambiguous and unsound.Consequently, formal specification languages have been designed for capturing re-quirements in an unambiguous and sound software specification. In Chapter 3 ourown experience from two industrial experiments is presented where such formalspecification techniques have been successfully applied for increasing the quality offunctional requirements in a safety-critical domain.

Testing in the Development Process

Testing has to be considered with respect to the development process, too. In theclassical waterfall-model of the software development process, testing is presented asthe last phase [Roy70]. This causes a tendency to cut the testing efforts, if projectsexceed their schedules. The situation for testing is even more dramatic, if it isrealized that most of today’s big software projects do not meet their schedules, asreported in [Gib94] and more recently in [GSPC99]. Consequently, the tests haveto be designed as soon as possible in the development process, after the softwarerequirements are available.

Execution of the test can take place as soon as software units are ready for com-pilation or interpretation. If components are ready for integration, these tests areperformed until the final system-level tests are carried out. The V-model [fSSaCB91]stresses the different levels of testing and shows the relation to the developmentphases explicitly.

1.1. SOFTWARE TESTING 3

In Boehm’s spiral process model testing takes a central role [Boe86]. This ap-proach proposes rapid prototyping. A prototype of software is an incomplete imple-mentation of software that mimics the behavior we think the user need [Sta89]. Inrapid prototyping, software is implemented by the incremental development of a pro-totype that is tested and analyzed before new functionality is added. This processmodel is motivated by the fact that testing should be performed as soon as possible.The earlier in the process a fault is discovered and corrected, the cheaper the correc-tion — early detected faults are cheap faults. Thus, with respect to testing Boehmsays: “Do it early!” [Boe81]. In Chapter 3, executable formal specifications are pro-posed as a prototype for an early validation of requirements as well as system-leveltest-cases.

Beck’s recent Extreme Programming approach is even more radical [Bec99]. Herejects the classical process models. However, testing plays a central role in ExtremeProgramming. One of its rules is that for each class, sufficient test-routines haveto be implemented as a part of the class prior to the methods design. In addition,more frequent regression tests are proposed after each working day of a program-ming team, when the software has been extended or small changes took place. Theadvantage is twofold: First, the scheduling problem is solved, since testing is anintegral part of the design and implementation process. Second, the testing routinesact as a formal specification of the class methods. This will become more obvious inChapter 5, when it is demonstrated that test-cases are in fact formal specifications.

1.1.2 Black-Box and White-Box Testing

This thesis is about test techniques. A test technique or test strategy is a systematicmethod used to select and/or generate tests to be included in a test-suite. Twofundamentally different categories of test techniques exist, black-box and white-boxtechniques. Black-box test techniques are based on requirements. They are thereforealso called behavioral or functional techniques. Testing done under a black-boxstrategy is called black-box testing. Black-box test strategies do ignore the internalstructure of the object under test. Thus, a black-box strategy does not consider howthe test-object have been implemented, but what its requirements are. The subjectof this thesis is black-box testing. For a more detailed introduction to black-boxtechniques we refer to [Bei95].

In white-box testing, the tests are derived from the structure of the tested object.Consequently, this testing method is also called structural testing. Well-knownexample test strategies of this kind are to execute every statement at least once(statement testing), to execute every branch at least once (branch testing), and totest the use of all data object (data-flow testing). For a detailed presentation andcomparison of these techniques we refer to [Bei90] and [Mye79].

In this thesis it is shown that white-box test strategies can be applied to require-ments, if these requirements are captured in a formal specification. This means thatstructural techniques can be applied for black-box testing, if the structure of the


specification is analyzed. One approach is to specify an abstract and executableprototype solely based on the requirements. Then, for example, branch testingis possible. This method and its encouraging results are presented in Chapter 3.A more advanced method derives test-cases directly from a more abstract non-executable specification by applying synthesis rules to the specification. Here, thestructural techniques are incorporated into the synthesis rules. This novel approachis shown in Chapter 6.

1.1.3 Test Automation

Testing is expensive and its efforts are often underestimated. The costs of testingare dependent on the number and subtlety of detected faults during testing. Brookspoints out that software engineers are usually too optimistic with respect to failurerates, “Therefore, testing is usually the most miss-scheduled part of programming”[Bro75]. He proposes that the half of the schedule should be devoted to testing: 1/3planning, 1/6 coding, 1/4 component test and early system test, 1/4 system testwith all components in hand. Another much-quoted heuristic rule distributes theefforts as follows: 40 % design, 20 % coding, 40% testing [Per95]. The figures givenby Boehm [Boe75] distinguish between different types of software systems. He foundthat for testing of command and control systems 34 %, space-born systems 46 %,operating systems 50 %, scientific systems 30 %, and business systems 28 % of thetotal development costs are spent. Boehm’s figures from 1975 are still in use as theirpresentation in the standard literature for software engineering [Som92] indicates.Although, more elaborate regression models for estimating the efforts of testing exist(see e.g. [Lyu95, MIO90]), these heuristic data demonstrate how expensive testingcan be.

One way to decrease these huge efforts is the automation of testing. Otherarguments for test automation summarized in [Bei95] are that

• manual tests are error-prone: a typical test-case contains 250 characters.Statistics from professional keypunchers indicate that manually entered testshave at least one keystroke-error. Furthermore visual test-verification is likelyto fail;

• manual testing leads to false confidence: manual test-execution is difficult andboring. Often effort and accomplishment are equated, giving the impressionthat ’hard work’ means ’well tested’;

• dependability requirements demand automated testing: all statistically validmethods of estimating software dependability are based on fully automatictest methods for executing enough test-cases;

• a test has to be run several times: in practice, tests are executed more thanonce. The reasons are that a test-design may have been wrong, a failure

1.2. FORMAL DEVELOPMENT METHODS 5

has to be checked if it is repeatable, a fault has to be demonstrated to theprogrammer, the fixing of the fault has to be shown.

• maintenance needs automated regression testing: In today’s software engi-neering, 80 % of the whole software engineering efforts are in maintenance ofexisting code. Only an automated test process of the changed software makesmaintenance feasible.

More detailed justifications and examples of test-automation can be found in [FH88,Hen89, Mil86, Nom87, Per86, VGvM80]. In principal, all three phases of testing canbe automated or at least supported by tools:

1. test design: test-case generators derive test-cases by analyzing a formal repre-sentation of the software. Most of the tools are based on the program source-code (white-box testing).

2. test execution: test-scripts or test-routines that can be run automatically maketests repeatable and reduce costs.

3. test evaluation: automated oracles are programs that evaluate the tests bycomparing the actual test results with the expected outcome.

Traditionally, the automation of the test design has been focused on white-boxtechniques. The reason is that the source-code of a programming language is astandardized formal representation which can be analyzed mechanically. For au-tomating black-box testing based on the requirements, a formal representation of therequirements is needed1. Formal development methods provide such specificationtechniques that enable the automation of black-box testing. A general summaryof the possibilities of automating black-box testing, not solely focused on formalmethods, is [Pos96].

This thesis contributes to the automation of black-box testing with respect to allthree test phases above. In order to understand the contributions, motivations andaims of this work a short introduction into formal development methods is presented.The summary of the related work in Chapter 2 will introduce into the existing testtechniques based on formal methods in more detail.

1.2 Formal Development Methods

All classical engineering disciplines rely on mathematics for creating their products.At a NATO conference in 1968, it was realized that the construction of software isso complex that its development deserves to be an engineering discipline of its own— the term software engineering was born [NR69]. In the same decade, the firstformal development methods have been invented.

1Random testing techniques are exceptions.


Formal development methods are mathematical methods for producing software(and today digital hardware). Due to the discrete nature of software, the mathemat-ics needed is fundamentally different to that used in other engineering disciplinesmainly dealing with continuous domains. In software engineering, formal logic andalgebraic structures are the most important mathematical tools. The term formalmeans the use of methods of reasoning that are sound by virtue of their form andindependent of their content.

The UK Military of Defense standard on the procurement of safety-critical soft-ware [oD89] defines a formal method as:

A software specification and production method, based on a mathe-matical system, that comprises: a collection of mathematical notationsaddressing the specification, design and development phases of softwareproduction; a well-founded logical system in which formal verificationand proofs of other properties can be formulated; and methodologicalframework within which software may be verified from the specificationin a formally verifiable manner.

From the definition above it can be concluded that formal specifications of require-ments and formal verification of software are the corner-stone of a formal method.However, recently light-weight approaches to formal methods have gained popularityfor transferring these techniques into industry [ELC+98, JW96, Jon96]. In this con-text light-weight indicates that the method’s focus is rather on specification than onformal proofs. In Chapter 3 these approaches are examined with respect to testing.

In the following, an overview on the different formal techniques should give anintroduction to the advantages mathematics provides for software development —and for testing. A more detailed introduction to formal development methods canbe found, for example, in the formal methods guidebooks of NASA [NAS95, KK97].

1.2.1 Formal Specification Techniques

A formal specification is the documentation of the functional requirements of acomputer-based system in a mathematical language, the specification language.Since, the semantics of a specification language is formally defined, formal spec-ifications can capture software requirements in an unambiguous and yet abstractstyle.

Dependent on the mathematical apparatus used, several different specificationtechniques have been established. In this work, the following three fundamentallydifferent approaches to specification are distinguished: model-based, algebraic andbehavioral specifications. The reason for this classification are the completely dif-ferent consequences of these techniques for testing (see the survey of related workin Chapter 2).


Model-based Specification

Model-based specification techniques use mathematical objects for modeling a sys-tem’s state explicitly. Numbers, sets, sequences, finite mappings, Cartesian prod-ucts and composite types are typical types of objects that are applied to model asystem on an adequate level of abstraction. In contrast to conventional program-ming languages, logical constraints on the types, here called data-invariants, maybe added to specify additional properties of a model. Functionality is specified bystate-changes expressed by functions or relations on the state model. Consequently,these specifications are also called “state-based”.

Prominent formal methods of this kind are VDM [Jon90, FL98], Z [Spi88, Spi92,WD96], and B [Abr96]. Furthermore, the formalism of Back and von Wright’srefinement calculus is model-oriented [BvW98]. The testing techniques presentedin this work are solely based on model-based specification techniques. However, forthe discussion of the related work the next chapter, a basic knowledge of the othertwo specification methods is necessary.

Algebraic Specification

In an algebraic specification technique an object class or type is specified in termsof the relationships between the operations defined on that type. Unlike in model-based specifications, nothing is said about the internal representation of the definedobjects. Hence, in general, algebraic specifications are more abstract than model-based specifications.

Algebraic specification was brought to prominence by Guttag [Gut77] in thespecification of abstract data types. Various notations for algebraic specificationhave been developed, including OBJ [FGJM95] , Larch [GHW85, GH93] and CASL[Mos99]. In RAISE [Gro92, Gro95], algebraic specification and the model-basedtechniques from VDM have been combined. A notation independent presentationof the subject can be found in [HL93].

Behavioral Specification

Abstract specifications of the behavior of a system are called behavioral specifica-tions. This type of formalisms describe rather how a system behaves over time thanwhat the possible computations can be. Control and concurrent systems are typicalapplication domains for this specification style.

If the description of the behavior should mainly focus on the set of possiblereachable states, finite state machines are used to capture the dynamic state-changes.

If the specification is focused rather on the possible events that can occur in asystem than on its reachable states, then labeled transition systems are in use. Theprocess algebras of CSP [Hoa85] and the π-Calculus [Mil99] belong to this class ofspecification techniques.


Further well-known behavioral specification techniques are Petri-nets [Pet77,Pet81], and Estelle, LOTOS, SDL, [Tur93], well-suited for defining communicationprotocols.

However, some development methods, like UML, use diagrammatic behavioralspecifications that lack a formal semantics. These methods are not considered to beformal by the author.

Behavioral specification techniques can be combined with non-behavioral meth-ods. VDM++ (see Chapter 3) is an example of combining model-based with be-havioral specifications, LOTOS an example of the combination with the algebraicspecification style.

1.2.2 Design Methods

A toolbox of a formal development method should include both, a formal specifi-cation language as well as a formal design technique. Usually, a formal design isstarted from an abstract formal specification and then successively refined into anexecutable implementation. This process is called refinement. In model-based tech-niques, both, the data model (data refinement, also called data reification) as wellas the operations (operational refinement) are refined.

Two different formal approaches to the refinement problem exist, known as inventand verify and program synthesis. In this thesis it is demonstrated how both methodscan be adapted for the application in testing.

Invent and Verify

In this classical design paradigm, a refined version of the specification is invented andafterwards verified that the refinement is correct with respect to its specification.Each method provides the proof obligations necessary for proving the refinementrelation. For example, the ’M’ of VDM (Vienna Development Method) appliesinvent and verify [Jon90].

It is demonstrated in Chapter 7 that this approach can be adapted for testingsuch that an abstract formal VDM specification can be used as a test oracle.

Program Synthesis

Program Synthesis is the systematic transformation of a specification into an im-plementation. Formal transformation rules guarantee that the implementation iscorrect by construction if these synthesis rules are respected.

Dijkstra has been the first who proposed this approach for the development ofstate-based programs [Dij76]. Additional heuristics for calculating programs havebeen presented by Gries [Gri81]. Further progress in program synthesis has beenmade by Back and von Wright [BvW98], and Morgan [Mor90]. They extended the


approach to refinement calculi with their own language for defining contracts andlarge sets of rules for calculating programs.

Program synthesis have been applied to functional programs, too: Martin-Loefpresented an intuitionistic type theory for deriving functional programs from typedeclarations [ML85]. Manna and Waldinger [MW92] developed a tabular prooftechnique, such that functional programs are developed as side effects during theproof.

A formal refinement calculus enables the development of tools that supportthe interactive synthesis of programs. In [vW94] an automation of Back and vonWright’s refinement calculus have been described. Coen adapted Martin-Loef’s workand developed a constructive type theory for the theorem prover Isabelle [Coe92].

In this thesis it is the first time that program synthesis is applied to test-casesynthesis (Chapter 5 & 6). Here, Back and von Wright’s refinement calculus is used.

1.2.3 Formal Methods and Testing

In the past of computer science, sometimes a gap between the testing and the formalmethods community could be observed. Testers did not believe in the applicability offormal verification techniques to real world problems, and formal method’s advocatescould not accept testing-techniques as an adequate verification method for producingcorrect software. However, recently the gap is closing.

Today, light-weight approaches to formal methods invite engineers to take theadvantages of formal specification techniques without focusing alone on correctnessproofs. Having precise and unambiguous formal specifications available, is the pre-requisite in order to automate black-box testing. The importance of black-boxtesting has increased, due to the growing awareness that a combination of black-and the more traditional white-box testing uncovers more defects than applying onetechnique solely. Especially, the object-oriented paradigm and the increasing use ofComponents Of The Shelf (COTS), shifted the focus of interest towards black-boxapproaches [Bei95].

A further reason for an easier integration of testing methods and formal methodshas been the success of model-checking. Model-checking is in fact exhaustive testingof a formal model. For model-checking to be applicable to software, abstract modelsof the software with finite data domains have to be build. These abstractions have tobe proven to be correct. As a consequence, integrated tools with model-checkers andtheorem provers are under development, as presented recently in [Rus00, BGL+00].

In this work the application of formal development methods to system-level test-ing is considered. This part of the testing process will always be necessary. Forproving a computer-based system to be correct, it would be necessary to model thewhole environment the system interacts with — which is impossible.

The Philosophical Point of View. The relation between testing and formalspecification is reflected in the philosophical points of view of famous Austrians:


Ludwig Wittgenstein, the Wiener Kreis, and Sir Karl Popper. Wittgenstein and theWiener Kreis proposed logic for describing and solving scientific problems [Wit21,Kra50]. Popper realized that only through falsification truth can be obtained. Heconsequently proposed his demarcation criterion according to which any empiricallyrefutable consistent proposition should be regarded as scientific. For theories to beobservationally falsifiable, they must be stated precisely and clearly [Pop35, Pop63].Their results can be applied to software development as well, with the followingconsequences:

• A formal specification of requirements helps to solve systematically the prob-lem of software development by offering the apparatus of logic (or more generalmathematics).

• The validation of the formal specification as well as the implemented solutiononly is feasible through falsification — which is testing.

• For being falsifiable (or testable), the requirements description has to be un-ambiguous and sound — ensured by formal specification and verification tech-niques.

1.3 Objective

The overall objective of this thesis is to contribute to the combination of formaldevelopment techniques and testing. Despite the fact that a lot of previous workrelated to formal methods and testing exists, this thesis should demonstrate thatmore can be done. The aim of this thesis is to examine the formal developmenttechniques with respect to their usefulness in the testing process. Three formaltechniques are considered from a testers perspective:

• Light-weight formal methods recently gained increased popularity, due to theireasier integration into an industrial environment. This thesis should demon-strate the advantages, costs and feasibility of applying this kind of formaltechniques to testing.

• Invent and verify development techniques are traditionally applied in formalprogram verification with proofs. Where proofs are not feasible, but formalspecification have been successfully implemented, it is our aim to show thatthese more abstract specifications can be used as test oracles.

• Program synthesis techniques ensure the development of correct programs.Our aim is to develop system-level test-cases with the same level of rigor.This thesis should prove that program synthesis techniques can be applied toproduce correct test-cases.

1.4. MOTIVATION 11

It has not been our objective to produce a ready to use tool-set for testing software,but to examine and develop the techniques necessary for developing new testingtools — tools more advanced than available today.

1.4 Motivation

During the last few years, the interest in formal software development has been grow-ing rapidly. However, despite their long existence since the 60s, formal methods arenot yet widely accepted and applied in the software industry. Our overall motivationfor this work is to facilitate this technological transfer and thus to contribute to theefforts in turning software development into a mature engineering discipline.

From our experience in industry, we are convinced that test-engineers are apromising target group for introducing formal methods. Especially, in the domainof safety-critical systems, testers are really concerned about the effectiveness andmaturity of their methods. In this thesis both, practical results that can be applieddirectly, and new theoretical results that need further adaption and tool support tobe transfered into industrial practice are presented.

The central role of testing in software engineering demands scientifically justi-fiable testing methods. Today, test-plans are rather complex procedures, includingtest-cases with several sequences of interaction with a system. In a systematic soft-ware development process, these test-cases have to be documented correctly andunambiguously. If a tester cannot rely on his test-plans, they are rather useless.This was the motivation for our work on the application of program synthesis tech-niques to test-case synthesis.

A further motivation of our work is the fact that test-cases can be used forvalidating a formal specification by users not familiar with formal notations. Thepremise for such a test-case based validation is that the test-cases have been correctlyderived from the specification.

1.5 Contributions

In this thesis three formal development techniques are examined, if and how theysupport the testing of computer-based systems: Light-weight approaches to formalmethods, the refinement calculus, and data-reification techniques. As a result, noveltesting techniques are presented.

The scientific contributions of this work can be summarized as follows:

• A light-weight approach to specification based testing has been developed.The approach is based on executable VDM specifications and the commercialIFAD VDMTools. Two industrial experiments demonstrate its successful ap-plicability. Figures of the efforts and efficiency of the method are presented.The experiments have been designed, supervised and controlled by the author.


The actual specification work has been done by two master thesis students ofthe author.

• A main contribution of this work is the realization and formal proof that test-cases are in fact abstract formal specifications. Our abstraction theorems holdfor simple input-output test-cases as well as for test scenarios consisting ofseveral steps of interaction.

• The observed property that test-cases are abstractions of specifications leadto a new test synthesis approach based on program synthesis. In this thesis itis the first time that the refinement calculus is applied to the problem of testsynthesis. The result is an abstraction calculus for testing in order to calculatetest-cases according to a testing strategy.

• New formal synthesis rules for the test strategies of partition testing, mutationtesting and test-sequencing are presented and formally proved.

• Finally, it is shown how formal specifications can be used as test-oracles. Aframework using VDM and new features of the IFAD VDMTools is discussed.

1.6 Publications

The research for this thesis lead to 10 publications of the intermediate scientificresults.

• Our light-weight approach to testing has been described in [HA99, HA00a,HA00b, AGA00]. It has been shown that this validation approach can beextended to hybrid systems as well in [AK00]. The article in the IEEE Softwarejournal [HA00b] was first submitted to the Fourth International Conferenceon Requirements Engineering (ICRE2000). The program committee selectedthis article as best paper and forwarded it to the IEEE Software journal.

• The work on a calculus for test-synthesis lead to the most recent publication[Aic01].

• The ideas and techniques to use formal specifications as test-oracles can befound in [Aic98, Aic99a, Aic99b, Aic99c].

1.7 Structure of this Thesis

Chapter 1 introduces to the general subject of this thesis: software testing andformal development methods. Furthermore, the objectives, the motivations, asummary of contributions, and the structure of this thesis are presented.

1.7. STRUCTURE OF THIS THESIS 13

Chapter 2 summarizes the related work on testing based on formal developmentmethods. In this chapter, the existing test-case generation techniques basedon formal specifications are explained. The summary distinguishes betweentesting techniques based on model-based, algebraic, and behavioral formalspecifications. Based on this technical review of the related work, the newcontributions of this thesis are discussed.

Chapter 3 presents a light-weight approach to testing based on executable formalspecifications. The technique uses an executable VDM specification of thesystem requirements and IFAD VDMTools for validating the design of thesystem-level test-cases.

Chapter 4 Here, two industrial experiments in the domain of air-traffic control arepresented in which the light-weight approach have been applied. Its applicabil-ity in an industrial environment is demonstrated by the encouraging results ofthese experiments. Detailed empirical data support our claims. Furthermore,example serve to illustrate the proposed method.

Chapter 5 develops the foundation for a new test-synthesis calculus. It is provedthat test-cases are in fact abstract formal specifications. This insight holds forsimple input-output test-cases as well as for test scenarios consisting of severalsteps of user interaction. The refinement calculus is used to formulate andprove this abstraction property.

Chapter 6 presents formal rules of the new abstraction calculus for test synthesis.This is the first time that program synthesis techniques are applied to thetest-case synthesis problem. Different test strategies are formulated in thisnew formal framework. Rules for partition testing, mutation testing and thecalculation of test scenarios of interactive systems are defined.

Chapter 7 shows how abstract model-based specifications can be used as test or-acles for an automatic evaluation of the test results. The technique is basedon VDM but can be applied to other model-based formal methods. It is ex-plained how the invent and verify proof technique can be applied to testing.Furthermore, it is shown that the new features of IFAD VDMTools facilitatethe implementation of this oracle framework for testing.

Chapter 8 demonstrates the new test-calculus by means of a case-study. A voicecommunication system for air-traffic serves as an example for showing theapplication of the different testing techniques.

Chapter 9 gives a brief summary of this work. Furthermore, the final conclusionsare drawn by the author. The thesis closes with an outlook and motivationof the possible future work that may be initiated by the contributions of thisthesis.

Chapter 2

Review of Related Work onTesting

In this chapter a review of related work on testing based on model-based formalmethods is presented. Since our approaches to testing is based on the Vienna De-velopment Method (VDM) and the refinement calculus, the previous work on testingwith model-based specifications is most closely related to our own work. However,at the end of this chapter, testing methods based on algebraic and behavioral spec-ifications are presented. The discussion of the testing techniques of these methodsshows the different challenges with respect to testing, depending on the choice ofthe specification style.

2.1 Partition Testing

The most common strategy for generating tests from model-based specifications ispartition testing. Partition testing applies strategies based on dividing the set of allpossible inputs into equivalence classes under an appropriately defined equivalencerelation [HT88, OB88, RC81]. In the late 80’s the formal methods communityrealized that partition testing techniques may be applied to formal specifications.Especially, for state-based specifications the adaption of well-known techniques fromsource-code based testing is straight forward.

2.1.1 Domain Testing

Scullard reports in [Scu88] on the industrial use of VDM for selecting test cases inorder to validate a micro-processor. More precisely, domain testing is applied toVDM specifications.

Domain testing analyzes the input-domain of a system’s functions, i.e. its signa-tures. The strategy is based on a simple processing model in which the system undertest accepts an input vector, which then goes through a classification process, e.g.

15

16 CHAPTER 2. REVIEW OF RELATED WORK ON TESTING

a case statement. Each of these cases is processed separately. The main concern indomain testing is the correctness of the classification process as contrasted with thecorrectness of the processing done for each case. Usually, in domain testing, domainsare defined by a set of boundary inequalities. A simple example of a domain testingtechnique is the choice of extreme values of input parameters [Coh78, JW94].

Since all input values of a model-based specification have a concrete represen-tation, domain analysis can be performed on it. Furthermore, preconditions, data-invariants, and post-conditions can be used to synthesize subdomains. Scullard’stest-case extraction process for VDM can be summarized as follows:

1. Domain reduction: Reduction of the signature of the function to basic typesonly. Compound types are replaced by the Cartesian product of their compo-nents. Furthermore preconditions and data-invariants are taken into accountto reduce the input-domain further.

2. External analysis: Selecting a set of ’typical’ values for each basic type of thereduced signature. For ordered types extreme values and one or more centralvalues are chosen, for unordered types all values are selected.

3. Internal analysis: Identification of possible cases by skolemizing the postcon-dition of the function.

Scullard gives an example of his technique and discusses that an automated testgeneration system would be useful. However, he did not formulize the underlyingpartitioning method. This has been done in the subsequent work on VDM andtesting by Dick & Faivre.

2.1.2 DNF-Partitioning

Dick & Faivre. One of the most cited works on test-case generation from model-based specifications is [DF93]. Dick and Faivre realized that transforming a formalVDM specification into a disjunctive normal form (DNF) constitutes an adequatepartitioning method. A Prolog tool has been developed by them that rewrites data-invariants, preconditions and postconditions according to the following rules for aDNF construction:

A ∨ B ≡ (A ∧ B) ∨ (¬A ∧ B) ∨ (A ∧ ¬B)

A ⇒ B ≡ ¬A ∨ B

For model-based specifications this partitioning process is extended to type asser-tions of the form x : T denoting the values of variable x to be of type T . A specifi-cation containing x :T can be further partitioned into special values v :T and othervalues:

x : T ≡ (x = v ∨ x 6= v) ∧ x : T

2.1. PARTITION TESTING 17

A simple example serves to demonstrate the method of DNF partitioning.

Example 2.1 (DNF partitioning) Consider a specification of a program com-puting the maximum of two integer numbers. Assuming that the input values arestored in two state variables x : Z and y : Z and the result in variable max , thenthe problem can be specified with the following relational update Max . The square-brackets denote a relational update [R] that updates the state variables according toan input-output relation R. This relational update corresponds to a post-conditionin VDM.

Max = [(max = x ∨max = y) ∧max ≥ x ∧max ≥ y ]

Rewriting the disjunction into a DNF gives:

≡ [((max = x ∧max = y) ∨(max 6= x ∧max = y) ∨(max = x ∧max 6= y)) ∧max ≥ x ∧max ≥ y ]

Distributing the ∧ and simplification results in three disjoint cases:

≡ [(max = x ∧max = y) ∨(max > x ∧max = y) ∨(max = x ∧max > y)]

The subexpressions above characterize three subrelations, describing the test parti-tions for testing Max with respect to its specification. ¤

Stocks. Most of the later work on testing from model-oriented specifications isbased on this method. In Stocks’ PhD thesis and his subsequent work, this formalpartitioning technique is applied to Z [Sto93, SC93b, SC93a, CS94]. Stocks definesa framework in which classes of test-cases can be defined with Z-schemas. He showshow different partitioning strategies can be applied to Z specifications. In addition,mutation testing is applied to formal specifications (see Section 2.2 below).

Horcher & Peleska. In [HP94, HP95] a Z-based tool for partitioning is presentedby Horcher & Peleska. Its industrial application demonstrated that the test-caseselection process was not significantly faster with this approach. The reason wasthat the same degree of understanding for the specification must be reached as formanual test case selection. However, the evaluation of the tests took only 5% of theusual time since test oracles had been compiled from the Z specifications [HM96].Test oracles are discussed below in Section 2.5.


Helke et al. [HNS97] demonstrate that a theorem prover like Isabelle [Pau94]can be used to generate such test-classes based on DNF rewriting. It is shown howconstructive proofs may be used to calculate the partitions of a Z representation inthe HOL instance of this generic theorem prover. HOL stands for higher order logicand is based on Alonzo Church’s work on a simple theory of types [Chu40].

Van Aertryck. In [Aer98] the tool-supported method CASTING for automat-ically generating test-cases is presented. CASTING is generic in the sense thatit is applicable to different model-based specification languages as well as to pro-gramming languages. This is realized by an internal representation using attributegrammars in which the test-case selection strategies can be encoded. An instan-tiation of CASTING for the B-method is presented, where DNF partitions can begenerated automatically from B’s abstract machine specifications. The tool uses aconstraint solver for finding actual test-cases.

Behnia & Waeselynck. A problem concerning DNF partitioning and classicallogics is discussed in [BW99]: DNF rewriting of B specifications may lead to unde-fined partition specifications.

Consider a specification expression, where f is a partial function.:

¬(x ∈ dom(f ) ∧ f (x ) = y)

Applying the DNF approach would result in three disjoint partitions:

(x /∈ dom(f ) ∧ f (x ) 6= y) ∨ (x /∈ dom(f ) ∧ f (x ) = y) ∨ (x ∈ dom(f ) ∧ f (x ) 6= y)

It can be easily seen that the first two cases are undefined, since f (x ) is undefined.The authors propose an extended deduction system, where definedness is treatedexplicitly by an ∆ operator denoting well-definedness. Such problems have beenforeseen by Jones and are solved by extending VDM’s logic to a three-valued logicof partial functions, where the third logic value is undefinedness [Jon90, BFL+94].Consequently, for VDM this problem does not exist — the partitions above arewell-defined in VDM-SL. This shows that Jones is right, when he says in [Jon98]:

The lessons which I believe it is important to draw from my series ofattempts to find a satisfactory way to reason about partial functionsinclude the fact that a real difficulty should be faced rather than ignored( I can accept any approach to this problem much more readily thanpretending it does not exist).

Behnia & Waeselynck solve this problem for the B method by designing an alterna-tive set of partitioning heuristics that do not generate undefined partition predicates.Their approach is described in Section 2.1.4.


2.1.3 Separation of the Input

Hieron

Hieron proposes another heuristic of partitioning model-based specifications that iscoarser than pure DNF-partitioning [Hie97]. His partitioning technique rewrites Zoperation predicates as (X1 ∧ Y1) ∨ · · · ∨ (Xn ∧ Yn) where

• predicates Xi refer only to the inputs;

• predicates Yi refer to the outputs.

This form of the predicates separates the description of the input Xi from the op-erational part Yi specified by a relation. Since Xi and Yi may contain disjunction,the rewritten operation predicate is not necessarily in DNF. The motivation is notto separate constraints that apply to the same category of variables.

For example, let us reconsider an operation for calculating the maximum of twonumbers with the formal input parameters x , y and the output parameter max .Then the predicate that specifies the computation of the maximum of x and y

((x > y ∨ x = y) ∧ (max = x )) ∨ ((x < y) ∧ (max = y))

is of the desired form (X1 ∧ Y1) ∨ (X2 ∧ Y2). As argued by Hieron, the predicateX1 needs not to be further split: it characterizes an input subdomain with uniformbehavior.

However, unlike in our example above, the resulting input domains X1 and X2

may overlap. Hence, for obtaining disjoint partitions of the input domain, theinput predicates have to be rewritten with the rule for DNF-partitioning. Then,the resulting partitions for two predicates ,like in our example, would be: X1 ∧ X2,¬X1 ∧ X2 and X1 ∧ ¬X2.

A further advantage of this transformation is that non-determinism can be han-dled as well. A non-deterministic specification of the form Xi ∧ (Y A

i ∨Y Bi ) leads to

three possible states, if rewritten as a DNF:

(Xi ∧ ¬Y Ai ∧ Y B

i ) ∨ (Xi ∧ Y Ai ∧ ¬Y B

i ) ∨ (Xi ∧ Y Ai ∧ Y B

i )

However, these three states do not directly correspond to three valid tests if consid-ered separately, i.e., tests that will not reject a correct program. Here, a union oftwo possible test-results is split into three deterministic test-results. If the systemis testes with these three test-cases, two of the test-cases might fail. In Hieron’sframework this problem does not occur, since such non-deterministic specificationsare not further split.

This partitioning is courser than pure DNF-rewriting as described in the previoussection, since the individual Xi and Yi have not been expanded. Even more controlof the number of test-cases is provided, if the syntactical structure of a specificationlanguage is taken into account. Such an approach is described in the next section.


Donat

Donat [Don98] takes a similar approach as Hierons did. Donat distinguishes betweenthe stimulus (input) expressions S and the response (output) expressions R in asystem’s specification. The difference to Hieron’s work is that specifications arerewritten into a disjunction of, what he calls, test-classes:

(S1 ⇒ R1) ∨ · · · ∨ (Sn ⇒ Rn)

In contrast to the previous work on partition testing, quantified expressions canbe handled in his partitioning algorithm. This is done by translating a specificationinto a test-class normal form, where the quantifiers ∀ and ∃ are shifted insideconjunctions and outside disjunctions, if possible.

From a test-class S ⇒ R, a set of test-frames A ⇒ R is derived. In contrast toa test-class stimulus S , a test-frame stimulus A does not include disjunctions andA ⇒ S holds. A variety of different test-frame sets can be derived from a test-class.Several strategies and their resulting test coverage are described. Donat uses BinaryDecision Diagrams [Bry86] to perform the test frame construction and selection.

2.1.4 Structural Partitioning

The notation of the B method is a general purpose language ranging from abstractspecification constructs to implementation oriented statements. Consequently, spec-ification and implementation oriented testing strategies can be combined. Recently,this unification has been presented in [BW99]. The unification is made by extendingthe concept of control-flow graphs to non-deterministic specification operators. Inthis work, the deterministic programming statements are not rewritten into basicpredicates and then are DNF partitioned, but path-oriented strategies known fromwhite-box testing techniques are applied. The advantage of considering the syntaxof a modeling language like B, is that a better control of the partitioning is provided:In general, pure DNF partitioning results into too many test-cases. Therefore, cov-erage strategies from program testing are extended to handle the whole specificationlanguage of B. The result is a hierarchy of testing strategies:

All-paths criterion: This criterion demands that all feasible paths of the controlflow graph are covered. For a specification that is rewritten as a set of non-deterministic choices, here denoted by S1 u . . . u Sn , each Si is a path on thecontrol flow graph. For example an if-statement can be rewritten as follows:

if P then S1 else S2 ≡ (P ⇒ S1) u (¬P ⇒ S2)

Hence, test-cases from two input-domains satisfying P and ¬P have to betested for covering the two paths.

2.2. MUTATION TESTING 21

All-paths combinations criterion: This criterion demands that all combinationsof feasible paths on the control flow graph should be covered by test data. Ifseveral input-domains are obtained according to the all-paths criterion, then aDNF-rewriting of these input-domains provides a coverage of all possible pathcombinations.

All-expanded-path-combinations criterion: This criterion splits infeasible pathconditions into further sub-cases. A relational update [P ⇒ S ] may be infea-sible either if P is false or if P is true and S is infeasible.

These criteria can be further refined by splitting guard condition predicates intodisjoint cases.

2.2 Mutation Testing

An alternative and complementary approach to partition testing is mutation testing.Mutation testing is a fault-based testing technique introduced by Hamlet [Ham77]and DeMillo et al [DLS78]. It is a means of assessing test suites. When a programpasses all tests in a suite, mutant programs are generated by introducing faults intothe code of the program under test. Then, the suite is assessed in terms of howmany mutants it distinguishes from the original program.

Budd & Gopal [BG81] applied this concept to the domain of specifications givenin predicate calculus. Potential faults are represented as mutants of the specification.Test data are then generated that can distinguish between the mutated and originalspecification. Several promissing program mutations are given as examples, but thegeneration technique is left implicit.

Stocks extends this technique in [Sto93] to model-based specification languagesby defining a collection of mutation operators for Z’s specification language. Anexample for specification mutation is the exchange of the join operator ∪ of setswith intersection ∩. From these mutants, test-cases are generated for demonstratingthat the implementation does not implement one of the specification mutations.

Kuhn developed a similar approach [Kuh99]. Faults were represented as mutatedformal specifications and the conditions for distinguishing them from the originalspecification were calculated. This information was then used to calculate a faultcoverage hierarchy. However, the conditions for detecting the faults were only shownfor mutations involving Boolean operators.


2.3 Structuring Test-Cases

In general, an automated DNF-partitioning produces many unstructured test-cases.Therefore, some kind of user interaction is necessary for selecting and structuringthe relevant test-cases in a test suite. Classification trees and abstraction hierarchiesare suitable tools for handling this problem, as explained in the following sections.

2.3.1 Classification Trees

In order to enhance DNF partitioning, the classification tree method is applied forselecting only interesting partitions in [SCS97]. A classification tree is a graphicalrepresentation of the stepwise partitioning of an input domain into relevant parti-tions [GG93, GWG95]. The root of the tree represents the complete input domainof the system under test. Its children nodes are predicates representing the parti-tions. The tree grows as the partitions are refined. The leaves represent the finalpartitioning of the input domain and serves as the head of a selection table fromwhich the actual test-cases are chosen. The presented tool provides a graphical userinterface that represents the partitions of a Z specification in such a classificationtree. The advantage of this approach is that the partitions are well structured andthat only relevant ones can be refined or chosen to be final interactively. This solvesthe test-case explosion problem due to DNF partitioning.

2.3.2 Abstraction

Stepney realized the abstraction relation between Z specifications of test-cases andobject-oriented Z specifications of operations [Ste95]. Test-case specifications areabstract in the sense that only relevant input and before-states are tested and thatonly interesting consequences of an operation should be observed. Formally, thismeans that for designing a test-case specification of an operation, the preconditionis strengthened and the postcondition is weakened. This is the inverse process ofrefinement — abstraction. In other words, her method of test suit design works byintroducing non-determinism into the original specification.

Her group developed a tool for interactively calculating such partition abstrac-tions and to structure them for reuse. With this tool, the test-cases of a test suiteare structured in a hierarchy of abstractions visualized in a tree. The root of the treeis the original specification. Following the tree towards its leaves means abstraction.The children are either a set of partitions, or more non-deterministic versions of theparent test-case specifications.

Our work can be seen as an extension of this view on testing. Unlike Stepney, weuse the refinement calculus for defining the abstraction rules for designing test-cases.Furthermore, we cover the sequencing of test-cases that is only mentioned by her.Chapter 5 presents our approach to this topic.

2.4. TEST SEQUENCING 23

2.4 Test Sequencing

Simple input-output test-cases are not sufficient for non-trivial systems. An interac-tive system has to consider the possible user actions. Usually an internal state underwhich an operation should be tested cannot be set directly, but must be set up byinvoking a series of preceding operations. The process of finding such sequences ofoperations is called test sequencing.

2.4.1 Finite State Machine (FSM) Calculation

For model-based specifications, this search for test sequences is non-trivial. Thereason is that before- and after-states are not explicitly available, but implicitlydefined in postcondition relations.

In [DF93] Dick and Faivre describe a test sequencing method, based on the cal-culation of a finite state machine (FSM) from a VDM specification. Their approachworks as follows:

1. Partitioning of the original specifications using the DNF technique describedabove. A set of sub-operations is obtained that form the transitions of theFSM.

2. Calculation of the before- and after-states of each suboperation by existentialquantifying every variable external to the state in question and then simplify-ing.

3. DNF-partitioning of the disjunction of the set of states obtained in 2. Theresult is a description of disjoint states — the states of the FSM.

4. Construction of the FSM by resolving the constraints of sub-operations againststates. The FSM has a transition between states s1 and s2 if a postconditionrelation R of a sub-operation is satisfied, hence R.s1.s2 holds. The transitionis labeled with the name of the corresponding sub-operation.

Having the FSM explicitly available, paths through the FSM can be extractedas test sequences. The strategies for selecting the sequences can be chosen from thewell-known path strategies of white-box testing. Examples for such strategies areto visit each node (here state) once, or to visit each transition (here sub-operation)once, or to cover all possible paths without repetition. More about path strategiescan be found in [Bei90] and [Bei95].

2.4.2 Refinement of a FSM

Recently, Derrick and Boiten discussed the issues of test sequencing and formalrefinement in Z [DB99]. They provide means to calculate a concrete FSM from anabstract one using a data refinement relation (retrieve relation). If this refinement


relation is functional and the refined specification is the weakest refinement (noadditional operational refinement) then disjoint abstract states and transitions willproduce disjoint concrete ones — the two FSMs are isomorphic. If the refinementrelation is not functional no further transitions can be added, but states may beidentified under refinement. However, if further operational refinement is performed,then the process of refinement adds new transitions and states by weakening aprecondition. In addition, existing transitions may be deleted by strengthening apostcondition.

2.5 Test Oracles

In the previous sections, the review focused on test-case generation techniques, es-pecially on techniques for partitioning the input domain of a system under test. Thesecond main task in testing, to be considered, is the evaluation of the test resultsusing an oracle. An oracle predicts the outcome of a given test. Previous work hasshown the derivation of such test oracles from formal specifications.

The conceptually simplest form of an oracle is a comparison of the actual outputfor some input against a pre-calculated expected output for the same input. Anexecutable prototype can be used to calculate the expected outputs. Some of theformal method tools support this kind of prototyping by providing

• specification interpreters, and/or

• code-generators

for executable specifications, e.g. IFAD VDMTools and Atelier B to name only two.For example, [TDS98] reports the use of a C-prototype that has been generated

from a B specification, for preparing test-cases. In this project, the test-results ofthe prototype had been stored into a test file and then were refined into a suitableform for testing the Ada implementation. Similar works exist for VDM [Dro00,vdBVW99].

The alternative and more general approach is the use of possibly non-deterministicformal specifications as oracles. The partitioning approaches described above rewritea specification as a set of test-class specifications with disjoint input domains. Theseformal specifications do more than describe conditions on the input. The relation-ship between the input states and output states is precisely specified. This meansthat the specification predicates can serve as an oracle — if refined into an executableform.

Consider the specification predicate that defines a computation of the square-root of a positive number x . If the result is stored in a variable s , and e is a givenprecision, then we have the following relation:

-e ≤ x – s2 ≤ e

2.6. ALGEBRAIC SPECIFICATIONS 25

If this relation is evaluated it can serve as an oracle, and can decide if a testsucceeds. This is in contrast to using a prototype for calculating the actual square-root which is far more complicated then using the specification above.

In [HM96] such a test environment is described that generates a predicate eval-uator as an oracle from a Z specification.

In Chapter 7 an alternative approach is shown, where implicit VDM specifica-tions are directly used as test-oracles.

2.6 Algebraic Specifications

In the following, an overview of the algebraic testing techniques proposed by Marie-Claude Gaudel et al. [BGM91, BGM92, Gau95, GJ98] is presented. Although,algebraic specifications are not the subject of this thesis, the review of this workshould explain the differences to model-based testing techniques. It has to be men-tioned that Dick & Faivre’s early testing techniques on VDM have been initiallyinspired by this work [BGM91].

Figure 2.1 shows an algebraic specification of sorted lists in the notation ofLOTOS [ISO89]. Such a specification has two parts: a signature Σ = (S ,F ), whereS is a finite set of sorts and F is a finite name of operation names over the sorts in S ,and a finite set of axioms. The sorts in our examples are List, Elem and Boolean;the new operation sorted is defined by three axioms.

2.6.1 Testing a Data Type Implementation

Testing code against an algebraic specification consists of showing that the finalsystem satisfies the axioms in the specification. To create tests for a given axiom,the variables of the axiom are instantiated with values. To run such a test, theresulting expressions are evaluated. If the results satisfy the axiom then the test ispassed, otherwise it is failed.

An advantage of algebraic specifications in testing is the ease with which testdata are constructed. The constructive nature of the specifications shows how tobuild various instances of the defined data types, which also implicitly include testsequencing information. There is also great potential for automatically generatingtest cases from specifications by steadily building up data from the base cases usingconstructor operations.

In the SortedList example, a possible test, inspired by the third equation, is:

sorted(cons(A, cons(B , cons(C , nil)))) = (A ≤ B) ∧ sorted(cons(B , cons(C , nil)))

The corresponding test procedure for testing an implementation P consists of twocomputations and one comparison, namely

• computing the representation in P of cons(A, cons(B , cons(C , nil))), callingthe sorted function in P on it, and storing the result;


specification Sorted_List_Example : exit

library Boolean, Lists endlib

type SortedList is

List, Elem, Boolean

opns

sorted : List -> Boolean

eqns

forall xs: List, a, b : Elem

ofsort Boolean

sorted(nil) = true;

sorted(cons(a, nil)) = true;

sorted(cons(a, cons(b,xs)) = (a <= b) and sorted(cons(b,xs));

endtype

Figure 2.1: An algebraic specification of sorted lists as an extension of general lists.

• similarly, calling sorted on cons(B , cons(C , nil)), checking if A ≤ B and build-ing the conjunction of the results.

• comparing the two results (oracle).

This shows that in algebraic testing, the test-cases are equations and the job of anoracle is the comparison of the computed results of the left- and right-hand side ofan equation.

2.6.2 Selection Strategies

Gaudel et al. distinguishes between two kinds of selection hypothesis for reducingthe size of test set to practical levels.

A uniformity hypothesis is an assumption that the input-space can be dividedinto sub-domains such that if a test set containing a single element from each sub-domain is passed, then the exhaustive test set is also passed. This correspondsto the hypothesis used in partition testing. Examples of uniformity hypothesesare the selection of representative values, or the partitioning of the input-space byconsidering the conditions in conditional axioms.

A regularity hypothesis uses a size function from ground Σ -terms to the naturalnumbers and assumes that if a set of tests made up of ground terms of size less thanor equal to a given number, then the exhaustive test set is also passed.

Example 2.2 (Algebraic Testing) A first selection strategy on the SortedList

example (Fig. 2.1) would be to make a uniformity hypothesis on all the variables.

2.6. ALGEBRAIC SPECIFICATIONS 27

Case 3.1: (a <= b) = true and sorted(cons(b,xs)) = true

⇒ sorted(cons(a, cons(b,xs))) = true

Case 3.2: (a <= b) = true and sorted(cons(b,xs)) = false

⇒ sorted(cons(a, cons(b,xs))) = false

Case 3.3: (a <= b) = false and sorted(cons(b,xs)) = true


Case 3.4: (a <= b) = false and sorted(cons(b,xs)) = false


Figure 2.2: Examples of algebraic test-cases for testing an implementation of sortedlists as specified in Figure 2.1. The tests are synthesized by unfolding the Booleandata type definition in the third axiom.

This means that that there are three tests, which are arbitrary instantiations of theaxioms.

If for some reason more tests are desired, one possibility is to unfold the definitionof and in the third axiom. Since the and operations is defined as follows

true and true = true

true and false = false

false and true = false

false and false = false

, the four new conditional axioms in Figure 2.2 are obtained.More test-cases can be derived, if the <= operation is further unfolded. To

obtain the actual test cases, the variables have to be instantiated. Here, a regularityhypothesis could be applied that limits the length of the lists to be instantiated totwo. ¤

The method has been automated in the tool LOFT (LOgic for Functions andTesting) using Prolog [Mar95]. Furthermore, it has been successfully applied on asubstantial case-study [DM91]. Hughes & Stotts applied the technique for testingobject-oriented programs [HS96]. More recently, these algebraic testing techniqueshave been combined with behavioral techniques of LOTOS [GJ98]. Testing withbehavioral specifications is discussed in the next section.

Currently, Andrej Pietschker is developing a new test-case generator based onalgebraic specifications [PF98]. In this work the problem of mapping the abstracttest-cases of the specification to concrete implementation level is considered. Giventhe abstract tests, a reverse parsing technique is applied that uses a part of thespecification expressed by means of a grammar to produce tests in the concreterepresentation.


Antoy & Hamlet have presented a framework, where implementations are auto-matically checked against their algebraic specifications [AH00]. In contrast to thealgebraic test approach described above, they evaluate the specification by rewritingit following the axioms. Here, the test-results of the implementation are mappedand compared to the resulting terms of the specification. Only, simple examples oftesting a set implementation are shown.

2.7 Behavioral Specifications

A specification is called behavioral, if the dynamic behavior of a system is describedrather than its data and data transformations. Finite state machines and labeledtransition systems are examples of this kind of specification style. Behavioral spec-ifications are applied in domains where systems have a rather complex state butsimple state transitions, like in communication protocols and control applications.

Behavioral specifications are well suited for automatically generating sequencesof test-cases, since feasible applications of state operations are made explicit. There-fore, Dick & Faivre have shown how to produce a behavioral finite state machinefrom a computational VDM specification as has been explained in Section 2.4.

A class of test generation approaches starts directly from such behavioral speci-fications, like finite state machines [Cho78, FvBKG91, LY94] or finite labeled tran-sition systems [dNH84, Hen88, Bri89, PF90, Tre92, Pel96, PS96] .

Another class of approaches combines computational and behavioral specifica-tions. As we do, MacColl and Carrington are planing to develop a testing frameworkfor interactive systems. They use a combination of Z and behavioral specifications(the process algebras CSP and CCS) for specifying such systems [MC99]. In [GJ98]such behavioral specifications are combined with algebraic testing techniques. Fur-thermore, in [SS98] testing from combined Z and statechart specifications is dis-cussed.

2.8 Discussion

As shown above, a lot of work has been done on generating test-cases from formalspecifications. However, in this thesis it is demonstrated that there is still room forinnovation.

2.8.1 A New Approach to Test Strategies

This thesis seeks to extend the body of work for model-based specifications summa-rized above. A new framework under which all testing strategies, like partitioningand mutation heuristics, can be formally specified, analyzed and used to automatethe generation of test case specifications. In contrast to previous works with the

2.8. DISCUSSION 29

same goal, e.g. Burton’s recent work on Z [Bur00, BCGM00], our mathematicaltreatment of test-cases can handle test-sequencing as well. Unlike the work above,our approach is based on program synthesis techniques. Since test-case generationis a synthesis process, we find it more natural to use a development method likethe refinement calculus in order to calculate test-cases. The basis to this uniformframework is the view that all kinds of test-cases are mathematical abstractionsof model-based specifications. This new approach to testing will be presented inChapter 5.

Having realized the abstraction relation of test-cases, the refinement calculuscan be applied for formalizing synthesis rules for testing. This is the first time thatBack and von Wright’s refinement calculus [BvW98] is applied for the developmentof test-cases. In Chapter 6 test strategies for partitioning, mutation testing aswell as test sequencing are formalized as proof rules in the refinement calculus. Itwill be shown that the formalization itself revealed interesting properties of theheuristics that have to be considered, when automation is envisaged. The uniformframework based on the abstraction relation provides an environment under whichthe relationship between various testing heuristics can be investigated in order todevelop more efficient heuristics.

2.8.2 Algebraic vs. Model-Based Specifications

Algebraic specifications are not the topic of this thesis. As explained in the introduc-tion, one of our motivations for this work is the technology transfer of formal spec-ification techniques into industry. The author shares the opinion of Jones [Jon90]that model-based specifications are more adequate for the practicing engineer andthat algebraic definitions should be left to the formal method experts. Although ingeneral, algebraic specifications are more abstract than model-based specifications,their validation is more difficult for non-mathematicians: It is not easy to determine,if the axioms in a complex algebraic specification are sufficient and sound.

With respect to testing Hayes [Hay86] argues that algebraic techniques are bestsuited to testing primitive data types and that, for more complex abstract datatypes, a model-based specification is simpler. As noted by Gaudel, predicate logicspecifications are more general than algebraic specifications. However, the price ofthis generality is the restriction that, in general, only specifications of classes oftest-cases can be generated automatically. Algebraic techniques can generate testdata as well.

It is our opinion, that a higher degree of automation of test-case generationshould not be gained through a specification technique that is more error-pronefor non-experts in formal methods. The experiments presented in Chapter 3 willdemonstrate how error-prone informal specifications are. The situation should notbe made worse by introducing another source of errors.


2.8.3 Taking the Dynamic Behavior into Account

The following chapter will show that today’s safety-critical systems may be highlyinteractive as well as involve complex data manipulations. Typical test-cases of suchcomplex systems are scenarios of usage. Thus, a testing strategy must treat both,the input-output relation as well as the possible sequences of interaction.

As has been described above, recent work on model-based specifications hasfocused on the combination of behavioral and computational methods for coveringthe different aspects of functionality. In contrast, this work uses Back and vonWright’s contract language for describing the behavior of interactive systems. Ithas been shown that Back and von Wright’s refinement calculus covers such type ofsystems in [BMvW99].

The innovation in the work presented here is that the synthesis of test-sequences(scenarios) from a specification is considered as an abstraction problem. Our se-quencing technique does not calculate the complete finite state machine prior totest-case selection and thus differs from the approaches above. In an environment,where change requests are common, calculating the whole finite state machine isnot very efficient. Our method focuses on the impact on the possible scenarios bysystematically analyzing interaction compositions.

2.8.4 How Much Does It Cost?

All the methods presented above rely on formal specifications and thus will not beapplied in industry if formal specifications fail to be accepted. A common questionof the practicing software engineer is: “What are the efforts and the payoffs?”, orsimply “How much does it cost?”.

Too few of the previous works answer this question. One of the reasons is thatoften only simple examples serve to demonstrate new research in the area of testing.In this work, we try to answer the question partly by presenting the results of twoindustrial experiments described in the next chapter. In these experiments a light-weight approach to testing has been used that does not represent fundamentallynew techniques. The test method is based on executable specifications. However,the exact figures of the efforts and detected issues are presented. Furthermore, theempirical results have motivated our more theoretical research of Chapter 5 and 6.

2.8.5 New Tools — New Oracle Approaches

It has been shown that formal specifications have been used as test oracles before.However, in this area of automating the test process, the availability of new tools isessential. Previous work on oracles has mainly focused on code generation or directlyexecutable specifications. In Chapter 7 we will present a novel oracle technique,where recently available VDM tools are combined with well-known proof techniques

2.8. DISCUSSION 31

of VDM. This enables the direct use of VDM’s pre-post condition specifications astest oracles.

Chapter 3

A Light-Weight Approach toTesting

In this chapter a test design method is presented that has been applied in twoindustrial projects. It is a method for designing system-level test cases for complexcomputer-based systems. This light-weight approach is based on executable formalspecifications that capture the user requirements of a system under test.

The details of the two projects and their encouraging results are described inthe next chapter. Here, the method and its foundation are introduced. After adefinition and discussion on light-weight formal methods, the Vienna DevelopmentMethod is described in more detail. Next, it is shown how VDM and VDMTools canbe used in test design. Finally, the relation between our approach and the importantvalidation aspect in testing are discussed.

3.1 Light-Weight Formal Methods

During the last few years, the interest in formal software development has beengrowing rapidly. One indicator for this growing popularity is the fact that everymain conference on software engineering features sessions about formal methods.

Several reasons can be identified; among others the following three:

• the combination of formal and more traditional approaches,

• the availability of commercial tools to support the developer,

• the beginning awareness that the application of formal methods does not nec-essarily imply the use of formal correctness proofs.

The last item lead to a new class of formal methods that is often called light-weightformal methods or formal methods light.

33

34 CHAPTER 3. A LIGHT-WEIGHT APPROACH TO TESTING

These light-weight approaches to formal methods provide the advantages of aprecise and unambiguous specification language to raise the quality of a system’sspecification, without focusing on proofs [DKLM98, ELC+98, JW96, Jon96].

In this chapter such a light-weight method is presented and its relevance forsystem-level testing is discussed and demonstrated.

It is important to distinguish between the terms light-weight and informal meth-ods.

light-weight implies a formal specification language with a formal semantics. Al-though, formal proofs are out of a light-weight formal method’s scope, formalverification is possible if needed or wanted.

informal means that no formal semantics of the requirements description is avail-able. Therefore, formal proofs of properties of a described system are im-possible. Natural language and the Unified Modeling Language (UML) areexamples of informal specification methods.

The advantages of a light-weight formal method will be demonstrated in the nextchapter, when the results of our industrial projects are presented. These advantagescan be summarized as follows:

+ easy to learn: our students learn our method during a one semester course con-sisting of two hours of lecture and one hour of exercise per week.

+ easy to apply: formal proofs are difficult or even infeasible for software developersthat are not trained in discrete mathematics. However, experience shows thatsoftware engineers are able to specify their systems in a formal language aftertwo weeks of training.

+ unambiguous descriptions: the formal language provides a tool for specifyingwithout the ambiguity introduced by informal description techniques.

+ complete descriptions: in a formal specification it is obvious and precisely de-fined what kind of properties are described and what is postponed for laterrefinement.

The disadvantages of skipping a formal verification process are that questions con-cerning

– soundness: a formal specification may contain subtile errors that may solely bedetected by formal proof techniques;

– correctness: the correctness of an implementation with respect to its specificationcannot be guaranteed without formal correctness proofs;

– deeper insight: in general, formal proofs of system properties provide a deeperinsight into the consequences of system models; cannot be answered withoutformal proofs.

3.2. VDM AND TOOLS 35

Some formal methods facilitate the specification in an explicit style such thata tool can interpret or generate code from this kind of formal specifications. Theadvantage of these executable specifications is

+ that testing can be used for raising the confidence in the specification.

The disadvantage of executable formal specifications is that

– they are in general less abstract than more implicitly defined specifications, likenon-deterministic pre- and postconditions.

Our light-weight approach to testing is based on executable specifications of asystem’s requirements. The method uses these abstract system prototypes not onlyfor checking the specification, but for:

• the validation of the requirements,

• the validation of existing system-level test cases,

• the design of system-level test-cases, and

• the validation of the formal specification.

The formal method that provides such a light-weight application is the ViennaDevelopment Method supported by IFAD’s VDMTools. The following section willgive a short introduction to VDM and the used set of tools.

3.2 VDM and Tools

3.2.1 The Vienna Development Method

In the following, the Vienna Development Method (VDM) and its specification lan-guages are described. The contents are partly based on [PT89, PvKT91, LHB+96,Lar95].

VDM is a formal method for the description and development of computer basedsystems. It is a model-based method, i.e. its formal descriptions (VDM specifica-tions) consist of a model of the system being constructed.

A system design is generated through a series of specifications, where each spec-ification is more concrete and closer to the implementation than the previous one[Bjø89]. Each of these development steps introduces proof obligations which, whenappropriately discharged, ensure the (relative) correctness of the implemented sys-tem.

VDM is one of the most widely used formal methods and has been applied tothe construction of a large variety of software systems [Bee86, Be80, Bod88, BFL96,Cot84, Han94, JS93, Lam91, MS93, OH95, Sad88, TaA88].


Its history starts in the 60-ies, when, in the IBM laboratory in Vienna, a group ofpeople, managed by Heinz Zemanek, worked on formal language definition and com-piler design. The goal was to create an operational semantics for the programminglanguage PL/I [LW69, Luc87]. By outsiders the meta-language used was dubbedthe “Vienna Definition Language” or VDL. Due to the complicated formal reasoningabout the operational semantics, a denotational semantics approach, was taken in1972 with the meta-language Meta-IV. In the past, Meta-IV has been used in manydifferent variants. First of all, these variants reflect several different styles in the useof VDM:

• the Danish school,

• the English school,

• the Irish school.

The differences between the schools is reflected in their different application areas.The Danish school focused on modeling large software systems, and thus used VDMto abstract from complex systems for better understanding, whereas the Englishschool has concentrated more on the correctness of smaller systems and used amore implicit style in order to ease the verification of the systems. The Irish schoolof VDM can be described as a constructive style of VDM with a heavy use ofalgebraic structures, like monoids or groups, for modeling. Here, all specificationsare formulated in an explicit (executable) style.

In the mid-eighties the British Standards Institute (BSI) started to harmonizethe different variants producing a standard for Meta-IV [Sen87, PT92, Par94]. Inthe nineties this was taken up by the International Organization for Standardization(ISO) [LHB+96] and Meta-IV became VDM-SL.

VDM consists of two major components:

• The VDM Specification Language, VDM-SL. The language describes a notationin which VDM specifications can be expressed.

• A method describing the development from a specification to an implementa-tion in a certain programming language. The method part of VDM is not asprecisely defined as the VDM-SL notation.

The VDM bibliography [Lar00] contains more than 500 references including tu-torial text books [Jon90, FL98], a reference manual of VDM-SL [Daw91] and a bookabout proofs in VDM [BFL+94].

In the 90-ies, an object-oriented version of VDM has been created — VDM++.Beside its object-oriented features, VDM++ offers specification techniques for cap-turing the dynamic behavior of parallel systems as well as for the timing constraintsof real-time systems. In contrast to VDM-SL, the specification language of VDM++is not standardized and therefore its syntax and semantics are still subject to changes[Lan95, DG95, DGP95, KL96, IFA00, Ins00].

3.2. VDM AND TOOLS 37

In the light-weight use of VDM, the author follows the tradition of the Danishschool of VDM. In this thesis the main focus concerning VDM is rather on modelingcomplex computer-based systems than on proving their correct implementations.However, the proof obligations for a correct refinement will be used for testingpurposes in Chapter 7.

It might be due to the long tradition of light-weight applications of VDM inDenmark that the tools that provide the best support of this approach are Danish— the IFAD VDMTools.

3.2.2 IFAD VDMTools

Tool support of formal methods is crucial for transferring these techniques intoindustry. Our light-weight approach is based on the features currently offered bythe commercial IFAD VDMTools. Two versions of these tools exist, one for thetraditional VDM-SL, one for the object-oriented VDM++. Both have been used inour industrial experiments.

The following features are provided by the last versions of IFAD VDMTools:

Syntax checking of VDM specifications. To provide a literate programming style[Knuth84] for specifications, the VDM source code may include LATEX typesettext under the UNIX family of operating systems. Under Windows, VDMspecification can be included into MS Word documents.

Static and dynamic type checking features the testing of pre- and postcondi-tions, and data-invariants.

Pretty printing. The VDM source code in ASCII format is converted into themathematical notation for VDM-SL by generating a LATEX document underUNIX.

Execution. Specification interpretation is provided for all executable language con-structs.

Test coverage analysis. Test coverage information can be automatically recordedduring the evaluation of a test-suite. The specifier can at any point check whichparts of the specification are most frequently evaluated and which parts havenot been covered at all.

Source-level debugging. The debug mode is used to interpret the specification.This mode provides an overview of break points, function trace, and definitionsavailable in the specification.

Specification management maintains a project by keeping track of the status ofmodules and remembering this status from session to session. Furthermore,all the functionalities are accessible through a graphical user interface.


Code generation An explicit form of a specification that may be compiled intoC++ or Java source code.

Dynamic Link Facility With this feature external C++ libraries may be linkedto the VDM-SL tool. This enables the combined interpretation of VDM-SLspecifications and C++ source code [FL96].

CORBA API For VDM++ an API has been developed through which the tool canbe accessed as a CORBA object. This enables the call of the specification frominside a programming language that is supported by the CORBA standard[OMG96].

The specification interpreter together with the test coverage analysis tool ismainly used for our light-weight approach to system-level testing. The possibili-ties of combining programs and executable specifications establishes the test oracleframework that is introduced in Chapter 2.5.

3.3 Test Design with Abstract Prototypes

3.3.1 Abstract Prototypes

The subject of this thesis are abstraction techniques in testing. In the following,our light-weight approach that is based on an abstract but executable model of thesystem’s requirements is introduced. Such an abstract prototype that is defined ina formal specification language, like VDM-SL, differs from conventional prototypesas used in rapid prototyping processes in many ways:

abstract model of state: the prototype is abstract in the sense that implemen-tation details are not included in the model of the state. Here, an adequateabstraction level mainly depends on the abstraction level of the requirements.

data invariants: the state model can be further constrained by adding logicalproperties to a data model. These data invariants must hold before and afterthe execution of an operation and can be dynamically checked during inter-pretation of the prototype.

preconditions: further abstraction of the formal prototype may be gained by re-ducing the definition of a system’s functionality to common cases explicitlystated in the requirements. Special cases, like exceptions may be subject for alater, more detailed design. Preconditions are used for defining the propertythat must hold for executing an operation succesfully. Here, successful execu-tion means an execution that terminates. Preconditions may be dynamicallychecked during execution of the prototype.

3.3. TEST DESIGN WITH ABSTRACT PROTOTYPES 39

advanced language features: A specification language usually offers more ad-vanced language constructs than traditional programming languages. This isdue to their mathematical nature. Examples of such features are

• logical quantors that are executable over finite domains;

• comprehension expressions, like set comprehensions;

• pattern matching, like in functional programming languages.

These language features enable an executable formal specification to serve as anabstract prototype for test-case design and validation. In the following, the approachis explained in more detail.

3.3.2 The Light-Weight Test Approach

Our method proposes the use of a model-based specification language with an exe-cutable language subset supported by a specification interpreter. In our case VDMand IFAD VDMTools have been chosen for building a formal and executable modelrepresenting the requirements of the system under test. Although the model is exe-cutable, it is still abstract in the sense that implementation details are not included.Only abstract mathematical objects like sets, sequences, finite mappings are usedto capture the relevant requirements.

Once an executable formal model has been established, the interpreter tool canbe used to run test-cases on this formal prototype. With this test-case animationalready designed test-cases can be validated, or new test cases can be designed usingstructural techniques. IFAD VDMTools support this process with their test coverageanalysis tool that highlights uncovered parts in the VDM specification.

Figure 3.1 presents an overview of the processes of our approach that have beendeveloped for the two experiments in industry. The phases are described as follows:

1. The starting point is the informal requirements documentation containing theuser and software requirements of the system that should be tested.

2. Usually, in an industrial environment a traditional test design process is al-ready established. This traditional test design method takes the informalrequirement documents and produces one test-case for each requirement.

3. In parallel to the conventional test design, a formal methods expert creates anabstract VDM prototype out of the informal requirements descriptions. Thisformalization process usually highlights a lot of open issues in the requirements.These are documented in a catalogue of questions concerning the requirements.

4. The requirement engineer reacts to the questions in the catalogue. Somequestions are due to a lack of domain knowledge of the specifier, others needa clarification and then a modification of the requirements. In some cases the


Requirements Formalization

Requirements Document

Formal Specification

Conventional Test-Cases

Formalized Test-Cases

Synthesized Test-Cases

Test-Case Animation

Test-Case Formalization

Test-Case Synthesis

Test-Case Correction

Requirements Modification

Test-Case Generation

Error Report

Catalog of Questions

Figure 3.1: Processes with inputs and outputs.

questions have to be forwarded to the customer (user) of the system underdevelopment — validation has been initiated.

5. When test-cases are available, these are formalized, as well. The formalizedtest-cases are sequences of calls to the operations of the abstract prototype.Furthermore, observer routines are used for checking the test results as spec-ified in the conventional test documents. This phase uncovers errors in thetest-cases, like missing steps, or wrong input/output specifications.

6. The abstract prototype is executed (animated) with the formalized test-cases.The coverage information that is collected during the test runs highlights un-covered parts in the requirement specification.

7. Additional test-cases are designed from the test-specification such that 100 %of the specification are covered.

One will argue that the conventional test-design approach can be skipped andtest-cases can be directly synthesized from the formal specification. However, expe-rience shows that this twofold test-design approach constitutes a valuable validationprocess. This is of high importance, since this kind of test-cases are often used asacceptance test-cases of the product. Furthermore, it is sometimes the case that aconventional test engineer has an implicit knowledge of the domain and the test-cases needed. This means that he designs test-cases for requirements that are not

3.4. FORMAL SPECIFICATIONS AS A CATALYST IN VALIDATION 41

explicitly stated in the requirements Then, it is a promising method to keep theestablished test design method and to validate it systematically as described above.

In the following, the validation role of a the formal specification and its impor-tance for testing is emphasized.

3.4 Formal Specifications as a Catalyst in Valida-

tion

3.4.1 Issues in the Requirements and Test Documentation

System-level testing is based on the user and software requirements of the systemunder test. Consequently, the quality of the test-cases depends on the quality of therequirement documentation. Common sources of errors in the test case design are

• missing requirements,

• ambiguous requirements, or even

• unsound requirement descriptions.

This is especially valid for large projects, where system-level tests are designed andcarried out by independent test engineers. This was the case in the two projectsdescribed in the next chapter.

A further source of errors is the complexity of the system-level test-cases itself.Usually, several interactions with the system are necessary for establishing an inter-nal state prior to the testing of a given requirement. Designing such a test from aninformal prose document by hand leads to several mistakes in the test design.

Common errors that have been found in the test documentation during our twoexperiments are

• missing test cases,

• missing interactions (test steps) in a test case,

• wrong interactions (test steps),

• wrong input-data of a test,

• wrongly predicted test results.

It has been demonstrated in the two industrial experiments that a light-weight for-mal method can be used very efficiently for detecting issues in both, the requirementsand the test-cases.


3.4.2 A Catalyst in Validation

Our experiments demonstrated the fact that formal specifications increased the num-ber of detected issues in the documents. Therefore, the author describes specificationtechniques as a catalyst in the validation process of requirements and system-leveltest-cases.

The American Heritage Dictionary defines a catalyst as a substance, usuallypresent in small amounts relative to the reactants, that modifies and especiallyincreases the rate of a chemical reaction without being consumed in the process.

The presentation of the empirical data gained in the next chapter will demon-strate the catalyst role of our approach in validating the requirements as well as invalidating test cases.

3.5 Summary and Discussion

In this chapter, a light-weight formal approach to testing has been presented. Light-weight formal methods and their properties have been discussed. Then, VDM andits most prominent tool-set has been explained. Both, the specification languageof VDM together with the powerful tool support enables the efficient applicationof an abstract prototype in testing as has been explained. Finally, the importantvalidation role has been discussed.

The chapter does not present any technical novelties concerning tool developmentof new formal methods. It is an application of a light-weight formal method fortesting.

However, the strong emphasize of the approach on the validation of both, therequirements as well as the test-design is new. Especially, to our present knowledge,it is the first time that this light-weight method has been applied in two industrialexperiments and that detailed empirical data have been collected — data about theefforts and efficiency of the method. These data are needed for convincing decisionmakers in industry. It is a prerequisite for transferring the techniques. A fact thatis too often neglected by the research in the domain of formal methods.

Chapter 4

Industrial Experiences

In the previous chapter, a light-weight formal method for testing has been presented.This chapter reports about our experiences with the application of the method,gained in two industrial projects. The results of the two experiments support ourclaims previously made. The presented data will give insight into the costs andbenefits of formal approaches to black-box testing. Short examples from the projectshelp to illustrate the approach further. Finally, it is discussed how these experiencesmotivated our more advanced test framework described in Chapter 5 & 6.

4.1 The Application Domain

The work presented in this chapter has been performed in two joint projects of theAustrian company FREQUENTIS1 and the Institute for Software Technology at theGraz University of Technology.

The goal of the two experiments has been to apply the light-weight formal testmethod in projects of a representative size. The size of the projects made it necessarythat the actual specification work has been carried out by two students in theirmaster’s thesis projects for the degree of a diploma engineer in Telematics2. Thestudents have been educated in formal methods and the test method by the author.Furthermore, their work has been supervised and steered by him in a very interactivemanner.

4.1.1 Air Traffic Control

The core sphere of business of the company FREQUENTIS is air traffic control. Itprovides voice and data communication systems for safely routing aircrafts betweenthe world’s airports. The existing voice communication system for air traffic control

1FREQUENTIS Nachrichtentechnik GmbH, Spittelbreitengasse 34, 1120 Vienna, Aus-tria/Europe, URL: http://www.frequentis.co.at/

2A combination of computer-science and telecommunications.

43

44 CHAPTER 4. INDUSTRIAL EXPERIENCES

line bound voice communicationvoice communication via radioVCS internal voice communication

Airport withVCS 3020S system

Another airport

External party

·

VCS

·

··

PSTNISDN

Figure 4.1: A typical environment of the VCS 3020S system

VCS 3020 is one of the central products of the company FREQUENTIS and hasbeen chosen as the target system in the first project. Voice communication amongair traffic controllers and between the controllers and the pilots is crucial to thesafety of commercial aviation. The voice communication system is responsible fortransmitting the commands of the air-traffic controllers, concerning starting andlanding clearances, positioning of the air-planes, emergency situations etc.

The work in this domain demonstrates the crucial role that formal specificationtechniques play in software engineering. They may serve as a tool for validatingboth, the functional requirements as well as existing system-level test-cases. It willbe shown by the data that the validation process of the requirements is essential forthe design of black-box test-cases.

4.1.2 Voice Communication in Air Traffic Control

Among other important systems on an airport, like radars, instrument landing sys-tems, etc., the voice communication system can be seen as the nerve system ofthe airport. It serves as the sole communication system between the pilots, the air-traffic control personnel working on the airport (denoted by • in Fig. 4.1), the groundpersonnel on the runways, other parties external to the airport and even other air-ports. This typical environment of the FREQUENTIS voice communication systemVCS 3020S is shown in Fig. 4.1.

The system offers all well known services of PABX3 or PSTN4 systems, i.e. callset up and termination, conferencing, call forwarding, call diversion, etc. In contrastto these systems, it also supports voice communication via radio. The system makesa clear distinction between line bound duplex voice communication (voice can be

3PABX . . . Private Automatic Branch Exchange4PSTN . . . Public Switched Telephone Network

4.1. THE APPLICATION DOMAIN 45

OP

OP

OP

RIF

LIF

LIF

LIF

VCS 3020S System

PSTN/PABX/phone set

ISDN

MFC

radio set

SWITCH

Figure 4.2: VCS 3020S system overview

transmitted and received at the same time) and simplex communication via radio(voice must be either transmitted or received).

Figure 4.2 shows the essential units of the system as well as the interfaces to theexternal components. The main voice communication path is established betweenthe operators working at the operator positions (OP) and line bound parties as wellas radio parties (shown at the left side of the figure). The radio sets, necessary forcommunication via radio, are external to the system and are connected to it viaradio interfaces (RIF s). Line bound voice communication is established via the lineinterfaces (LIF s), which are highly configurable.

The VCS 3020S system is a switched system intended for real-time transmissionof continuous voice data. This makes it easy to distinguish the VCS 3020S fromboth, non-switched broadcast systems and data networks, which are not intendedfor real-time transmission of continuous media. The architecture of the system isdistributed and the components are connected by means of bi-directional interfaces.Furthermore, the safety critical core components of the system are duplicated andoperate in hot stand-by mode.

4.1.3 A Network of Voice Communication Systems

In contrast to the first project, which covered the already existing voice commu-nication system VCS 3020S , the second project considered a new communicationsystem still under development.

The primary goal of the new system is the complete interconnection of air trafficcontrol centers via a fully digital network. The digital network lines will be used forvoice (telephone and radio) and data transmissions. There are several interlinkedair traffic control centers, each of them consisting of a network node and a voicecommunication system (VCS). See Figure 4.3 for an example of such a networkconnecting the main airports in Austria.

One of the advantages of the new system is that lines between the network nodes


VCS

VCS

VCS

VCS

VCS VCS

N1

N3

N5

N6

N4 N2

Figure 4.3: A network for air traffic control

are allocated dynamically. Therefore, additional lines can be allocated to compen-sate a broken line or to provide additional bandwidth in times of high network traffic.This dynamic allocation is also valuable from a cost effectiveness point of view.

Another important concept are roles. Roles are simply functionalities: for in-stance the functionality to control a certain sector of air traffic. These roles - orfunctionalities - are not bound to specific OPs, but can be distributed within the net-work, to optimize the work-load on the controllers. Therefore, it should be possibleto control air space, independent of the physical location of the controller.

4.2 Experiment I

The specification work in this project has been carried out by Johannes Horl, astudent of the author. He had a training of a one semester course in VDM, butno domain knowledge of voice communication in air traffic control. In this project,VDM++ has been used the first time in our group for specifying the requirementsof a non-trivial system.

4.2.1 Requirements Validation

In the following, the techniques and results of modeling the system in VDM++ arepresented. First, the informal UML model of the object-oriented architecture andits tool-based mapping to VDM++ classes are outlined. Then the specification offrequency coupling serves to demonstrate the specification language of VDM++.Finally, the statistics of the ’errors’ found in the informal requirement documents ispresented.

4.2. EXPERIMENT I 47

Architecture of the System

In order to structure the specification, an object-oriented architecture reflectingthe physical components has been chosen. The states and methods of the fourclasses OP , SWITCH , LIF and RIF define the main functionality of the system.Additional auxiliary classes have been used to define system-wide constraints andconfiguration details. Figure 4.4 shows the relations between these classes in UMLnotation. The UML class diagrams have been automatically generated from theformal class specification using the IFAD VDM++ Toolbox link to the RationalRose 98 CASE-tool [IFA98].

LIFid : LIF_IDs witch : @SW ITCH

RIFid : RIF_IDs witch : @SW ITCH

OPid : OP_IDs witch : @SW ITCH

SWITCHops : s et of @OPlifs : s et of @LIFrifs : s et of @RIF

lifs

0..*switch

0..*

rifs

switch

switch

0..*

ops0..*

0..*

0..*

Figure 4.4: Main class relations (in UML notation).

The following specification describes the architecture of the VCS 3020S system inVDM++ notation. Here, the specifications of the classes RIF and LIF are skipped,since they are similar to the one of the OP class.

class OP

instance variables

id : OP -ID ;switch : @SWITCH

end OP

class SWITCH

instance variables

ops : @OP -set;inv ops 4 card ops ≤ 24;lifs : @LIF -set;inv lifs 4 card lifs ≤ 96;rifs : @RIF -set;inv rifs 4 card rifs ≤ 48

end SWITCH

The bidirectional associations between the components are specified as sets ofreferences to each other. Additional invariants constrain the number (cardinality)of possible operator positions and interfaces.


An Example of the System’s Functionality

Within the scope of this project, specification has been limited to the radio commu-nication system part, since this is the most safety critical part and thus of particularinterest for analyzing the use of formal methods. In order to keep the demonstrationexample small, discussion is primarily focussed on the function “Coupling/RadioRe-transmission” in this thesis.

Note, that a physical frequency used for voice transmission is associated witha single Radio Interface (RIF ) of the system. The RIF includes hardware andsoftware necessary for the voice communication and the associated signaling, likethe PTT signal and the SQUELCH-Break signal. It is not possible to change thephysical frequency on which voice is sent or received from within the VCS 3020Ssystem. The system represents these frequencies as frequency objects, which have anumber of possible operations. The “Coupling/Radio Re-transmission” function isone of them.

With this function the operator can couple two or more frequencies. This meansthat voice received on one frequency is automatically (re-)transmitted on the othercoupled frequencies. By means of this feature, two pilots, operating on differentfrequencies, are enabled to talk to each other using the VCS 3020S system. Couplingcan be switched on/off for each frequency at the operator position and all frequenciescoupled form a so-called “coupling group”. Consequently, several different couplinggroups may exist, one at each operator position.

The instance variable frq-couplings of the class SWITCH stores the couplinggroups as a finite mapping from operator positions to sets of frequencies. TheSWITCH has to guarantee, that a frequency can only be member of a single couplinggroup and that not more than 15 frequencies are in one coupling group. Thisproperty is expressed by means of a data-invariant denoted by the keyword inv.

instance variables

frq-couplings : OP -IDm→ (FRQ-ID -set);

inv frq-couplings 4(∀ s ∈ rng frq-couplings · card s ≤ 15) ∧(card rng frq-couplings > 1 ⇒ ⋂

rng frq-couplings = {});init objectstate 4

frq-couplings := {7→};

The following two methods are called from the OP if frequency coupling isswitched on or off for a frequency. The purpose of the method O-FrqCouplingStartis to decide, whether frequency coupling is allowed for the given frequency or not.If the invariant of frq-couplings would be violated, false is returned, otherwise thefrequency is added to the coupling group associated with the OP .


methods

O-FrqCouplingStart (op : OP -ID , frq : FRQ-ID) value B 4if frq ∈ ⋃

rng frq-couplings ∨(op ∈ dom frq-couplings ∧ card frq-couplings (op) > 14)

then return false

else ( if op ∈ dom frq-couplingsthen frq-couplings := frq-couplings †

{op 7→ {frq} ∪ frq-couplings (op)}else frq-couplings := frq-couplings † {op 7→ {frq}};return true

)pre op ∈ frqs (frq).tx

The method O-FrqCouplingEnd is the counterpart of the previous one. It isused to switch off the coupling of a particular frequency. The frequency is eitherremoved from the coupling group, or the whole coupling group entry is removed, ifthe frequency was the last one in the coupling group.

methods

O-FrqCouplingEnd (op : OP -ID , frq : FRQ-ID) 4if card frq-couplings (op) = 1then frq-couplings := {op} −C frq-couplingselse frq-couplings := frq-couplings † {op 7→ frq-couplings (op) \ {frq}}pre op ∈ dom frq-couplings ∧ frq ∈ frq-couplings (op)

The specification of the OP is not shown here. It mainly deals with details of theMMI (Man Machine Interface) in order to call methods of the SWITCH . However,the specification of the OP has been important, to be able to formulate systemtest-cases at the appropriate level of abstraction.

Analysis of Identified Issues

During the formal specification of the VCS 3020S system a total number of 64ambiguous, contradictory, unclear or erroneous issues have been found in the variousdocuments of the system. Figure 4.5 shows the statistics of them. The issuescategorized as Reading have been detected by ’active’ reading the documents, havinga clear vision of the modeling method to be carried out.

After about a month of practical experience with the system parts, the informaldocuments have been read again and thereby more issues have been found, whichhave been categorized as Reading with system knowledge. Next the configuration, i.e.the static initial state of the system, and the functionality of the system have beenspecified. The issues found thereby have been named Specification of configurationand Specification of functionality, respectively. The reader should note that thereis an enormous difference in the type of issues found by reading the documents


20

8

5

17

12

0 0

2

0

5

10

15

20

25

Reading Reading with system-

knowledge

Specification of

configuration

Specification of

functionality

Issu

es

Ambiguous, contradictiory or unclear issues

Erroneous issues

Figure 4.5: Statistics of the identified issues

in comparison to those found by writing the formal specification. A typical errorfound by reading the documents would be a missing character in a table of validcharacters for a configurable name. Contrary to this, an issue found by specifyingthe functionality of the system typically deals with undefined behavior of the systemunder certain conditions or/and after a specific sequence of user actions. It wouldhave been very difficult or even impossible to find such issues only by reading orreviewing the documents, even if this is carried out by a VCS 3020S system expert.

The identified issues have been resolved and the results have been integrated intothe informal documents, which is certainly helpful to improve the overall quality ofthe system.

4.2.2 Test Case Validation

The formalization of the test cases is shown by means of an example. Figure 4.6is a copy of the informal test case “Frequency Coupling” out of the Internal TestSpecification of FREQUENTIS .

It can be seen in Fig. 4.6 that a test case is divided into several steps. Each steprequires an action from the test-personnel. The expected reaction of the VCS 3020Ssystem, which is considered to be correct is specified in the reaction column for eachtest-step. The test-personnel has to check, whether the reaction of the system isidentical to the specified behavior or not.

The formal specification of test cases follows the informal specification closely.The actions required for every step of the informal specification are explicitly speci-fied as method invocations. These methods have a prefix ‘E-’ to indicate the exter-nal call mechanism. The informally specified reactions after the actions are formallydefined as calls to observation methods with the prefix ‘Test-’. The VDM++ spec-ification in Figure 4.7 is the formalization of the informal test case presented in


StepNo.

Actions Reactions OK Remarks

1 Couple frequency 1 withfrequency 2 by pressing the<CPL> key on both frequencyobjects.

The frequencies are coupled(the coupling symbol appears for both frequencies).All audio received onfrequency 1 is transmitted onfrequency 2 and vice versa.

2 Press PTT Voice is routed to frequency 1and 2

3 Activate TX for frequency 3 Frequency 1 and 2 are stillcoupled, so all audio received onfrequency 1 is transmitted onfrequency 2 and vice versa.Attention: no voice is routed tofrequency 3

4 Press PTT Voice is routed to frequency 1, 2and 3.

5 Stop the frequency coupling bypressing the <CPL> key onboth frequency objects again.

The frequency coupling symbol( ) disappears.

Figure 4.6: Informal test case for the “Coupling/Radio Re-transmission” feature.

Figure 4.6.

The action part of the first step of the informal test case is represented by thefirst two calls of the method E-FrqChangeCoupling of the class op1 (which representsan Operator Position). This call turns the frequencies with the ids ‘1’ and ‘2’ intotransmit mode and couples them. The following two calls to the method Test-IsFrqCoupled check if it succeeded. This corresponds to the reaction part of theinformal test case. These test methods return a boolean value and the conjunctionof these values is the total result of the test case.

The test steps 2–4 have been formalized using the same pattern. The steps 2and 4 require, that the PTT key has to be pushed (formalized with the E -PttPushmethod). Note, that the PTT key also needs to be released after the reaction hasbeen tested (formalized with the E -PttRelease method). This is not mentioned inthe informal test case. Furthermore, the last three method invocations are not partof the informal test case specification. These method invocations bring the systemback into its initial state. This is very important, since the next test case to be runassumes that the system is in this initial state. When the real system is tested, it isassumed that the the test personnel sets the system into an appropriate state beforea test is performed.

The example shows how the specification based test automation leads to a moreprecise test case description. The results of a test case are not relevant for the testcoverage analysis, but further confidence into the formal specification is gained bychecking them.


methods

TestCoupling () value B 4( dcl result : B := true;

op1!E -FrqChangeCoupling (mk-FRQ-ID (1)) ;op1!E -FrqChangeCoupling (mk-FRQ-ID (2)) ;result := result ∧ op1!Test-IsFrqCoupled (mk-FRQ-ID (1)) ∧

op1!Test-IsFrqCoupled (mk-FRQ-ID (2));op1!E -PttPush () ;result := result ∧ rif 1!Test-IsPttActiveBy (mk-OP -ID (1)) ∧

rif 2!Test-IsPttActiveBy (mk-OP -ID (1));op1!E -PttRelease () ;op1!E -FrqChangeTx (mk-FRQ-ID (3)) ;op1!E -PttPush () ;result := result ∧ rif 1!Test-IsPttActiveBy (mk-OP -ID (1)) ∧

rif 2!Test-IsPttActiveBy (mk-OP -ID (1)) ∧rif 3!Test-IsPttActiveBy (mk-OP -ID (1));

op1!E -PttRelease () ;op1!E -FrqChangeCoupling (mk-FRQ-ID (1)) ;op1!E -FrqChangeCoupling (mk-FRQ-ID (2)) ;result := result ∧ ¬ op1!Test-IsFrqCoupled (mk-FRQ-ID (1)) ∧

¬ op1!Test-IsFrqCoupled (mk-FRQ-ID (2));op1!E -FrqChangeRx (mk-FRQ-ID (1)) ;op1!E -FrqChangeRx (mk-FRQ-ID (2)) ;op1!E -FrqChangeRx (mk-FRQ-ID (3)) ;return result

)

Figure 4.7: Formalized test cases.

Coverage Analysis Results

By means of the IFAD VDM++ Toolbox all test cases for the specified functionalityhave been animated. Thereby, runtime information, i.e. which expressions/statementshave been evaluated, has been collected.

The overall result of this analysis is that approximately 80% of the formal spec-ification is covered by the test cases, leaving some 20% of the specification’s func-tionality untested! The result is of major importance since it points out that anunacceptable large part of the radio functionality of the VCS 3020S system remainsuntested in the system integration test phase. A more detailed summary is shownin Table 4.1.

The IFAD VDM++ Toolbox can also generate a colored specification, i.e. thecovered and not covered parts of the specification are pretty-printed with different


Class Name Number of Methods Coverage

OP 31 91%RIF 11 70%

SWITCH 35 79%

average – 83%

Table 4.1: Summary of the test coverage results

colors. The colored specification can be used to find missing test cases. At present,it is not possible to generate the missing test cases automatically, however, it ispossible to find the missing test cases systematically. Based on this informationnew test cases have been specified and a re-run of the test coverage analysis showed,that 99% of the formal specification were covered by the combination of the old andthe new test cases, which is an acceptable coverage ratio.

4.2.3 Remarks on Experiment I

Issues

The whole model is about 2400 lines5 of VDM++ specification. About 1000 lineshave been used for the test cases and the Workspace, which was necessary to makethe specification executable.

During the development of the formal model, a total number of 64 defects havebeen found in the informal system documents. These ambiguous, contradictory,unclear or erroneous issues have been summarized in a catalogue of questions. Allof them have been resolved, supported by system experts.

A main result of this work was the realization that only 80% of the system’sradio functionality has been covered by the former existing test cases. Consequently,additional test cases have been derived out of the formal specification. Furthermore,the specification high-lighted how much more economic test cases could be designed,in order to cover more system functionality in a single run.

Efforts

During the whole work of this project, the amount of time spent on different taskshas been recorded. The summary in Fig. 4.8 shows the efforts, which have beennecessary for the different tasks. This information can be used for estimating thecosts of future projects.

Note, that the time-efforts have been recorded on a per hour basis. This meansthat the values have not been estimated after an activity has been performed. Theeffort has rather been recorded while the work has been performed.

5The VDM++ LOCs have been determined by removing all comments and counting only lineswith more than 2 characters


1,2

2,3

3,5

1,2

1,1

3,0

Read FRQ Documents

Read FM Literature

Formal Specification

Test Coverage Analysis

Modification

Meetings, Presentations

Working times are given in

40 hour units (approx. 1 week)

Figure 4.8: Time analysis for the whole work

As can be seen from the figure, the time efforts have been categorized. The timespent on reading the documents of the VCS 3020S system (Read FRQ Documents)was rather short because of an additional practical work with the existing VCS 3020Ssystem. About twice the time has been spent on reading literature about formalmethods (Read FM Literature). For the specification of the architecture and thefunctionality of the system 31

2weeks have been used. The amount of time spent

on the test coverage analysis also includes the time spent to find the additionaltest cases. The time used for the coverage analysis and for modifications due torequirement changes seems to be extremely short. About three weeks have beenspent to prepare and make several presentations and for numerous meetings.

Experience with VDM++

The specification was done using the VDM++ language definition of the IFADVDM++ Toolbox version 6.0. Besides some minor errors which have been discoveredin the Toolbox, the main drawbacks, that have been discovered, were found on theVDM++ language itself, three of them are worth being pointed out:

• Type definitions inside a class are not public. Thus, types defined in one classcannot be used in another. The consequence is that an artificial inheritancehierarchy has to be created in order to make types available to more than oneclass.

• Method calls are not allowed in expressions.

• Member variables are private and can therefore only be accessed via methods.This property combined with the previous one can be considered as the worstdrawback of VDM++, since it is not possible to make implicit specificationswithout either calling methods from within expressions in order to retrievevalues from other objects member variables, nor is it possible to get access tothese member variables directly.

4.3. EXPERIMENT II 55

The consequence of these limitations is, that it is in most cases impossible to giveimplicit specifications, and so we felt that the specification became much too explicit.It is worth mentioning, that IFAD has also identified these drawbacks independentlyand a newer version of the Toolbox with a changed language definition is alreadyavailable.

The success and acceptance of our method depends mainly on tool support.Consequently, the development of tools to support the generation of test cases canbe identified as a central issue for future work. One experience is that professionaltesters are seriously interested in the presented techniques, since they try to improvetheir testing techniques continuously.

Concerning VDM++, the new language version has to be evaluated. Futurecase studies will show, if the reported weaknesses have been solved. For the nextexperiment, the author has taken the decision that conventional VDM-SL is sufficientfor our purposes.

However, it should be pointed out that the encouraging results convinced FRE-QUENTIS to take the second step: A new project in co-operation with our institutehas been carried out. This time, VDM-SL is used in parallel to an ongoing project,in which a safety-critical system has to be developed from scratch. The results arereported in the next section.

4.3 Experiment II

The specification work in this project has been carried out by Andreas Gerstinger, astudent of the author. Due to our experiences with VDM++ in this project VDM-SLhas been used as the specification language. This decision proved to be right sincea higher level of abstraction of the VDM prototype could be observed.

4.3.1 Requirements Validation

The requirements are contained in a regularly updated document of almost 3000 re-quirements formulated in natural language. The task of the formal methods engineerwas to create an executable model of a part of the system—specifically a networknode—which plays a crucial role. There was also a first version of the architecturaldesign document available which was helpful for gaining a general overview of thenetwork node, however, the formal specification is solely based on the requirements.

The formal methods engineer had no domain knowledge in air traffic control.Therefore, he was introduced to the domain and the system to be developed in a11

2day training session held by a system engineer. It should be mentioned that

the formal methods engineer was not directly involved in the main project—theformalization was done in parallel.

The first step was an intense study of the requirements and architectural designdocument. The formal methods engineer was not given a list of requirements to be


formalized, he was rather instructed to formalize certain functionalities. Althoughthe requirements document did contain some structure, one of the most challengingtasks was the selection of the relevant requirements.

The next step was the actual formalization of the selected requirements. The for-mal model was constructed in two stages: first a model of the system’s state, usingtypes constrained by data-invariants, then the functional model containing the op-erations. This formalization process yielded the most valuable output. Formalizingrequirements requires understanding them rigorously and in an unambiguous way.Since natural language requirements are doomed to be ambiguous, several questionshave been raised during the formalization process. These questions have been incor-porated into a “catalog of questions” which was regularly exchanged between theformal methods engineer and the requirements manager. During the whole project,108 questions have been raised, with 33 of them resulting in changes in the require-ments document. These questions uncovered inconsistencies, ambiguities, errors,omissions, redundancies and inaccuracies in the requirements document—which areunavoidable in textual requirements documents of this size. This led to an improve-ment in the quality of the requirements.

Studying the documents and selecting the relevant requirements took 5.5 weeks,the actual formalization process took another 6.5 weeks for 140 requirements. Thesize of the formal specification was 60 pages including comments.

This phase could have been completed in considerably shorter time if it wouldhave been performed by a domain expert. We believe however, that in this phase,the lack of domain knowledge is not a disadvantage—in contrast—it might be advan-tageous because of the type of unbiased questions which arise. This phase resemblesa validation of the requirements.

4.3.2 Examples of Requirement Issues

The following examples demonstrate the kind of issues being raised during the for-malization process. Relatively simple examples are chosen to keep the explanationsat a minimum.

Layout of a Touch Panel

The example is about the key layout on the touch panels of the OPs. If an OPhandles the functionality of a specific role, we say the OP has the role assigned.Roles are grouped into function groups. A function group contains roles which havesimilar functionality. A VCS may have a special OP with additional functionalitywhich is called Test-OP. The original requirements for displaying the correct layoutof a touch panel are summarized in Figure 4.9.

If these requirements in plain text are read quickly, they seem to be quite rea-sonable. However, if they are read more thoroughly they give rise to questions. Ifthey are read with the process of formalization in mind, they give rise to even more


ID Text

#0025 A function group contains all roles which have about thesame function and therefore have the same key layout.There are up to 9 function groups 1..9. Each role belongsto exactly one function group. The function group 0contains the key layout for OPs which own roles frommore than one function group.

#0042 If an OP has roles from more than one function groupassigned, a default layout shall be displayed.

#0062 If all assigned roles belong to the same function group,the key layout corresponding to the function group shallbe displayed.

#1047 The key layout of the Test-OP must always be the layoutof the function group 0.

Figure 4.9: Requirements for the touch panel layout on OPs

questions. And once the actual formalization takes place, even more issues may beraised. This was the process by which the whole project was conducted: some issueswere detected just by reading and having the formalization in mind, but some issuesrequire the actual formalization to be raised. Since the goal of the formalizationwas an executable specification—a high level prototype—the focus was also on im-plementation. Issues were raised which might otherwise have been detected muchlater in the project, namely in the implementation phase.

The most obvious inconsistency is the following. Requirement #0025 is inconsis-tent in itself: First, it is stated that there are 9 function groups which are numberedfrom 1 to 9. The same requirement, however, mentions function group 0! This isan easy-to-detect inconsistency, which probably does not require formalization to beuncovered.

However, if we read on, it will be seen that requirement #0042 talks about acertain default layout. If we don’t have the formalization in mind, we might justthink about this as a certain layout. However, if we view the ”default layout”inconnection with the other requirements, the question is raised if the ”layout of thefunction group 0”and the ”default layout”are equivalent. This is in fact true, sothere is an inconsistent terminology in the requirements.

An unambiguous specification can be given using VDM type definitions:

Functiongroup = Ninv f 4 f ∈ {1, . . . , 9};

Layout = Functiongroup | DEFAULT;


A data-invariant and a union type model the fact that there are 9 function groupsnumbered 1 to 9, but ten layouts: one for each function group plus an additionaldefault layout.

During the formal specification of the actual layout calculation another issue wasraised. So far we know:

• if an OP has roles from only one function group, the layout of the functiongroup shall be displayed.

• if an OP has roles from more than one function group, the default layout shallbe displayed.

• the Test-OP shall always display the default layout.

Below, the formal definition is shown:

CalcLayout : APLID × VCSIDo→ Layout

CalcLayout (opid , vcsid) 4( if is-TestOP (vcss , opid)

then return DEFAULTelse let assigned -roles = get-roles-of -OP (rv , opid),

functiongroups = {get-fg (roles , vcsid , rid) | rid ∈ assigned -roles} in

cases functiongroups :{} → return DEFAULT,{f } → return f ,others → return DEFAULT

end

)The if-clause is used to display the correct layout on the Test-OP. Then, the set

of assigned function groups is constructed from the set of assigned roles with thehelp of a set comprehension expression. In the following cases statement, a layoutis returned depending on the set of function groups. A set can have one, more, orno elements. The case for one or more function groups (i.e. the second and thirdbranch of the cases statement) is covered in the requirements. However, what if anOP has no roles (and therefore no function groups) assigned? This case (i.e. thefirst branch of the cases statement) is not covered in the requirements!

This is not necessarily an error in the requirements, since this case may havebeen deliberately omitted, and left to the discretion of the engineers responsiblefor the following phases. In this case, it could have been left open in the formalspecification as well. However, it was not meant to be left open. There should be arequirement saying:

If an OP has no roles assigned, a default layout shall be displayed.


ID Text

#1735 If several roles come into question, then an InterCOMconnection will be prioritized. If no InterCOM role isavailable, the lowest role will be used.

Figure 4.10: Part of the rules to determine the source of a call

Order of Roles

This is an example of an ambiguity in the requirements. A controller can initiate callsto other controllers by using the devices on his OP. On the receiving controller’s OP,the call is signaled, together with the source information of who initiated the call, i.e.a caller can be identified before the call is answered. The difficulty with this conceptis the following: since OPs can be assigned functionalities (called roles) dynamically,the source information depends on the assigned roles of the initiating OP. Therefore,there have to be some rules of how the source information is determined. Therequirement in figure 4.10 is part of these rules.

It is not important to understand the requirement. However, when formalizingthis requirement, there will—at some point—be the need to determine the “lowest”role from a set of roles.

In order to formalize this requirement correctly, there must be a total orderdefined on the roles. However, this is not the case—the corresponding requirementis missing. It is interesting to note, that it was not even clear among the systemspecialists which order is meant, since roles can be ordered by different attributes.Even test cases were generated, based on a wrong assumption of an engineer. Therequirement has been clarified by exactly stating the desired order of the roles.

4.3.3 Test-Case Validation

For parts of the functionality covered by the formal specification, conventionallydesigned test-cases already existed. These test-cases were formulated in a veryclear—we call it semi-formal—way. Each test case is broken down to atomic actionsand the results of each action are specified in great detail.

The task of the formal methods engineer was to formalize these test-cases andanimate them on the formal specification using the interpreter tool. There were 65test-cases consisting of approximately 200 single steps of which 60% were formal-ized. Due to the detailed specification of the test cases, their formalization was astraightforward task.

By comparing the expected results specified in the original test-cases with theresults returned by the formal specification—which acted as a prototype—a test-case validation has taken place. 16 errors within the original test-cases have beendetected and could therefore be corrected. Furthermore, the animation of test cases


also led to a validation of the formal model. Seven minor errors were found in theformal specification and consequently corrected. Another output of this stage wasthe calculation of the test coverage of the formalized test cases with the help of thetest coverage metrics built-in in IFAD’s VDMTools.

It was realized that this coverage metrics, which is basically expression coverage,is rather weak and that complete coverage was quickly attained. This is also due tothe fact that the conventional test-cases were designed in a very thorough manner:for each requirement one or more test-cases have been specified. The time effort forthe formalization and animation of the test-cases was 2 weeks.

4.3.4 Remarks on Experiment II

Summing up, the formalization of 140 requirements uncovered 108 open issues lead-ing to 33 changes in the requirements document, and 16 errors in the acceptancetest-cases have been detected. This was done in 14 man weeks. Considering thefact that requirements engineering is one of the most difficult parts in a project thiseffort seems to be acceptable. Especially, in a safety-critical domain like air trafficcontrol, where voice communication must not fail. The customer Austro-Controlappreciated the increased rate of early questions for clarification. The requirementengineer Robert Aster said:

In the mentioned project, the application of VDM took place at a rela-tively late stage. At this time, each member of the project team alreadypossessed a high level of insider-knowledge. Considering this fact, it waseven more astonishing, that the VDM-engineer quickly became familiarwith the complex subject. Subsequently, the application of VDM quicklyled to qualified questions. Consequently, several inconsistencies in therequirements were uncovered and corrected prior to the implementationphase. Apart from this, it was realized that several requirements werewell known to the members of the project team without being actuallywritten down.

This emphasizes the important catalyst role the formalization plays in validation.However, the experience indicates that this method could be much more effective ifit would be more tightly integrated into the conventional development process.


In this chapter the results of two experiments of applying light-weight formal meth-ods to testing, have been presented. The experiments have been carried out in anindustrial environment in the highly safety-critical domain of voice-communicationfor air-traffic control. Both experiments followed the method as described in Chap-ter 3. Examples of parts of the abstract prototype specification served to illustratethe formalisms and their role in validation.

4.4. SUMMARY AND DISCUSSION 61

Experiment I detected 64 problems in the requirement documentation of theexisting voice communication system. Furthermore, an unacceptable low coveragerate of 80 % has been analyzed by testing the abstract prototype with the existingsystem-level test-cases. As a consequence, additional test-cases have been derivedfrom the formal specification. The work has been done in less than 13 man-weeks.

Experiment II detected 33 problems in the requirements documentation of thenew voice communication network under development. This time the existing test-cases covered 100 % of the abstract prototype, but the 65 test-cases with 200 test-steps contained 16 faults. Since almost all detected faults occurred in different test-cases, this means that approximately 25 % of the test-cases have been erroneous.The effort took 14 man-weeks.

These results indicate the efficient applicability of our light-weight approach inpractice. The high number of faults in the requirements reflect the fact that require-ments engineering is a difficult task. It has been shown that formal specificationsmay help to raise the quality of both, the requirements and the system-level testcases. That supervised students could do this work underpins our opinion that themethod could be adopted by practicing engineers. Furthermore, the author thinksthat the efforts are justifiable, especially for such safety-critical applications.

However, the experiments also showed that more can be achieved. We learnedthat executable specifications have the disadvantage that non-experienced users tendto think too operationally. Consequently, the prototypes lack abstraction and in-corporate too many design details. However, the engineers liked the availability oftest-cases to validate our formal specifications.

The solution is a test-case generation method that derives the complex test-cases from a general non-executable formal specification. These test-cases have tobe correct by definition. A premise for using test-cases for validating a specificationis that they are correct with respect to the specification.

Therefore, a new test-case synthesis method has been developed by the authorand is presented in the following. The general correctness relation for all kinds oftest-cases is abstraction, as is explained in the following chapter.

Chapter 5

Test-Cases are Abstractions!

In this chapter, the foundation for the new test-case synthesis approach is presented.It is shown that test-cases are abstractions of model-based formal specifications. Theframework of the refinement calculus is used for formally defining and proving thisabstraction relation for simple input-output test cases, non-deterministic test cases,partition test cases and test scenarios for interactive systems. A short introductionto the refinement calculus is included.

5.1 The New Idea

5.1.1 Motivations

In the concluding discussion of the previous chapter, the motivation for our newtest-synthesis approach has been given: Test-cases must be correct with respect tothe formal specification. Here, by the correctness of a test-case it is meant thatthe functionality described by a test-case (for a given input) must be consistentwith the functionality described by the formal specification. The reason is that thetest-cases should serve as a tool for validating an abstract possibly non-executableformal requirement specification. Test-cases can be easily understood by a customeror engineer who is not familiar with a formal specification language.

A further motivation for this work is the need of a general formal frameworkfor certifying test-case generation tools. Simon Burton has a similar motivation inhis work on Z that has been recently published [Bur00]. However, our framework ismore general as will be seen in the following.

5.1.2 Test-Design as a Formal Synthesis Problem

As can be seen in the previous chapter, test-cases are complex algorithms that haveto be executed either by a tester manually, or by test drivers.

The goal of our framework, is the derivation of such test-cases T from a formalspecification S . A test-case T should be correct with respect to S , and a program

63

64 CHAPTER 5. TEST-CASES ARE ABSTRACTIONS!

P that implements S should be tested with T . Thus, the derivation of T from Sconstitutes a formal synthesis problem. In order to formalize a certification criteriafor this synthesis process, the relation between T , S and P must be clarified.

It is well-known that the relation between a specification S and its correct im-plementation P is called refinement. We write

S v P

for expressing that P is a correct refinement (implementation) of S . The problemof deriving an unknown P? from a given S is generally known as program synthesis:

S v P?

In this chapter, it is shown that correct test-cases T are abstractions of thespecifications S , or:

T v S v P

Consequently, test-case synthesis is a reverse refinement problem. The reverserefinement from a given S into an unknown test-case T? is here called abstraction,and denoted as:

S w T?

Hence, formal synthesis techniques can be applied for deriving test-cases froma formal specification. This is the most important idea in this thesis. Its maincontribution is to use the theory of Back and von Wright’s refinement calculus[BvW98] for formulating abstraction rules for deriving correct test-cases.

The refinement calculus is a theory for reasoning about the correctness and therefinement of programs. In the following it is shown how its program synthesistechniques can be applied to test-case synthesis.

5.2 The Refinement Calculus

5.2.1 Contracts

The prerequisite for testing is some form of contract between the user and theprovider of a system that specifies what it is supposed to do. In case of system-level testing usually user and software requirement documents define the contract.Formal methods propose mathematics to define such a contract unambiguously andsoundly. In the following two chapters the formal contract language of Back and vonWright [BvW98] is used. It is a generalization of the conventional pre- and post-condition style of formal specifications known in VDM, B and Z. The foundation ofthis refinement calculus is based on lattice-theory and higher-order logic (HOL).

5.2. THE REFINEMENT CALCULUS 65

A system is modeled by a global state space Σ . A single state x in this statespace is denoted by x : Σ . Functionality is either expressed by functional statetransformers f or relational updates R. A state transformer is a function f :Σ → Γmapping a state space Σ to the same or another state space Γ .

A relational update R : Σ → Γ → Bool specifies a state change by relating thestate before with the state after execution. In HOL, relations are modeled by func-tions mapping the states to Boolean valued predicates. For convenience, a relationalassignment (x : = x ′ | b) is available and generalizes assignment statements. It setsa state variable x to a new state x ′ such that b, relating x and x ′, holds.

The language further distinguishes between the responsibilities of communicatingagents in a contract. Here, the contract models the viewpoint of one agent calledthe angel who interacts with the rest of the system called the demon. In our workfollowing [BvW98, BMvW99], the user is considered the angel and the system undertest the demon. Relational contract statements denoted by {R} express relationalupdates under control of the angel (user). Relational updates of the demon aredenoted by [R] and express updates that are non-deterministic from the angel’spoint of view. Usually, we take the viewpoint of the angel.

The contract statement 〈f 〉 denotes a functional update of the state determinedby a state transformer f . There is no choice involved here, neither for the angel northe demon agent, since there is only one possible next state for a given state.

Two contracts can be combined by sequential composition C1;C2 or choice op-erators. The angelic choice C1 t C2 and the demonic choice C1 u C2 define non-deterministic choice of the angel or demon between two contracts C1 and C2. Fur-thermore, predicate assertions {p} and assumptions [p] define conditions the an-gel, respectively the demon, must satisfy. In this language of contract statements{p}; 〈f 〉 denotes partial functions and {p}; [R] pre-postcondition specifications. Fur-thermore, recursive contracts, using the least fix-point operator µ and the greatestfix-point operator ν, are possible for expressing several patterns of iteration.

The core contract language used in this work can be summarized by the followingBNF grammar, where p is a predicate and R a relation.

C : = {p} | [p] | {R} | [R] | C ;C | C t C | C u C | µX · CTo simplify matters, we will extend this core language by our own contract

statements. However, all new statements will be defined by means of the abovecore language. Thus, our language extensions are conservative. This means that noinconsistencies into the theory of the refinement calculus are introduced by our newdefinitions.

5.2.2 Example Contracts

A few simple examples should illustrate the contract language. The following con-tract is a pre- postcondition specification of a square root algorithm:


{x ≥ 0 ∧ e > 0}; [x : = x ′ | -e ≤ x – x ′2 ≤ e]

The precondition is an assertion about an input variable x and a precision e. Arelational assignment expresses the demonic update of the variable x to its new valuex ′. Thus, the contract is breached unless x ≥ 0 ∧ e > 0 holds in the state initially.If this condition is true, then x is assigned some value x ′ for which -e ≤ x -x ′2 ≤ eholds.

Consider the following version of the square root contract that uses both kindsof non-determinism:

{x , e : = x ′, e ′ | x ′ ≥ 0 ∧ e ′ > 0}; [x : = x ′ | -e ≤ x – x ′2 ≤ e]

In this contract the interaction of two agents is specified explicitly. This contractrequires that our agent, called the angel, first chooses new values for x and e. Thenthe other agent, the demon, is given the task of computing the square-root in thevariable x .

The following example should demonstrate that programming constructs can bedefined by means of the basic contract statements. A conditional statement can bedefined by an angelic choice as follows:

if P then S1 else S2 , {P}; S1 t {¬P}; S2

Thus, the angel agent can choose between two alternatives. The agent will,however, always choose only one of these, the one for which the assertion is true,because choosing the alternative where the guard is false would breach the contract.Hence, the agent does not have a real choice if he wants to satisfy the contract.

Alternatively, we could also define the conditional in terms of choices made bythe other agent (demon) as follows:

if P then S1 else S2 fi , [P ]; S1 u [¬P ]; S2

These two definitions are equivalent. The choice of the demon agent is notcontrollable by our agent, so to achieve some desired condition, our agent has tobe prepared for both alternatives. If the demon agent is to carry out the contractwithout violating our agent’s assumptions, it has to choose the first alternative whenP is true and the second alternative when P is false.

Iteration can be specified by recursive contracts (µX · C ). Here X is a variablethat ranges over contract statements, while (µX · C ) is the contract statement C ,where each occurrence of X in C is interpreted as a recursive invocation of thecontract C . For example, the standard while loop is defined as follows:

while g do S od , (µX · if g then S ;X else skip fi)

We write skip = 〈id〉 for the action that applies the identity function to thepresent state.

5.2. THE REFINEMENT CALCULUS 67

5.2.3 Semantics

The semantics of the contract statements is defined by weakest precondition pred-icate transformers. A predicate transformer C : (Γ → Bool) → (Σ → Bool) is afunction mapping postcondition predicates to precondition predicates. The set ofall predicate transformers from Σ to Γ is denoted by Σ 7→ Γ , (Γ → Bool) →(Σ → Bool).

The different roles of the angel and the demon are reflected in the followingweakest-precondition semantics. Here q denotes a postcondition predicate and σa particular state, p is an arbitrary predicate, and R a relation. Following theconvention, we identify contract statements with predicate transformers that theydetermine. The notation f .x is used for function application instead of the morecommon form f (x ).

{p}.q , p ∩ q (assertion)

[p].q , ¬p ∪ q (assumption)

{R}.q .σ , (∃ γ ∈ Γ ¦ R.σ.γ ∧ q .γ) (angelic update)

[R].q .σ , (∀ γ ∈ Γ ¦ R.σ.γ ⇒ q .γ) (demonic update)

(C1;C2).q , C1.(C2.q) (sequential composition)

(C1 t C2).q , C1.q ∪ C2.q (angelic choice)

(C1 u C2).q , C1.q ∩ C2.q (demonic choice)

In this semantics, the breaching of a contract by our angel agent, means that theweakest-precondition is false. If a demon agent breaches a contract, the weakest-precondition is trivially true. The semantics of the specification constructs abovecan be interpreted as follows:

• The weakest precondition semantics of an assertion contract reflects the factthat, if the final state of the contract should satisfy the post-condition q ,then in addition the assertion predicate p must hold. It can be seen that theglobal state is not changed by an assertion statement. Consequently, the angelbreaches this contract if p ∩ q evaluates to false.

• The semantics of an assumption shows that the demon is responsible for satis-fying an assumption predicate p. If the assumption does not hold, the demonbreaches the contract and the angel is released from the contract. In this case,the weakest-precondition trivially evaluates to true.

• The angelic update definition says that a final state γ must exist in the rela-tion R, such that the postcondition q holds. The existential quantifier in theweakest-precondition shows that the angel has control of this update. The an-gel can satisfy the contract, as long as one update exists that satisfies the post-condition. In the set notation this update is defined as {R}.q .σ , R.σ∩q 6= ∅.


• This is in contrast to the definition of the demonic update. Here, all possiblefinal states γ have to satisfy the postcondition. The reason is that the demonicupdate is out of our control. It is not known, to which of the possible states,described by the relation R, the state variables will be set. In the set notationthis update is defined as [R].q .σ , R.σ ⊆ q .

• The weakest-precondition of two sequentially combined contracts is defined bythe composition of the two weakest-preconditions.

• The angelic choice definition shows that the weakest-precondition is the unionof the weakest-precondition of the two contracts. Thus, a further choice of theangel, further weakens the weakest-preconditions.

• The demonic choice is defined as the intersection of the weakest-preconditionsof the two contracts. Thus, demonic choice means a strengthening of theweakest-preconditions.

For further details of the predicate transformer semantics, we refer to [BvW98].

5.2.4 Refinement and Abstraction

The notion of contracts includes specification statements as well as programmingstatements. More complicated specification statements as well as programmingstatements can be defined by the basic contract statements presented above. Therefinement calculus provides a synthesis method for refining specification statementsinto programming statements that can be executed by the target system. The re-finement rules of the calculus ensure by construction that a program is correct withrespect to its specification.

Formally, refinement of a contract C by C ′, written C v C ′, is defined by thepointwise extension of the subset ordering on predicates: For Γ being the after statespace of the contracts, we have

C v C ′ , ∀q ∈ (Γ → Bool) · C .q ⊆ C ′.q

This ordering relation defines a lattice of predicate transformers (contracts) withthe lattice operators meet u and join t. The top element > is magic.q , true, astatement that is not implementable since it can magically establish every postcon-dition. The bottom element ⊥ of the lattice is abort.q , false defining the notionof abortion. The choice operators and negation of contracts are defined by point-wise extension of the corresponding operations on predicates. A large collection ofrefinement rules can be found in [BvW98, Mor90].

Abstraction is dual to refinement. If C v C ′, we can interchangeable say C isan abstraction of C ′. In order to emphasize rather the search for abstractions than

5.3. TEST-CASES AS ABSTRACTIONS 69

for refinements, we write C w C ′ to express C ′ is an abstraction of C . Trivially,abstraction can be defined as

C w C ′ , C ′ v C

Hence, abstraction is defined as the reverse of refinement.

5.3 Test-Cases as Abstractions

In the following we will demonstrate that test-cases common in software engineeringare in fact contracts – highly abstract contracts. To keep our discussion simple,we do not consider parameterized procedures, but only global state manipulations.In [BvW98] it is shown how procedures can be defined in the contract language.Consequently, our approach scales up to procedure calls.

5.3.1 Input-Output Tests

The simplest form of test-cases are pairs of input i and output o data. We candefine such an input-output test-case TC as a contract between the user and theunit under test:

TC i o , {x = i}; [y : = y ′ | y ′ = o]

Intuitively, the contract states that if the user provides input i , the state will beupdated such that it equals o. Here, x is the input variable and y the outputvariable.

In fact, such a TC is a formal pre-postcondition specification solely defined for asingle input i . This demonstrates that a collection of n input-output test-cases TCsare indeed pointwise defined formal specifications:

TCs , TC i1 o1 t . . . t TC in on

Moreover, such test-cases are abstractions of general specifications, if the specifica-tion is deterministic for the input-value of the test-case, as the following theoremshows.

Theorem 5.1 Let p : Σ → Bool be a predicate, Q : Σ → Γ → Bool a relation onstates, and TC i o a test-case with input i in variable x and output o in variabley. Then

{p}; [Q ] w TC i o ≡ (x = i) ⊆ p ∧ (| x = i |;Q) ⊆| y : = o |, where

• | p | denotes the coercion of a predicate p :Σ → Bool to a relation (here x = i).The relation | p | :Σ → Σ → Bool is defined as follows:

| p | .σ.γ , σ = γandp.σ


• | f | denotes the coercion of a state transformer f : Σ → Γ to a relation (herey : = o). The relation | f | :Σ → Γ → Bool is defined as follows:

| f | .σ.γ , f .σ = γ

• the composition operator ; is overloaded for relations. The relation compositionP ;Q is defined as follows:

(P ;Q).σ.δ , (∃γ · P .σ.γ ∧Q .γ.δ)

Proof.{p}; [Q ] w TC i o

≡ by definitions∀ σ r · p.σ ∧Q .σ ⊆ r ⇐ (x = i).σ ∧ [y : = y ′ | y ′ = o].r

≡ by definition of demonic relational assignment∀ σ r · p.σ ∧Q .σ ⊆ r ⇐ (x = i).σ ∧ (∀ y ′ · (y ′ = o) ⇒ r [y : = y ′])

≡ by simplification of update∀ σ r · p.σ ∧Q .σ ⊆ r ⇐ (x = i).σ ∧ r [y : = o]

≡ by definition of substitution r : = (y : = y ′ | y ′ = o).σ∀ σ · p.σ ∧Q .σ ⊆ (y : = y ′ | y ′ = o).σ ⇐ (x = i).σ

≡ distributivity, subset definition(∀ σ · (x = i).σ ⇒ p.σ) ∧(∀ σ σ′ · (x = i).σ ∧Q .σ.σ′ ⇒ (y : = y ′ | y ′ = o).σ.σ′)

≡ definitions(x = i) ⊆ p∧ | x = i |;Q ⊆| y : = o |

¤Theorem 1 shows that only for deterministic specifications, simple input-output

test-cases are sufficient, in general. The theorem becomes simpler if the whole inputand output is observable.

Corollary 5.1 Let p : Σ → Bool be a predicate, Q : Σ → Γ → Bool a relation onstates, and TC i o a test-case, where the whole change of state is observable. Thus,input i : Σ and output o : Γ. Then

{p}; [Q ] w TC i o ≡ p.i ∧Q .i .o

Proof. The corollary follows from Theorem 1 and the assumption that i : Σ ando : Γ .¤

The fact that test-cases are indeed formal specifications and as Theorem 1 showsabstractions of more general contracts explains why test-cases are so popular. Theyare abstract, and thus easy to understand. Furthermore, they are formal and thusunambiguous.

Furthermore, the selection of certain test-cases out of a collection of test-casescan be considered as abstraction:

5.3. TEST-CASES AS ABSTRACTIONS 71

Corollary 5.2TC i1 o1 t . . . t TC in on w TC ik ok

for all k , 1 ≤ k ≤ n.

Proof. The theorem is valid by definition of the join operator a tb w a or a tb w b,respectively.¤

5.3.2 Non-Deterministic Test-Cases

In general, a contract can permit more than one result. In this case, testing therequirements with simple input-output values is insufficient. An output predicateω : Σ → Bool can be used for describing the set of possible outputs. We define sucha test-case as follows:

TCp i ω , {x = i}; [y : = y ′ | ω]

For being a correct test-case with respect to a contract this type of test-caseshould be an abstraction of the contract.

Theorem 5.2 Let p : Σ → Bool be a predicate, Q : Σ → Σ → Bool a relation onstates, and TC2 i o a test-case with input i in variable x and output in variable ysuch that the output predicate ω holds . Then we have:

{p}; [Q ] w TCp i ω ≡ (x = i) ⊆ p ∧ (| x = i |;Q) ⊆| ω |The theorem shows that a test-case for non-deterministic results can be calcu-

lated by strengthening the precondition to a single input value and weakening thepostcondition to the outputs of interest. The fact that the output predicate ω mightbe weaker than Q represents the case that not all properties of an output might beobserved. This can be useful if not all variables or only selected properties of theoutput should be checked.

5.3.3 Partition Tests

Partition analysis of a system is a powerful testing technique for reducing the pos-sible test-cases: Here, a contract is analyzed and the input domains are split intopartitions. A partition is an equivalence class of test-inputs for which the testerassumes that the system will behave the same. These assumptions can be basedon a case analysis of a contract, or on the experience that certain input values arefault-prone.

In case of formal specifications, the transformation into a disjunctive normalform (DNF) is a popular partition technique as already mentioned in the discussionof the related work in Section 1. This technique is based on rewriting according therule A ∨ B ≡ (A ∧ B) ∨ (¬A ∧ B) ∨ (A ∧ ¬B).


A partitioning of a contract statement {p}; [R] is a collection of n disjoint par-titions {pi}; [Ri ], such that

{p}; [R] = {p1}; [R1] t . . . t {pn}; [Rn ]

and

∀i , j ∈ {1, . . . , n} ¦ i 6= j ⇒ pi ∩ pj = ∅

These partitions describe classes of test-cases, here called partition test-cases.Often in the literature, if the context is clear, a partition test-case is simply calleda test-case.

Partition test-cases are abstractions of specifications, too:

Theorem 5.3 Let {pi}; [Ri ] be a partition of a specification {p}; [R]. Then

{p}; [R] w {pi}; [Ri ]

Proof. The result follows directly from the definition of partitioning above, and thedefinition of t.¤

Above, only the commonly used pre-postcondition contracts have been consid-ered. They are a normal form for all contracts not involving angelic actions. Thismeans that arbitrary contracts excluding t and {R} can be formulated in a pre-postcondition style. (see Theorem 26.4 in [BvW98]). However, our result thattest-cases are abstractions holds for general contract statements involving user inter-action. In order to justify this, user-interaction has to be discussed with respect totesting. The next section will introduce the necessary concepts.

5.4 Testing Interactive Systems

The synthesis of black-box tests for an interactive system has to consider the pos-sible user actions. Furthermore, simple input-output test-cases are insufficient forpractical systems. Moreover, sequences of interactions, called scenarios, are nec-essary for setting the system under test into the interesting states. Consequently,scenarios of the system’s use have to be developed for testing.

Scenarios are gaining more and more popularity in software engineering. Thereasons are the same as for other test-cases: Scenarios are abstractions of an inter-active system. For a comprehensive introduction into the different roles of scenariosin software engineering see [JKS98]. In this work, the focus is on validation andverification.

5.4. TESTING INTERACTIVE SYSTEMS 73

5.4.1 User Interaction

Testing interactive systems, typically involves the selection of a series of parameters.Some of these parameters can be entered directly, some have to be set up, by initiat-ing a sequence of preceding actions. Adequate test-cases should distinguish betweenthese two possibilities of parameter setup. Therefore, simple pre-postcondition con-tracts are not sufficient to specify test-cases. Moreover, the tester’s interaction withthe system has to be modeled.

We define an atomic interaction IA of a tester, as a composition of the testerssystem update T and the following system’s response Q .

IA , {T}; [Q ]

The fact that we define an atomic interaction by means of angelic and demonicupdates does not exclude other contract statements for modeling interaction. Theo-rem 13.10 in [BvW98] states that {T};[Q] is a normal form, thus arbitrary contractstatements can be defined by means of interactions.

In this context a simple input-output test-case TCI i o includes the actual settingof the input variable to i .

TCI i o , {x : = x ′ | x ′ = i}; [y : = y ′ | y ′ = o]

Again the abstraction relation holds for this kind of test-cases.

Theorem 5.4 Let T : Σ → Γ → Bool and Q : Γ → Θ → Bool relations on states,and TCI i o a test-case with input i in variable x and output o in variable y. Then

{T}; [Q ] w TCI i o ⇐ | x : = i |⊆ T ∧Q ⊆| y : = o |Proof.

The theorem holds by homomorphism and monotonicity properties. For ab-stracting an interaction, demonic updates may be weakened and angelic updatesstrengthened.¤

The proof is similar to that of Theorem 5.1.

5.4.2 Iterative Choice

The application of an iterative choice statement for specifying and refining interac-tive systems have been extensively discussed in [BMvW99]. This statement, intro-duced in [BvW98], is defined as a recursive selection of possible interactions S .

do ♦ni gi : : Si od , (µX · {g1}; S1;X t . . . t {gn}; Sn ;X t skip)

The skip statement, models the user’s choice of stopping the dialog with the system.µ denotes the least fix-point operator. In general, a recursive contract µX · S is


interpreted as the contract statement S , but with each occurrence of statementvariable X in S treated as a recursive invocation of the whole contract.

The iterative choice statement follows a common iteration pattern, called angeliciteration. This iteration construct over S is defined as the following fixpoint:

SΦ , (µX · S ;X t skip)

Therefore, we have

do ♦ni gi : : Si od = ({g1}; S1 t . . . t {gn}; Sn)Φ

Iterative choice should not be mixed with guarded command iterations used byDijkstra [Dij76]. Guarded command iterations are strong iterations defined by Sω ,(µX · S ;X u skip) with, in contrast to angelic iteration, the termination out of auser’s control.

In [BMvW99] refinement rules for iterative choice are given. However, for testingwe need abstraction rules for the synthesis of test-cases — scenarios are our goal.

5.4.3 Scenarios

An arbitrary scenario SC of an interactive system with n possible interactions Si

and of length l is a sequence of l sequential user interactions Si . We write a sequencecomprehension expression

〈Si(k) | (1 ≤ i ≤ n) ∧ (1 ≤ k ≤ l)〉to denote such arbitrary sequences, where k is the position in the sequence. Itshould be mentioned that this sequence comprehension expression is not a validpredicate transformer, but rather serves as a scheme for sequences of predicatetransformers. We use sequence comprehensions as a convenient notation, but theycannot be defined in higher-order logic.

Scenarios are abstractions of interactive systems, modeled by iterative choice, asthe following theorem shows.

Theorem 5.5

do ♦ni gi : : Si od w 〈({gi}; Si)(k) | (1 ≤ i ≤ n) ∧ (1 ≤ k ≤ l)〉

Proof. The theorem is valid by definition of the angelic iteration statement and thusby definition of iterative choice:

do ♦ni gi : : Si od

≡ skip t {g1}; S1 t {g2}; S2 t {g1}; S1; {g1}; S1 t {g1}S1; {g2}S2 t . . .

Hence, by definition of t any choice of sequences of {gi}Si is an abstraction.¤


However, for test-case generation, we are only interested in valid scenarios. Ascenario is considered a test-scenario if it terminates for every possible initial state.Thus its weakest precondition should be true for an unspecified final state:

〈Si(k) | (1 ≤ i ≤ n) ∧ (1 ≤ k ≤ l〉.true = true

Consequently, the abstraction should not equal the abort statement. Since abortis the bottom element ⊥ of the predicate transformer lattice, it is the trivial ab-straction of every statement. Therefore, we define a notion of testing abstractionwT

S wT T , S w T A abort

and get the abstraction rule for testing scenarios:

Theorem 5.6 Let g(k) denote the guard at the kth position in a scenario and as-sume that the system specification is consistent. Hence we assume that for all inter-actions Si .true ⊆ gi . Furthermore, g(l + 1) 6= false should be an arbitrary predicatecalled the goal.

do ♦ni gi : : Si od wT

〈({gi}; Si)(k) | (1 ≤ i ≤ n) ∧ (1 ≤ k ≤ l) ∧ gi(k) ⊆ Si(k).g(k + 1) ∧ g(1) = true〉

Proof. Abstraction follows from Theorem 5.5. Termination is valid by induction:The weakest precondition of the first interaction is true, due to the assumptionthat for all interactions Si .true ⊆ gi and g(1) chosen to be true. Consequently Si(1)terminates. An interaction Si(k +1) terminates due to the fact that its pre-conditiong(k + 1) can be reached by definition.¤

This abstraction rule defines the calculation of valid test scenarios. The goalpredicate is a condition that defines the states that should be reached by a sequenceof interactions. Trivially, it can be chosen to be true. For developing a scenario forsetting a system to a certain state, this goal predicate represents the correspondingstate description.

The theorem above shows that the question if a scenario terminates, can bereduced to the question if two following interactions are composeable. From thisobservation a new testing strategy will be derived in the next section.


In this chapter, the theory of the refinement calculus have been used for showingthat test-cases are abstractions of formal specifications (contracts). The abstractionrelation have been proved for different kinds of test-cases. Simple input-output test-cases, non-deterministic test-cases, test partitions, and test-sequences, here called


test scenarios, have been covered. The presented theorems clarify the role thattest-cases play in a development process, formally we have:

test-cases v specification v implementation

The fact that many different types of test-cases can be handled, demonstratesthe generality of our approach of choosing abstraction as the correctness criterion fortest-cases. In contrast to previous work on testing, one criterion covers the simplestform of input-output test-cases as well as the most complex sequences of test-cases.

Susan Stepney did use an abstraction relation, as well [Ste95]. However, in herwork abstraction is mainly used for finding and structuring partition test-cases forobject-oriented Z specifications. She covered neither input-output test-cases, nortest-sequences.

The important conclusion that can be drawn from this chapter is: Test-cases areabstract formal contracts that partially define a system’s functionality.

The fact that test-cases are more abstract than formal specifications that coverthe whole input-output domain, like pre- postconditions, does not imply that test-cases are better suited for specification purposes.

Test-cases are indeed too abstract for replacing specifications using a formal spec-ification language. However, the reasons why test-cases are so popular in software-engineering can be given in the presented context:

• Test-cases are formal contracts. Therefore, they are more precise than naturallanguage specifications.

• Test-cases are abstract. The fact that test-cases include less detail than pro-grams or pre- postcondition specifications is the reason why they are easilyand quickly understood by engineers as well as by customers.

The theorems that have been presented in this chapter, are the prerequisite fordeveloping formal synthesis rules for testing. In the following chapter, such formalabstraction rules are presented. The rules define how test-cases can be calculatedfrom a formal specification. The different kinds of rules represent the different teststrategies how useful test cases are to be selected.

Chapter 6

Testing Strategies

A strategy for selecting test-cases is based on a hypothesis about faults in a pro-gram. This chapter shows that such strategies can be formulated as formal synthesisrules for deriving test-cases. This is a consequence of our observation that correcttest-cases are abstractions of contracts, as has been explained in Chapter 5. The ab-straction rules of the prominent strategies of partition testing and mutation testingare presented. In addition, it is shown that structural testing strategies are coveredby this approach.

Furthermore, a new technique for calculating sequences of test-cases, here calledscenarios, is presented. In contrast to previous work on test sequencing, our ap-proach does not need to compute a finite state machine. This is in contrast toprevious work on testing from state-based specifications. A well-known examplefrom the testing literature serves to demonstrate this unusual application of therefinement calculus in order to synthesize tests rather than implementations.

6.1 Partition Testing

Partition testing techniques are based on a uniformity hypothesis. This strategyassumes that a system shows the same behavior for a certain partition of inputvalues. Therefore, once the equivalence partitions are selected, it is sufficient to testone case for every partition.

For partitioning model-based specifications, [DF93] proposed the rewriting ofspecifications into their disjunctive normal form (DNF). This popular technique isbased on rewriting disjunctions with:

a ∨ b ≡ (a ∧ ¬b) ∨ (¬a ∧ b) ∨ (a ∧ b)

The proof of the following abstraction rule for this well-known strategy high-lighted a problem with this technique. In general, DNF-rewriting results in disjointpartitions. Applied to a relational specification it gives disjoint input-output rela-tions, but the input partitions (the domain of the relation) may overlap.

77

78 CHAPTER 6. TESTING STRATEGIES

In such pathological cases, selecting a test-case for each partition is not cor-rect. If test-cases are derived for overlapping domains and the specification is non-deterministic then two test-cases with the same input may define two different de-terministic outputs.

Therefore, the disjointness of the resulting input partitions dom.Qi is assumedin the following synthesis rule.

Theorem 6.1 Let a and b be arbitrary Boolean expressions and Q1, Q2, Q3 berelations. The following rule for deriving partition test-cases from pre-postconditioncontracts is then generally valid:

[x : = x ′ | a ∧ ¬b] w [Q1],[x : = x ′ | ¬a ∧ b] w [Q2],[x : = x ′ | a ∧ b] w [Q3],

∀i , j ∈ {1, 2, 3} ¦ i 6= j ⇒ dom.Qi ∩ dom.Qj = ∅[x : = x ′ | a ∨ b] w {dom.Q1}; [Q1] t

{dom.Q2}; [Q2] t{dom.Q3}; [Q3]

Proof.[x : = x ′ | a ∨ b]

= by the DNF-partitioning rule[x : = x ′ | (a ∧ ¬b) ∨ (¬a ∧ b) ∨ (a ∧ b)]

= by definition of demonic choice[x : = x ′ | (a ∧ ¬b)] u [x : = x ′ | ¬a ∧ b] u [x : = x ′ | a ∧ b]

w by w-assumptions and monotonicity of u[Q1] u [Q2] u [Q3]

= by explicitly stating the domains as assumption[dom.Q1]; [Q1] u [dom.Q2]; [Q2] u [dom.Q3]; [Q3]

= by the fact that the choice is deterministic, since:(1) the domains are disjoint by assumption,(2) the whole input domain is covered,since abstraction cannot reduce the domain of a relational update.{dom.Q1}; [Q1] t {dom.Q2}; [Q2] t {dom.Q3}; [Q3]

¤The derivation rule of Theorem 6.1 yields three disjoint partitions, although a

derived partition might be empty, thus dom.Qi = false. It follows from Theorem 5.3that each of these partitions is an abstraction of the original contract. It is obviousthat the rule has to be applied recursively to the resulting partitions if further sub-partitions are needed.

The proof of Theorem 6.1 shows that the derivation rule yields partition test-cases that cover the full input domain. Furthermore, it follows from the assumptionsthat for one partition more non-determinism can be introduced due to abstraction.This reflects the possibility of weakening certain tests.


Weakening of Tests: The rule shows that test-cases can be further abstractedby weakening a demonic update. In the rule above, for example the premise [x : =x ′ | a ∧¬b] w [Q1] indicates such a possible weakening of a test-case. This form ofweakening a test-case is a further strategy for test-case design. The reason for thisfurther abstraction of a partition might be that

• a tester does not wish to observe the (demonic) changes of the whole statespace, or

• not the whole state and thus its updates are observable.

Therefore, parts of the demonic updates of the system are skipped in the test-casespecifications. Formally this is an abstraction process carried out by weakening apost-condition.

Example 6.1 The rule is illustrated by deriving the test-cases from a specificationof the computation of the minimum of two numbers.

[z : = z ′ | (z ′ = x ∧ x ≤ y) ∨ (z ′ = y ∧ x ≥ y)]

= by the partitioning rule

{x < y}; [z : = z ′ | (z ′ = x ∧ x ≤ y) ∧ (z ′ 6= y ∨ x < y)] t{x > y}; [z : = z ′ | (z ′ 6= x ∨ x > y) ∧ (z ′ = y ∧ x ≥ y)] t{x = y}; [z : = z ′ | (z ′ = x ∧ x ≤ y) ∧ (z ′ = y ∧ x ≥ y)]

= by simplification

{x < y}; [z : = z ′ | z ′ = x ] t{x > y}; [z : = z ′ | z ′ = y ] t{x = y}; [z : = z ′ | z ′ = x ∧ z ′ = y ]

In the third test partition {x = y} a further abstraction by weakening the rela-tional update might be applied. For testing the correct computation for input x = y ,a tester might-choose to compare the new value z ′ only with one input variable. Ifx is compared to z , then the following three test cases are obtained:

w by weakening the post-condition in the third test-case

{x < y}; [z : = z ′ | z ′ = x ] t{x > y}; [z : = z ′ | z ′ = y ] t{x = y}; [z : = z ′ | z ′ = x ]

¤


Note that it has been the search for a valid abstraction that highlighted thenecessary condition of disjointness. The last step in the proof shows that only if thedemonic choice u is in fact deterministic it can be transformed into an angelic choicet of partitions. For a general demonic choice of partitions, the abstraction relationwould not hold. The consequence could be the derivation of incorrect test-cases.The second example shows such a pathological case, where the rule must not beapplied.

Example 6.2 The example demonstrates the consequence, if the rule is applied tonon-disjoint partitions.

[z : = z ′ | z ′ = 0 ∨ z ′ = 1]

w by a wrong application of the partitioning rule

{true}; [z : = z ′ | z ′ = 0 ∧ z ′ 6= 1] t{true}; [z : = z ′ | z ′ 6= 0 ∧ z ′ = 1] t{false}; [z : = z ′ | z ′ = 0 ∧ z ′ = 1]

= by simplification

{true}; [z : = z ′ | z ′ = 0] t {true}; [z : = z ′ | z ′ = 1]

w by selecting input-output test-cases

(TC 1 0) t (TC 1 1)

The result of this wrong application of the rule is the derivation of two deter-ministic test-cases for a non-deterministic result. If applied in testing an implemen-tation of this specification, these test-cases indicate an error in the implementationthat does not exist. The correct solution, for non-disjoint domains is to merge thepartitions. Then, only non-deterministic test-cases would have been possible, likeTCp 1 (z ′ = 0 ∨ z ′ = 1).¤

6.2 Structural Partitioning

In the review of the related work, structural approaches for partitioning B-specificationshave been discussed (see Section 2.1.4). Here, the syntax of the specification lan-guage is considered for partitioning. This structural technique provides a bettercontrol of the partitioning process, then pure DNF-partitioning.

Our abstraction approach can be extended for defining test-case synthesis rulesfor different kinds of specification statements. For example, the rule for partitioning

6.3. MUTATION TESTING 81

a conditional specification can be defined as follows:

Theorem 6.2 Given a conditional contract statement as defined in Section 5.2.2.C1 and C2 be arbitrary contracts and P a predicate. Then, a conditional contractcan be partitioned by

C1 w C ′1,

C2 w C ′2

if P then C1 else C2 fi w {P};C ′1 t {¬P};C ′

2

Proof.The rule is valid by the definition of the if-statement and the monotonicity of thew relation.¤

Synthesis rules, like the one for if-statements, can be given for arbitrary specifi-cation statements. Each such syntax oriented synthesis rule reflects a test-selectionstrategy, known from white-box testing. For example, the rule above representsbranch testing, a strategy, where every branch in the control-flow should be testedonce.

The following example demonstrates that a structural strategy might help toreduce the set of test-cases.

Example 6.3 The rule is illustrated by deriving the test-cases from an alternativespecification of the computation of the minimum of two numbers.

if x ≤ y then [z : = z ′ | z ′ = x ] else [z : = z ′ | z ′ = y ]

= by the partitioning rule for if-statements

{x ≤ y}; [z : = z ′ | z ′ = x ] t {x > y}; [z : = z ′ | z ′ = y ]

In contrast to the three test-cases of Example 6.1, here only two test-cases arederived for the same problem.¤

6.3 Mutation Testing

Mutation testing is a fault-based testing technique introduced by Hamlet [Ham77]and DeMillo et al [DLS78]. It is a means of assessing test suites. When a pro-gram passes all tests in a suite, mutant programs are generated and the suite isassessed in terms of how many mutants it distinguishes from the original program.The hypothesis is that programmers only make small errors. Stocks extends this


technique in [Sto93] to model-based specification languages by defining a collectionof mutation operators for Z’s specification language. An example for specificationmutation is the exchange of the join operator ∪ of sets with intersection ∩. Fromthese mutants, test-cases are generated for demonstrating that the implementationdoes not implement one of the specification mutations.

We can formulate abstraction rules for deriving mutation test-cases for a givencontract and its mutants.

Theorem 6.3 Given a contract C and a mutation Cm generated by applying amutation operator m such that m.C = Cm holds. Then a test-case TC i o thatcovers the mutation must be derived by the rule

C w TC i o,Cm 6w TC i oC w TC i o

The rule shows that a test-case for finding errors in a mutant, has to be, first, acorrect test-case of the original contract, second, it must not be a an abstraction ofthe mutated contract. Test-cases that are abstractions of the mutations do not coverthe mutated parts of a contract. Consequently, the coverage criteria of a collectionof test-cases T for a contract C in mutation testing with a collection of mutationoperators M can be given by:

∀ m ∈ M ¦ ∃ tc ∈ T ¦ (C w tc ∧m.C 6w tc)

This represents a new approach to the quality criteria for test-cases based onmutations. Again, the minimum example serves to illustrate the synthesis rule.

Example 6.4 This example shows how mutation testing techniques might be ap-plied to specification based testing. Consider this specification C of a minimumcomputation:

if x ≤ y then [z : = z ′ | z ′ = x ] else [z : = z ′ | z ′ = y ]

In mutation testing, the assumption is made that programmers produce smallerrors. A common fault made by programmers is that operators are mixed. In thisexample a possible fault would be to implement the < operator instead of the ≤.Hence, a mutation operator

m(... ≤ ...) = ... ≥ ...

is defined that changes all occurrences of ≤ in a contract to ≥. Applying thismutant operator m.C = Cm computes the following mutant:

if x ≥ y then [z : = z ′ | z ′ = x ] else [z : = z ′ | z ′ = y ]

6.4. CALCULATING SCENARIOS FOR TESTING 83

The synthesis rule says that a successful test-case for this mutation must not bean abstraction. Since

if x ≥ y then [z : = z ′ | z ′ = x ] else [z : = z ′ | z ′ = y ] 6w {x < y}; [z : = z ′ | z ′ = x ]

but

if x ≤ y then [z : = z ′ | z ′ = x ] else [z : = z ′ | z ′ = y ] w {x < y}; [z : = z ′ | z ′ = x ]

a test-case {x < y}; [z : = z ′ | z ′ = x ] is a valid test-case for detecting a specificationmutation of this kind in our implementation. On the other hand, the test-case{x = y}; [z : = z ′ | z ′ = x ] would not detect the mutation since it is an abstractionof both, the original specification and the mutated one.¤

6.4 Calculating Scenarios for Testing

In this section an alternative strategy for the sequencing of test-cases into test-scenarios is proposed.

6.4.1 Critical Remarks on FSM approaches

In the literature on related work on test-sequencing for model-oriented specifications,authors have been concentrating solely on the approach proposed by Dick and Faivrein [DF93]. This strategy first calculates partitions of the available operations andstates. Then a finite state machine (FSM) is constructed by calculating possibletransitions between the states. The result is a graph with nodes that are statepartitions and transitions that are operation partitions. To derive test-sequences(scenarios) the tester follows the paths in the graph. See the related work summa-rized in Chapter 2 for examples of this approach.

One disadvantage of this technique is that the whole FSM has to be calculated inadvance, even if full coverage is out of the tester’s scope due to resource limitations.This situation is even worse: Due to the focus on state partitions, the number ofstates increases exponentially with the number of partitioned state variables. Hence,rather large FSMs have to be calculated in advance. The second disadvantage is thata state based testing strategy is enforced, although the contract does not emphasizestates but, like for interactive systems, possible interactions are the central paradigmof description.

In the following, a scenario oriented testing strategy is proposed. We call it a lazytechnique, since the test-cases are calculated by need. It does not calculate a FSM,since it is not based on states. It is based on atomic scenarios, called compositions.


6.4.2 Compositions

We define a composition of an interactive system as a terminating sequential com-position of two interactions:

Composition = C1;C2

The following corollaries follows directly from Theorem 5.6 and defines a rule forcalculating such compositions.

Corollary 6.1 For a consistent specification of interactions we have that

(p ∩ ga) ⊆ Sa .gb ⇒do ♦n

i gi : : Si od wT {p ∩ ga}; Sa ; {gb}; Sb

where 1 ≤ a, b ≤ n holds and p is an arbitrary predicate such that p 6= false.

In practice, we will not calculate the compositions from the original specifica-tion, but will previously perform a partition analysis on the interactions, leading tomore (partition) interactions. However, the approach keeps the same. These com-positions should be calculated for all interaction partitions of interest. Next, thesecompositions are combined into scenarios.

6.4.3 Scenario Synthesis

The following rule defines the general calculation of scenarios by combining twocompositions of interest.

Corollary 6.2 Let the interactions with indices 1 ≤ i , j , k ≤ n be interactions ofan interactive system with n interaction partitions, then

do ♦ni gi : : Si od wT {p1 ∩ gi}; Si ; {gj}; Sj u {p2 ∩ gj}; Sj ; {gk}; Sk∧

p ∩ p1 ∩ g1 ⊆ Sj .(p2 ∩ g2) ⇒do ♦n

i gi : : Si od wT {gi}; Si ; {gj}; Sj ; {gk}; Sk

In order to generate valid scenarios, a tester can e.g. start by an initial interactionwith a guard equal to true and then he further searches for compositions leading tohis test-goal. Which scenarios and how many scenarios are tested, depends on thetesting strategy.

6.4.4 Scenario Based Testing Strategies

The new test approach can be divided into three phases:

1. calculation of interesting partitions for each interaction.

6.5. EXAMPLE: PROCESS SCHEDULER 85

2. calculation of compositions.

3. combination of compositions to validate or to generate test-scenarios.

Different test-coverage strategies can be derived, determined by the strategy forcombining the compositions. Interesting scenario analysis strategies are:

Derive scenarios that include for each partition

• one composition consisting of the partition: for each partition one scenario.

• all possible compositions consisting of the partition: for each partition, onescenario for each interaction reaching the partition.

• all possible combinations of compositions between two interactions of interest:all scenarios leading from one interaction of interest to another.

• all possible combinations of compositions: all possible scenarios.

The strategies are similar to the testing strategies used in data-flow testing[Bei90]. The difference is that here atomic scenarios, called compositions, are con-sidered, and in data-flow testing data-objects. An example will serve to demonstratethe approach at work.

6.5 Example: Process Scheduler

In this section, the application of our test synthesis method is demonstrated byan example, well known in the formal methods testing literature. It is the processscheduler introduced in [DF93] and translated to Back and von Wright’s contractnotation. We have chosen this example, although it is not new, because it allows acomparison to the traditional FSM approach most easily. Test-cases for industrialexamples from our projects have been calculated, too. These more complex exampleswill be published in future publications.

6.5.1 Interactive Process Scheduler

The system consists of processes either ready to be scheduled or waiting to becomeready and, optionally, a single active process. These processes are identified by aunique Pid , Nat. A process cannot be both ready and waiting, and the activeprocess is neither ready nor waiting. In addition, there must be an active processwhenever there are processes ready to be scheduled. The scheduling algorithm isnot further specified.

We can model the interactions with this process scheduler as shown in Figure 6.1.In this specification a : Pid | nil is a global variable representing an optional activeprocess, the global variable r : set of Pid and w : set of Pid represent the sets of ready


Init , [a, r ,w : = a ′, r ′,w ′ | a ′ = nil ∧ r ′ ∪ w ′ = ∅]

New , {p: = p ′ | p ′ 6= a ∧ p ′ 6∈ (r ∪ w)}; [w : = w ′ | w ′ = w ∪ pset ]

Ready , {p: = p ′ | p ′ ∈ w}; [a, r ,w : = a ′, r ′,w ′ |w ′ = w -pset ∧a = nil ⇒ (r ′ = r ∧ a ′ = p) ∧a 6= nil ⇒ (r ′ = r ∪ pset ∧ a ′ = a)]

Swap , {a 6= nil}; [a, r ,w : = a ′, r ′,w ′ |r = ∅ ⇒ (a ′ = nil ∧ r ′ = ∅) ∧r 6= ∅ ⇒ (a ′ ∈ r ∧ r ′ = r -a ′set) ∧w ′ = w ∪ aset ]

Figure 6.1: Contracts of the process scheduler’s initialization and interactions.

and waiting processes. Furthermore, p : Pid is a global input variable solely used forsetting a parameter.

New introduces another process, Ready puts a process into the ready state, andSwap changes the active process. In order to prevent a confusion with assertions,pset and aset are used for denoting the sets {p}, {a} containing the single elementp and a. Swap is an example of an operation with a precondition {a 6= nil} thatcannot be directly satisfied by choosing a parameter. This is in contrast to Ready ,where the precondition {w 6= ∅} follows from the angelic update {p: = p ′ | p ′ ∈ w}and thus must not be explicitly stated.

The interactive process scheduler can be defined by iterative choice of theseinteractions. The initialization statement should be executed once prior to userinteraction. In Figure 6.2 this model is shown. Note that the precondition of Swaphas become a guard. Furthermore, necessary conditions such that a parameterselection is possible are documented in the guards. Here w 6= ∅ is such a preconditionof Ready .

Scheduler , Init ; do true : :New♦ w 6= ∅ : :Ready♦ a 6= nil : :Swapod

Figure 6.2: Contract of the interactive process scheduler.

6.5. EXAMPLE: PROCESS SCHEDULER 87

Ready1 , {a = nil}; {p: = p ′ | p ′ ∈ w}; [a, r ,w : = a ′, r ′,w ′ |w ′ = w -pset ∧r ′ = r ∧ a ′ = p ]

Ready2 , {a 6= nil}; {p: = p ′ | p ′ ∈ w}; [a, r ,w : = a ′, r ′,w ′ |w ′ = w -pset ∧r ′ = r ∪ pset ∧ a ′ = a]

Swap1 , {a 6= nil ∧ r = ∅}; [a, r ,w : = a ′, r ′,w ′ |(a ′ = nil ∧ r ′ = ∅)∧w ′ = w ∪ aset ]

Swap2 , {a 6= nil ∧ r 6= ∅}; [a, r ,w : = a ′, r ′,w ′ |(a ′ ∈ ready ∧ r ′ = r -a ′set)∧w ′ = w ∪ aset ]

Figure 6.3: Partitioned process scheduler.

6.5.2 Interaction Partitioning

For test-case synthesis, we first partition the basic interactions. Here, our parti-tion strategy will be based solely on the case distinctions in the contract. As aconsequence, the interaction New is not partitioned. Figure 6.3 presents the newpartitions after some simplification.

Further partitioning based on interesting states would be possible. Here, forexample, New may be further partitioned into cases where w = nil and w 6= nil .Any partition is possible and can be formulated as a rewriting rule, such that theresulting partitions are correct abstractions as stated in Theorem 2.

As a consequence of this partitioning, a new interactive system contract can begiven. In this new description shown in Figure 6.4, the partition preconditions areincorporated into the guards. This is necessary such that our scenario synthesisapproach works.

Scheduler ′ , Init ; do true : :New♦ w 6= ∅ ∧ a = nil : :Ready1

♦ w 6= ∅ ∧ a 6= nil : :Ready2

♦ a 6= nil ∧ r = ∅ : :Swap1

♦ a 6= nil ∧ r 6= ∅ : :Swap2

od

Figure 6.4: Partitioned contract of the interactive process scheduler.


{a = nil};New ;Ready1 {a 6= nil};New ;Ready2

Swap1;Ready1 {card w > 1};Ready1;Ready2

{card w > 1};Ready2;Ready2

Swap2;Ready2

{r = ∅};Ready1; Swap1 {card r > 1}; Swap2; Swap2

{r = ∅ ∧ a 6= nil};New ; Swap1 Ready2; Swap2

{card r = 1}Swap2; Swap1 {a 6= nil};New ; Swap2

S ;New

Figure 6.5: Compositions of the process scheduler.

6.5.3 Compositions

The next step, is the calculation of atomic scenarios — the compositions. Thecalculation is done by applying the rule of Corollary 3 to the partitioned interactivesystem contract. In many cases, the precondition p of a composition is stronger thanthe guard ga of the first interaction Sa . In the formula of Corollary 3 this meansthat p 6= true. The reason for the additional constraint p is that the interactionSa does not guarantee that gb is satisfied. This fits perfectly into our approach,since precondition strengthening is in fact abstraction. However, such strengtheningindicates paths that are more difficult to establish. In the trivial case p = gb , whichmeans that the precondition of the composition is the conjunction of the two guardsga and gb .

The compositions of the scheduler are listed in Figure 6.5. In this presentationthe guard assertions g are skipped. Only if a guard g has been strengthened bya precondition p the additional assertion is shown as the precondition {p} of thecomposition of the two interactions.

This collection of possible compositions has several advantages: (1) Several sce-narios can be calculated step-wise, without calculating the weakest precondition forthe whole sequence of interactions again and again. (2) It carries the informationwhich interactions are easily established and which are difficult to set up. For settingup a goal as quickly as possible, choosing simple compositions will lead to shorterscenarios. On the other hand, strong preconditions indicate that these combinationsare complicated to carry out. A tester should include such complex combinations.

The compositions are grouped by the second statement, which is more practicalfor searching scenarios backwards. Backwards scenario development is more usefulif scenarios are used to reach a certain test goal, as will be seen next.


6.5.4 Scenarios

Applying the rule for composing two compositions, correct scenarios can be synthe-sized in a systematic way. In Figure 6.6 one scenario for testing each partition ispresented. For each scenario the additional precondition to be established is docu-mented. Scenario3 serves to discuss the synthesis process in more detail.

The actual scenario synthesis starts with the last statement. Here, this is Ready2,the interaction to be tested. From Figure 6.5 it can be seen that four compositionsare available. Ready1 is chosen because a scenario for Ready1 is already available.However, the new precondition forces to choose New twice.

New is chosen, because it is the most promising: Here, New can follow eachstatement S , since it has the precondition true. It is a general strategy to chooseinteractions with weak guards.

Scenario4 and Scenario5 shows that scenarios can be reused for setting a systemin a state of interest.

Based on the table of possible compositions, all scenarios according to one ofthe selection strategies that have been presented in the previous section can becalculated. Here, we applied the first strategy: One scenario for each partition.

It should be noted that for this simple strategy, not all compositions have to becalculated in advance. However, compositions carry the information which combi-nations are most easily achieved, those with the weakest additional precondition.Trivially, it equals true.


What we have presented, is to our knowledge, the first application of the refinementcalculus for generating test-cases. In this chapter several formal abstraction rules forcalculating correct test-cases have been presented. These rules represent differentstrategies for test-case selection, known as DNF-partition testing, structural testingand mutation testing.

The presented synthesis rules for scenario calculation define an alternative methodfor finding sequences of interactions. In contrast to finite state machine (FSM) basedapproaches, the focus is on finding possible compositions of interactions. Whichcompositions are possible is determined by the abstraction rules. A well-know ex-ample has been translated into an interactive contract specification and served forillustrating purposes of the method.

The advantage of our scenario calculation method is that it can be applied byneed. Therefore, we call it a lazy technique. Dependent on the system’s level ofreliability new test-scenarios can be added. The focus on compositions, high-lightswhich sequences appear to be trivial and which seem to be more-interesting, from atester’s perspective.

It is the author’s opinion that an abstraction calculus for testing, provides deeper


Scenario1 , Init ; (Testing New)New

Scenario2 , Init ; (Testing Ready1){a = nil};New ;Ready1

Scenario3 , Init ; (Testing Ready2){a = nil};New ;{a = nil ∧ card w > 0};New ;{card w > 1};Ready1;Ready2

Scenario4 , Scenario2; (Testing Swap1)Swap1

Scenario5 , Scenario3; (Testing Swap2)Swap2

Figure 6.6: Testing scenarios for the process scheduler.

insight into the methods of test-case derivation. Especially, the understanding ofnew black-box testing strategies is supported by reasoning about the synthesis pro-cess. The examined testing strategies showed that even existing techniques revealinteresting properties. New test-case generation techniques may be the result.

Chapter 7

Automated Test ResultVerification

In this chapter the possibilities to automate the test-result verification through for-mal requirement specifications are explored. More precisely, the formal methodVDM (Vienna Development Method) serves to demonstrate that abstract require-ment models can be used as test oracles for concrete software. It is shown thattechniques of VDM’s invent and verify development method can be adapted to setup a test environment. The automation of the resulting testing frame-work is basedon modern CASE-tools that support a light-weight approach to formal methodsas explained in Chapter 3. Here, the specification language used is VDM-SL, butthe results are easily transferred into similar model-based methods such as B, Z orRAISE.

7.1 From Formal Specifications to Oracles

The Vienna Development Method provides the two concepts needed to supportthe presented testing process. First, the formal specification itself, which preciselydefines what a function or operation should compute. Second, the concept of data-reification, that provides a formal definition of abstraction. In the following, bothconcepts are explained and their role in our testing approach is clarified.

7.1.1 VDM-SL Specifications as Oracles

As already mentioned, in VDM a system is specified in terms of abstract math-ematical models. VDM-SL, the general purpose specification language of VDM,provides mathematical objects, like sets, sequences, maps etc., to model a system’sstate. The functionality is formulated by imperative operations, which manipulatethese abstract state representations or applicative functions. Two styles can bedistinguished to define functionality: implicit and explicit definitions. An implicit

91

92 CHAPTER 7. AUTOMATED TEST RESULT VERIFICATION

operation defines what should happen by pre- and post-conditions.

The following example is inspired by the industrial projects that have been pre-sented in Chapter 4 in which a voice communication system for air traffic controlhas been formally specified. Here, an abstract view of the system’s radio switch ismodeled. The switch assigns frequencies to operator positions on which communi-cation may be established. The VDM-SL model refers to frequencies and operatorpositions by the identifiers FrqId and OpId. The relation is modeled by a finite map-ping from operator positions to a set of frequencies. A map can be interpreted as atwo-column table whose left- and right-side entries can be accessed by the domain(dom) and range (rng) operators.

The set of frequencies represents frequency coupling, which means that commu-nication is forwarded to all frequencies in the set. A requirement of the switch isthat a frequency must not belong to two different coupling sets. This is expressedby means of a data-invariant, stating that for all two frequency sets, with more thanone element, they may not have frequencies in common.

Switch-a = OpIdm→ FrqId -set

inv s 4 ∀ fs1, fs2 ∈ rng s · card fs1 > 1 ∧ card fs2 > 1 ⇒ fs1 ∩ fs2 = {}

A function couple-frequency, defined by pre- and post-conditions, adds frequenciesto an operator position. The pre-condition says that if the operator position hasalready a frequency associated, then no frequency set with a cardinality greater thanone must exist that already contains the frequency to be added. This would lead toviolation of the system’s invariant.

The post-condition establishes the following relation between the input (f, opand s) and the resulting output r: If the operator position has already a frequencyassociated, then the resulting map equals the old one with the input frequency addedto the existing set of frequencies of the operator position. This is expressed by over-riding (†) a map entry. In case of a new operator position, the result equals the oldswitch with the new map entry.

couple-frequency (f : FrqId , op : OpId , s : Switch-a) r : Switch-apre op ∈ dom s ⇒ ¬ ∃ fs ∈ rng s · f ∈ fs ∧ card fs > 1post if op ∈ dom s

then r = s † {op 7→ s (op) ∪ {f }}else r = s † {op 7→ f }

From a testers perspective, the post-condition serves as a test oracle. In general,an oracle is any program, process or data that specifies the expected outcome of aset of tests as applied to a tested object [Bei90]. Here, the oracle is a predicate,a Boolean function, which evaluates to true if the correct output is returned, with

7.1. FROM FORMAL SPECIFICATIONS TO ORACLES 93

C

A

C

A

r r

opabstract

opconcrete

Figure 7.1: The morphism of abstraction.

the premise that the pre-condition holds. The signature of the post-condition oracleabove is:

post-couple-frequency : OpId ∗ FrqId ∗ Switch-a ∗ Switch-a → B

The two arguments of type Switch-a define the relation between the old and newvalue of the switch. Note that pre-conditions define the input values for ’good’ tests,where valid outputs can be expected.

However, the system’s model and consequently the oracle is abstract and cannotbe directly used as a test-oracle for a test on implementation level. What is needed,is a link between the abstract level and the concrete level. In VDM, and other formalmethods, this link is provided by a precise definition of abstraction.

7.1.2 The Homomorphism of Abstraction

The M in VDM is based on a refinement of an abstract specification to a moreconcrete one, which usually is carried out in several steps. Refining the data modelis called data-reification, which is correct if it establishes a homomorphism betweenthe abstract and refined, more concrete, level.

In Figure 7.1 our notion of abstraction is represented by a commuting diagram,where operations (op) are viewed as functions from input to output states. Theabstract operation opabstract manipulates an abstract state representation of type A.

The concrete operation opconcrete incorporates detailed design decisions and mapsa refined input state to an output state. Examples for such data-refinement wouldbe to implement a set through a balanced binary-tree, or the other way round, toview a data-base system as a set of data. The relationship between the abstract andthe concrete is defined by a a function r mapping a concrete state to its abstractcounterpart.

The diagram shows that the retrieve function r is a homomorphism for whichthe following formula holds:


∀ in : C · opabstract(r(in)) = r(opconcrete(in))

Using an implicit function definition on the abstract level, the formula for acorrect implementation (refinement) is:

∀ in : C · pre-opabstract(r(in)) ⇒ post-opabstract(r(in), r(opconcrete(in)))

This simply means that the abstract post-condition must hold for the concreteinput and output, mapped to the abstract level by r, if the pre-condition is satisfied.

Modern tools allow the interpretation of explicit VDM-SL definitions. If theretrieve function and the post-condition predicate are defined explicitly, the formulaabove can be evaluated and thus provides a testing frame-work inside VDM-SL.However, to test functionality implemented in a programming language like C++,the language gap between programming languages and the specification languagehas to be overcome. In the following it is shown how modern tools provide thesebridges and help to automate the approach.

7.2 Automated Test Evaluation

The combination of the notion of abstraction and the possibility to use specifica-tions as test oracles leads to a testing framework with abstract oracles. Moderntools like the IFAD VDMTools (see Chapter 3) allow the interpretation or evencode-generation of such pre- and post-conditions which leads to an automated testevaluation through post-condition oracles.

In Figure 7.2 the data-flow of the new testing framework is presented. An imple-mentation is executed with a concrete test input (state). An implemented retrievefunction maps the concrete input and output to its abstract representations. Apre-condition check validates the input and feeds it into the oracle which checks therelation to the produced output. If the post-condition evaluates to true, the testpassed.

The advantage of the approach is the automation of black-box testing by theusage of structural test-data. Another possibility would be to code the reverse of rin order to test with abstract test-data derived from the specification.

In the following, the possibilities for automation by using the IFAD tools areexplained in more detail.

7.2.1 Code-Generation of Oracles

The IFAD VDMTools provide a C++ code-generator, which translates explicitVDM-SL specifications to C++ source code, using a VDM library. With this tool the

7.2. AUTOMATED TEST EVALUATION 95

postconditionoracle

implementation

preconditioncheck

r r

concrete input concrete output

abstractinput

valid abstractinput

abstractoutput

input valid? test ok?

Figure 7.2: DFD of the testing framework.

post-condition functions, like in our Switch example, can be automatically trans-lated to C++ . Even the pre-condition with its quantors over finite sets can becode-generated. Below the generated C++ code for post-couple-frequency is shown:

Bool post_couple_frequency(int f, int op, Map s, Map r) {Bool varRes;Bool cond;cond = (Bool) s.Dom().InSet(op);if (cond.GetValue()) {

Map var2;Map modmap;Set tmpVar2;Set var2;var2 = Set().Insert(f);tmpVar2 = (Set) s[op];tmpVar2.ImpUnion(var2);modmap = Map().Insert(op, tmpVar2);var2 = s; var2.ImpOverride(modmap);varRes = (Bool) (r == var2);

}else {

Map var2;Map modmap;modmap = Map().Insert(op, f);var2 = s; var2.ImpOverride(modmap);varRes = (Bool) (r == var2);


}return varRes;

}

What remains to implement manually is the retrieve function. Special C++generator functions in the VDM library facilitate the conversions of C++ types intoVDM objects.

7.2.2 Interpretation of Oracles with the Dynamic Link Fa-cility

Another approach is to use the specification interpreter in order to evaluate the oracleinside VDM-SL. The dynamic link facility may be used to call the C++ functions tobe tested from inside the IFAD Toolbox [FL96]. The dynamic link facility enablesa VDM-SL specification to execute parts written in C++, during interpretation. Itis called dynamic link facility because the C++ code is dynamically linked with theToolbox.

In order to access the implementation, a dlmodule has to be defined which con-tains the function’s declarations that are to be tested. For our example that would be

dlmodule CPP

exports

functions couple-frequency-ext : Z× Z× Z m→ Z-set → Z m→ Z-set

end CPP

if we model frequencies and operator positions as integers (Z). In order to testcouple-frequency in this case two functions are to be implemented. One mappingthe input to the implementation representation and calling couple-frequency, and theretrieve function which converts the output back to the abstract VDM-SL data-type.

With this approach abstract VDM-SL test-cases are fit into the implementation.Hence, the following must hold for a correct implementation opconcrete :

∀ in : A · pre-opabstract(in) ⇒ post-opabstract(in, r(opconcrete(rep(in))))

where the function rep :A → C maps the abstract input to a concrete representation.

7.2.3 CORBA Oracles

A third elegant way to bridge the language gap is CORBA, the Common ObjectRequest Broker Architecture [OMG96]. The VDM++ Toolbox provides a CORBAbased API (Application Programming Interface) through which a program can ac-cess the functionality of the VDMTools [Sto98].


The CORBA interface of the VDMTools is defined in IDL, the interface definitionlanguage of CORBA. The IDL provides a language independent interface that mapsthe different programming languages to the specification language of VDM. The setof programming languages that are supported is dependent on the object requestbroker (ORB) that is in use.

The IDL interface of the VDMTools, defines the VDMTools as a class of objects.This tool-class includes a set of methods through which the full functionality of arunning VDM toolbox may be invoked. Thus, all of the functionality that usually isinvoked through the graphical user interface may be called from other applications.

This includes the loading of a specification into the tool, starting the type-checker, and, most important for our approach, the invocation of the specificationinterpreter. IFAD’s intention for this API is the combined execution of VDM spec-ifications and program source-code. This facilitates, for example, the integration ofgraphical user interfaces with executable VDM specifications for validation purposes.Since the API is based on the distributed architecture of CORBA, the program ob-jects and the VDM interpreter may run on different machines that communicateover a network.

However, with this technique a post-condition oracle can be accessed and evalu-ated through the API, as well. The VDMTools may even run on a different (test-)server providing the VDM oracles. Since CORBA IDL converts different program-ming languages, the test approach can be applied to as many programming languagesas IDL mappings have been implemented, e.g. C++ and Java.


In this chapter we have presented an approach for automated test evaluation throughVDM-SL oracles. We have shown the general strategy and presented several wayshow the resulting test approach can be automated. The three presented solutionsfor automation are based on the commercial IFAD VDMTools.

To our present knowledge, only one automated black-box testing approach hasbeen based on an explicit mapping from an implementation level to a formally speci-fied and abstract post-condition oracle [BF86]. In this early work, the post-conditionoracle had to be translated manually into Prolog. Of course, other formal modeloriented approaches to testing have been published, but they differ in two aspects:First, the focus lies on test-case generation, rather than on oracle generation and au-tomated test-result verification [Las90, DF93, Sto93, HNS97, PS97, Don97, Aer98,Ped98]. Hence, the abstraction problem is not considered at all. Second, explicitspecifications serve as active oracles which calculate and compare the specified out-put to the program under test [Har92, ALR98, TDS98]. In contrast to our passiveoracles, these solutions cannot handle non-deterministic results.

Our idea has been presented in [Aic98] the first time. This work could alsobe extended with test-case generation techniques in order to automate the whole


test process. However, the automated test-result verification technique especiallysupports random testing.

In future we envisage an instrumentation of objects with post-condition oracles.Objects can be instrumented through inheritance without changing the actual code.Then, testable objects inherit the functionality from its superclass and provide ad-ditional retrieve and oracle functions as testing methods.

Our test-approach has been successfully applied in the master’s thesis project ofour student Pascal Fenkam [Fen00]. Currently, an object-oriented Java frameworkis under development that eases the application of the presented test approach.

We feel that the application of both, abstract formal specifications and formaldevelopment tools to testing, as presented here, is a powerful combination.

Chapter 8

Case Study

This chapter demonstrates our abstraction test approach by a case study that is in-spired by our industrial experiments. The simplified requirements of a functionalitytaken from a voice-communication system for air-traffic control serve as an illus-trating example. First, the formalization of the requirements is presented. Then,a contract that specifies the interactions of tester with the system is presented.From this contract test scenarios are calculated by searching compositions of thepartitioned operations.

8.1 Roles in Voice Communication

The functions that have to be formalized in this case study provide the managementof roles in a voice-communication system for air-traffic control.

A role is a group of functionalities that is available at an operator position: forinstance the functionality to control a certain sector of air traffic. These roles —or functionalities — are not bound to specific operator positions (OPs), but canbe dynamically distributed within the network, to optimize the work-load on thecontrollers. Therefore, it should be possible to control air space, independently ofthe physical location of the controller.

The functionality that is modeled in our system’s view is

• role request: an operator position requests a role, if possible the role is trans-ferred to the operator position.

• role release: an operator position releases one of its roles.

• call: a call is established from one operator position to a another by dialingthe phone number of a role.

• hang up: a call is released.

• enable OP: an operator position is enabled.

99

100 CHAPTER 8. CASE STUDY

• disable OP: an operator position is disabled.

The details of the functionality above, are presented in the following formal specifi-cation. First, the global state is modeled, then the operations are defined.

8.1.1 An Abstraction of the System’s State

The global state space of the system is defined by the three state variables for

• ops :Ops , the operator positions connected to the voice-communication system,

• roles : Roles that are statically and system-widely defined,

• calls : Calls , currently established between operator positions.

An array of operator positions Ops represents the existing working places of theair-traffic controllers. Each operator position Op is modeled by its set of functiongroups fgs that are installed at this operator position, the roles that are currentlyassociated to this OP, and a Boolean value denoting if the OP is enabled . Here,function groups and roles are identified by natural numbers.

Ops , array 1..opmax of Op

Op , record fgs : set of Natroles : set of Natenabled : Bool

The globally known roles in a system are defined by an array of roles. Eachrole represents a function group fg and a number num through which an operatorposition can be called.

Roles , array 1..rmax of Role

Role , record fg : Natnum : Nat

Currently established calls between operator positions are represented by a setof calls. A call consists of its source src, the initiator of the call, and the targetoperator position trg .

Calls , set of Call

Call , record src : Nattrg : Nat

External parties calling via radio or phone interfaces are not included in thismodel. The reason is that it is assumed that external calls have the same behav-ior as calls between operator positions with respect to roles. Consequently, onlyoperator to operator calls are used for testing the roles functionality. This unifor-mity hypothesis on calls demonstrates that an abstract view on a system alreadyestablishes a testing strategy.

8.1. ROLES IN VOICE COMMUNICATION 101

8.1.2 Operations

In the following the functionalities under consideration are specified formally. Eachof the operations depends on the operator position at which the operation is initi-ated. Auxiliary variables are used for storing the parameter values that are explicitlychosen by a user. Assertions further restrict a user’s choice to cases, where a functionmay be applied successfully.

A role request ReqRole is initiated by a user’s selection of an operator position oand a role r . A succeeding assertion further restricts the possible choices to operatorpositions that have less than 16 roles associated and that are enabled. Furthermore,for a requested role, the corresponding function group must be installed on thechosen operator position. If this holds, then the system demonically updates thestate variable ops such that the requested role is associated to the selected operatorposition. This implies that the role may be removed from another operator position.

The syntax of accessing a field in an array and in a record is the same. Forexample, ops [o][roles ] denotes the roles field in the oth entry of variable ops . Theoperator # denotes the cardinality of a set.

ReqRole , {o, r : = o ′, r ′ | 1 ≤ o ′ ≤ opmax ∧ 1 ≤ r ′ ≤ rmax};{#ops [o][roles ] < 16};{ops [o][enabled ]};{roles [r ][fg ] ∈ ops [o][fgs ]};[ops : = ops ′ | ∀op · (r ∈ ops ′[op][roles ] ∧ op = o) ∨

(r 6∈ ops ′[op][roles ] ∧ op 6= o)]

Releasing a role, RelRole, is specified by first selecting an appropriate operatorposition o and a role r , and then removing the role from the operator position. Theuser’s (angelic) selection is restricted to operator positions that have the role to bereleased.

RelRole , {o, r : = o ′, r ′ | 1 ≤ o ′ ≤ opmax ∧ 1 ≤ r ′ ≤ rmax};{ops [o][enabled ]};{r ∈ ops [o][roles ]};[ops : = ops ′ | ∀op · r 6∈ ops ′[op][roles ] ]

A Call to another operator position via the phone number of a role, involves theselection of the calling operator position o1 and the role number n. A selection isvalid if there exists both, a role with this number, and an enabled and not callingtarget operator position with this role associated. Furthermore, the calling operatorposition must not take part in a call already. For valid input, the system successfullyselects the target operator position o2 and establishes a call to o2.


Call , {o1, n: = o1′, n ′ | 1 ≤ o1′ ≤ opmax};{ops [o1][enabled ]};{6 ∃c ∈ calls · c[src] = o1 ∨ c[trg ] = o1};{∃r , op · roles [r ][num] = n ′ ∧ r ∈ ops [op][roles ]∧

ops [op][enabled ]∧∀c ∈ calls · c[trg ] 6= op ∧ c[src] 6= op};

[o2: = o2′ | ∃r · roles [r ][num] = n ∧ r ∈ ops [o2′][roles ]];[calls : = calls ′ | calls ′ = calls ∪ Call(o1, o2)]

A HangUp disconnects a call at an operator position o. A precondition is that acall is established previously. The remove of a call is specified implicitly, by statingthat a call is removed in which o is either the source or the target.

HangUp , {o: = o ′ | 1 ≤ o ′ ≤ opmax};{∃c ∈ calls · c[src] = o ∨ c[trg ] = o};[calls : = calls ′ | ∃c · calls ′ = calls\c ∧ (c[src] = o ∨ c[trg ] = o)]

EnableOp enables a selected operator position, if it has been disabled in advance.

EnableOp , {o: = o ′ | 1 ≤ o ′ ≤ opmax};{¬ops [o][enabled ]};ops [o][enabled ]: = T

DisableOp disables an enabled operator position. As a consequence, all roles ofthe operator position are released.

DisableOp , {o: = o ′ | 1 ≤ o ′ ≤ opmax};{ops [o][enabled ]};ops [o][enabled ]: = F;ops [o][roles ]: = ∅;

With these definitions our abstract view on the functionality of the voice commu-nication system is precisely defined. However, in the next section it will be seen thatfor testing purposes the possible user interactions of the system should be explicitlyspecified.

8.2 Tester-System Interactions

In the previous section, the operations of the system have been formally specified.Most of the formal methods, like VDM and Z, stop the specification process at thispoint. However, in Back and von Wright’s contract language the behavior of theuser, her interactions, can be defined more precisely.

8.2. TESTER-SYSTEM INTERACTIONS 103

One possibility would be to include the operations in an iterative choice state-ment as has been done in the process scheduler example of Chapter 6. However, froma tester’s perspective this would not be the most elegant choice for this example.

The difference to the process scheduler is that the communication system has adistributed interface — the operator positions. Consequently, a single tester has tomove locally when operations from a different operator position are to be initiated.Therefore, it makes sense to

• separate the selection of operator positions from the other user interactions;

• design test scenarios such that operations initiated at one operator positionare grouped together.

The result of this considerations is the specification of the voice communicationsystem below. In this contract, the selection of the operator position is separatedfrom the other selections and is included in an enclosing iterative choice statement.

Furthermore, the operations have been divided into their angelic contract partand their demonic contract part. The angelic part of an Operation is denoted byOperationa and represents the user’s inputs (angelic updates) and his constraints(assertions). The demonic part is denoted by Operationd and represents the reactionsof the system (demonic updates). This separation makes the contract more readable,but has its advantages for deriving the test scenarios, too, as will be seen later.

VCS , {ops , roles , calls : = ops ′, roles ′, calls ′ | ops ′[roles ] = ∅ ∧ calls ′ = ∅};do true : :{o: = o ′ | 1 ≤ o ≤ opmax};

do ops [o][enabled ] ∧#ops [o][roles ] < 16 : : ReqRolea ;ReqRoled

♦ ops [o][enabled ] ∧ ops [o][roles ] 6= ∅ : : RelRolea ;RelRoled

♦ ops [o][enabled ]∧(@c ∈ calls · c[src] = o ∨ c[trg ] = o) : : Calla ;Calld

♦ ops [o][enabled ]∧(∃c ∈ calls · c[src] = o ∨ c[trg ] = o) : : HangUpd

♦ ¬ops [o][enabled ] : : Enabled

♦ ops [o][enabled ] : : Disabled

odod

The contract of the VCS can be interpreted from a tester’s point of view asfollows:

1. In advance of the actual testing, the testing environment has to be set up. Thisinvolves the installation of function groups at operator positions, the enteringof role data (function group and number). Furthermore, certain operatorpositions may be disabled in advance. These data can be fit into the system


over a special working place for parameterizing the VCS. This initializationis modeled by an angelic update with the constraints that no calls can beestablished in advance, and no roles can be previously associated.

2. The tester selects an operator position and initiates one of the following oper-ations:

(a) If an enabled operator position has less than 16 roles associated, then anew role may be requested:

ReqRolea , {r : = r ′ | 1 ≤ r ′ ≤ rmax ∧ roles [r ′][fg ] ∈ ops [o][fgs ]}

ReqRoled , [ops : = ops ′ | ∀op · (r ∈ ops ′[op][roles ] ∧ op = o) ∨(r 6∈ ops ′[op][roles ] ∧ op 6= o)]

(b) If an enabled operator position has at least one role associated, then arole can be released:

RelRolea , {r : = r ′ | 1 ≤ r ′ ≤ rmax ∧ r ′ ∈ ops [o][roles ]}RelRoled , [ops : = ops ′ | ∀op · r 6∈ ops ′[op][roles ]]

(c) If an enabled operator position is not already involved in a call, then acall can be initiated:

Calla , {n: = n ′ | ∃r , op · roles [r ][num] = n ′ ∧ r ∈ ops [op][roles ]∧ops [op][enabled ] ∧ ∀c ∈ calls · c[trg ] 6= op ∧ c[src] 6= op}

Calld , [o2: = o2′ | ∃r · roles [r ][num] = n ∧ r ∈ ops [o2′][roles ]];[calls : = calls ′ | calls ′ = calls ∪ Call(o, o2)]

(d) If an enabled operator position is involved in a call, then the operatorcan stop calling by hanging up. Here, no further selection of the tester isneeded. Therefore, a demonic update is sufficient:

HangUpd , [calls : = calls ′ | ∃c · calls ′ = calls\c ∧ (c[src] = o ∨ c[trg ] =o)]

(e) A disabled operator position may be enabled:

EnableOpd , ops [o][enabled ]: = T

(f) An enabled operator position may be disabled:

DisableOpd , ops [o][enabled ]: = F; ops [o][roles ]: = ∅;(g) The tester may skip the operation selection and may choose a new oper-

ator position for entering his commands.

3. The tester decides to stop testing.

8.3. DNF-PARTITIONING 105

8.3 DNF-Partitioning

As in the process scheduler example, the next step is the partitioning of the inter-actions. This can be done by applying the rule for generating a disjunctive normalform (DNF).

However, in this contract none of the demonic updates includes cases such thatthe DNF rule can be applied. The only ∨ that is included in ReqRoled is inside a∀-quantifier expression which cannot be split apart.

The only partitioning that can take place is the partitioning of the guard expres-sion for hanging up. There, the existential quantified expressions can be split intotwo separate expressions:

(∃c ∈ calls · c[src] = o ∨ c[trg ] = o)

= by splitting the quantifier

((∃c ∈ calls · c[src] = o) ∨ (∃c ∈ calls · c[trg ] = o))

= by DNF-rewriting

((@c ∈ calls · c[src] = o) ∧ (∃c ∈ calls · c[trg ] = o))∨((∃c ∈ calls · c[src] = o) ∧ (@c ∈ calls · c[trg ] = o))∨((∃c ∈ calls · c[src] = o) ∧ (∃c ∈ calls · c[trg ] = o))

= by the invariant: only one call per operator position

(∃c ∈ calls · c[src] 6= o ∧ c[trg ] = o)∨(∃c ∈ calls · c[src] = o ∧ c[trg ] 6= o))∨((∃c ∈ calls · c[src] = o ∧ c[trg ] = o))

= by the invariant: self-calls are not possible

(∃c ∈ calls · c[trg ] = o)∨(∃c ∈ calls · c[src] = o))

This gives two partitions for hanging up. In the first, the called operator hangsup. In the second, the calling operator hangs up.


Note that the guard for the call operation cannot be partitioned since

@c ∈ calls · c[src] = o ∨ c[trg ] = o ≡ ∀c ∈ calls · c[src] 6= o ∧ c[trg ] 6= o

Consequently, the partitioned contract VCS ′ is only modified with respect to theHangUp operation.

VCS ′ , {ops , roles , calls : = ops ′, roles ′, calls ′ | ops ′[roles ] = ∅ ∧ calls ′ = ∅};do true : :{o: = o ′ | 1 ≤ o ≤ opmax};

do ops [o][enabled ] ∧#ops [o][roles ] < 16 : : ReqRolea ;ReqRoled

♦ ops [o][enabled ] ∧ ops [o][roles ] 6= ∅ : : RelRolea ;RelRoled

♦ ops [o][enabled ] ∧(@c ∈ calls · c[src] = o ∨ c[trg ] = o) : : Calla ;Calld

♦ ops [o][enabled ] ∧ (∃c ∈ calls · c[src] = o) : : HangUpd

♦ ops [o][enabled ] ∧ (∃c ∈ calls · c[trg ] = o) : : HangUpd

♦ ¬ops [o][enabled ] : : Enabled

♦ ops [o][enabled ] : : Disabled

odod

The contract VCS is an example of a class of systems where domain partitioningis trivial and thus not sufficient. The complex guards indicate that, here, the chal-lenging task is the sequencing of the tests — the test scenario synthesis. As explainedin the previous chapter, our strategy starts with the search for valid compositions.

8.4 Scenario Synthesis

8.4.1 Interaction Compositions

Our test synthesis method for test scenarios is based on compositions (see Chap-ter 6). These compositions are pairwise sequential compositions of interactions.Their construction plays three different roles:

1. compositions are atomic elements for finding possible scenarios;

2. the compositions highlight which interactions are more interesting than otherswith respect to testing;

3. by searching valid compositions, the preconditions and guards of the interac-tions of a contract are validated.

In the following, a set of compositions for each of the interactions in the VCS ′

contract is presented. Since the contract consists of 7 basic operations on a certain

8.4. SCENARIO SYNTHESIS 107

operator position, theoretically 7 × 7 = 49 compositions might exist. However, fortesting only a reduced number of compositions is necessary. The following type ofcompositions can be skipped:

• non-terminating composition: the weakest precondition of the sequential com-position is false. Thus for an arbitrary predicate q , (S1; S2).q = false. Thisrepresents a non-feasible test sequence.

• independent composition: the weakest precondition of the sequential composi-tion S1; S2 equals the conjunction of the weakest preconditions of S1 and S2.Thus, for an arbitrary predicate q , (S1; S2).q = S1.q ∩ S2.q . This represents atest sequence, where S1 does not contribute in setting up the system’s state,such that S2 can be invoked. These compositions are not interesting from atester’s point of view.

The following compositions are grouped by the second operation.

ReqRole

Five operations may precede ReqRole and yield independent compositions. A pre-ceding HangUp is not interesting for testing since it constitutes an independentcomposition.

(C1) {ops [o][enabled ] ∧ ops [o][roles ] < 15 ∧ rmax ≥ 2∧∃r1, r2 · r1 6= r2 ∧ roles [r1][fg ] ∈ ops [o][fgs ] ∧ roles [r2][fg ] ∈ ops [o][fgs ]∧r1 6∈ ops [o][roles ]};ReqRolea ;ReqRoled ;ReqRolea ; {r 6∈ ops [o][roles ]};ReqRoled

(C2) {ops [o][enabled ] ∧ ops [o][roles ] < 16∧∃r · roles [r ][fg ] ∈ ops [o][fgs ]};ReqRolea ;ReqRoled ;ReqRolea ; {r ∈ ops [o][roles ]};ReqRoled

(C3) {ops [o][enabled ] ∧ ∃r · r ∈ ops [o][roles ]};RelRolea ;RelRoled ;ReqRolea ;ReqRoled

(C4) {ops [o][enabled ] ∧ (∃r , op · op 6= o ∧ r ∈ ops [op][roles ] ∧ ops [op][enabled ]∧∀c ∈ calls · c[trg ] 6= op ∧ c[src] 6= op)∧roles [r ][fg ] ∈ ops [o][fgs ] ∧#ops [o][roles ] < 16};Calla ;Calld ;ReqRolea ; {roles [r ][num] = n};ReqRoled


(C5) {¬ops [o][enabled ] ∧ ∃r · roles [r ][fg ] ∈ ops [o][fgs ]∧#ops [o][roles ] < 16};Enabled ;ReqRolea ;ReqRoled

ad C1: This constitutes the request of two different roles. The additional assertion{r 6∈ ops [o][roles ]} after the second ReqRolea reduces the possibilities of thetester’s role selection to new roles. This is the reason why the operationshave been split into an angelic and demonic part. This additional assertion isconsistent with our approach of abstraction, since reducing the choices of theangel is abstraction.

The precondition states that the operator position (OP) must be enabled; twofurther roles can be added to the OP; at least two roles exist in the system;two roles exists that have the permission to be added to the OP and one roleis not added yet.

ad C2: This means requesting the same role twice. The additional assertion {r ∈ops [o][roles ]} after the second ReqRolea reduces the possibilities of the tester’srole selection to a role that has been already added to the OP.

The precondition states that the operator position must be enabled; one fur-ther role can be added to the OP; one role exists that can be added to theOP.

ad C3: For releasing a role and then requesting one, at least one role must beassociated to the enabled OP previously.

ad C4: This represents the interesting case, where during a call the role of thepartner is requested. This raises the question if this should be possible — anexample for the validation role of the composition synthesis process.

The precondition states that the OP must be enabled; a role associated to asecond enabled OP exists which is not involved in a call and this role can berequested.

ad C5: This composition results in a case where the first role is added to the OPafter its enabling. For enabling an OP it must be disabled in advance, fur-thermore the preconditions for requesting a role must hold.

RelRole

Only three operations can be composed with RelRole to independent and termi-nating compositions. Enabling and disabling operations cannot directly precedeRelRole. A preceding HangUp and Call result in an independent composition. SinceCall is interesting from a validation point of view its composition is included, thequestion is: Is it possible to release a role during a call?


(C6) {ops [o][enabled ] ∧ ∃r · roles [r ][fg ] ∈ ops [o][fgs ]∧ops [o][roles ] = ∅};ReqRolea ;ReqRoled ;RelRolea ;RelRoled

(C7) ops [o][enabled ] ∧#ops [o][roles ] ≥ 2};RelRolea ;RelRoled ;RelRolea ;RelRoled

(C8) {ops [o][enabled ] ∧ (∃r , op · op 6= o ∧ r ∈ ops [op][roles ] ∧ ops [op][enabled ]∧∀c ∈ calls · c[trg ] 6= op ∧ c[src] 6= op)∧ops [o][roles ] 6= ∅};Calla ;Calld ;RelRolea ; {roles [r ][num] = n};RelRoled

ad C6: This composition represents the case, where a certain role is requested andafterwards released. Here, the precondition is further restricted (abstracted)to guarantee that the same role is requested and afterwards released: No rolesshould be associated previously. The more general case of requesting one roleand releasing another would be less interesting.

Therefore, the precondition states that the OP must be enabled; a role withthe appropriate function group exists; and the set of associated roles is empty.

ad C7: This is a test for releasing two roles successively. A precondition is thattwo roles are associated to the enabled OP.

ad C8: This composition represents a test for checking if releasing a role is possibleduring a call. The complex precondition shows that this composition is anindependent one.

Call

The two interesting compositions for testing the call functionality, is to hang up andthen to initiate a call. For hanging up we have identified two partitions. The otherpossible preceding operations are not very interesting for testing due to their highindependence.

(C9) {ops [o][enabled ] ∧ (∃c ∈ calls · c[src] = o)};HangUpd ;Calla ;Calld

(C10) {ops [o][enabled ] ∧ (∃c ∈ calls · c[trg ] = o) ∧ (∃op · ops [op][roles ] 6= ∅)};HangUpd ;Calla ;Calld


ad C9: This is the case, where the OP has called another one, hangs up, andagain initiates a call. Due to the composition the precondition becomes rathersimple.

The precondition says that a call must exist that has been initiated by theOP, and OP must be enabled.

ad C10: This is the case, where the OP has been called by another one, the testerat the called OP hangs up, and again initiates a call.

Here, the precondition in addition demands that another OP exists that hasa role associated. It might be the case that the other OP, which previouslycalled, has meanwhile released its roles.

HangUp

The only interesting and possible preceding operation for hanging up at an operatorposition is the initiation of a call. The other partition of HangUp can only be reachedby first changing the actual operator position.

(C11) {ops [o][enabled ] ∧ (∃r , op · r ∈ ops [op][roles ] ∧ ops [op][enabled ]∧∀c ∈ calls · c[trg ] 6= op ∧ c[src] 6= op)};Calla ;Calld ;HangUpd

ad C11: This represents a test-case for testing the hanging up during an initiatedcall. The precondition is the precondition of Call .

EnableOp

For enabling an operator position, the only possible preceding operation at the sameoperator position is its disabling.

(C12) {ops [o][enabled ]};Disabled ;Enabled

ad C12: Disabling and then enabling demands that the operator position is firstenabled.

DisableOp

Disabling an operator position is trivial in the sense that it is possible after eachpreceding operation. This operation builds independent compositions with everyother operation. The question the experts have to answer is: Is this a good de-sign decision? Again, the validation process of the specification (design) has been


improved by raising this kind of questions.

8.4.2 Selecting Test Scenarios

Based on the compositions above, different strategies for selecting the scenarios arepossible. Examples are:

1. test all (partitioned) operations once: a maximum of one scenario per availableoperation is needed;

2. test all compositions once: a maximum of one scenario per composition;

3. test each composition with all possible scenarios between a given operation ofinterest and the composition;

4. test each composition with all possible preceding scenarios: maximum numberof test-cases;

5. test each operation with all possible preceding scenarios including independentcompositions: maximum number of test-sequences.

The strategies are ordered by the increasing number of needed scenarios for coveringthese criteria. The additional tests of independent compositions in the last strategymight detect unintended feature interactions [MZ94].

In the following, the test scenarios for the first (Strategy 1) and second strategy(Strategy 2) are presented.

Strategy 1

The following six test scenario contracts have been developed following Strategy 1.This strategy only demands one test for each partitioned operation.Therefore, thesimplest compositions can be chosen to set up a state such that the operation to betested can be invoked.

Where compositions have been used, a scenario is annotated by the identifier ofthe composition. The process follows a backtracking approach where the synthesisprocess starts from the operation under test and ends at the parameterization of thesystem.

In the test-cases, certain input selections have been further abstracted to con-crete test-data. This has been done by adding an additional assertion after anangelic update. In the contract language, the sequential composition of an angelicupdate and an assertion {R}; {p} reduces the range (final states) of the relation Rto (ran.R ∩ p).


Scenarios for Testing Each Operation

(T1) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ] ∧ roles ′[1][fg ] ∈ ops ′[1][fgs ]};{o: = o ′ | o ′ = 1};

(C5) Enabled ;ReqRolea ;ReqRoled


(C5) Enabled ;(C6) ReqRolea ;ReqRoled ;

RelRolea ;RelRoled

(T3) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ] ∧ ¬ops ′[2][enabled ]∧roles ′[1][fg ] ∈ ops ′[1][fgs ]};{o: = o ′ | o ′ = 1};

(C5) Enabled ;ReqRolea ;ReqRoled ;{o: = o ′ | o ′ = 2};Enabled ;Calla ;Calld

(T4) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅¬ops ′[1][enabled ] ∧ ¬ops ′[2][enabled ] ∧ roles ′[1][fg ] ∈ ops ′[1][fgs ]};{o: = o ′ | o ′ = 1};

(C5) Enabled ;ReqRolea ;ReqRoled ;{o: = o ′ | o ′ = 2};Enabled ;

(C11) Calla ;Calld ;HangUpd



(C5) Enabled ;ReqRolea ;ReqRoled ;{o: = o ′ | o ′ = 2};Enabled ;

(C11) Calla ;Calld ;{o: = o ′ | o ′ = 1};HangUpd

(T6) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ]};{o: = o ′ | o ′ = 1};Enabled ;Disabled

It should be noted that the test-case (T1) is included in (T2), and (T3) is includedin (T4). Therefore, the two test-cases (T1) and (T3) can be skipped for testing alloperations once. They are redundant.

It should be obvious that all these tests have to be carried out on all operatorposition in the voice communication system.

Strategy 2

Next, the scenarios for covering all of the 12 compositions are presented. Here, wewill skip those scenarios that are detected to be redundant during the developmentof the test-cases, i.e. scenarios that would test compositions that are already coveredby other test scenarios.

Scenarios for Testing Each Composition

(T1) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ] ∧ roles ′[1][fg ] ∈ ops ′[1][fgs ]∧roles ′[2][fg ] ∈ ops ′[1][fgs ]};{o: = o ′ | o ′ = 1};

(C5) Enabled ;(C1) ReqRolea ; {r = 1};ReqRoled ;

ReqRolea ; {r = 2};ReqRoled






(C5) Enabled ;(C6) ReqRolea ; {r = 1};ReqRoled ;(C3) RelRolea ; {r = 1};RelRoled ;


(T4) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ] ∧ ¬ops ′[2][enabled ]∧roles ′[1][fg ] ∈ ops ′[1][fgs ] ∧ roles ′[1][fg ] ∈ ops ′[2][fgs ]};{o: = o ′ | o ′ = 1};

(C5) Enabled ;ReqRolea ; {r = 1};ReqRoled ;{o: = o ′ | o ′ = 2};Enabled ;

(C4) Calla ; {roles [1][num] = n};Calld ;ReqRolea ; {r = 1};ReqRoled

(T5) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ] ∧ roles ′[1][fg ] ∈ ops ′[1][fgs ]∧roles ′[2][fg ] ∈ ops ′[1][fgs ]};{o: = o ′ | o ′ = 1};


ReqRolea ; {r = 2};ReqRoled ;(C7) RelRolea ; {r = 1};RelRoled ;

RelRolea ; {r = 2};RelRoled



(C5) Enabled ;ReqRolea ; {r = 1};ReqRoled ;{o: = o ′ | o ′ = 2};

(C5) Enabled ;ReqRolea ; {r = 2};ReqRoled ;

(C8) Calla ; {roles [1][num] = n};Calld ;RelRolea ; {roles [r ][num] = n};RelRoled


(C5) Enabled ;ReqRolea ; {r = 1};ReqRoled ;{o: = o ′ | o ′ = 2};Enabled ;

(C11) Calla ; {roles [1][num] = n};Calld ;(C9) HangUpd ;

Calla ; {roles [1][num] = n};Calld


(C5) Enabled ;ReqRolea ; {r = 1};ReqRoled ;{o: = o ′ | o ′ = 2};

(C5) Enabled ;ReqRolea ; {r = 2};ReqRoled ;

(C11) Calla ; {roles [1][num] = n}Calld ;{o: = o ′ | o ′ = 1};

(C10) HangUpd ;Calla ; {roles [2][num] = n};Calld


(T9) {ops , roles , calls : = ops ′, roles ′, calls ′ | calls ′ = ∅ ∧ ops ′[roles ] = ∅∧¬ops ′[1][enabled ]};{o: = o ′ | o ′ = 1};

(C12) Enabled ;Disabled

These nine test scenarios include all twelve compositions that have been identifiedto be of interest for testing. Test-case (T5) includes composition (C1) and thus test-case (T1) might be skipped. However, a tester may want to check the results of thecompositions at the end of a test-case. Then, test-case (T5) should be included andthe other skipped test-cases for composition (C5), (C6), and (C11) may be added.

Depending on the need and on the test hypothesis made, more tests can bedesigned. The following possibilities for extending the set of test-cases can be con-sidered:

• More scenarios may be developed following the other scenario strategies.

• The scenarios should be tested on all operator positions in the voice commu-nication system.

• Different roles than the first two might be chosen.

• Additionally, extreme values might be chosen. Examples are maximum andminimum numbers of roles and function groups at an operator position.

• Additional mutation tests might be invented from the specification. Then, itcan be checked if the test scenarios would detect the specification mutations,by showing that the derived scenarios are not abstractions of the specificationmutants.


In this chapter, an important functionality of a real voice communication systemhas served as a case study for illustrating our new test approach based on formalcontracts. An abstract model of the operations for role management has been con-structed. Next, a contract that defines the possible tester-system interactions hasbeen presented, and then test-scenarios have been derived (abstracted) from thisformal contract. Therefore, the contract had been first partitioned and then inter-action compositions have been developed. From these compositions, a set of testscenarios has been constructed.


The example demonstrates that a rather complex functionality, as the manage-ment of roles in a voice communication system, can be specified in a rather coarsecontract. The key to this simplicity of the contract is abstraction — abstraction toa level needed for system testing, not for testing the specification.

The roles management functionality has been part of our second experimenton light-weight approaches as described in Chapter 4. The executable VDM-SLspecification turned out to be much more complex (30 pages). One reason is thelarge set of auxiliary functions that has to be defined for manipulating the statespace. In our contract, these state updates are specified implicitly by relationsexpressed by quantifiers that are not executable but more compact.

The second reason, why the implicit model of this chapter is less complex, is moretrivial: Requirements concerning exceptional behavior and architectural details havebeen skipped. Mathematically, this means that data-abstraction has been appliedfor modeling the central behavior first. And the test scenarios of the example showthat this central behavior is complex enough. It is our strong opinion that, first,the most important parts of a system have to be understood, validated and tested.Then, the view on a system (its model) may be refined further and more detailscan be considered. The refinement calculus provides the theory for this step-wisetop-down approach.

Refinement is an important aspect in our new test approach. In our methodthe use of the refinement calculus supports the data-refinement of a specificationtogether with its test-cases, since both are contracts. This is in contrast to test-approaches that use finite state machines (FSM), where refinement is rather difficult.Derrick and Boiten showed that states and transitions in the FSM might disappearor are to be added under data-refinement [DB99]. In our approach the interactioncompositions and thus the scenarios can be data-refined quite straight-forwardly.See [BMvW99] for an example of data-refinement of an interactive system in therefinement calculus.

Finally, the validation aspect of constructing the compositions should be pointedout. Several missing preconditions have been detected, due to this scenario synthesismethod. Further questions have been raised by considering, if a possible composi-tion of interactions does indeed reflect the intended behavior. This questions areeasily communicated to a customer, since they are caused by concrete examples ofbehavior.

Chapter 9

Concluding Remarks

In this thesis three formal development techniques have been examined regardingtheir support in the testing of computer-based systems: Light-weight approaches toformal methods, the refinement calculus, and data-reification techniques. As a re-sult, novel testing techniques have been developed. In this chapter some concludingremarks regarding this work are provided. This is done by first reviewing the resultsof this research. This is followed by a presentation of areas of future work.

9.1 Research Results

The research of this thesis resulted in three new approaches to testing:

1. a light-weight approach to specification based testing that has been suc-cessfully applied in industry,

2. an abstraction-based test synthesis method that uses the mathematicalframework of the refinement calculus,

3. a method for using abstract formal specifications as automated test-oracles.

In the following the contributions of these test methods are discussed in more detail.

9.1.1 Light-Weight Approach

The presented light-weight approach uses abstract VDM prototypes for assessingand designing test-cases. It is light-weight in the sense that no proofs are carriedout in this method. Two industrial experiments in the domain of air-traffic controldemonstrated the relevance of our method.

This part of the work is an application of an existing light-weight formal methodto testing. However, the strong emphasis of the approach on the validation ofboth, the requirements as well as the test-design is new. Especially, to our presentknowledge, it is the first time that this light-weight method has been applied in two

119

120 CHAPTER 9. CONCLUDING REMARKS

industrial experiments and that detailed empirical data have been collected — dataabout the efforts and efficiency of the method. The results of these experimentsdemonstrated the main advantages of our method:

• the formal specification of the requirements is a catalyst in validation in thesense that many questions are raised during the modeling process: Exper-iment I detected 64 problems in the given requirement documentation andExperiment II found 33 problems;

• previously designed test-cases can be improved: Experiment I uncovered anunacceptable low coverage of appr. 80 % in the existing test-cases. ExperimentII found that appr. 25 % of the existing test-cases were erroneous;

• engineers like the availability of test-cases to validate our formal specifications.Having an abstract prototype, an engineer or customer is not forced to under-stand the formal specification. The test-cases demonstrate the consequencesof the model by examples;

• the efforts are acceptable low.

However, the experiments also showed that more can be achieved. We learned thatexecutable specifications have the disadvantage that non-experienced users tend to,and sometimes are forced to, think too operationally. Consequently, the specifica-tions lack abstraction and incorporate too many design details.

9.1.2 Abstraction-Based Test Synthesis

This part of the work resulted in the most important contributions of this thesis.It has been observed that test-cases are abstractions of formal specifications. Thislead to a new test synthesis approach based on program synthesis. In this thesisit is the first time that the refinement calculus is applied to the problem of testsynthesis. The result is an abstraction calculus for testing in order to calculate test-cases according to a testing strategy. New formal synthesis rules for several teststrategies have been defined. This research has identified and proved the followingfacts:

• test-cases can be viewed as formal contracts,

• test-cases are abstractions of formal specifications,

• program synthesis techniques can be applied to the test-case synthesis problem,

• the refinement calculus of Back and von Wright provides a powerful mathe-matical framework for defining an abstraction calculus for testing,

• simple input-output test-cases as well as complex test scenarios can be treatedby an abstraction approach to testing,

9.2. FUTURE WORK 121

• rather different testing strategies can be formulated as formal abstraction rules:partition testing, structural testing, mutation testing, and scenario testing(test sequencing).

It is the author’s opinion that an abstraction calculus for testing, provides deeperinsight into the methods of test-case derivation. Especially, the understanding ofnew black-box testing strategies is supported by reasoning about the synthesis pro-cess. The examined testing strategies showed that even existing techniques revealinteresting properties. New test-case generation techniques may be the result.

For example, the presented synthesis rules for scenario calculation define analternative method for finding sequences of interactions. In contrast to finite statemachine (FSM) based approaches, the focus is on finding possible compositionsof interactions. Which compositions are possible is determined by the abstractionrules.

9.1.3 Formal Specifications as Oracles

In this part of the research, the possibilities to automate the test-result verificationthrough formal requirement specifications have been explored. More precisely, theformal method VDM (Vienna Development Method) serves to demonstrate thatabstract requirement models can be used as test oracles for concrete software. Itis shown that techniques of VDM’s invent and verify development method can beadapted to set up a test environment.

Our ideas differ to most of the previous work on test-result verification in twoaspects:

• a possibly non-executable specification may serve as a rather general oraclefor checking the test results. Non-deterministic test results can be handled aswell.

• the abstraction problem is solved by using VDM’s data-reification techniquesto relate the test-data for the implementation to the abstract specificationoracle.

9.2 Future Work

Naturally, further extensions to our new abstraction calculus for test-synthesis mustbe included in the list of future work. More synthesis rules can be found thatrepresent other test strategies that are not considered in this thesis. This includes,strategies for testing several forms of iterations, procedure calls and data-types.

One could think of a calculus for design-patterns that provides a testing strat-egy for each pattern. Integration tests could be designed for certain software ar-chitectures, like for pipe-and-filters, object-oriented architectures and black-boardsystems.

122 CHAPTER 9. CONCLUDING REMARKS

Furthermore, the test calculus could be extended to handle concurrency. Theframework of action systems [BS94], an extension of the refinement calculus fordescribing reactive systems, could be used for defining new rules.

The synthesis rules could be refined such that they define a structural operationalsemantics of the test-selection strategy. Like for programming languages it shouldbe possible to derive test-case generators out of this operation semantics definition.Furthermore, program transformation tools could be applied to derive the test-casessemi-automatically.

The development of future tools that support our test-synthesis process is oneof our central interests. We envisage the use of a theorem prover to calculate thescenarios. The author has already started first experiments with constructive proofsin Isabelle [Pau94]. Alternatively, model-checkers may be used for checking theabstraction relations. However, for the application in industry new interfaces areneeded that hide the complex interior of a theorem prover. The advantage of ourmethod is that a test-case generator based on our approach can be certified byproving that the correct abstractions are calculated.

We hope that the presented work stimulates further research on test-synthesisbased on other program-synthesis approaches. Especially, the application of programsynthesis and transformation tools for testing could be a promising topic of futureresearch.

9.3 Epilogue

The introduction started with the sentence “Software testing is a challenging task.”If testing is taken seriously, it could be as challenging as the development of the pro-gram itself. Therefore, the test design deserves the same level of rigor than softwaredesign. In this thesis, we have provided new methods that facilitate this aim. Theview that test-cases are abstractions of specifications should help to further unifyefforts in the testing and formal methods community. We conclude by expressingthe hope that in future more people will be aware of the central property:

test-cases v specification v implementation

Bibliography

[Abr96] Jean-Raymond Abrial. The B-Book: Assigning Programs to Mean-ings. Cambridge University Press, Trumpington Street, CambridgeCB2 1RP, Great Britain, 1996.

[Aer98] Lionel Van Aertryck. Une methode et un outil pour l’aide a lageneration de jeux de tests de logiciels. PhD thesis, Universite deRennes, January 1998.

[AGA00] Bernhard K. Aichernig, Andreas Gerstinger, and Robert Aster. Formalspecification techniques as a catalyst in validation. In Proceedings ofthe 5th Conference on High-Assurance Software Engineering, 15th–17thNovember 2000, Albuquerque, New Mexico, USA. IEEE, 2000.

[AH00] S. Antoy and D. Hamlet. Automatically checking an implementationagainst its formal specification. IEEE Transactions on Software Engi-neering, 26(1):56–69, 2000.

[Aic98] Bernhard K. Aichernig. Automated requirements testing with abstractoracles. In ISSRE’98: The Ninth International Symposium on Soft-ware Reliability Engineering, Paderborn, Germany, pages 21–22, IBMThomas J.Watson Research Center, P.O.Box 218, Route 134, York-town Heights, NY, USA, November 1998. Ram Chillarege. ISBN 3-00-003410-2.

[Aic99a] Bernhard K. Aichernig. Automated black-box testing with abstractVDM oracles. In John Fitzgerald and Peter Gorm Larsen, editors,Workshop Materials: VDM in Practice!, Part of the FM’99 WorldCongress on Formal Methods, Toulouse. Springer, 1999.

[Aic99b] Bernhard K. Aichernig. Automated black-box testing with abstractVDM oracles. In M. Felici, K. Kanoun, and A. Pasquini, editors, Com-puter Safety, Reliability and Security: proceedings of the 18th Interna-tional Conference, SAFECOMP’99, Toulouse, France, September 1999,volume 1698 of Lecture Notes in Computer Science, pages 250–259.Springer, 1999.

123

124 BIBLIOGRAPHY

[Aic99c] Bernhard K. Aichernig. Executable specifications in software reliabilityengineering. In 2nd Workshop on Formal Descriptions and SoftwareReliability (FDSR), Boca Raton, Florida, 31st of October 1999.

[Aic01] Bernhard K. Aichernig. Test-case calculation through abstraction. InProceedings of Formal Methods Europe 2001, FME 2001, March 12–162001, Berlin, Germany, Lecture Notes in Computer Science. SpringerVerlag, 2001. To be published.

[AK00] Bernhard K. Aichernig and Reinhold Kainhofer. Modeling and validat-ing hybrid systems using VDM and Mathematica. In C.Michael Hol-loway, editor, Lfm2000, Fifth NASA Langley Formal Methods Work-shop, Williamsburg, Virginia, June 2000, number CP-2000-210100 inConference Publication, pages 35–46. NASA, June 2000.

[ALR98] Sten Agerholm, Pierre-Jean Lecoeur, and Etienne Reichert. Formalspecification and validation at work: A case study using VDM-SL.In Proceedings of Second Workshop on Formal Methods in SoftwarePractice, Florida, Marts. ACM, 1998.

[BCGM00] Simon Burton, John Clark, Andy Galloway, and John McDermid. Au-tomated V & V for high integrity systems, a targeted formal methodsapproach. In C.Michael Holloway, editor, Lfm2000, Fifth NASA Lan-gley Formal Methods Workshop, Williamsburg, Virginia, June 2000,number CP-2000-210100 in Conference Publication. NASA, June 2000.

[Be80] D. Bjørner and O. Oest (eds.). Towards a Formal Description of Ada,volume 98 of Lecture Notes in Computer Science. Springer-Verlag,1980.

[Bec99] Kent Beck. Extreme Programming Explained: Embrace Change.Addison-Wesley, 1999.

[Bee86] D. Beech, editor. Concepts in User Interfaces: A (VDM) ReferenceModel for Command and Response Languages, volume 234 of LectureNotes in Computer Science. Springer-Verlag, 1986.

[Bei90] Boris Beizer. Software Testing Techniques. Van Nostrand Reinhold,New York, 2nd edition, 1990.

[Bei95] Boris Beizer. Black-Box Testing: Techniques for Functional Testing ofSoftware and Systems. John Wiley & Sons, Inc., 1995.

[BF86] R.E. Bloomfield and P.K.D. Froome. The application of formal methodsto the assessment of high integrity software. IEEE Transactions onSoftware Engineering, SE-12(9):988–993, September 1986.

BIBLIOGRAPHY 125

[BFL+94] Juan C. Bicarregui, John S. Fitzgerald, Peter A. Lindsay, RichardMoore, and Brian Ritchie. Proof in VDM: A Practitioner’s Guide.FACIT. Springer-Verlag, 1994.

[BFL96] T.M. Brookes, J.S. Fitzgerald, and P.G. Larsen. Formal and informalspecifications of a secure system component: Final results in a com-parative study. In Marie-Claude Gaudel and Jim Woodcock, editors,FME’96: Industrial Benefit and Advances in Formal Methods, pages214–227. Springer-Verlag, March 1996.

[BG81] Timothy A. Budd and Ajei S. Gopal. Program testing by specificationmutation. Computer Program Testing, pages 129–148, 1981.

[BGL+00] S. Bensalem, V. Ganesh, Y. Lakhnech, C. Munoz, S. Owre, H. Ruesz,J. Rushby, V. Rusu, N. Shankar, E. Singerman, and A. Tiwari. Anoverview of SAL. In C.Michael Holloway, editor, Lfm2000, FifthNASA Langley Formal Methods Workshop, Williamsburg, Virginia,June 2000, number CP-2000-210100 in Conference Publication, pages187–196. NASA, June 2000.

[BGM91] G. Bernot, M.-C. Gaudel, and B. Marre. Software testing based on for-mal specifications: A theory and a tool. Software Engineering Journal,6(6):387–405, November 1991.

[BGM92] G. Bernot, M.-C. Gaudel, and B. Marre. A formal approach to soft-ware testing. In Maurice Nivat, Charles Rattray, and Teodor Rus,editors, Proceedings of the Second International Conference on Alge-braic Methodology and Software Technology, Workshops in Computing,pages 243–253, London, May 22–25 1992. Springer-Verlag.

[Bjø89] Dines Bjørner. Specification and transformation, towards a meaningof ‘M’ in VDM. Lecture notes to IFIP TC2/WG2.2 South AmericaTutorial: Formal Description of Programming Concepts, April 1989,February 1989. A revised and expanded version of a TAPSOFT article.

[BMvW99] Ralph Back, Anna Mikhajlova, and Joakim von Wright. Reasoningabout interactive systems. In Jeannette M. Wing, Jim Woodcock, andJim Davies, editors, FM’99 — Formal Methods, World Congress onFormal Methods in the Development of Computing Systems, Toulouse,France, September 1999, Proceedings, Volume II, volume 1709 of Lec-ture Notes in Computer Science. Springer, 1999.

[Bod88] Grahan Boddy. The use of VDM within the Alvey Flagship Project. InR. Bloomfield, L. Marshall, and R. Jones, editors, VDM ’88: VDM– The Way Ahead, pages 153–166. VDM-Europe, Springer-Verlag,September 1988.

126 BIBLIOGRAPHY

[Boe75] B.W. Boehm. The high cost of software. In E. Horowitz, editor, Practi-cal Strategies for Developing Large Software Systems. Addison-Wesley,Reading MA, 1975.

[Boe81] Barry W. Boehm. Software Engineering Economics. Prentice-Hall,Englewood Cliffs, NJ, 1981.

[Boe86] Barry W. Boehm. A spiral model of software development and en-hancements. ACM Software Engineering Notes, 11(4):22–42, 1986.

[Bri89] E. Brinksma. A theory for the derivation of tests. In Peter H.J. van Eijk,Chris A. Vissers, and Michel Diaz, editors, The Formal DescriptionTechnique LOTOS: Results of the ESPRIT/SEDOS Project, pages 235–247. Elsevier Science Publishers North-Holland, 1989.

[Bro75] Frederick P. Brooks. The Mythical Man-Month, Essays on SoftwareEngineering. Addison Wesley, 1975.

[Bry86] Randal E. Bryant. Graph-based algorithms for Boolean function ma-nipulation. IEEE Transactions on Computers, 35(8):677–691, August1986.

[BS94] R.J.R. Back and K. Sere. From modular systems to action systems. InProceedings of Formal Methods Europe’94, Spain, October 1994, Lec-ture Notes in Computer Science. Springer-Verlag, 1994.

[Bur00] Simon Burton. Automated testing from Z specifications. Technical re-port, Department of Computer Science, University of York, November2000.

[BvW98] Ralph-Johan Back and Joakim von Wright. Refinement Calculus,a Systematic Introduction. Graduate Texts in Computer Science.Springer, 1998.

[BW99] Salimeh Behnia and Helene Waeselynck. Test criteria definition for Bmodels. In Jeannette M. Wing, Jim Woodcock, and Jim Davies, editors,FM’99 — Formal Methods, World Congress on Formal Methods inthe Development of Computing Systems, Toulouse, France, September1999, Proceedings, Volume I, volume 1709 of Lecture Notes in ComputerScience, pages 509–529. Springer, 1999.

[Cho78] T.S. Chow. Testing software design modeled by finite-state machine.IEEE Transactions on Software Engineering, 4(3), 1978.

[Chu40] Alonzo Church. A formulation of the simple theory of types. Journalof Symbolic Logic, 5:56–68, 1940.

BIBLIOGRAPHY 127

[Coe92] Martin David Coen. Interactive Program Derivation. PhD thesis,St. John’s College, University of Cambridge, 1992.

[Coh78] E.I. Cohen. A Finite Domain-Testing Strategy for Computer ProgramTesting. PhD thesis, Ohio State University, June 1978.

[Cot84] I.D. Cottam. The rigorous development of a system version controlprogram. IEEE Transactions of Software Engineering, 10(2):143–154,March 1984.

[CS94] D. Carrington and P. Stocks. A tale of two paradigms: Formal meth-ods and software testing. In Proceedings of the 8th Z User Meeting.Springer-Verlag, 1994.

[Daw91] John Dawes. The VDM-SL Reference Guide. Pitman, 1991. ISBN0-273-03151-1.

[DB99] John Derrick and Eerke Boiten. Testing refinements of state-basedformal specifications. Software Testing, Verification and Reliability,9:27–50, July 1999.

[DDH72] O.-J. Dahl, E.W. Dijkstra, and C.A.R. Hoare. Structured Programming,chapter I. Notes on Structured Programming, page 6. A.P.I.C. Studiesin Data Processing. Academic Press, London and New York, 1972.

[DF93] Jeremy Dick and Alain Faivre. Automating the generation and sequenc-ing of test cases from model-based specifications. In J.C.P. Woodcockand P.G. Larsen, editors, FME’93: Industrial-Strength Formal Meth-ods. Springer-Verlag, April 1993.

[DG95] Eugene Durr and Stephen Goldsack. The development of concurrentand real-time systems using VDM++, June 1995.

[DGP95] Eugene Durr, Stephen J. Goldsack, and Nico Plat. Object reification inVDM++. In M. Wirsing, editor, ICSE-17 Workshop on Formal MethodsApplication in Software Engineering Practice, pages 194–201, 24–25April 1995.

[Dij76] E.W. Dijkstra. A Discipline of Programming. Series in AutomaticComputation. Prentice-Hall International, 1976.

[DKLM98] Martin Dunstan, Tom Kelsey, Steve Linton, and Ursula Martin.Lightweight formal methods for computer algebra systems. In IS-SAC ’98: International Symposium on Symbolic and Algebraic Com-putation, University of Rostock, Germany, August 1998.

128 BIBLIOGRAPHY

[DLS78] R. DeMillo, R. Lipton, and F. Sayward. Hints on test data selection:Help for the practicing programmer. IEEE Computer, 11(4):34–41,April 1978.

[DM91] P. Dauchy and B. Marre. Test data selection from algebraic spec-ifications: Application to an automatic subway module. In A. vanLamsweerde and A. Fugetta, editors, 3rd European Software Engineer-ing Conference, ESEC’91, volume 550 of Lecture Notes in ComputerScience, 1991.

[dNH84] R. de Nicola and M. Hennessy. Testing equivalences for processes.Theoretical Computer Science, 34, 1984.

[Don97] Michael R. Donat. Automating formal specification-based testing. InMichel Bidoit and Max Dauchet, editors, TAPSOFT ’97:Theory andPractice of Software Development, 7th International Joint ConferenceCAAP/FASE, volume 1214 of Lecture Notes in Computer Science,pages 833–847. Springer-Verlag, April 1997.

[Don98] Michael R. Donat. A Discipline of Specification-Based Test Derivation.PhD thesis, Department of Computer Science, The University of BritishColumbia, September 1998.

[Dro00] Georg Droschl. Formal Specification and Analysis of Requirements inSoftware Development. PhD thesis, Institute for Software Technology,TU-Graz, Austria, April 2000. Supervisor: Peter Lucas.

[ELC+98] S. M. Easterbrook, R. Lutz, R. Covington, J. Kelly, Y. Ampo, andD. Hamilton. Experiences using lightweight formal methods for re-quirements modeling. IEEE Transactions on Software Engineering,24(1), January 1998.

[Fen00] Pascal Christian Fenkam. Dynamic user management system for websites. Master’s thesis, Institute for Software Technology, TU-Graz, Aus-tria, September 2000. Supervisor: Harald Gall and Peter Lucas.

[FGJM95] K. Futatsugi, J.A. Goguen, J.P. Jouannaud, and J. Meseguer. Princi-ples of OBJ2. In Proceedings of the 12th ACM Symposium on Principlesof Programming Languages, New Orleans, pages 52–66, 1995.

[FH88] P.A. Freeman and H.S. Hunt. Software quality improvement throughautomated testing. In Fifth International Conference on Testing Com-puter Software, Washington, DC, June 13–16, 1988, 1988.

[FL96] Brigitte Frohlich and Peter Gorm Larsen. Combining VDM-SL specifi-cations with C++ code. In Marie-Claude Gaudel and Jim Woodcock,

BIBLIOGRAPHY 129

editors, FME’96: Industrial Benefit and Advances in Formal Methods,pages 179–194. Springer-Verlag, March 1996.

[FL98] John Fitzgerald and Peter Gorm Larsen. Modelling Sytems, PracticalTools and Techniques. Cambridge University Press, 1998.

[fSSaCB91] ESA Board for Software Standardisation and Control (BSSC). ESASoftware Engineering Standards. European Space Agency PublicationsDivision, ESTEC, Noordwijk, The Netherlands, February 1991. Doc.Id.: ESA PSS-05-0 Issue 2.

[FvBKG91] G. Fujiwara, S. von Bochmann, F. Khendek, and A. Ghedamsi. Testselection based on finite-state models. IEEE Transactions on SoftwareEngineering, 17(6), 1991.

[Gau95] Marie-Claude Gaudel. Testing can be formal too. In TAPSOFT’95:Theory and Practice of Software Development, 6th International JointConference CAAP/FASE, volume 915 of Lecture Notes in ComputerScience, pages 82–96. Springer-Verlag, May 1995.

[GG93] M. Grochtmann and K. Grimm. Classification trees for partition test-ing. Software Testing, Verification and Reliability, 3:63–82, 1993.

[GH93] John V. Guttag and James J. Horning. Larch: Languages and Toolsfor Formal Specification. Texts and Monographs in Computer Science.Springer Verlag, 1993.

[GHW85] J.V. Guttag, J.J. Horning, and J.M. Wing. The Larch family of speci-fication languages. IEEE Software, 2(5):24–36, 1985.

[Gib94] W. Wayt Gibbs. Software’s chronic crisis. Scientific American, Septem-ber 1994.

[GJ98] Marie-Claude Gaudel and Perry R. James. Testing algebraic data typesand processes: A unifying theory. Formal Aspects of Computing, 10(5& 6):436–451, 1998.

[Gri81] David Gries. The Science of Programming. Texts and Monographs inComputer Science. Springer Verlag, 1981.

[Gro92] The RAISE Language Group. The RAISE Specification Language. TheBCS Practitioners Series. Prentice-Hall, 1992.

[Gro95] The RAISE Method Group. The RAISE Development Method. TheBCS Practitioners Series. Prentice-Hall, 1995.

130 BIBLIOGRAPHY

[GSPC99] Neil Gross, Marcia Stepanek, Otis Port, and John Carey. Software hell.Business Week, pages 38–44, December 1999.

[Gut77] J. Guttag. Abstract data types and the development of data structures.Communications of the ACM, 20(6):369–405, 1977.

[GWG95] M. Grochtmann, J. Wegner, and K. Grimm. Test case design usingclassification trees and the classification-tree editor. In Proceedings ofthe 8th International Quality Week, San Francisco, 1995.

[HA99] Johann Horl and Bernhard K. Aichernig. Formal specification of a voicecommunication system used in air traffic control, an industrial appli-cation of light-weight formal methods using VDM++ (abstract). InJ.M. Wing, J. Woodcock, and J. Davies, editors, Proceedings of FM’99– Formal Methods, World Congress on Formal Methods in the Devel-opment of Computing Systems, Toulouse, France, September 1999, vol-ume 1709 of Lecture Notes in Computer Science, page 1868. Springer,1999.

[HA00a] Johann Horl and Bernhard K. Aichernig. Requirements validation ofa voice communication system used in air traffic control, an industrialapplication of light-weight formal methods (abstract). In Proceedingsof the Fourth International Conference on Requirements Engineering(ICRE2000), June 19–23, 2000, Schaumburg, Illinois, page 190. IEEE,2000. Selected as one of three best papers.

[HA00b] Johann Horl and Bernhard K. Aichernig. Validating voice communi-cation requirements using lightweight formal methods. IEEE Software,pages 21–27, May/June 2000.

[Ham77] Richard G. Hamlet. Testing programs with the aid of a compiler. IEEETransactions on Software Engineering, 3(4):279–290, July 1977.

[Han94] Kirsten Mark Hansen. Formalising railway interlocking systems. InNordic Seminar on Dependable Computing Systems, pages 83–94, Tech-nical University of Denmark, August 1994. Department of ComputerScience.

[Har92] Andrew Harry. The value of reference implementations and proto-typing in a formal design and testing methodology. Report 208/92,National Physical Laboratory, Queen’s Road, Teddington, MiddelsexTW11 0LW, UK, October 1992.

[Hay86] I.J. Hayes. Specification directed module testing. IEEE Transactionson Software Engineering, 12(1):124–133, 1986.

BIBLIOGRAPHY 131

[Hen88] M. Hennessy. An algebraic theory of processes. MIT Press, 1988.

[Hen89] B.M. Henderson. Big brother — automated test controller. In SixthInternational Conference on Testing Computer Software, Washington,DC, May 22–25, 1989, 1989.

[Hie97] R.M. Hierons. Testing from a Z specification. Software Testing, Veri-fication and Reliability, 7:19–33, 1997.

[HL93] Ivo Van Horebeek and Johan Lewi. Algebraic Specifications in SoftwareEngineering, an Introduction. Springer Verlag, 1993.

[HM96] E.M. Horcher and E. Mikk. Test automation using Z specifications. InTools for System Development and Verification, Workshop, Bremen,Germany, BISS Monographs. Shaker Verlag, 1996.

[HNS97] Steffen Helke, Thomas Neustupny, and Thomas Santen. Automatingtest case generation from Z specifications with Isabelle. In ZUM’97,1997.

[Hoa85] C.A.R. Hoare. Communicating Sequential Processes. International Se-ries in Computer Science. Prentice Hall, 1985. ISBN 0-13-153271-5(0-13-153289-8 PBK).

[HP94] Hans-Martin Horcher and Jan Peleska. The role of formal specificationsin software testing. In Tutorial Notes for the FME’94 Symposium.Formal Methods Europe, October 1994.

[HP95] Hans-Martin Horcher and Jan Peleska. Using formal specifications tosupport software testing. Software Quality Journal, 4(4):309–327, De-cember 1995.

[HS96] Merlin Hughes and David Stotts. Daistish: Systematic algebraic testingfor OO programs in the presence of side-effects. In ISSTA’96, 1996.

[HT88] R. Hamlet and R. Taylor. Partition testing does not inspire confidence.In Second Workshop on Software Testing, Verification and Analysis,Banff, Canada, July 19–21, 1988, 1988.

[IFA98] IFAD. INFORMA project EP23163: User manual for the Rose-VDM++

link, January 1998. Doc.Id.: INFORMA-IFAD-40-V1.3.

[IFA00] The VDM Tool Group, IFAD. VDMTools: The IFAD VDM++ Lan-guage. The Institute of Applied Computer Science, Forskerparken 10,5230 Odense M, Denmark/Europe, 6.6 edition, 2000.

132 BIBLIOGRAPHY

[Ins00] The Institute of Applied Computer Science, Forskerparken 10,5230 Odense M, Denmark/Europe. VDMTools: VDM++ Toolbox UserManual, 6.6 edition, 2000.

[ISO89] ISO. Lotos: A formal description technique based on the temporal or-dering of observational behaviour. Technical Report 8807, InternationalStandards Organisation, 1989.

[JKS98] Mathias Jarke and Reino Kurki-Suoni. Special issue on scenario man-agement. IEEE Transactions on Software Engineering, 24(12), 1998.

[Jon90] Cliff B. Jones. Systematic Software Development Using VDM. Seriesin Computer Science. Prentice-Hall, second edition, 1990.

[Jon96] Cliff B. Jones. Formal methods light: A rigorous approach to formalmethods. IEEE Computer, 29(4):20–21, April 1996.

[Jon98] Cliff B. Jones. Some mistakes I have made and what I have learned fromthem. In Egidio Astesiano, editor, FASE’98, volume 1382 of LNCS,pages 7–20. Springer, 1998.

[JS93] P.S. Jackson and P.A. Stokes. Formal Specification and Animation of aWater Level Monitoring System. Technical Report INFO-0428, AtomicEnergy Control Board, Ottawa, Canada, March 1993.

[JW94] B.C. Jeng and E.J. Weyuker. A simplified domain-testing strategy.ACM Transactions on Software Engineering and Methodology, 3:254–270, July 1994.

[JW96] Daniel Jackson and Jeannette Wing. Formal methods light:Lightweight formal methods. IEEE Computer, 29(4):21–22, April 1996.

[KK97] John C. Kelly and Kathryn Kemp. Formal methods, specification andverification guidebook for software and computer systems, volume ii: Apractitioner’s companion, planning and technology insertion. TechnicalReport NASA-GB-001-97, NASA, Washington, DC 20546, May 1997.

[KL96] Stuart Kent and Kevin Lano. Axiomatic semantics for VDM++,August 1996. ESPRIT project 6500: AFRODITE, Doc.ID.:AFRO/IC/SKKL/SEM/V1,ftp://theory.doc.ic.ac.uk/papers/Lano/VPP/skkl.ps.

[Kra50] Victor Kraft. Der Wiener Kreis. Der Ursprung des Neopositivismus.Springer-Verlag, Vienna, Austria, 1950.

BIBLIOGRAPHY 133

[Kuh99] D. Richard Kuhn. Fault classes and error detection capability ofspecification-based testing. ACM Transactions on Software Engineer-ing and Methodology, 8(4):411—424, October 1999.

[Lam91] Richard Lampard. Specification of the Two-way Handshake in VDM.Report 188/91, National Physical Laboratory, Queen’s Road, Tedding-ton, Middelsex TW11 0LW, UK, September 1991.

[Lan95] Kevin Lano. Formal Object-Oriented Development. FACIT. Springer-Verlag, London, 1995.

[Lar95] Peter Gorm Larsen. Towards Proof Rules for VDM-SL. PhD thesis,Technical University of Denmark, Department of Computer Science,March 1995. ID-TR:1995-160.

[Lar00] Peter Gorm Larsen. The VDM bibliography. Technical report, TheInstitute of Applied Computer Science, Forskerparken 10, DK-5230Odense M, Denmark, 2000.

[Las90] Janusz Laski. Data flow testing in STAD. The Journal of Systems andSoftware, 12(1):3–14, 1990.

[LHB+96] P. G. Larsen, B. S. Hansen, H. Bruun, N. Plat, H. Toetenel, D. J. An-drews, J. Dawes, G. Parkin, et al. Information technology — program-ming languages, their environments and system software interfaces —Vienna Development Method — specification language — part 1: Baselanguage, December 1996. International Standard ISO/IEC 13817-1.

[Luc87] Peter Lucas. VDM: Origins, hopes, and achievements. In Bjørner,Jones, Airchinnigh, and Neuhold, editors, VDM ’87 VDM – A FormalMethod at Work, pages 1–18. VDM-Europe, Springer-Verlag LNCS 252,1987.

[LW69] P. Lucas and K. Walk. On the formal description of PL/I. AnnualReview Automatic Programming Part 3, 6(3), 1969.

[LY94] D. Lee and M. Yannakakis. Testing finite-state machines: state iden-tification and verification. IEEE Transactions on Computers, 43(3),1994.

[Lyu95] Michael R. Lyu, editor. Handbook of Software Reliability Engineering.McGraw-Hill, 1995.

[Mar95] B. Marre. LOFT: A tool for assisting selection of test data sets fromalgebraic specifications. In TAPSOFT’95: Theory and Practice of Soft-ware Development, 6th International Joint Conference CAAP/FASE,

134 BIBLIOGRAPHY

volume 915 of Lecture Notes in Computer Science, pages 799–800.Springer-Verlag, May 1995.

[MC99] Ian MacColl and David Carrington. A model of specification-basedtesting of interactive systems (abstract). In J.M. Wing, J. Woodcock,and J. Davies, editors, Proceedings of FM’99 – Formal Methods, WorldCongress on Formal Methods in the Development of Computing Sys-tems, Toulouse, France, September 1999, volume 1709 of Lecture Notesin Computer Science, page 1862. Springer, 1999.

[Mil86] E.F. Miller. Mechanizing software testing. In TOCG Meeting, WestlakeVillage, CA, April 15, 1986, San Francisco, 1986. Software ResearchAssociates.

[Mil99] Robin Milner. Communicating and Mobile Systems: the Pi-Calculus.Cambridge University Press, 1999.

[MIO90] John D. Musa, Anthony Iannino, and Kazuhira Okumoto. SoftwareReliability: Professional Edition. McGraw-Hill, New York, 1990.

[ML85] P. Martin-Lof. Mathematical Logic and Programming Languages, chap-ter Constructive mathematics and computer programming, pages 167–184. International Series in Computer Science. Prentice-Hall, 1985.

[Mor90] Carrol C. Morgan. Programming from Specifications. Series in Com-puter Science. Prentice-Hall International, 1990.

[Mos99] Peter D. Mosses. CASL: A guided tour of its design. In Proceedings ofWADT’98, volume 1589 of Lecture Notes in Computer Science, pages216–240. Springer-Verlag, 1999.

[MS93] Paul Mukherjee and Victoria Stavridou. The formal specification ofsafety requirements for storing explosives. Formal Aspects of Comput-ing, 5(4):299–336, 1993.

[MW92] Zohar Manna and Richard Waldinger. Fundamentals of deductiveprogram synthesis. IEEE Transactions on Software Engineering,18(8):674–704, August 1992.

[Mye79] G.J. Myers. The Art of Software Testing. John Wiley & Sons, NewYork, 1979.

[MZ94] Peter Mataga and Pamela Zave. Formal specification of telephone fea-tures. In J. P. Bowen and J. A. Hall, editors, Z User Workshop, Cam-bridge 1994, Workshops in Computing, pages 29–50. Springer-Verlag,1994.

BIBLIOGRAPHY 135

[NAS95] NASA. Formal methods specification and verification guidebook forsoftware and computer systems, volume I: Planning and technologyinsertion. Technical Report NASA-GB-002-95, NASA, Washington,DC 20546, USA, July 1995.

[Nom87] T. Nomura. Use of software engineering tools in Japan. In Ninth Inter-national Conference on Software Engineering, Monterey, CA, March30–April 2, 1987., 1987.

[NR69] Peter Naur and Brian Randell. SOFTWARE ENGINEERING. NATOScience Committee, January 1969.

[OB88] T.J. Ostrand and M.J. Balcer. The category-partition method for spec-ifying and generating functional tests. Communications of the ACM,31(6):676–686, 1988.

[oD89] The UK Ministry of Defence. Defence standard for military safety-critical software 00-59. draft, 1989.

[OH95] Takahiko Ogino and Yuji Hirao. Formal methods and their applicationsto safety-critical systems of railways. QR of RTRI, 36(4):198–203, De-cember 1995.

[OMG96] OMG. The common object request broker architecture and specifica-tion, revision 2.0. Technical report, OMG, July 1996.

[Par94] Graeme I. Parkin. Vienna development method specification language(VDM-SL). Computer Standard & Interfaces, 16:527–530, 1994. Specialissue with the general title: The Programming Language StandardsScene, Ten Years On.

[Pau94] L.C. Paulson. Isabelle — A Generic Theorem Prover, volume 828 ofLecture Notes in Computer Science. Springer Verlag, 1994.

[Ped98] Jesper Pedersen. Automatic test case generation and instantiation forVDM-SL specifications. Master’s thesis, Department of Mathematicsand Computer Science, Odense University, September 1998.

[Pel96] Jan Peleska. Test automation for safety-critical systems: Industrialapplication and future developments. In Marie-Claude Gaudel andJim Woodcock, editors, FME’96: Industrial Benefit and Advances inFormal Methods, pages 39–59. Springer-Verlag, March 1996.

[Per86] I.M. Perelmuter. Directions of automation in software testing. In ThirdConference on Testing Computer Software, Washington, DC, Septem-ber 29 – October 1, 1986, 1986.

136 BIBLIOGRAPHY

[Per95] William Perry. Effective methods for software testing. John Wiley &Sons, 1995.

[Pet77] J. Peterson. Petri nets. ACM Computer Surveys, 9(3):223–252, 1977.

[Pet81] J.L. Peterson. Petri Net Theory and the Modelling of Systems.McGraw-Hill, New York, 1981.

[PF90] D.H. Pitt and D. Freestone. The derivation of conformance tests fromLOTOS. IEEE Transactions on Software Engineering, 16(12), 1990.

[PF98] Andrej Pietschker and John S. Fitzgerald. Techniques for automatedtest generation from system specifications. In 28th International Sym-posium on Fault-Tolerant Computing, Munich, June 23-25, 1998. IEEEComputer Society, 1998.

[Pop35] Karl R. Popper. Logik der Forschung. Wien, 1935.

[Pop63] Karl R. Popper. Conjectures and Refutations. The Growth of ScientificKnowledge. London, 1963.

[Pos96] Robert M. Poston. Automating Specification-Based Software Testing.IEEE Computer Society Press, Los Alamitos, California, 1996.

[PS96] Jan Peleska and Michael Siegel. From testing theory to test driverimplementation. In Marie-Claude Gaudel and Jim Woodcock, editors,FME’96: Industrial Benefit and Advances in Formal Methods, pages538–556. Springer-Verlag, March 1996.

[PS97] J. Peleska and M. Siegel. Test automation of safety-critical reactivesystems. South African Computer Jounal, 19:53–77, 1997.

[PT89] Nico Plat and Hans Toetenel. Tool support for VDM. Technical Report89-81, Delft University of Technology, November 1989.

[PT92] Nico Plat and Hans Toetenel. A formal transformation from theBSI/VDM-SL concrete syntax to the core abstract syntax. TechnicalReport 92-07, Delft University, March 1992.

[PvKT91] Nico Plat, Jan van Katwijk, and Hans Toetenel. Applications andbenefits of formal methods in software development. Technical Report91-33, Delft University of Technology, Faculty of Technical Mathemat-ics and Informatics, April 1991.

[RC81] D.J. Richardson and L.A. Clarke. A partition analysis method to in-crease program reliability. In Fifth International Conference on Soft-ware Engineering, San Diego, CA, March 9–12, 1981, 1981.

BIBLIOGRAPHY 137

[Roy70] W.W. Royce. Managing the development of large software systems. InIEEE WESCON, pages 1–9, August 1970.

[Rus00] John Rushby. Disappearing formal methods. In Proceedings of the5th Conference on High-Assurance Software Engineering, 15th–17thNovember 2000, Albuquerque, New Mexico, USA, pages 95–96. IEEE,2000. Invited talk.

[Sad88] S.J. Sadler. Nuclear reactor protection software, an application ofVDM. Appendix to: VDM ’88. VDM - The Way Ahead. Springer-Verlag, 1988. From: Rolls-Royce and Associates Limited.

[SC93a] P.A. Stocks and D.A. Carrington. Test templates: A specification-based testing framework. In Proceedings of the 15th International Con-ference on Software Engineering, pages 405–414, August 1993.

[SC93b] Phil Stocks and David Carrington. Test template framework: Aspecification-based testing case study. In Proceedings of the Interna-tional Symposium on Software Testing and Analysis, pages 11–18, 1993.

[SCS97] Harbhajan Singh, Mirko Conrad, and Sadegh Sadeghipour. Test casedesign based on Z and the classification-tree method. In Proceedingsof the 1st International Conference on Formal Engineering Methods(ICFEM ’97), 1997.

[Scu88] G.T. Scullard. Test case selection using VDM. In R. Bloomfield,L. Marshall, and R. Jones, editors, VDM’88, VDM — The Way Ahead,2nd VDM-Europe Symposium, Dublin, Ireland, September 1988, Pro-ceedings, volume 328 of Lecture Notes in Computer Science, pages 178–186. Springer Verlag, 1988.

[Sen87] Dev Sen. Objectives of the British standardization of a language tosupport the VDM. In Airchinnigh Bjørner, Jones and Neuhold, editors,VDM ’87 VDM – A Formal Method at Work, pages 321–323. VDM-Europe, Springer-Verlag LNCS 252, 1987.

[Som92] Ian Sommerville. Software Engineering. International Computer Sci-ence Series. Addison-Wesley, 4th edition, 1992.

[Spi88] J.M. Spivey. Understanding Z, a Specification Language and its formalsemantics. Cambridge University Press, 1988.

[Spi92] J. Michael Spivey. The Z Notation: A Reference Manual. Prentice-HallInternational Series in Computer Science. Prentice-Hall, New York,N.Y., second edition, 1992.

138 BIBLIOGRAPHY

[SS98] S. Sadeghipour and H. Singh. Test strategies on the basis of extendedfinite state machines. Technical report, Daimler-Benz AG, Researchand Technology, 1998.

[Sta89] M.E. Staknis. The use of software prototypes in software testing. InSixth International Conference on Testing Computer Software, Wash-ington, DC, May 22–25, 1989, 1989.

[Ste95] Susan Stepney. Testing as abstraction. In J. P. Bowen and M. G.Hinchey, editors, ZUM ’95: 9th International Conference of Z Users,Limerick 1995, volume 967 of Lecture Notes in Computer Science.Springer, 1995.

[Sto93] Philip Alan Stocks. Applying formal methods to software testing. PhDthesis, The Department of computer science, The University of Queens-land, 1993.

[Sto98] Ole Storm. The VDM Toolbox API users guide. Technical report,IFAD, 1998.

[TaA88] G.S. Teo and M. Mac an Airchinnigh. The use of VDM in the spec-ification of chinese characters. In VDM ’88 VDM – The Way Ahead,pages 476–499, September 1988.

[TDS98] H. Treharne, J. Draper, and S. Schneider. Test case preparation usinga prototype. In Didier Bert, editor, B’98, volume 1393 of LNCS, pages293–311. Springer, 1998.

[Tre92] J. Tretmans. A Formal Approach to Conformance Testing. PhD thesis,Universiteit Twente, the Netherlands, December 1992.

[Tur93] Kenneth J. Turner, editor. Using Formal Description Techniques —Estelle, LOTOS, SDL. John Wiley and Sons Ltd., 1993.

[vdBVW99] Manuel van den Berg, Marcel Verhoef, and Mark Wigmans. Formalspecification and development of a mission critical data handling sub-system, an industrial usage report. In John Fitzgerald and Peter GormLarsen, editors, Workshop Materials: VDM in Practice!, Part of theFM’99 World Congress on Formal Methods, Toulouse. Springer, 1999.

[VGvM80] U. Voges, L. Gmeiner, and A.A. von Mayerhauser. Sadat — anautomated test tool. IEEE Transactions on Software Engineering,16(3):286–290, 1980.

[vW94] J. von Wright. Program refinement by theorem prover. Technical re-port, Abo Akademi University, Dept. of Computer Science, 1994.

BIBLIOGRAPHY 139

[WD96] Jim Woodcock and Jim Davies. Using Z — Specification, Refinement,and Proof. International Series in Computer Science. Prentice Hall,1996.

[Wit21] Ludwig Wittgenstein. Tractatus logico-philosophicus. Annalen derNaturphilosophie, 1921.

Systematic Black-Box Testing of Computer-Based …Systematic Black-Box Testing of Computer-Based...

Documents

Transcript of Systematic Black-Box Testing of Computer-Based …Systematic Black-Box Testing of Computer-Based...