1. Softnet-Workshop “Testen und Verifikation” · Test case Generation), dessen Ziel die...

www.soft-net.at

1. Softnet-Workshop“Testen und Verifikation”

TU Graz, 7. November 2007

Sehr geehrte Damen und Herren !

Wir bedanken uns für Ihre Teilnahme beim ersten Workshop des Kompetenznetzwerks Softnet

Austria. Ziel des Workshops war Bekanntmachung von neuen Test- und Verifikationstechniken in

Industrie und Gewerbe. Folgenden konkreten Verfahrenstechniken und Methoden wurden im

Rahmen dieses Workshops vorgestellt:

Performance Evaluierung im betrieblichen Umfeld

Testdaten- und Testsequenzgenerierung für Protokolle

Modell-basiertes Testen mit strukturierten Modellen

Qualitätssicherung von UML Modellen

Testautomatisierung

Testoptimierung und Testmanagement

Model Checking von Hard- und Software

Modell-basierte Fehler lokalisierung

Anbei finden Sie ein Z usammenstellung aller Vorträge und die Kontaktadressen der Teilnehmerinnen

und Teilnehmer des Workshops. Sollten wir Ihr Interesse mit einem konkreten Beitrag oder im

Rahmen eines persönlichen Gesprächs geweckt haben, so zögern Sie nicht, uns umgehend zu

kontaktieren. Gerne erteilen wir weiter Auskünfte, leiten Anfragen an die jeweiligen

Netzwerkpartner weiter, oder unterstützen Sie bei themenbezogenen Frag estellungen.

Mit freundlichen Grüßen,

Prof. Dr. Franz Wotawa und Dr. Bernhard Peischl

Kontakt:

Technische Universitä t Graz, Institut für SoftwaretechnologieSoftnet AustriaInffeldgasse 16b/2, A-8010 Graz, Austr iawww.soft-net.at

www.ist.tugraz.at

{wotawa,peischl}@ist.tugraz.at

www.soft-net.at

www.soft-net.at

Programm

10:15-10:30 Einleitung und Begrüßung

(Prof. Dr. Franz Wotawa, Technische Universität Graz)

10:30-11:00 Server Performance Evaluation im betrieblichen Umfeld

(Prof. Dr. Helmut Hlavacs, Universität Wien)

11:00-11:30 Testfallgenerierung in der Praxis: Höhere Qualität bei geringeren Kosten

(Stefan Mohacsi, Siemens PSE)

11:30-12:00 10 Entwickler, 100 Modelle, 1000 Inkonsistenzen: Qualitätssicherung von

Modellen

(Dr. Michael Breu, arctis Softwaretechnologie GmbH)

13:30-14:00 Protokoll-Conformance-Testen: Eine industrielle Anwendung von Formalen

Methoden

(Martin Weigelhofer, Technische Universität Graz)

14:00-14:30 Testoptimierung

(Ingrid Unterguggenberger, Christian Lackner, Cicero Consulting)

14:30-15:00 Mining Repositories for Decision-Support in Software Test Management

(Rudolf Ramler, Software Competence Center Hagenberg)

15:30-16:00 Model Checking in Hardware und Software

(Dr. Roderick Bloem, Technische Universität Graz)

16:00-16:30 Softnet Austria - Bringing Software Engineering into Practice

(Dr. Bernhard Peischl, Softnet Austria)

16:30-17:30 Model-based Software Debugging

(Prof. Dr. Markus Stumptner, University of South Australia)

12:00-13:30 Mittagspause mit Buffet

15:00-15:30 Kaffeepause

17:30- "Come Together" Buffet

Abstracts + CVs

Server Performance Evaluation im betrieblichen Umfeld

Testfallgenerierung in der Praxis: Höhere Qualität bei

geringeren Kosten

10 Entwickler, 100 Modelle, 1000 Inkonsistenzen:

Qualitätssicherung von Modellen

Mining Repositories for Decision-Support in Software Test

Management

Prof. Dr. Helmut Hlavacs, Universität Wien

Stefan Mohacsi, Siemens PSE

Dr. Michael Breu, arctis Softwaretechnolgie GmbH

Rudolf Ramler, Software Competence Center Hagenberg, Austria

Durch die Verfügbarkeit des omnipräsenten Internets und der

darauf aufbauenden Geschäftsabwicklung, e.g., B2B oder B2C,

hängt der betriebliche Geschäftserfolg in zunehmendem Maße

von der Verfügbarkeit der eigenen Serverinfrastruktur ab. Neben

der reinen Verfügbarkeit wird jedoch mit steigendem

Businessvolumen auch die Performance der eingesetzten IT

wichtig. Als Beispiel dient die Wartezeit bei Webseiten, welche

einen entscheidenden Einfluss auf das Käuferverhalten hat. In

diesem Vortrag wird erklärt, welche Auswirkungen ungenügende

Performance auf den Umsatz haben kann, wie Performance

modelliert und gemessen wird, und welche Aussagen man durch

Performanceevaluation erhält.

Ao.Univ.Prof. Dr. Helmut Hlavacs studierte Technische

Mathematik an der TU Wien, wo er auch im Bereich

Performance Evaluation von High-Performance Rechnern und

Computer Numerik promovierte. Seit 1998 arbeitet er an der

Fakultät für Informatik der Universität Wien, zunächst als

wissenschaftlicher Mitarbeiter, dann als Universitätsassistent. Im

Jahr 2004 habilitierte er sich zum Thema Quality of Service in

Distributed Systems. Dr. Hlavacs ist Autor zahlreicher

wissenschaftlicher Publikationen zum Thema Performance

Evaluation in verteilten Systemen, und ist Autor eines Buches

über High-Performance Computer.

Ein systematischer Test ist eine unverzichtbare Voraussetzung für

den Erfolg jedes SW-Produkts. Wie kann dieser Test aber mit

vertretbarem Aufwand durchgeführt werden? Um die Qualität

zu gewährleisten, müssen sämtliche Testfälle laufend aktualisiert

und dann erneut ausgeführt werden - ohne Automatisierung ein

aussichtsloses Unterfangen.

Andererseits wurde bislang der Anreiz für die Einführung

automatisierter Testmethoden durch komplizierte

Scriptsprachen, hohe Toolkosten und die aufwändige Wartung

der Testfälle beträchtlich vermindert.

Aus diesem Grund startete Siemens bereits 1997 das

Forschungsprojekt IDATG (Integrating Design and Automated

Test case Generation), dessen Ziel die Generierung von Testfällen

aus einer formalen Spezifikation war. Der Vorteil liegt auf der

Hand: Anstelle jedesmal sämtliche Testfälle per Hand

aktualisieren zu müssen, muss nur die Spezifikation angepasst

werden, was eine enorme Ersparnis bei der Wartung der Tests

bedeutet. Ausserdem garantiert die Methode eine

systematische Abdeckung aller Requirements, logischen

Abhängigkeiten und Äquivalenzklassen. IDATG wird

beispielsweise zum Test des Spacecraft Operating Systems der

ESA eingesetzt.

Stefan Mohacsi wurde 1970 geboren und studierte Informatik an

der Technischen Universität Wien. Seit 1997 arbeitet er für

Siemens, wo er schon früh die Leitung des Forschungsprojekts

IDATG (Integrating Design and Automated Test Case Generation)

übernahm. Im Laufe der Jahre entwickelte sich der erste

Prototyp zu einem professionellen Werkzeug, das heute den

neuesten Stand der Testfallgenerierungs-Technik repräsentiert.

Die bislang bedeutendste Anwendung der IDATG-Technologie ist

das Testautomatisierungs-Framework der Europäischen

Raumfahrtsbehörde ESA.

Abgesehen von der Weiterentwicklung von IDATG ist Stefan

Mohacsi Mitglied des Siemens Test Support Centers, wo er als

Senior Consultant für Testautomatisierung und Codetesten tätig

ist. Ausserdem hielt er zahlreiche Vorträge auf internationalen

Testkonferenzen und ist Mitglied des Austrian Testing Boards.

Neben dem Einsatz von informellen Spezifikationtechniken

gewinnt der Einsatz von UML zur Beschreibung von

Anforderungen, Softwarearchitekturen, und

Unternehmensmodellen zunehmend an Bedeutung. Mit der

wachsenden Komplexität der Modelle wachsen auch die

Fehleranfälligkeiten im Entwurf dieser Modelle. Die Ursache liegt

zum einen in der stark angewachsenen Komplexität von UML 2.0

selbst, als auch in der schieren Größe und den Querbeziehungen

zwischen Modellen, und letztendlich auch im Einsatz

inkompatibler Werkzeuge. Der Vortrag diskutiert den aktuellen

Stand der Technik im Bereich Modellvalidierung und stellt den

Softnet-Ansatz zur Qualtitätsicherung von Modellen vor.

Dr. Michael Breu provomierte 1991 an der TU München und war

anschliessend als Methodenberater und Softwarearchitekt für

Siemens-Nixdorf und FAST München tätig. Seit 2002 ist er

Geschäftsführer der arctis Softwaretechnologie GmbH. Neben

seiner langjährigen praktischen Erfahrung in der Durchführung

und Leitung von Softwareentwicklungsprojekten, hat er auch

zahlreiche wissenschaftliche Publikationen im Bereich

Softwareengineering Methoden und der Modellierung verteilter

System geschrieben. Daneben ist er auch Co-Author bzw.

Mitherausgeber dreier Bücher.

Resources and time available for testing are usually limited as

testing is a major cost factor in software development. One way

to deal with these constraints is to increase efficiency of testing

by directing the testing effort to those parts of the system that

contain the highest risk, e.g., where the most defects can be

found. As studies report, "about 80 percent of the defects come

from 20 percent of the modules, and about half the modules are

defect free". Yet, the resulting question is: "What are the high-

risk parts of the system?" In this work we investigate whether

answers to this question can be found by mining the software

repositories established in the course of software development.

First results from mining repositories of a medium-size software

project are presented.

Rudolf Ramler studied business informatics at the Johannes

Kepler University of Linz, Austria, and has been a member of the

scientific staff at the Software Competence Center Hagenberg,

Austria, since 2001. His research interests include software

testing, quality management and requirements engineering. He

has led research projects on testing of Web-based systems as

well as test automation, and he directed the development of

tools for test management. Rudolf works as a consultant in

industry and is a lecturer with the Upper Austria University of

Applied Sciences at Hagenberg, the Vienna University of

Technology, and the Free University of Bolzano, Italy.

www.soft-net.at

Protokoll-Conformance-Testen: Eine industrielle Anwendung

von Formalen Methoden

Model Checking in Hardware und Software

Testoptimierung

Model-based Software Debugging

Martin Weiglhofer, Technische Universität Graz

Dr. Roderick Bloem, Technische Universität Graz

Ingrid Unterguggenberger und Christian Lackner, Cicero

Consulting

Prof. Dr. Markus Stumptner, University of South Australia

Ein wichtiges Qualitätskriterium von

Hochverfügbarkeitssystemen ist deren funktionale Korrektheit in

Hinblick auf eine gegebene Spezifikation. Systematisches Testen

ist ein langwieriger und schwieriger Prozess. In der Theorie wird

seit langem die Anwendung formaler Methoden zur

systematischen Generierung von Testfällen diskutiert. Testfälle

werden automatisch aus einem Modell des implementierten

Softwaresystems generiert. Die Komplexität aktueller

Softwareprodukte führt jedoch häufig zu sehr großen Modellen,

die von vielen Algorithmen nicht verarbeitet werden können.

Dieser Vortrag behandelt aktuelle Techniken und Tools die einen

Einsatz formaler Methoden zur Testfallgenerierung in der

industriellen Praxis ermöglichen sollen. Am Beispiel eines

Session Initiation Protocol Registrars (Internet Telefonie) wird ein

mögliches Anwendungsszenario dargestellt.

Martin Weiglhofer ist Doktorand und Forschungsassistent am

Institut für Software Technologie der Technischen Universität

Graz. Er absolvierte sein Bachelorstudium

(Softwareentwicklung&Wissensmangement; 2005) und sein

Masterstudium (Softwareentwicklung&Wirtschaft; 2006) an der

Technischen Universität Graz. Seine Forschungsinteressen sind

Modell-basiertes Testen, automatisiertes Software Testen sowie

Formale Methoden zur Softwareentwicklung.

We will review techniques for exhaustive verification of

hardware and software. For hardware, such formal techniques

have made great headway and are in use at virtually all major

design companies. The first challenge is to provide the user with

an expressive, but simple language to express the specifications.

Second, we must provide pushbutton technology and not expose

the user to any kind of mathematics, and third, we must find

ways to deal with more than very small systems. We will look at

existing techniques to tackle these challenges and look at some

success stories in both hardware and software verification.

Dr. Roderick Bloem received his M.S. Degree in computer science

from Leiden University in The Netherlands in 1996 and his PhD

degree in computer science from the University of Colorado at

Boulder in 2001. He has held internship positions at Lucent Bell

Labs, at Cadence Berkeley Labs, and at the Corporate technology

Department of Siemens AG. He has been employed as an

/assistant /at Institute for Software Technology at Graz University

of Technology since 2002. His fields of interest include formal

verification, fault localization and repair, and methods for design

of correct-by-construction hardware. Dr. Bloem has published

several dozen papers in international conferences and journals,

is a member of several program committees, and has received a

Best Paper Award at the 2006 Haifa Verification Conference for

his work on fault localization.

Test und Qualitätssicherung (samt Fehlerbereinigung) stellen mit

Abstand den größten Aufwandsblock in der Entwicklung und

Pflege von Softwaresystemen dar. Eine Optimierung dieses

Kostenblocks ist möglich und notwendig und bringt dem IT-

Manager wesentliche Einsparungen. Der Vortrag zeigt ein

quantitatives Modell für Testoptimierung basierend auf Basis-

Maßzahlen wie Function Points, LOC und dem damit

prognostizierbaren Fehlerpotenzial auf. Dazu werden im

wesentlichen vier Maßnahmen zur Testoptimierung diskutiert:

QS als implizite Aufgabe im Software Life Cycle, Standardisierung

der Testmethodik mit Werkzeugunterstätzung,

Testautomatisierung und Zentralisierung von Testaufgaben in

einem QualityLab.

Ingrid Unterguggenberger: Studium der Angewandten Informatik

an der Universität Klagenfurt, Tätig in den Bereichen

Multiprojektmanagement, QS und Softwaremetriken der Firma

Cicero-Consulting GmbH IFPUG CFPS in diversen Benchmarking

Projekten in enger Zusammenarbeit mit DASMA.

Christian Lackner: Studium der Angewandten Informatik an der

Universität Klagenfurt, Seit 1994 selbständiger Dienstleister im

und rund um den Bereich der Softwareentwicklung Außerdem

tätig in den Bereichen Multiprojektmanagement und QS in

Projekten der Firma Cicero-Consulting GmbH

A considerable body of work on model-based software

debugging (MBSD) has been published in the past decade. We

summarise the underlying ideas and present the different

approaches as abstractions of the concrete semantics of the

programming language. We compare the model-based

framework with other well-known Automated Debugging

approaches and present open issues, challenges and potential

future directions of MBSD.

Markus Stumptner studied Computer Science and got his Ph.D.

degree 1990 at TU Vienna, where he worked at the Institute for

Information Systems since 1986. He was Senior Researcher at

the Christian-Doppler-Labor for Expert SystemS, 1999-2000

Professor at Johannes Kepler University Linz in the field of Data

& Knowledge Engineering. Since 2001 Chair of Computing an the

University of South Australia, Adelaide, head of the research

group Knowledge & Software Engineering, since 2003 Director of

the Advanced Computing Research Centre, co-founder of the

"Mawson Institute for Advanced Manufacturing". His present

research interests are in the fields of Data- and Process

Integration, model-based reasoning particularly within the

ranges diagnosis and debugging. Markus Stumptner is

participant in the Collaborative Research Centres (CRCs - CRCs -

distributed authority centers) for Enterprise Distributed Systems

Technology, Integrated Engineering Asset Management and

Advanced Automotive Technology. Present main project partners

are amongst others General Motors/Holden, Australian Nuclear

Science & Technology Organisation (ANSTO), Defense Science &

Technology Organisation (DSTO).

http://www.unisanet.unisa.edu.au/staff/homepage.asp?Name=

Markus.Stumptner

www.soft-net.at

Danksagung

Das Kompetenznetzwerk Softnet Austria wird durch Mittel des Bundesministeriums für

Wirtschaft und Arbeit über die Österreichische Forschungsförderungsgesellschaft (FFG)

sowie des Bundeslands Steiermark über die Steirische Wirtschaftsförderungsgesellschaft

mbH (SFG) und die Stadt Wien über das Zentrum für Innovation und Technologie (ZIT)

gefördert.

www.soft-net.at

1

1. Softnet WorkshopTesten und Verifikation

1. Workshop Testen und Verifikation, 7.11.2007 1

7.11.2007, Aula TU Graz

www.soft-net.at

10:15-10:30 Einleitung und Begrüßung(Prof. Dr. Franz Wotawa, Technische Universität Graz)

10:30-11:00 Server Performance Evaluation im betrieblichen Umfeld(Prof. Dr. Helmut Hlavacs, Universität Wien)

11:00-11:30 Testfallgenerierung in der Praxis: Höhere Qualität bei geringeren Kosten(Stefan Mohacsi, Siemens PSE)

11:30-12:00 10 Entwickler, 100 Modelle, 1000 Inkonsistenzen: Qualitätssicherung von Modellen(Dr. Michael Breu, arctis Softwaretechnologie GmbH)

12:00-13:30 Mittagspause mit Buffet13:30-14:00 Protokoll-Conformance-Testen: Eine industrielle Anwendung von Formalen


Methoden(Martin Weigelhofer, Technische Universität Graz)

14:00-14:30 Testoptimierung(Ingrid Unterguggenberger, Christian Lackner, Cicero Consulting)

14:30-15:00 Mining Repositories for Decision-Support in Software Test Management(Rudolf Ramler, Software Competence Center Hagenberg)

15:00-15:30 Kaffeepause15:30-16:00 Model Checking in Hardware und Software

(Dr. Roderick Bloem, Technische Universität Graz)16:00-16:30 Softnet Austria - Bringing Software Engineering into Practice

(Dr. Bernhard Peischl, Softnet Austria)16:30-17:30 Model-based Software Debugging

(Prof. Dr. Markus Stumptner, University of South Australia)17:30- "Come Together" Buffet

Motivation


Beispiel-Binärsuche (1)

low higha

key ?


high

high

high

low

low

low

………….………….………….

Beispiel – Binärsuche (2)

1: public static int binarySearch(int[] a, int key) {2: int low = 0;3: int high = a.length - 1;4:5: while (low <= high) { Integerbereich: [-231,231-1]


5: while (low < high) {6: int mid = (low + high) / 2;7: int midVal = a[mid];8:9: if (midVal < key)10: low = mid + 1;11: else if (midVal > key)12: high = mid - 1;13: else14: return mid; // key found15: }16: return -(low + 1); // key not found.17: }

Kann überschritten werden!

Unterschiedliche Fehlerursachen

• Fehler– In der Spezifikation – Im Design– In der Implementierung

• Fehler aufgrund von – Interaktionen– Annahmen

• ….1. Workshop Testen und Verifikation, 7.11.2007 6

2

Fehlererkennung?

• Verifikation basierend auf einer Spezifikation

• Testen– Conformance testing– Unit tests– System tests


10:15-10:30 Einleitung und Begrüßung(Prof. Dr. Franz Wotawa, Technische Universität Graz)

10:30-11:00 Server Performance Evaluation im betrieblichen Umfeld(Prof. Dr. Helmut Hlavacs, Universität Wien)

11:00-11:30 Testfallgenerierung in der Praxis: Höhere Qualität bei geringeren Kosten(Stefan Mohacsi, Siemens PSE)

11:30-12:00 10 Entwickler, 100 Modelle, 1000 Inkonsistenzen: Qualitätssicherung von Modellen(Dr. Michael Breu, arctis Softwaretechnologie GmbH)

12:00-13:30 Mittagspause mit Buffet13:30-14:00 Protokoll-Conformance-Testen: Eine industrielle Anwendung von Formalen


Methoden(Martin Weigelhofer, Technische Universität Graz)

14:00-14:30 Testoptimierung(Ingrid Unterguggenberger, Christian Lackner, Cicero Consulting)

14:30-15:00 Mining Repositories for Decision-Support in Software Test Management(Rudolf Ramler, Software Competence Center Hagenberg)

15:00-15:30 Kaffeepause15:30-16:00 Model Checking in Hardware und Software

(Dr. Roderick Bloem, Technische Universität Graz)16:00-16:30 Softnet Austria - Bringing Software Engineering into Practice

(Dr. Bernhard Peischl, Softnet Austria)16:30-17:30 Model-based Software Debugging

(Prof. Dr. Markus Stumptner, University of South Australia)17:30- "Come Together" Buffet

1

Server Performance Evaluationim Betrieblichen Umfeld

Helmut Hlavacs

Universität WienInstitut für Verteilte und Multimediale Systeme

Business-Beispiel

• Virtuelle Autohändler-Plattform• Verkauf über Web Frontend• Beobachtung

– Wartezeit zwischen 4 und 6 Sekunden ->

Softnet Workshop, TU Graz 7.11.2007

Wartezeit zwischen 4 und 6 Sekunden60% der Suchen werden abgebrochen

– Wartezeit > 6 Sekunden ->95% der Suchen werden abgebrochen

• 5% der erfolgreichen Suchanfragen erzeugen einen Verkauf

• Durchschn. 360 € Gewinn pro Verkauf

Business-Beispiel

Fragen des Managements• Kann der Webserver einen Last-Zuwachs von

10%, 20%, 30% verkraften?• Wann ist die Server Kapazität überschritten?


• Wann ist die Server-Kapazität überschritten?• Wieviel Gewinn würde dann verloren gehen?

-> Durchführen von Experimenten (Lasttests)

Business-Beispiel

aktuell +10% +20% +30%

Suchanfragen pro Tag 92448 101693 110938 120182

Response time 2,9 3,8 5,7 11,3

Verlorene Verkaufsabschlüsse (%) 0 0 60 95


Verkaufsabschl. pro Tag 4.622 5.085 2.219 300

Tägliches Einkommen (in 1000 €) 1.664 1.831 1.997 2.163

Potentielles tägliches Einkommen (in 1000 €) 1.664 1.831 799 108

Täglicher Einkommensverlust (in 1000 €) 1,198 2.055

Service Levels

• Suchanfragen sollten in weniger als 4 Sekunden ein Resultat liefern

• Zuviel Last > System kann Service Levels nicht


• Zuviel Last -> System kann Service Levels nichtmehr einhalten

Ergebnis der Performance-Studie

System muss durch ein leistungsfähigeres ersetzt werden


Welches?

2

Auswirkungen schlechter Performance

• Empirische Erkenntnis: Bei zu hoher Wartezeit verlassen die Benutzer eine Webseite (Zona Research 1999, 2001)


• Kostet $4.35 Billionen jährlich (Zona Research 2000)

• Reduktion der Ladezeit um eine Sekunde könnte die Abbruchrate um 30% - 60% senken (Zona Research 2001)

• Verluste durch Ausfallszeiten: 0.1 - 0.3% des jährlichen Einkommen pro Stunde (Zona Research 2001)

Betriebliche IT – Beobachtungen

Historisch gewachsene Systeme• Komplexe Topologien• Dutzende unterschiedliche Technolgien• Dynamische Datenerzeugung• Probleme: Flaschenhälse komplexe Abhängigkeiten


Probleme: Flaschenhälse, komplexe Abhängigkeiten

Zurück zur Zentralisierung• Billige Thin Clients• Einfach zu installieren, geringe Wartungskosten• Voraussetzung: gute Netzwerkverbindungen verfügbar• Problem: Zentralisierte Systeme skalieren schlecht• z.B. nächtliche Batchjobs, Abrechungen

Was ist Performance Evaluierung?

Teil von ISO/IEC 9126• Qualität des Produktes Software• Funktionalität, Benutzbarkeit, Änderbarkeit, Übertragbarkeit• Effizienz, Zuverlässigkeit


, g

Performance Evaluierung• Performance Test• Kapazitätsplanung• Benchmarking• Quality of Experience

Performance Test

Welche Performance hat mein „System under Test“ (SUT)?

• Gegeben– System


– Typische individuelle Arbeitslast

• Gesucht– Systemverhalten– Performance Metriken– Werden SLAs eingehalten?

Metriken

Performance• Antwortszeit• Durchsatz• Auslastung


• Ressourcenbelastung• Stabilität

Reliability(Zuverlässigkeit), Availability(Verfügbarkeit)• Mean-Time-To-Failure• Mean-Time-To-Repair• Mean-Time-Between-Failure

Performance Test und Überwachung

Offline• Testumgebung• Load Testing: typische Workload• Stress Testing: Worst Case Workload• Spike Testing: Extreme Überschreitung


Spike Testing: Extreme Überschreitung• Wichtig: Testumgebung ähnlich zu Produktion

Online• Produktionsumgebung• Individuell: Logfiles• End-to-end: Messagenten• Wichtig: Test darf Produktion nicht beeinflussen

3

Kapazitätsplanung

• Service-Level Agreement (SLA)• Vom Management vorgegeben• Ausgedrückt durch Zielmetriken


„95% der Anfragen sollten innerhalb von 4 Sekunden beantwortet sein“

Kapazitätsplanung

Zu erzeugende Modelle• Workload• Performance und Availability Prediction

C


• Cost

Mit welcher Plattform/Konfiguration können bei gegebener Kostenobergrenze und Workloaddie SLAs eingehalten werden?

Benchmarking

• Machen unterschiedliche Systeme bzgl. Performance vergleichbar

• Orientierungshilfe für Kauf• Ablauf


• Ablauf– Normierte Programme/Arbeitslast wird erzeugt– Workload: Real, Synthetisch, Künstlich– Performance Metriken werden gesammelt– Verdichtung auf eine Performance-Kennzahl

Benchmarks

• SPEC– CPU (Integer, FP)– Grafik– Java Client/Server

M il / W b

• BABCo – SysMark– WebMark– MobileMark

3DM k


– Mail / Webserver– Network File System– Power and Performance– Virtualisierung– MPI, OpenMP– SIP

– 3DMark

• TPC• Linpack (Top500)• WEBstone• SIPstone

Quality of Experience

Wie erlebt ein Mensch die subjektive Qualität des Services?

Subjective/perceived QoS (PQoS)• Subjektiv erlebte Qualität eines Einzelservices


• zB: Sprachqualität eines Telefonats, IPTV Bildqualität• Metrik: Mean Opinion Score (MOS)

Quality of (User) Experience (QoE)• Subjektiv erlebte Qualität des Anbieters/Gesamtservices• Bewertung des Angebotes (welche Services gibt es?)• Bewertung der PQoS der einzelnen Services

PQoS Beispiele

Wartezeit• Zona 8-Second Rule: “Nach 8 Sekunden

brechen 70% der Besucher die Transaktion ab”


Sprache• ITU E-Model, PESQ, 3SQM, …

Video• PSNR, DVQ, SSIM, VQM, JND, …

4

PQoS Beispiele – Bouch, Kuchinsky, Bhatti 2000

10

12

14

16

18

20

user

s

High Average Low


0

2

4

6

8

10

2 3 4 5 6 8 10 12 16

Latency

No.

of

Firefox Plugin Fasterfox

Weitere Methoden der Performance Evaluation

• Analytische Modellierung– Warteschlangentheorie– Petri-Netze– Markov-Ketten


Markov Ketten– Fluid-Modelle

• Discrete Event Simulation– Modell als Programm– Modellbestandteile in Datenstrukturen

Was kann die Wissenschaft beisteuern?

Grundlagenforschung• Weiterentwicklung der Methodik• Modellentwicklung


Angewandte Forschung• Als Projekt-Partner

– Auswahl von Methoden, Tools zur Performance Messung

– Durchführung von Messungen, Performance Studien• Perceived QoS/QoE Studien

Frühere und laufende Projekte

• Modellierung von Anwenderverhalten• Satellitenkommunikation• Mobilfunk (Zellen)• Webserver• IP Broadcasting


IP Broadcasting• PQoS von Mobile Video• Mobile Computing und Mobile Applikationen• Mobile Community Support Systeme• Heimnetzwerke und Applikationen• Energieffizienz• Ad-hoc Vehicle Communikation• …

Unsere Arbeiten in Softnet

• Availability und Reliability

• Performance Evaluation und Benchmarking


• Security

Voice over IP – Szenario

TER1. REG


2. INVITE2. INVITE

3. 200 OK3. 200 OK1. REGISTE EGISTER

5

Voice over IP – Problem

• Clients und Proxies von unterschiedlichen Quellen• SIP lässt manche Fragen offen• Manchmal verstehen einander Proxy und

Client nicht


Client nicht• Internet ist ein offenes globales Netzwerk• Nicht standard-konforme SIP-Nachrichten

– Clients, Proxies könnten abstürzen– Angriffsmöglichkeit, Denial of Service

Voice over IP – Schutzschild


INVITE

Voice over IP – Schutzschild

• Selbst-lernend– Welche SIP-Nachrichten waren erfolgreich?– Welche SIP-Nachrichten führten zu Fehlern?


• Klassifikation– Enthält Header-Teile die problematisch sind?– SIP-Nachricht (Header) wird durch eine möglichst

ähnliche ersetzt– Minimale Distanz

C4.5 Decision Tree for SIP Register Messages

- critical header parameters identifiedby the decision tree:


Voice over IP – Weitere Einsatzgebiete

• DoS Angriff erkennen

• Voice SPAM / SPIT-Detection


SIP Proxy Performance Benchmarks

• SIPstone

• IMS/NGN Performance Benchmark


• SPEC SIP Subgroup

6

SIP Security – ZRTP

• Secure RTP (SRTP)• Verschlüsselung des Medienkanals• Schlüsselaustausch nicht definiert

– PKI, SIP basiert


PKI, SIP basiert– ZRTP

ZRTP• Keine PKI notwendig• Diffie-Hellman• Verifizieren der Schlüssel: Vorlesen von Wörtern

SIP Security – ZRTP

• Man-In-The-Middle Angriff– Analyse– Implementation– Durchführung


• Implementierungsangriff

• Erweiterungen– Challenge-Response– Random Call Delay

Danke für ihre Aufmerksamkeit!


SEM-T Grundlagen des Software Testen 2.1.2002

1

1 © Copyright 2007 SC TestNovember 2007 Siemens IT Solutions and Services PSE

Practical Experience withPractical Experience withTest Case GenerationTest Case Generation

Higher Product Quality Higher Product Quality -- Reduced Test CostsReduced Test Costs

Stefan Mohacsi (Siemens)Stefan Mohacsi (Siemens)

Overview

Problems of Traditional Test AutomationThe IDATG MethodologyFormal Test Specification with IDATG


Automated Generation of Test ScriptsSuccess Stories

Why Test Automation is Essential

Manual maintenance and execution of all test cases...requires a tremendous effortis monotonous and frustrating


is affected by human mistakes

„Solution“ in many projects: Testing is skipped!

The only alternative is test automation!

Depending on the test phase, different test languages, methods and tools are requiredNo solution combining different types of GUI and API testing is availableTest cases and test data have still to be created

Problems of TraditionalTest Automation


Test cases and test data have still to be createdmanuallyTest maintenance remains a huge problemTest specification depends on implementation detailsFor GUI testing, simple Capture & Replay is usually not sufficient

IDATG = Integrating Design andAutomated Test Case Generation

Functionality of Siemens IDATG

Specification of API and GUI tests on 3 abstraction levels using graphical editors


g g pMapping of project requirements to test packagesBuilt-in GUI Spy for mapping test steps to GUI elementsGeneration of test data using various systematic methodsGeneration of test scripts in various formats (XML, WinRunner, SilkTest etc.)

The IDATG Methodology

IDATG

Test Case Generation

RequirementSpecification

Low-LevelSpecification

Test Data Generation


Test Execution

Test Execution Tool (e.g. WinRunner)

GUI informationrecorded withGUI Spy

Task FlowModeling


2

User Interface of IDATG


Convenient Workspace showing window and task hierarchy

Visual connection between tasksteps and GUI objects

Requirements Specificationand Task Modeling

Based on the functions described in the software requirements specification


specificationDefinition of re-usable and parametrizablebuilding blocks improves clearness and maintainability

Test Data Generation with IDATG

Test data can be generated, imported from an external file (.txt, .csv etc.), or defined manually.For each field in a data set, equivalence classes and boundary values can be defined.


Subsequently, IDATG can generate data recordsaccording to different coverage criteriaThe link between test data and the actual test sequences can easily be established

CECIL Test Methodology

CECIL = Cause-Effect Coverage Improved by Linear programming

Innovative test data design method combining the benefits of:• Multi-dimensional equivalence partitioning• Boundary value analysis• Cause/effect analysis


• Cause/effect analysis

Properties:• Well suited for complex semantic dependencies• High error-detection potential• Difficult to apply manually, but fully automated by IDATG

CECIL Test Methodology


Task Flow Model


The sequence of test steps for each task can be defined with the Task Flow EditorBuilding Block Concept: Each step may either represent an atomic step (blue) or an entire sub task flow (yellow)Re-use of sub tasks minimizes effort for test maintenance


3

Step Description

Test InstructionsFor GUI testing the user input, e.g. a mouse click)For API testing, either test commands in a l l l l th th t t t i t

Each Step is defined by :


low-level language or the path to a test scriptTest Driver that will execute the instructionsTest Data to be used as input or expected resultSemantic Pre-Conditions(e.g. all fields have to be filled out correctly)

Expected effects (Post-Conditions)(e.g. a new window opens)

Cursor

Recording Windowswith the GUI Spy


The GUI Spy allows you to capture windows directly from the screenSupportsVC++, VB, .NET, Web testing and Java

Test Case Generation with IDATG

As soon as part of the application has been specified,


As soon as part of the application has been specified,it is possible to generate test casesGeneration algorithm tries to cover all edges in the task flow graphs in consideration of semantic conditions (C1 coverage)Generated test cases are displayed as flow diagramsRequirements coverage can be checked at a single glance

Exporting Test Cases

XML for ESA Test Commander

WinRunner / SilkTest

IDATG

...OtherFormats


The specification as well as the generated test cases are stored in XML formatTest cases can be converted into arbitrary formats (e.g. WinRunner or SilkTest scripts)

Success Stories

ECS-IDE (Hicom-Knowledge Base Editor)Test Automation successful in spite of very complex GUIEffort for maintenance and test execution is very lowIt is possible to generate WinRunner tests for both the English and the German GUI version (!)


FLES (Flexible Logic Editor for Telecommunication Services)Huge Java application with complex dependenciesTest maintenance would be infeasible without building block concept of IDATG Excellent cooperation with test team in Slovakia

Success Story: ESA


SIS PSE Space Business and IDATG team create test automation framework for ESA’sSpacecraft Operating SystemTests are specified with IDATG

Photo: ESA


4

ESA Intelligent Test FrameworkESA Intelligent Test FrameworkSIS PSE IND AS & SC TestSIS PSE IND AS & SC Test

API T tAPI T t


IDATGIDATGTest Test

CommanderCommander(Tcl/Tk)(Tcl/Tk)

GUI Test GUI Test DriversDrivers

Spacecraft Spacecraft Operating Operating

SystemSystem

API TestAPI TestDriversDrivers

ManualManualTest Test

Summary

3-layered graphical specification method significantly reduces maintenance effortMany errors already found in design phaseVisualization of test sequences and GUI rather than an abstract language


an abstract languageComplete, faultless test scripts incl. GUI-MapFormal specification is a valuable aid for the programmer

Contact

Stefan Mohacsi Siemens IT Solutions and Services


PSE Support Center Test

Phone: +43(0)51707 / 43417e-Mail: [email protected]

1

10 Entwickler, 100 Modelle, 1000 Inkonsistenzen

Qualitätsicherung in ModellenQ g

Michael Breu

Workshop Softnet, Graz, 7. Nov. 07

about arctis

2002: Spin-Off of the research group "Quality Engineering" at the University of Innsbruck

with the competence fieldsSoftware EngineeringSoftware ArchitecturesSecurity Engineering

Mission: apply state-of-the-art technology in practice

M. Breu Nov-07/2 ©

Mission: apply state-of-the-art technology in practiceJoint research projects with the University of InnsbruckPersonnel

4 permanent employees with up to 15 years of experience in professional sw developmentprofound know how in software engineering concepts

project specific assignment of staff of the research group "Quality Engineering"

Content

Status Quo: Modelling in Practice

The Models@Work Solution

M. Breu Nov-07/3 ©

The Models@Work Solution

Future Work

Modelling in Practice

The number, the size and the complexity of models used in industry are continuously growing Real applications result in complex models

Enterprise modelsSoftware architecturesRequirement specificationsM d l d i T i

M. Breu Nov-07/4 ©

Model driven TestingModel driven software development (MDSD)

Sources of complexitymany models and models typescomplex interrelationships between model elementsheterogeneous modelling environmentdomain specific languages and rules

Modelling in Practice

The ProblemManual quality assessment (inspection, management and analysis) of huge models is no more possible

Our ApproachCreating a framework and a tool

M. Breu Nov-07/5 ©

to check the consistency of models across modelling tools (modelling)to provide the modeller with aggregated measures about models (analysis)

Related Work

Industrial pragmatic approaches for checking consistency of models

by sd&m, BMW, Siemens, . . .

OO MetricsStatic Code Analysis (e.g. Sotograph)e.g. BorCon 2004, SDMetrics

M. Breu Nov-07/6 ©

UML related languagesOCL constraints on meta model level (e.g. OCLE tool)Executable UML (Kennedy Carter)

2

Key Ideas

Provide a model repositorygeneration based on a meta modelnot restricted to UML models or even graphical notation for modelsallowing to store model elements across modelling tool boundaries

Provide a mechanism to specify and evaluate queries over models

b d d i ifi d l

M. Breu Nov-07/7 ©

based on a domain specific meta modelinterpreted over a collection of elementsin a generic model repository

Rules for Quality Assessment

Development Guidelines ExamplesPrinciple 5: “Ist die Anzahl der BCI’s zu einem BC überschaubar (idealerweise 1-5)”Principle 20:“Greifen die BA’s nur auf BE’s ihres eigenen BOP’s zu?” Principle 25:“Gibt es zu jedem BCI genau eine BF, welche die Funktionalität

M. Breu Nov-07/8 ©

implementiert?”Source: „Methoden und Techniken zur Qualitaetssicherung im SW-Entwicklungsprozess der BMWGroup.“

Master’s thesis, Technical University of Munich, Dep. of Computer Science, September 2004.

Metrics for Quality Assessment

General Metrics ExampleCa : Afferent Couplings : The number of classes outside this category that depend upon classes within this category.Ce : Efferent Couplings : The number of classes inside this category that depend upon classes outside this categories.I : Instability : (Ce ÷ (Ca+Ce)) : This metric has the range [0 1] I=0

M. Breu Nov-07/9 ©

I : Instability : (Ce ÷ (Ca+Ce)) : This metric has the range [0,1]. I 0indicates a maximally stable category. I=1 indicates a maximally instable category.

Source: „OO Design Quality Metrics-An Analysis of Dependencies.“ Position Paper, Workshop on Pragmatic andTheoretical Directions in Object-Oriented Software Metrics, OOPSLA94, 1994.

Example

Each actor present in the Use Case Descriptionhas to have a corresponding actor in the Organisational Model.

M. Breu Nov-07/10 ©

Architecture


Integration in Eclipse


3

Status quo

The framework for a domain–specific model assessment is based on a common meta modelPrototypic implementations of our methodology:

1. integration with heterogeneous modelling environments2. generic analysis module with OCL interpretator (EMFT)

Integration with Eclipse


Case studies:1. analysis of clinical processes based on a catalogue of checks2. support for model consistency and analysis of models in MDSD

Future Work

ResearchMetrics, queries and tests for model quality assessmentPilot application in real world projects

Application in Model Driven TestingImplementation:

Modelling tools integration with the EMFT prototype


Adapters for other modelling tools for MOF conform notations (currently for MSVisio, MagicDraw and XML)

Institut für Softwaretechnologie

Protokoll-Konformitäts-Testen: Eine

Industrielle Anwendung Formaler

Methoden

Martin Weiglhofer

Institut für SoftwaretechnologieTechnische Universität Graz

1. Softnet Workshop

Martin Weiglhofer Graz, 7.11.2007 Protokoll-Konformitäts-Testen

1 / 14


Warum Protokoll Konformitätstests?

Internet-Telefonie als BeispielMögliches Signalisierungsprotokoll: SIP

RFC 3261: ca 200 Seiten

48 zusätzliche RFC’s (Erweiterungen)

24 "Internet-Drafts"

2005: A1-Netzausfall (Steiermark)500.000 mobilkom-Kunden betroffen

Software-Fehler


2 / 14








Software-Fehler


2 / 14








Software-Fehler


2 / 14



Methoden

Spezifikationzu testendes

System

formale

Spezifikation

Autom.

Testfallgen.

Testfälle


3 / 14



Methoden


System

formale

Spezifikation

Autom.

Testfallgen.

Testfälle


3 / 14



Methoden


System

formale

Spezifikation

Autom.

Testfallgen.

Testfälle


3 / 14



Methoden


System

formale

Spezifikation

Autom.

Testfallgen.

Autom.

Testfallgen.

Testfälle


3 / 14



Methoden


System

formale

Spezifikation

Autom.

Testfallgen.

Testfälle


3 / 14



Methoden


System

formale

Spezifikation

Autom.

Testfallgen.

Testfälle


3 / 14



Methoden


System

formale

Spezifikation

Autom.

Testfallgen.

Testfälle


3 / 14


Outline

1 Testfall Generierungs-Techniken

Endliche Zustandsautomaten

Labeled Transition Systems

2 LTS-Basiertes Konformitäts-Testen

3 Erste Ergebnisse einer industriellen Anwendung

4 Zusammenfassung


4 / 14

TestfallGenerierungs-Techniken

Endliche

Zustandsautomaten

Labeled Transition

Systems

LTS-BasiertesKonformitäts-Testen

ErsteErgebnisseeinerindustriellenAnwendung

Zusammenfas-sung


Testfallgenerierung mittels

Zustandsautomaten

s0

s1

s2a/1

b/0

a/0

b/1 b/0

a/1

i0

i1

i2a/1

b/0

a/0

b/1 b/0

a/1

Internen Zustand der Impl. prüfenStatus-Nachricht

Distinguishing-Sequence

. . .

AnnahmenGleiche Anzahl von Zuständen in Impl. und

Spez.

Impl. verändert sich während des Tests nicht

. . .


5 / 14


Endliche

Zustandsautomaten

Labeled Transition

Systems



Zusammenfas-sung



Zustandsautomaten

s0

s1

s2a/1

b/0

a/0

b/1 b/0

a/1

i0

i1

i2a/1

b/0

a/0

b/1 b/0

a/1



. . .


Spez.


. . .


5 / 14


Endliche

Zustandsautomaten

Labeled Transition

Systems



Zusammenfas-sung



Zustandsautomaten

s0

s1

s2a/1

b/0

a/0

b/1 b/0

a/1

i0

i1

i2a/1

b/0

a/0

b/1 b/0

a/1s/0

s/1

s/2



. . .


Spez.


. . .


5 / 14


Endliche

Zustandsautomaten

Labeled Transition

Systems



Zusammenfas-sung



Zustandsautomaten

s0

s1

s2a/1

b/0

a/0

b/1 b/0

a/1

i0

i1

i2a/1

b/0

a/0

b/1 b/0

a/1s/0

s/1

s/2



. . .


Spez.


. . .


5 / 14


Endliche

Zustandsautomaten

Labeled Transition

Systems



Zusammenfas-sung


Testfallgenerierung mittels LTS

Spezifikation

a

b b

a b

b

c

c a

Implementierung

a

b

a b

b

c

c a

conforms-to

Relation zwischen Spez. und Impl.

Isomorphie, Bisimulation, . . .

Input-Output Konformität (ioco)

Input-Output Labeled Transition System


6 / 14


Endliche

Zustandsautomaten

Labeled Transition

Systems



Zusammenfas-sung



Spezifikation

a

b b

a b

b

c

c a

Implementierung

a

b

a b

b

c

c a

conforms-to






6 / 14


Endliche

Zustandsautomaten

Labeled Transition

Systems



Zusammenfas-sung



Spezifikation

a

b b

a b

b

c

c a

Implementierung

a

b

a b

b

c

c a

conforms-to






6 / 14




Zusammenfas-sung


Input/Output Konformität

Input/Output Konformitäts-Relation

Eine Implementation I ist konform zu einer

Spezifikation S, dann und nur dann, wenn die Outputs

von I Outputs von S nach jeder beliebigen

Aktionsfolge von S sind.

Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

ioco

?


7 / 14




Zusammenfas-sung








Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

ioco

?


7 / 14




Zusammenfas-sung








Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

ioco

�{!OK} ⊆ {!OK,!FAIL}


7 / 14




Zusammenfas-sung








Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

?DEL

!DONE

ioco

?


7 / 14




Zusammenfas-sung








Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

?DEL

!DONE

ioco

�<?DEL> keine

Aktionsfolge von S


7 / 14




Zusammenfas-sung








Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

?REGioco

?


7 / 14




Zusammenfas-sung








Spez.

?REG

!OK !FAIL

Impl.

?REG

!OK

?REGioco

�{!OK,δ} �⊆ {!OK,!FAIL}

δ


7 / 14




Zusammenfas-sung








Implikationen

Impl. kann mehr Inputs akzeptieren

(unvollständige Spez.)

Impl. darf weniger Outputs liefern

nicht-deterministische Impl./Spez.

Annahmen

"Quiescence" detektierbar

Impl. ist "input-enabled"


7 / 14




Zusammenfas-sung


TGV - Ein praxistaugliches Tool

Spezifikation unvollständig

nicht-deterministisch

TestpurposeRefuse Zustände

Accept Zustände

TGVbasiert auf ioco

On-the-fly Testfallgenerierung

abstract

tc

det. Testfall

Pass, Fail , Inconc. Zustände

CTG Alle Testfälle für Testpurpose


8 / 14




Zusammenfas-sung






Accept Zustände

TGVbasiert auf ioco


abstract

tc

det. Testfall




8 / 14




Zusammenfas-sung






Accept Zustände

TGVbasiert auf ioco


abstract

tc

det. Testfall




8 / 14




Zusammenfas-sung






Accept Zustände

TGVbasiert auf ioco


abstract

tc

det. Testfall




8 / 14




Zusammenfas-sung






Accept Zustände

TGVbasiert auf ioco


abstract

tc

det. Testfall




8 / 14




Zusammenfas-sung


Fehler-basierte Testpurposes

Spezifikation

Mutant

IOLTSS

IOLTSM

Unterscheidente

Sequenz = TP

Bernhard K. Aichernig and Carlo Corrales Delgado: From Faults Via Test Purposes to Test Cases: On theFault-Based Testing of Concurrent Systems. FASE, pp. 324-338, Springer, 2006.


9 / 14




Zusammenfas-sung



Spezifikation

Mutant

IOLTSS

IOLTSM

Unterscheidente

Sequenz = TP



9 / 14




Zusammenfas-sung



Spezifikation

Mutant

IOLTSS

IOLTSM

Unterscheidente

Sequenz = TP



9 / 14




Zusammenfas-sung



Spezifikation

Mutant

IOLTSS

IOLTSM

Unterscheidente

Sequenz = TP



9 / 14




Zusammenfas-sung



Spezifikation

Mutant

IOLTSS

IOLTSM

Unterscheidente

Sequenz = TP

out of mem

nach 11d



9 / 14




Zusammenfas-sung



Spezifikation

Mutant

IOLTSS

IOLTSM

Unterscheidente

Sequenz = TP

Bernhard K. Aichernig, Bernhard Peischl, Martin Weiglhofer and Franz Wotawa: Protocol ConformanceTesting a SIP Registrar: an Industrial Application of Formal Methods. SEFM, pp. 215-224, IEEE, 2007.


9 / 14




Zusammenfas-sung


Details zur Spezifikation

Session Initiation Protocol (SIP)

Signalisierungs-Teil für Voice-over-IP

Text basiertes Protocol

User/Session Management

Verschiedene SIP Einheiten (z.B., Proxy,

Registrar, ...)Registrar

Zuständig für Benutzer-Verwaltung

Registrar (LOTOS) Modell

20 Abstrakte Daten Typen (ca. 2.5KLOC netto)

10 Prozesse (ca. 500LOC netto)

3 Implementierungen

OpenSER - Open Source Impl.

2 Versionen einer Kommerziellen Impl.


10 / 14




Zusammenfas-sung


Ergebnisse mit manuell erstellten Test

Purposes

Test- tgv Test A B Cpurpose [sec] fälle Pass Fail Pass Fail Pass Fail

notfound 12 1280 16 1264 1148 132 16 1264

invalid req. 12 1328 0 1328 1008 320 0 1328

unauth. 15 880 0 880 880 0 880 0

ok 12 1488 1104 384 1104 384 1104 384

delete 7006 432 260 172 130 302 260 172

Summe 7057 5408 1380 4028 4270 1138 2260 3148

Abweichungen von der Spezifikationen

A: 9

B: 4

C: 8


11 / 14




Zusammenfas-sung


Ergebnisse mit manuell erstellten Test

Purposes

Test- tgv Test A B Cpurpose [sec] fälle Pass Fail Pass Fail Pass Fail

notfound 12 1280 16 1264 1148 132 16 1264

invalid req. 12 1328 0 1328 1008 320 0 1328

unauth. 15 880 0 880 880 0 880 0

ok 12 1488 1104 384 1104 384 1104 384

delete 7006 432 260 172 130 302 260 172

Summe 7057 5408 1380 4028 4270 1138 2260 3148

Abweichungen von der Spezifikationen

A: 9

B: 4

C: 8


11 / 14




Zusammenfas-sung


Ergebnisse aus Fehler-basierten Test

Purposes

Op. No. A B C

TC � � ? � � ? � � ?

EIO 22 3 18 1 0 5 17 3 8 11

EIO+ 35 4 24 7 1 7 27 4 14 17

ESO 5 0 4 1 0 4 1 2 2 1

EIO/a. 22 4 0 18 6 0 16 6 0 16

EIO+/a. 35 8 0 27 8 0 27 8 0 27

ESO/a. 5 2 1 2 2 1 2 2 1 2

Total 124 21 47 56 17 17 90 25 25 74

Zusätzliche Abweichung von der SpezifikationenA: 1, B: 0, C: 0


12 / 14




Zusammenfas-sung


Ergebnisse aus Fehler-basierten Test

Purposes

Op. No. A B C

TC � � ? � � ? � � ?

EIO 22 3 18 1 0 5 17 3 8 11

EIO+ 35 4 24 7 1 7 27 4 14 17

ESO 5 0 4 1 0 4 1 2 2 1

EIO/a. 22 4 0 18 6 0 16 6 0 16

EIO+/a. 35 8 0 27 8 0 27 8 0 27

ESO/a. 5 2 1 2 2 1 2 2 1 2

Total 124 21 47 56 17 17 90 25 25 74

Zusätzliche Abweichung von der SpezifikationenA: 1, B: 0, C: 0


12 / 14




Zusammenfas-sung


Zusammenfassung

Formales ioco testen: Praktisch Einsetzbar

Unvollständige Spezifikation

Nicht-Determinisitische Spezifikationen

Anwendbar auf industrielle Spezifikationen

Manuelle/Fehlerbasierte Testfälle


13 / 14




Zusammenfas-sung


Danke für Ihre

Aufmerksamkeit.


14 / 14

Seite 1Kosten und Nutzen des Softwaretests messen und optimieren.Copyright 2007: CICERO CONSULTING - all rights reserved.

Marcus Tullius CICERO

[106 v. Chr. - 43 v. Chr]

Christian [email protected]

Ingrid [email protected]

CICERO CONSULTING GmbHwww.cicero-consulting.com


1. Firmenvorstellung2. Qualitätskosten3. Nutzenkategorien4. Maßnahmen zur Testoptimierung5. Beispiel6. Daumenregeln


CICERO CONSULTING – Kompetenz umsetzen.

S SoftwareP ProduktivitätQ QualitätR Reengineering


Laut ISO-Normen misst die Industrie 4 Bereiche von Qualitätskosten.

Die Empirie belegt, dass besonders performante Unternehmen die Kosteneinsparung durch Investition in die 2 Bereiche Prävention und Testmethodik gewinnen.

• Fehlerverhütung und Prüfkosten

• Kosten für Fehlerverhütung

• Kosten für Prüfung / Test

• Fehlerkosten (Verluste)• Interne Fehlerkosten

(vor Auslieferung)

• Externe Fehlerkosten (nach Auslieferung)


KostenAufwand reduzieren – meist 1. Wahl

ZeitStraffere Testprozesse kürzeren Projektlaufzeittime-to-markethöhere Flexibilität

QualitätSteigerung von Qualität oft leider KEIN strategisches Ziel


Welcher Qualitätsanspruch ist wirtschaftlich?


Studie Fachkonzept Design Realisierung Test Stabilis. Wartung

1

10

100Problem-behebungs-kosten (rel.)




Seite 9

Entstehung des Problems

Erkennen und Beheben

Chaos-Zeitraum mit hohen Kosten




Seite 11

•

•

•

•

•

•

•

•

•

•




Seite 13

:


Seite 14

Initial(1)

Defined(2)

Planung des Tests (leider erst recht spät),Ausführungsbasiertes Testen

Integrated(3)

Integraler BestandteilQS ab Anforderungsanalyse

Managed(4)

Quantitative VorgabenMessen/VorhersagenDurchgängiges Testmanagement

Optimizing(5)

Ständige Test-Prozessverbesserung

Hier kostest Testautomatisierung mehr als es bringt !

Illinois Institute of Technology


Seite 15

Testschritt Manuelle Tests

Autom.Tests

Prozentsatz Verbesserung

Erstellung Testplan 32 40 -25 % (-50 %)

Testvorbereitung 262 117 55 % (-150 %)

Testausführung 466 23 95 % (- 60 %)

Analyse Testergebnisse 117 58 50 % (-25 %)

Überwachung Fehlerstatus 96 16 83 % (30 %)

Testberichte 96 16 83 % (0 %)

Gesamtaufwand 1069 270 75 % (-50 %)

Quality Assurance Institute, 1995 Bei ungenügender ReifeSeite 16Kosten und Nutzen des Softwaretests messen und optimieren.

Copyright 2007: CICERO CONSULTING - all rights reserved.Seite 16

Testschritt Manuelle Tests

Autom.Tests

Prozentsatz Verbesserung

Erstellung Testplan 100 100 0 %

Testvorbereitung 100 135 -35 bis -100 %

Testausführung 100 60 40 %

Analyse Testergebnisse 100 80 20 %

Testwiederholungen 250 25 90 %

Testberichte 100 40 60 %

Gesamtaufwand 750 440 ~ 40 %


Grundidee: Der Fachbereich/User soll ein “schlüsselfertiges” Produkt von derEntwicklung bekommen und macht maximal noch einen kurzenAbnahmetest.

Vorteile:• Hohe Effektivität und Effizienz in Testvorbereitung und Testdurchführung.

Erfahrungswert: Einsparung von 20% des Testaufwandes gegenüberdezentraler Durchführung.

• Optimierte Qualitätssicherung auch bei zunehmender Spezialisierung und Komplexität der IT.

• Einheitlicher Testprozeß und Einsatz der leistungsstärksten Tools.• 4-Augenprinzip (Test durch Spezialisten) und bessere Dokumentation der

Ergebnisse.


Der IT-Manager will den Test-Aufwand reduzieren, die Produktionsqualität darf nichtfallen. Kern der Maßnahmen ist, Fehler durch bessere Organisation und Methodikfrüher zu finden.

*) Annahme, siehe Literaturwerte und Benchmarking:


(Achtung: mit Vorsicht anwenden, dienen nur der ersten Plausibilisierung. Aktive Entscheidungen müssen mit genaueren Ansätzen begründet werden)In Literatur (ua Capers Jones), Benchmarking-Studien (zB ISBSG) bzw. Veröffentlichungen aus großen IT-Unternehmen (IBM, Boeing, ...) findet man diverse Daumenregeln, die man zur Plausibilisierung von QS-Statistiken, Vergleichswerten verwenden kann:

1 Function Point entspricht ~100 LOC (strukturierte Programmierung mit Cobol, C, uä)Fehlerpotenzial einer Neuentwicklung = FP^1,25 jeder einzelne QS-Schritt entdeckt nur 30% der wirklich vorhanden Fehler, 70% werden weitergeschleppt. Müssen 10 Fehler korrigiert werden, wird 1 neuer („bad fix“) eingebaut. Ohne ReTest geht es folglich nicht: Zyklus1: 300 Fehler, es werden im Zyklus2 nach Korrektur wieder 30 (neue) Fehler auftreten.Individualsoftware (Transaktionssysteme) werden mit ca. 75-85% Testeffizienz ausgeliefert, also mind. 15% der Fehler werden in der Wartung von den Anwendern gefunden. Je älter ein System, desto stabiler.Jede 6. Testfalleingabe (TDK) bringt einen neuen Fehler hervor.#Testfälle während der Entwicklung: FP^1,2 (zB: 1.000 FP ca. 4.000 TDK)Aufwand für TDK-Entwurf: ca. 30 min; Aufwand für TDK-Durchführung: ca. 20 min; Aufwand für Fehlerkorrektur: 2-10 h (abhängig von Teststufe)Phasenverteilung im Projekt: 30% Fachkonzept - 35% Programmierung/Build - 35% Test


1

Das SCCH ist eine Initiative der Das SCCH befindet sich im

Mining Repositories for Decision-Support in Software Test Management Softnet Workshop "Test and Verification"

Mag. Rudolf Ramler

+43 7236 3343 [email protected]

2Rudolf Ramler, 2007

Software Competence CenterHagenberg

Software Research Company

About 70 employees

Located at Softwarepark Hagenberg

Supported by the Kplus Program,Comet Program (starting 2008)

We focus on the areasSoftware Architectures and TechnologiesIntegrated Software Engineering ToolsKnowledge-based Vision SystemsIndustrial Data Warehousing & Data MiningProcess and Quality Engineering

Closing the Gap between Software Research and Software Product Development

2


Overview

Reality of Software TestingHow Defect Prediction WorksResults from IndustryConclusion and Research in Progress


Reality of Software Testing

Restrictive schedules, limited resources, fixed deadlines

100%0%

Test caseDefect found

Complete test

As scheduled

In the end

3


Solution: Setting Priorities

First things first Valuate and prioritize testing activitiesFocus on the defective modules!

Ideal setting

100%0%

Default

Open issue ???? ??? ??? ?????? ?????


80:20 Rule in Software Testing

Industrial data from a mature productAprox. 80% of the classes aredefect freeTop 10% contain 80% of the defects

"About 80 percent of the defects come from 20 percent of the modules,

and about half the modules are defect free."

Boehm & Basili, IEEE Computer 01/2001

0 253 507 760 1014 1267CLASS rank

0

11.

22.

33.

44.

55.

sdohteMcilbuPssalC

0 0.2 0.4 0.6 0.8 1.% CLASS

0

0.2

0.4

0.6

0.8

1.

%sdohteMcilbuPssalC

20%

4


Overview



module

How to predict defective modules?

A) GuessB) Expert estimationC) Defect prediction techniques

Ad C) Defect predictionFind an indicator i that predicts thedefectiveness d for a moduleStudies show that such indicators existUsually a combination: i1 * w1 + i2 * w2 …+ in * wn dResulting questions

Where to get the indicators from? SE repositoriesHow to combine the indictors? Data mining

correctdefective

module

?

5


Where to get the indicators from?

Versioning System

Software Metrics

Release Management

…Bug

Database

• Lines of Code• Cyclomatic complexity• Inheritance level• Coupling and Cohesion• Cyclic references• etc.

• Age in days since creation• Number of changes

- total, since last release- last week, 1/3/6 month, year

• Different developers• etc.

• Requirement priority• Requirement complexity• Development effort• Assigned developer• etc.

• Previous test results• Static analysis violations• Process maturity• Week day• etc.


Where to get the indicators from?

Versioning System

Software Metrics

Release Management

…Bug

Database

no…1580m2

yes…741224mn

:…………:

yes…991147m1

Defecti4 … ini3i2i1Module

6


How to combine the indictors?

Prediction model created via data mining:


Workflow for defect prediction

Versioning System

Software Metrics

Release Management

…

0.31222

0.27887

0.28660

03FIX Total_Binned _ 0, 1, 5, Infinity _Is_Zero

057FIX Total_Binned _ 0, 1, 5, Infinity _Is_High

ifthen

else

CCUser_staudere_Is_L

0.83427

0.88110Node size too small MinSup .

017FIX Total_Binned _ 0, 1, 5, Infinity _Is_Low

ifthen

else

CCUser_Pointner_IsAtLeast_L

ifthen

else

CCUser_penz_IsAtLeast_L

0.194135

07FIX Total_Binned _ 0, 1, 5, Infinity _Is _High

0.251128

06FIX Total_Binned _ 0, 1, 5, Infinity _Is_High

0.0591122Stopped MinEnt

ifthen

else

CCUser_penz_Is_VH

ifthen

else

CCUser_mittend_Is_VH

ifthen

else

CCUser_lnzccadmin_IsAtLeast_M

???? ??? ??? ?????? ?????

RISK

LOW HIGH

7


Overview



Study Setting

Data from product developmentSoftware system in C++7 consecutive versions (oneyear of development)

Analyzed repositoriesDefect databaseRelease databaseSource code repositorySotograph

Extracted indicatorsSize and structure metricsChange history per fileRelease history per fileUser activities

8


Study Results

Defect Prediction Accuracy

0,00

0,10

0,20

0,30

0,40

0,50

0,60

0,70

0,80

0,90

1 2 3 4 5 6 7

Release DBSource RepositorySotographAll Combined


Overview


9


Summary & Conclusion

Prediction of defective / defect-free modulesClassification of modules

Prediction uses indicators extracted from existing repositories

Repositories are available in software product development Effort remains for data extraction

Prediction performance 60% to 80% (not optimized)Performance equally for defective / defect-free modules


Effort for Data Extraction

Insufficient quality of dataInappropriate data models, manual data processing, rules and processes not followedMissing, wrong, inaccurate data

Data islandsHeterogeneous DBs, different semantic record matching issues

Complexity of the real worldDiscontinuities of processesNested versions and branches

Data volumeLarge number of entity typesToo few entries no significance

10


Research in Progress

Test Cockpit

Platform to organize and visualize information

Plug-ins to calculate measures and predictions

Data Warehouse

Store data prepared for analysis along timeline

Provide access for querying

Adapter

Retrieve data from test and development repositories

Data Warehouse

...

0102030405060708090

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

Cockpit

0102030405060708090

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

Plug-in

Adapter


Contact

Mag. Rudolf [email protected]+43 7236 3343 872http://www.scch.at

Institute for Software Technology

Model Checking forModel Checking forHardware and Software

Roderick BloemInstitute for Software Technology

Graz University of TechnologyGraz University of Technology

Professor Horst Cerjak, 19.12.20051

Roderick Bloem Graz, 7 November 2007 Model Checking


The ProblemMore time spent verifying than codingMore time spent verifying than coding

• Increasing design Complexity• Decreasing time to market• Faults unacceptable in many application areas

• Software test / hardware simulation inadequate– needs reference model– forgets corner cases– low coverage– Intel uses 6000 computers to simulate a microprocessor. They simulate

< 1 minute of CPU time before tape-out.– You cannot exhaustively simulate something as simple as a multiplier

• Don’t be the next to spent $500m on a recall




Help from Formal Methods?Formal MethodsFormal Methods• No Reference Model• No Self-Checking TestsN T t B h• No Test Bench

Instead:• Specify Properties Formally• Check Adherence Automatically. Full Coverage.• No Bugs!g

But:• Designers will not write a full specificationDesigners will not write a full specification• No time for hand proofs• Verification should not be limited to PhDs




Formal MethodsO i f t t f th t i f l th dOverview of state of the art in formal methods

• Hardware– Model checking in hardware becoming mainstream– Assertion Based Verification (not full coverage)– Model Checking

• Software– Model checking limited to specific domains

S– Simple code: Device Drivers– Simple properties: Buffer Overflows




Hardware• Systems are reactive

– Computation does not complete• Systems may be very complex• Systems are finiteSystems are finite• Fixing bugs is extremely expensive




Formal Light: Assertion-Based VerificationProperties stated in a temporal logic (PSL SVA )Properties stated in a temporal logic (PSL, SVA,…)

– always(if req-coffee then eventually coffee)– never(coffee and tea)– not(coffee before req-coffee)

• Properties automatically translated to HDLvoid not_coffee_before_req-coffee(req-coffee, coffee){

int state = 0;

if(state == 0 && coffee)emit_warning;

else if(req-coffee)state = 1;

}}

• Properties checked during simulation

Design

Property

Monitor




Formal Light: Assertion-Based Verification

N ld d l d d+ No golden model needed- Coverage not full

- Coverage determined by (random) simulation test bench

Assertions can help define coverage measures• Assertions can help define coverage measures

l lexample coverage goal:sequence req-coffee … req-tea … coffee

How do we get complete coverage?




Model Checking• Circuit = finite state machine = graph• Properties can be checked using exhaustive

graph search• Sizes of several thousand latchesS es o se e a ousa d a c es• Dealing with large systems

Abstraction– Abstraction– Assume/Guarantee Reasoning

Binary Decision Diagrams– Binary Decision Diagrams– SAT-based verification




BMC ExampleIn HDL: In hardware:In HDL:

module main(clk, in){i l 0

In hardware:

int val = 0;

if(in){

val

val = val + 1;} else {

val = 0;}if(val == 3){

alarm;alarm;}

}

in alarm




BMC ExampleUnrolling the circuit Can alarmx be one?

val0 = 0 val1 val2

Unrolling the circuit. Can alarmx be one?

val3

in0 alarm0 in1 alarm1 in2 alarm2

Propositional encoding.c� (a � b) Question: is there an assignment

ab

c (�c � a) �(� c � b) �( a � b � c)

to in0, in1, in2 (and the wires) such that alarm0=1 or alarm1 = 1 or alarm2 = 1?

Q ti i l d b SAT l



(�a � �b � c) Question is solved by a SAT solver


Model Checking in HardwareB i i t• Becoming mainstream

• Finds bugs in circuits with thousands of latches• Full coverage• Full coverage

– Great at finding corner-case bugs• Excellent complement to simulation-based verificationExcellent complement to simulation based verification• Specify what you check, check what you specify• Drawback: Block-based verification

• Assertion-based verification can be used with simulation approachesapproaches

• In use at IBM, Infineon, Intel, Motorola, TI, etc. etc.




Software• Infinite state systems• Focus on easy programs or easy properties

– Driver code– Embedded Software – Buffer overflows in C– Concurrency problems (deadlocks, races)y p ( , )

About my camera







Driver Verification at MicrosoftD i iti lDrivers are critical

– Run in kernel space, can wreak havocD i t d MS t lDrivers are not under MS control

– Developed by hardware companiesMS t if t i di– MS cannot verify correctness, impose codingstandards, educate designers, etc

Drivers are simpleDrivers are simple– Correctness often does not depend on details of driver

implementation or hardwareimplementation or hardware– Drivers are relatively small– Rules are simple



p


Examplevoid example(){do{do{lock();nPacketsOld = nPackets;req = devExt->WLHV;if(req && req > status){if(req && req-> status){devExt->WLHV = req->next;release();irp = req->irp;if(req->status > 0){if(req->status > 0){irp->IoS.status = SUCCESS;irp->IoS.Info = req->Stat;

} else {irp->IoS status = FAIL;irp->IoS.status = FAIL;irp->IoS.Info = req->Stat;

}smartDevFreeBlock(req);IoCompleteRequest(irp);IoCompleteRequest(irp);nPackets++;

}// if req} while(nPackets != nPacketsOld);release();



release();}


ExampleA d i t t k

void example(){do{ • A driver may not take or

release a lock twice in a row

do{lock();nPacketsOld = nPackets;req = devExt->WLHV;if(req && req > status){

• In this program, this is ensured by keeping track of

if(req && req-> status){devExt->WLHV = req->next;release();irp = req->irp;if(req->status > 0){ ensured by keeping track of

nPackets and nPacketsOldif(req->status > 0){irp->IoS.status = SUCCESS;irp->IoS.Info = req->Stat;

} else {irp->IoS status = FAIL;

• We can abstract the other details

irp->IoS.status = FAIL;irp->IoS.Info = req->Stat;

}smartDevFreeBlock(req);IoCompleteRequest(irp); detailsIoCompleteRequest(irp);nPackets++;




release();}


Examplevoid example(){do{

void example(){do{

b: (nPacketsOld == nPackets)

do{lock();b = true;skip;if(*){

do{lock();nPacketsOld = nPackets;req = devExt->WLHV;if(req && req > status){ if(*){

skip;release();skip;if(*){

if(req && req-> status){devExt->WLHV = req->next;release();irp = req->irp;if(req->status > 0){ if(*){

skip;skip;

} else {skip;

if(req->status > 0){irp->IoS.status = SUCCESS;irp->IoS.Info = req->Stat;

} else {irp->IoS status = FAIL; skip;

skip;}skip;skip;

irp->IoS.status = FAIL;irp->IoS.Info = req->Stat;

}smartDevFreeBlock(req);IoCompleteRequest(irp); skip;

b = b ? false: *;}// if

} while(!b);release();

IoCompleteRequest(irp);nPackets++;




release();}

release();}


Abstraction/Refinement

C Program AbstractAbstractProgram Boolean Model

Ch kg Checker

Predicates YES C Program

NO

Predicates(initially empty) Correct?

NO

YES C Programis correct

CounterexampleAnalyze

counterexampleon original program

True counterexam

ple?

YESYES

BUG




ExperienceSLAM is part of Microsoft’s Static Driver Verifier:• SLAM is part of Microsoft’s Static Driver Verifier:– SDV comes with set of standard rules that device

drvers must adhere to• Run daily on device drivers inside and outside of

Microsoft• Many critical bugs fixed using the SDV.




Array Bound ChecksB ff fl j blBuffer overflows a major problem• Software written in C (or C++), no bound checks• Buffer overruns have drastic consequences• Buffer overruns have drastic consequences• Array bound are not stored in single place

– In length field,In length field,– As pair of start/end pointers,– Using a terminator,…

ButS ti l d i l• Semantics clear and simple

This part of the talk is based on a presentation by Manuvir Das of Microsoft at CAV 2006




ExampleFunction header:Function header:BOOL WINAPI SetupGetStringFieldW(

IN PINFCONTEXT Context,i ld dIN DWORD FieldIndex,

OUT PWSTR ReturnBuffer,IN DWORD ReturnBufferSize,… );

Function call:WCHAR szPersonalFlag[20];…SetupGetStringFieldW(&Context,1,szPersonalFlag,50,…);SetupGetStringFieldW(&Context,1,szPersonalFlag,50,…);

Is this correct?


Roderick Bloem Graz, 7 November 2007 Model CheckingExample due to Manuvir Das, (c) Microsoft


ExampleFunction header:Function header:BOOL WINAPI SetupGetStringFieldW(

IN PINFCONTEXT Context,IN DWORD FieldIndexIN DWORD FieldIndex,__out_ecount(ReturnBufferSize)OUT PWSTR ReturnBuffer, IN DWORD ReturnBufferSize,IN DWORD ReturnBufferSize,… );

Function call:WCHAR szPersonalFlag[20];…SetupGetStringFieldW(&Context,1,szPersonalFlag,50,…);

Annotation tells compiler where lengths are stored• Warning when compiling call: Buffer is of size 20, not 50.


Roderick Bloem Graz, 7 November 2007 Model CheckingExample due to Manuvir Das, (c) Microsoft


Standard Annotation Language (SAL)SAL annotates function callsYou can state

– Buffer size: length, endpointer, sentinal– Non-nullness of pointersp– writability, readability of buffers

both before and after callboth before and after call




Checking Correctnessid FillSt i (void FillString(

__out_ecount(cchBuf) TCHAR* buf,

size t cchBuf,size_t cchBuf,

char ch) {

for (size_t i = 0; i < cchBuf; i++) {

buf[i] = ch;

}}

}

Tools can infer that the access to buf[i] is safe (and that it is unsafe if you use <=.)



y )


Tool Support• Infer annotations automatically

– 40% of annotations inferred automatically• Find bugs in calls

– Calls with incorrect sizes– Code that does not respect bounds




SAL in Practice• Windows Vista

– Goal: annotate 100 000 function calls– Achieved: 500 000 function calls annotated (40% by

tools)– more than 20 000 bugs found and fixed

• Low false alarm rate




ConclusionsFormal methods: Catch the hard bugsFormal methods: Catch the hard bugs

– Complements test– Checks only what you specify

H dHardware– Becoming mainstream– Routinely finds bugs in circuit with thousands of latches– Can be integrated in simulation (with loss of coverage)

Software– Very promising– Found its way into commercial toolsy– Currently limited to simple properties or simple software (embedded etc)

Extensions– Locate faultsLocate faults– Repair faults– Generate test cases from specs– Synthesize systems from specs



1

Bringing Software Engineering Science into Practice

Professor Horst Cerjak, 19.12.20051/18

Bernhard Peischl 1. Softnet Workshop – Test & Verifikation Kompetenznetzwerk Softnet Austria

Dr. Bernhard PeischlInstitut für Softwaretechnologie

Technische Universität Graz

Outline

• Motivation und Abgrenzung• Industrielle und akademische Partner• Kernkompetenzen



• Rechtsform• Fördergeber und Förderstruktur• Beispiel eines Referenzprojekts• Zusammenfassung

Softnet Austria Kompetenznetzwerk (1) • Komplexität in Software

– Telekommunikation– Automobilindustrie– Luftfahrt– Eingebettete Systeme

A i i h ik



– Automatisierungstechnik– Banken und Versicherungen

• Softwarekrise– Gepäcksystem Flughafen

Denver– Flugreservierungssystem

SABRE– Flugkontrollsystem AAS– …

Softnet Austria Kompetenznetzwerk (2)

• Ziele– Vermeidung von Softwarefehlern– Verbesserung von Software Qualität

Kostensenkung bei gleichbleibender Qualität



– Kostensenkung bei gleichbleibender Qualität– Steigerung der Wettbewerbsfähigkeit– Förderung der anwandten Forschung in

Österreich– Sprachrohr der österreichischen Softwareindustrie– Institutionalisierter Zugang zum Themenbereich

Software Engineering

Softnet Austria – Partner (1)• Partner aus fast allen Bundesländern

– Wien, Steiermark, Tirol, Kärnten, Oberösterreich• Industrielle Partner

– KMUs und „big player“:– Siemens Austria AG, PSE– Kapsch CarrierCom AG

SoftLab GmbH



– SoftLab GmbH.– Mobilkom AG– SailLabs GmbH.– UMA GmbH.– Ximes GmbH.– Cicero Consulting GmbH.– AVL GmbH.– Arctis GmbH.– SCCH– E-Novation GmbH.

Softnet Austria – Partner (2)

• Akademische Partner– Universität Wien– Technische Universität Wien– Technische Universität Graz– Universität Klagenfurt



Universität Klagenfurt– Universität Innsbruck– Center of Usability Research

and Engineering, CURE

• Leitung des Netzwerks– Univ.-Prof. Dipl.-Ing. Dr. Franz Wotawa– TU-Graz, Institut für Softwaretechnologie

2

Softnet Austria – Kernkompetenzen (1)

• SOFT T&M – Techniken und Modelle (formale Methoden, Verifikation)– 3 Projekte

• SOFT WEB



– Web Entwicklung und Benutzerschnittstellen– 3 Projekte

• Forschungslandschaft– Koordination vom Institut für Softwaretechnologie, TU-Graz– Angewandte Forschung im Software Engineering neben

• Gundlagenforschung (FWF)• EU-Projekte• FIT-IT, FFG-Projekte

Softnet Austria – Kernkompetenzen (2)• Softwareentwicklungsmethoden

– Anwandte Forschung, Adaption, Evaluierung– Deployment von Verfahrenstechniken– Unterstützung bei Entwicklung

• Wissenschaftliche Betreuung von Referenzprojekten– Mehrere Partner pro Projekt



Mehrere Partner pro Projekt– Akademisch geleitete Arbeitsgruppen– Arbeitsgruppen sind inhaltlich vernetzt

• Aufbau einer Wissensdatenbank – Öffentlicher Zugang– Aufbereitung der Ergebnisse aus Referenzprojekten– Recommender System (Aufwandschätzung, Test)

• Wissenschaftliche Publikationstätigkeit• Durchführung von Workshops & Seminaren

Softnet Austria – SOFT-WEB

• SOFT-WEB– Leitung: Univ.-Prof. Dr. Manfred Tscheligi – Center of Usability Research and Engineering

• InhalteUser Interface Design



– User Interface Design– Semistrukturierte & semantische Datenhaltung– Planungs- und Optimierungslösungen– Software Entwicklungsprozeß (Agile Methoden)

• Anwendungsgebiete– Benutzerschnittstellen für Motorprüfstände– Arbeitszeitplanung– Mobile Multi-Media Applikationen– Web Services und Semantik Web

Softnet Austria – SOFT T&M (1)

• SOFT T&M– Leitung: Univ.-Prof. Dr. Franz Wotawa– Institut für Softwaretechnologie, TU-Graz

• Inhalte



– Komplexitätsbeherrschung im Software Engineering– Testdatenmanagement– Testdatengenerierung– Formale Verifikationstechniken– Debugging, Fehlerlokalisierung– Software Architektur– Modellierungstechniken

Softnet Austria – SOFT T&M (2)

• Typische Applikationsdomänen– Sicherheitskritische Software

• Automobilindustrie• Eingebettete Systeme

H h fü b S ft S t



– Hochverfügbare Software-Systeme• Internettelefonie (VoIP)

– Geschäftskritische Software• Banken und Versicherungen• Transaktionsserver etc.

– Software Modelle für E-Government

Softnet Austria - Rechtsform

• Gemeinnütziger Verein– „Verein zur Förderung der Forschung im Bereich der angewandten

Softwareentwicklung in Österreich – Softnet Austria“– Vorsitzender

• Univ Prof Dr Franz Wotawa



• Univ.-Prof. Dr. Franz Wotawa– Keine Gewinnorientierung

• Zentrale Aufgaben– Inderdisziplinäre Förderung innovativer Forschung– Koordination von Forschungsprojekten– Etablierung eines zentralen Informationsknotens– Förderung der Weiterbildung

• Workshops, Tagungen, Schulungen, Seminare– Förderung des wissenschaftlichen Nachwuchses

3

Softnet Austria - Förderstruktur

• Bund– 2/3 der 60% des Gesamtvolumens– Bundesministerium für Wirtschaft und Arbeit

• Länder1/3 d P j kt l



– 1/3 des Projektvolumens– Land Steiermark– Land Wien, Stadt Wien

• Industrielle Partner– 40% des Gesamtprojektvolumens– Barleistung und In-Kind

Zusammenfassung• Angewandte Forschung im Softw. Eng.• Ziele

– Qualitätssteigerung, Fehlervermeidung, Kompexitätsbeherrsch.– Kostensenkung bei gleichbleibender Qualität

t l F h bi t



• zentrale Forschungsgebiete– SOFT-T&M

• Formale Verifikation• Hochverfügbare & sicherheitskritische Systeme

– SOFT-WEB• Web Entwicklung und Benutzerschnittstellen• Semantische Datenhaltung, agile Prozesse

• Partner aus fast allen Bundesländern– Big player und KMUs

Gefördertes Projektgeschäft im Software Engineering

Siemens

ISTTestsequenz-, Testdatengenerierung,Spezifikationen

Entwicklungspradigmen,Eingebettete SystemeSicherheitskritische SoftwareZeritifizierungTools

Umestzungs-KompetenzProzesse, IST Analyse,Pil ti

Fundierung MethodikResearch Directions

Module Mutations-Analyse,Statistischer Test



Softlab

CiceroConsulting

SCCHTestautomatisierung,Testprozess,Dataware House

Pilotierung

Testaufwandschätzung,Tools,Legacy Aspekte,KMU Fokus

Research DirectionsModul Testcockpit(Entscheidungsfindung)

Modul Testcockpit(Testaufwands-Schätzung)

Innovationfür Beratung

Testmethodik und Testmanagement

• Portfolio von Techniken• Methodenratgeber

– Branchenspezifische Sicht• Automotive, Telekommunikation, Banken- und Versicherungen



, , g– Entwicklungsparadigma

• Web Entwicklung, Embedded Systems– Domänenwissen und Fehlermodelle– Testabdeckung: Code, Spezifikation und Fehlermodelle– Testmanagement und Test(management)prozess– Anforderungen an Zertifizierung

Protokoll Conformance Test

• Existierende Prototypen– CADP Toolbox– TGV Tool

• Herausforderungen in der Praxis



• Herausforderungen in der Praxis– Korrekte Spezifikation (Review)– Ausführung der Testfälle– Fokussierung der Testfallgenerierung– Strukturelle und formale Methodik

Kontakt

Verein zur Förderung der Forschung im Bereich der angewandten Softwareentwicklung in Österreich -

Softnet Austria kurz: Softnet Austria



Inffeldgasse 16b A-8010 Graz, Austria

Tel: (+43|0)316-873-5711 Fax: (+43|0)316-873-5706

http://[email protected]

4

Softnet Austria



Danke für Ihr Interesse !

Model-Based Software Debugging

Markus Stumptner

University of South Australia

Advanced Computing Research Centre

Joint work with Wolfgang Mayer, Franz Wotawa,

Dominik Wieland, Gerhard Friedrich,

Cristinel Mateis, Thomas Havelka, . . .

Outline

• Introduction to automated debugging

• Modelling approaches

– Static, dynamic, mixed, and abstract models

• Empirical comparison

• Complex fault models

• Outlook

SACS’07 1

Origins of MBD

• Classic diagnostic Expert Sys-

tem reasoning: from symptoms to

causes.

MYCIN: If the patient has red spots

and the patient has high fever

and the patient has received medi-

cation of type X

then there is reason to suspect an

infection of type Y

• Worked well with limited problem

space

SACS’07 2

Origins of MBD

• Classic diagnostic Expert Sys-

tem reasoning: from symptoms to

causes.

MYCIN: If the patient has red spots

and the patient has high fever

and the patient has received medi-

cation of type X

then there is reason to suspect an

infection of type Y

• Worked well with limited problem

space

• ...but did not work well with this:

Create PDF with GO2PDF for free, if you wish to remove this line, click here to buy Virtual PDF Printer

SACS’07 3

Model-Based Diagnosis (MBD)

• Model: Predicts behavior of individual components

∀B : battery : ¬ab(B) → ∃V(V > 1.2 ∧ V < 1.6 ∧ volt(B,V)

• Observations: reflect behavior of the actual system

• Diagnosis: Derive conclusions about component correctness

• Structure and Function

Model

System Description

Predictions Discrepancies Observations

Diagnoses

Actual System

SACS’07 4

Model-Based Diagnosis (MBD)

Diagnosis Problem [Reiter AIJ 87 ] (SD,COMP,OBS) where

• SD . . . Model

• COMP . . . Diagnosis components

• OBS . . . Observations

Diagnosis [Reiter] ∆ ⊆ COMP where

SD ∪ OBS ∪ {¬AB(C)|C ∈ COMP \∆} ∪ {AB(C)|C ∈ ∆}

is consistent

¬AB(C) . . . Component C is correct.

Efficient computation: A Conflict is a set of components that cannot be all

working correctly given OBS.

SACS’07 5

Outcomes

• Applications in power network, car, engine diagnosis

• Many Algorithms, all handle multiple faults

– Hitting Set [Reiter 87] for the tree of minimal conflict sets

– GDE [de Kleer, Williams 88]: ATMS-based

– Specialised Algorithms for trees: SAB [El Fattah, Dechter IJCAI 95],

TREE* [Stumptner, Wotawa AIJ 01]

– Conflict-Directed A* [Williams, Ragno 01]

– Relational Constraint Solver [Mauss, Tatar AI Com 04]

• General methods for prioritisation: set minimality, cardinality, probability

• General measurement point selection: entropy based [de Kleer AAAI 91]

• Approaches for abstraction [Genesereth, Mozetic, Provan ...]

SACS’07 6

• Combining diagnosis and repair

• General focus on modeling

SACS’07 7

Debugging Problem

Debugging scenario

• Given: faulty program and test cases

• Goal: find program elements that explain failing test runs

• Replace faulty program elements such that tests no longer fail

× fault detection, test generation, verification

SACS’07 8

Modelling Programs

Debugging Problem (P,COMP,SPEC) where

P . . . Model representing the program

COMP . . . Program components (methods, statements)

SPEC . . . Anticipated behaviour (test cases)

Diagnosis [Reiter, AIJ 87] ∆ ⊆ COMP where

P ∪ SPEC ∪ {¬AB(C) | C ∈ COMP \∆} ∪ {AB(C) | C ∈ ∆}

is consistent.

¬AB(C) . . . Component C is not responsible for a failure.

SACS’07 9

Purpose of MB Debugging

• Locating (and possibly correcting) a bug in a piece of software

• Given: Program and (a set of) test cases

• No formal specification required (not verification) specified

• Program is compiled to a (logical) model without human interference specified

• History of models

– Prolog (Console, Friedrich, Theseider Dupre, IJCAI ’93, Bond ’94)

– VHDL (Stumptner, Friedrich, Wotawa, AIJ ’99, Wotawa AIJ ’02)

– Functional Language (Stumptner, Wotawa IJCAI ’99)

– Java (Mayer, Stumptner, Wotawa 2002+)

SACS’07 10

Exampled=2

(1) r = d÷ 3

(2) A = r2π

(3) C = 2rπ

A=1.39626, C=2.09439

A=3.14159, C=6.28318

d=2

(1′) r = � ⇐ assume AB(1)

(2) A = r2π

(3) C = 2rπ

A=3.14159, C=6.28318

• Test if an expression � exists to make program compute the anticipated

result - must be approximated.

SACS’07 11

Hardware

• System Description describes the correct behavior and structure

• Observations: output of real system

• Faults: components delivering incorrect results

• Repairs: expensive physical activity (either in parts or in work time)

SD

1 0

Reality (OBS)

SACS’07 12

Software

• System description = incorrect behavior

• Correct behavior → test cases: prescribed observations with limited or no

structural correspondence (only output values in the worst case)

• Structural faults possible (missing or additional connections or program parts)

• Repair: trivial to execute but not find

Test Case

Program (SD)

0

1

...or 0

SACS’07 13

A Model-Based Debugger

Diagnosis Engine

Converter

Java Compiler

...........

SemanticsLanguage

Test−Cases

InternalRepresentation

ModelFragments

Models

ModelN

Diagnoses

Formal

Model1

Program

SACS’07 14

Java Model Families

• Dependency-based models

– Loops collapsed into single components

– No backward reasoning

• Structured value-based models

– structured control flow

– Limited backward execution

• Unstructured Control Flow

– Exception Handling (8% to 23% of Java methods depending on study)

– recursion

– flexible hierarchical components

– backward execution

• Trace-based model computation

SACS’07 15

Dependency-Based Models

• Components correspond to statements

– Ports: represent used and modified variables

– Model dependencies between input and output ports

– Distinguish “correct” and “incorrect” value

– Static analysis to ensure model is finite

– Fast but imprecise

b

c

d

e

f

g

h

s1

s2

s3

i1 i2

i3 i4

a

SACS’07 16

Interactive Debugger

Entropy-based automatic break point selection

SACS’07 17

Value-Based Models[Mateis et.al., DX 98; Mayer & Stumptner, ASE 03]• Predict exact value; no information otherwise

• Test cases specify anticipated result

• Model derived from data flow equations

x y

z=x * y

z

¬AB(C) → z = x ∗ y

• Static approximations imply large models (represents all possible executions;

Dynamic data structures; Loops, recursion)

• Spurious “consistency” due to weak predictor

SACS’07 18

Abstraction-based Models

• Dynamically generate model equations

– Combination of execution and static analysis

– Tailored representation for each test case

– Complex analyses are unnecessary if most values are known

– Non-relational abstraction

• Bidirectional Abstract Interpretation [Bourdoncle, PLDI 93]

– Interval lattice

– Invariant and intermittent assertions

– Eliminate paths that cannot satisfy assertions

– Approximate program traces if control flow is not known

– Constraints in program and test spec prevent accumulation of imprecision

• Certain programs difficult to abstract (e.g., bit operations, data structures)

SACS’07 19

Abstract Traces

i = 0

while (i < values.length) {if (values[i] < k)

values[i] = 0;

i++;

}

SACS’07 20

Debugging by Model-Checking [Griesmayer/Staber/Bloem 06]

• Use a model checker to prove that a (short) correct execution exists (does not

exist) given some fault assumptions

• Test specification translated into assertions

• Program executions represented as propositional SAT problem

– Upper bound on execution steps assumed

– Restricted to paths that satisfy test specification

• If the final program state can be reached, an explanation has been found;

otherwise, the set of fault assumptions is inconsistent

• Precise modelling of language semantics and data structures

• High complexity

SACS’07 21

Measuring Report Quality [Renieris, KBSE 03]

• Different debugging paradigms are difficult to compare

• Estimation of manual effort to find true fault given implicated components as

quality indicator

• Use graph-based representation to compute measure (Score)

• Program Dependence Graph: represent statements, control and data

dependencies

score := 1− |BreadthFirstSearch(PDG(P),Report ,TrueFault)||PDG(P)|

• “Fraction of the program that is eliminated given Report.”

SACS’07 22

Score Computation

score := 1− |BreadthFirstSearch(PDG(P),Report ,TrueFault)||PDG(P)|

SACS’07 23

Experimental Results

9 programs, 27–105 LoC, 2–131 tests,

5–41 variants with a single seeded fault each

●

●

●

●

SDM DDM Exec VBM AIM

0.0

0.2

0.4

0.6

0.8

Sco

re

SACS’07 24

Debugging by State Mutation [Cleve, ICSE 05]

• Compare close correct and failing executions

• Interrupt and alter program states

• Dynamic programming to isolate (minimal) failure-inducing differences and

causal chains “The program failed because the value of x changed from 5 to 1, which

caused the value of y in line 273 to change from 3 to 4, which violated the test case.”

• Effective if suitable test runs are available

z=10

x=1

y=4

x=5

y=3

z=10

inject

ok okfail

SACS’07 25

Score Distribution

Difference-based approaches may localise some faults precisely, Model-based

approaches are more consistent

[0,0

.1)

[0.1

,0.2

)

[0.2

,0.3

)

[0.3

,0.4

)

[0.4

,0.5

)

[0.5

,0.6

)

[0.6

,0.7

)

[0.7

,0.8

)

[0.8

,0.9

)

[0.9

,1]

%

0

10

20

30

40

50AIMBMCNNDD

SACS’07 26

Isolating Assignment Faults

• Previous models assume target variable is correct

• Relaxed fault assumptions: RHS previous models, LHS switch target variable,

GSF allow global side-effects

• 38 variants of TCAS, 10 revealing test cases,

models AIM and BMC

Model Diagnoses Located Score Time (s)

RHS 5 21 % 0.840 12

LHS 6 100 % 0.923 44

GSF 18 100 % 0.769 51

• Model checker CBMC cannot find a solution for LHS, AIM and BMC similar for

GSF

• Less precise abstraction may provide superior results

SACS’07 27

Towards Isolating Structural Faults[Mayer & Stumptner, AADEBUG 03; Wotawa, IEA/AIE 05]

• Complementary models to detect and confine omission and commission

faults, wrong variables, etc.

• Design information: UML, abstr. dependencies, contracts, . . .

• Conflicts in complementary models hint at likely faults and limit the scope of

fault assumptions

context Queue :: add(o : Object) :

void

pre: self .size < self .capacity

post: self .elements.exists (e : e =

o)

SACS’07 28

Assumptions

• Correct program must be a close variant

– Cannot deal with wrong algorithms, missing classes, . . .

• Computational processes only

– No external interfaces: files, GUI, sockets, native code, . . .

– Single thread of execution

• No ”structural faults” (relaxed later)

– Set of modified variables is the same in both programs

– Assignments to wrong variables

– Missing code

• (Short error traces)

SACS’07 29

Outlook

• How much specification is required to isolate faults?

• Ranking of explanations

• Generalised fault assumptions

• Model selection

• Concurrency

• Combination of different approaches

SACS’07 30

Conclusions

• MBSD as a flexible approach to automated debugging

• Potential to integrate different techniques through “components+fault modes”

interface

• Program and test cases only; no formal specification is required

• Static models unsuitable for complex programs

• MBSD provides consistent results; Difference-based approaches isolate only

some faults better if a similar correct test case is available

• Precision of non-interactive debugging is limited by redundancy in data-flow

• Dynamic models not effective if initialisation of DS is faulty

• A better idea of anticipated behaviour is required

• Further application to debugging of knowledge bases and problem solving

requirements [Felfernig et al. AIJ 04]

SACS’07 31

Teilnehmerliste

Name E-Mail Adresse Ahsan Syed Nadeem [email protected] Aichholzer Oswin [email protected]

Alvera Georg [email protected] Beer Andreas [email protected] Bloem Roderick [email protected]

Brandl Harald [email protected] Brandstätter Gerhard [email protected] Breu Michael [email protected]

Ennemoser Christoph [email protected] Galler Stefan [email protected] Garcia-Barrios Victor Manuel [email protected] Gruber Thomas [email protected] Happich René [email protected] Herget Jenö [email protected] Herzner Wolfgang [email protected] Hlavacs Helmut [email protected]

Hoisl Alexander [email protected] Hussain Zahid [email protected] Jürgen Wuchta [email protected] Kogler Andreas [email protected] Krenn Willibald [email protected] Lackner Christian [email protected]

Lang Richard [email protected] Lechner Martin [email protected] Lindner Daniel [email protected] Marks Peter [email protected] Mihai Nica [email protected] Milchrahm Harald [email protected] Mohacsi Stefan [email protected]

Neuber Eugen [email protected] Overbeck Jan [email protected] Peer Franz Josef [email protected] Peischl Bernhard [email protected]

Pichler Günther [email protected] Pichler Petra [email protected]

Pilz Alexander [email protected] Prügger Johannes [email protected] Ramler Rudolf [email protected]

Reicher Brigitte [email protected] Schaffler Gerald [email protected] Shahzad Sara [email protected]

Stefan Lukas [email protected] Steinberger Reinhold [email protected] Stumptner Markus [email protected]

Thurnher Christopher [email protected] Uitz Josef [email protected] Unterberger Robert [email protected] Unterguggenberger Ingrid [email protected]

Weiglhofer Martin [email protected]

Wöber Gernot [email protected] Wotawa Franz [email protected]

1. Softnet-Workshop “Testen und Verifikation” · Test case Generation), dessen Ziel die...

Documents

Transcript of 1. Softnet-Workshop “Testen und Verifikation” · Test case Generation), dessen Ziel die...