Modeling History to Understand Software Evolution with Hismo 2008-03-12

96
Modeling History to Understand Software Evolution Tudor Gîrba www.tudorgirba.com

description

Over the past three decades, more and more research has been spent on understanding software evolution. However, the approaches developed so far rely on ad-hoc models, or on too specific meta-models, and thus, it is difficult to reuse or compare their results. We argue for the need of an explicit and generic meta-model that recognizes evolution as an explicit phenomenon and models it as a first class entity. Our solution is to encapsulate the evolution in the explicit notion of history as a sequence of versions, and to build a meta-model around these notions called Hismo. To show the usefulness of our meta-model we exercise its different characteristics by building several reverse engineering applications.

Transcript of Modeling History to Understand Software Evolution with Hismo 2008-03-12

Page 1: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Modeling Historyto Understand Software Evolution

Tudor Gîrbawww.tudorgirba.com

Page 2: Modeling History to Understand Software Evolution with Hismo 2008-03-12

to Understand Software Evolution

Modeling History

vorgelegt von

Tudor Gîrba

von Rumänien

Inauguraldissertation der Philosophisch-naturwissenschaftlichen

Fakultät der Universität Bern

Leiter der Arbeit:

Prof. Dr. Stéphane DucasseProf. Dr. Oscar Nierstrasz

Institut für Informatik und angewandte Mathematik

Page 3: Modeling History to Understand Software Evolution with Hismo 2008-03-12

forward engineering

}

{

}

{

}

{

}

{

Page 4: Modeling History to Understand Software Evolution with Hismo 2008-03-12

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

Page 5: Modeling History to Understand Software Evolution with Hismo 2008-03-12

reve

rse

engin

eerin

gforward engineering

}

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

actual development

Page 6: Modeling History to Understand Software Evolution with Hismo 2008-03-12

reve

rse

engin

eerin

gforward engineering

}

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

actual development

reve

rse

engi

neer

ing

Page 7: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Lehman etal, 2001

Most often time is put on the horizontaland a property on the vertical axis.

Page 8: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Lanza, Ducasse, 2002

Evolution Matrix shows how classes evolve.Time is still on the horizontal axis.

Page 9: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Gall etal, 2003

Co-change analysis recovers hidden dependencies.Time is the lines.

Page 10: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Eick etal, 2002

Evolution information can bemapped on structural information.

Page 11: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Lehman etal, 2001

Eick etal, 2002

Lanza, Ducasse, 2002Gall etal, 2003

...

Page 12: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Lehman etal, 2001

Eick etal, 2002

Lanza, Ducasse, 2002Gall etal, 2003

...How can we accommodate all t

hese techniques?

Page 13: Modeling History to Understand Software Evolution with Hismo 2008-03-12

short intermezzo

What is a model?

Page 14: Modeling History to Understand Software Evolution with Hismo 2008-03-12

short intermezzoA model is a simplification of the subject,

and its purpose is to answer some particular

questions aimed towards the subject.

Bezivin, Gerbe, 2001

Page 15: Modeling History to Understand Software Evolution with Hismo 2008-03-12

short intermezzo

what is a meta-model?

Page 16: Modeling History to Understand Software Evolution with Hismo 2008-03-12

short intermezzoa meta-model isa model that makes statements aboutwhat can be expressed in valid models.

Seidewitz, 2003

Page 17: Modeling History to Understand Software Evolution with Hismo 2008-03-12

short intermezzoa good meta-modelallows for succinctexpression of analyses.

Page 18: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Lehman etal, 2001

Eick etal, 2002

Lanza, Ducasse, 2002Gall etal, 2003

...

Page 19: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Lehman etal, 2001

Eick etal, 2002

Lanza, Ducasse, 2002Gall etal, 2003

...How can we accommodate all t

hese techniques?

Page 20: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Evolution Matrix shows class changes.

Idleclass

Pulsarclass

Supernovaclass

White dwarfclass Class

attributes

methods

Lanza, Ducasse, 2004

Page 21: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Evolution Matrix shows class changes.

Idleclass

Pulsarclass

Supernovaclass

White dwarfclass Class

attributes

methods

Lanza, Ducasse, 2004

Page 22: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Evolution Matrix shows class changes.

Idleclass

Pulsarclass

Supernovaclass

White dwarfclass Class

attributes

methods

Evolution needs to be modeled as first c

lass entity.

Lanza, Ducasse, 2004

Page 23: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Evolution Matrix shows class changes.

Idleclass

Pulsarclass

Supernovaclass

White dwarfclass Class

attributes

methods

Lanza, Ducasse, 2004

Page 24: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Evolution Matrix shows class changes.

Idleclass history

Pulsarclass history

Supernovaclass history

White dwarfclass history

ClassHistory

isPulsarisIdle...

Page 25: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

SystemVersion

Page 26: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

ClassHistory

SystemVersion

Page 27: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

ClassHistory

SystemVersion

SystemHistory

Page 28: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

ClassHistory

SystemVersion

SystemHistory

Page 29: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

ClassHistory

SystemVersion

SystemHistory

Hismo models history as first c

lass entity.

Girba, 2005

Page 30: Modeling History to Understand Software Evolution with Hismo 2008-03-12

1 2 3

4 5 6

Measuringhistory

Yesterday’sWeather

Time-basedDetection Strategies

Visualizing the evolution of hierarchies

Detectingco-change patterns

How developersdrive evolution

Page 31: Modeling History to Understand Software Evolution with Hismo 2008-03-12

1Measuring history

Page 32: Modeling History to Understand Software Evolution with Hismo 2008-03-12

2 4 3 5 7

2 2 3 4 9

2 2 1 2 3

2 2 2 2 2

1 5 3 4 4

What changed? When did it change? ...

Page 33: Modeling History to Understand Software Evolution with Hismo 2008-03-12

1 5 3 4 4

4 2 1 0+++ = 7=

LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-nEvolution ofNumber of Methods

LENOM(C)

Page 34: Modeling History to Understand Software Evolution with Hismo 2008-03-12

1 5 3 4 4

LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n

LENOM(C) 4 2-3 2 2-2 1 2-1 0 20+++ = 1.5=

EENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 22-i

Latest Evolution ofNumber of Methods

Earliest Evolution ofNumber of Methods

EENOM(C) 4 20 2 2-1 1 2-2 0 2-3+++ = 5.25=

Page 35: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ENOM LENOM EENOM

7 3.5 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1.25 5.25

2 4 3 5 7

2 2 3 4 9

2 2 1 2 3

2 2 2 2 2

1 5 3 4 4

Page 36: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ENOM LENOM EENOM

7 3.5 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1.25 5.25

balanced changer

late changer

dead stable

early changer

Page 37: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ENOM LENOM EENOM

7 3.5 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1.25 5.25

balanced changer

late changer

dead stable

early changer

History as first c

lass entity

enables comparison through measurements.

Page 38: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Evolution

Stability

Historical Max

Growth Trend

...

Number of Methods

Number of Lines of Code

Cyclomatic Complexity

Number of Modules

...

of

History can be measured in many ways.

Page 39: Modeling History to Understand Software Evolution with Hismo 2008-03-12

2Yesterday’s weather

Page 40: Modeling History to Understand Software Evolution with Hismo 2008-03-12

The recently changed parts are likely to change in the near future.

Common wisdom

Page 41: Modeling History to Understand Software Evolution with Hismo 2008-03-12

The recently changed parts are likely to change in the near future.

Common wisdom

Are they really?

Page 42: Modeling History to Understand Software Evolution with Hismo 2008-03-12

30% 90%

Page 43: Modeling History to Understand Software Evolution with Hismo 2008-03-12
Page 44: Modeling History to Understand Software Evolution with Hismo 2008-03-12
Page 45: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

Page 46: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

past

Page 47: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

past future

Page 48: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

past future

Page 49: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

past future

Page 50: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

past future

prediction hit

Page 51: Modeling History to Understand Software Evolution with Hismo 2008-03-12

present

past future

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

prediction hit

Page 52: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Overall Yesterday’s Weather shows the localization of changed in time. Girba etal, 2004

hit hit hit

YW = 3 / 8 = 37%

hit hit hit hit hit hit hit

YW = 7 / 8 = 87%

Page 53: Modeling History to Understand Software Evolution with Hismo 2008-03-12

3Time-based Detection Strategies

Page 54: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Detection Strategies are metric-based queries to detect design flaws.

METRIC 1 > Threshold 1

Rule 1

METRIC 2 < Threshold 2

Rule 2

AND Quality problem

Lanza, Marinescu 2006

Page 55: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Example: a God Class centralizes too much intelligence in the system.

ATFD > FEW

Class uses directly more than a

few attributes of other classes

WMC ! VERY HIGH

Functional complexity of the

class is very high

TCC < ONE THIRD

Class cohesion is low

AND GodClass

Lanza, Marinescu, 2006

Page 56: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Example: a God Class centralizes too much intelligence in the system.

ATFD > FEW

Class uses directly more than a

few attributes of other classes

WMC ! VERY HIGH

Functional complexity of the

class is very high

TCC < ONE THIRD

Class cohesion is low

AND GodClass

Lanza, Marinescu, 2006

But, what if it is

stable?

Page 57: Modeling History to Understand Software Evolution with Hismo 2008-03-12

History-based Detection Strategies take evolution into account. Ratiu etal, 2004

AND

isGodClass(last)

God Class

in the last version

Stability > 90%

Stable throughout

the history

Harmless God Class

Page 58: Modeling History to Understand Software Evolution with Hismo 2008-03-12

History-based Detection Strategies take evolution into account. Ratiu etal, 2004

AND

isGodClass(last)

God Class

in the last version

Stability > 90%

Stable throughout

the history

Harmless God Class

Time and space are treated the same.

Page 59: Modeling History to Understand Software Evolution with Hismo 2008-03-12

4Visualizing the evolution of hierarchies

Page 60: Modeling History to Understand Software Evolution with Hismo 2008-03-12

What happens with inheritance?

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

Page 61: Modeling History to Understand Software Evolution with Hismo 2008-03-12

History contains too much data.

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

Page 62: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

ClassHistory

SystemVersion

SystemHistory

Page 63: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ClassVersion

ClassHistory

SystemVersion

SystemHistory

InheritanceVersion

Page 64: Modeling History to Understand Software Evolution with Hismo 2008-03-12

InheritanceHistory

ClassVersion

ClassHistory

SystemVersion

SystemHistory

InheritanceVersion

Page 65: Modeling History to Understand Software Evolution with Hismo 2008-03-12

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

A is persistent, B is stable, C was removed, E is newborn ...

Page 66: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Hierarchy Evolution View encapsulates time.

A

B

D

C

E

A is persistent, B is stable, C was removed, E is newborn ...

age

changedmethods

changedlines

Removed

Removed

Girba etal, 2005

Page 67: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Hierarchy Evolution View reveals patterns.Girba etal, 2005

Page 68: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Hierarchy Evolution View reveals patterns.Girba etal, 2005

History as first c

lass entity

enables mapping to a graph.

Page 69: Modeling History to Understand Software Evolution with Hismo 2008-03-12

5Identifying co-change patterns

Page 70: Modeling History to Understand Software Evolution with Hismo 2008-03-12

A

B

C

D

E

1 2 3 4 5 6

B E

C D

A

Gall etal, ‘98

Page 71: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Co-change patterns are n-ary relationships.

Page 72: Modeling History to Understand Software Evolution with Hismo 2008-03-12

A

B

C

D

E

1 2 3 4 5 6

Version

Page 73: Modeling History to Understand Software Evolution with Hismo 2008-03-12

A

B

C

D

E

1 2 3 4 5 6

changed

Version

Page 74: Modeling History to Understand Software Evolution with Hismo 2008-03-12

changed(i)

HistoryA

B

C

D

E

1 2 3 4 5 6

changed

Version

Page 75: Modeling History to Understand Software Evolution with Hismo 2008-03-12

What is Concept Analysis?

Page 76: Modeling History to Understand Software Evolution with Hismo 2008-03-12

A

B

C

D

E

1 2 3 4 5 6

Page 77: Modeling History to Understand Software Evolution with Hismo 2008-03-12

{A, B, C, D, E}

Ø

{D, B}{2, 4}

{A, D}{2, 6}

{A, E, C}{5, 6}

{A, D, B}{2}

{A, E, C, D}{6}

{D}{2, 4, 6}

{A}{2, 5, 6}

{C}{3, 5, 6}

Ø{1, 2, 3, 4, 5, 6}

FCA

A

B

C

D

E

1 2 3 4 5 6

Page 78: Modeling History to Understand Software Evolution with Hismo 2008-03-12

{A, B, C, D, E}

Ø

{D, B}{2, 4}

{A, D}{2, 6}

{A, E, C}{5, 6}

{A, D, B}{2}

{A, E, C, D}{6}

{D}{2, 4, 6}

{A}{2, 5, 6}

{C}{3, 5, 6}

Ø{1, 2, 3, 4, 5, 6}

FCA

A

B

C

D

E

1 2 3 4 5 6

Girba etal, 2007

Page 79: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Parallel Inheritanceadd simultaneously children to several classes

Shotgun Surgerychange several classes simultaneously, but do not add methods

Page 80: Modeling History to Understand Software Evolution with Hismo 2008-03-12

{A, B, C, D, E}

Ø

{D, B}{2, 4}

{A, D}{2, 6}

{A, E, C}{5, 6}

{A, D, B}{2}

{A, E, C, D}{6}

{D}{2, 4, 6}

{A}{2, 5, 6}

{C}{3, 5, 6}

Ø{1, 2, 3, 4, 5, 6}

FCA

A

B

C

D

E

1 2 3 4 5 6

Page 81: Modeling History to Understand Software Evolution with Hismo 2008-03-12

{A, B, C, D, E}

Ø

{D, B}{2, 4}

{A, D}{2, 6}

{A, E, C}{5, 6}

{A, D, B}{2}

{A, E, C, D}{6}

{D}{2, 4, 6}

{A}{2, 5, 6}

{C}{3, 5, 6}

Ø{1, 2, 3, 4, 5, 6}

FCA

A

B

C

D

E

1 2 3 4 5 6

History as first c

lass entity

enables mapping to FCA.

Page 82: Modeling History to Understand Software Evolution with Hismo 2008-03-12

6How developers drive software evolution

Page 83: Modeling History to Understand Software Evolution with Hismo 2008-03-12

CVS shows activity.

Page 84: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Who is responsible for this?

Page 85: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Who is responsible for this?

Page 86: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Alphabetical order is no order.

Page 87: Modeling History to Understand Software Evolution with Hismo 2008-03-12

The Hausdorf metric can be used to compute the similarity between commits.

A

B

d(A, B) = ∑ min2{ | a - b | b in B }a in A

Page 88: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Alphabetical order is no order.

Page 89: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Ownership Map reveals development patterns.Girba etal, 2006

Page 90: Modeling History to Understand Software Evolution with Hismo 2008-03-12

Ownership Map reveals development patterns.Girba etal, 2006

History as first c

lass entity

enables reasoning about holistic changes.

Page 91: Modeling History to Understand Software Evolution with Hismo 2008-03-12

1 2 3

4 5 6

Measuringhistory

Yesterday’sWeather

Time-basedDetection Strategies

Visualizing the evolution of hierarchies

Detectingco-change patterns

How developersdrive evolution

Page 92: Modeling History to Understand Software Evolution with Hismo 2008-03-12

InheritanceHistory

ClassVersion

ClassHistory

SystemVersion

SystemHistory

InheritanceVersion

Page 93: Modeling History to Understand Software Evolution with Hismo 2008-03-12

History

VersionHistory

VersionHistory

Version

Page 94: Modeling History to Understand Software Evolution with Hismo 2008-03-12

History

VersionHistory

VersionHistory

Version

Hismo models history as first c

lass entity.

Girba, 2005

Page 95: Modeling History to Understand Software Evolution with Hismo 2008-03-12

to Understand Software Evolution

Modeling History

vorgelegt von

Tudor Gîrba

von Rumänien

Inauguraldissertation der Philosophisch-naturwissenschaftlichen

Fakultät der Universität Bern

Leiter der Arbeit:

Prof. Dr. Stéphane DucasseProf. Dr. Oscar Nierstrasz

Institut für Informatik und angewandte Mathematik

www.tudorgirba.com