PSI 2005N. Ramangaseheno

of 36 /36
PSI 2005 N. Ramangaseheno Unformating SVG Unformating SVG Documents Documents Presentation of the Project Environment of Work Introducing the Subject Unformat Process Applied to Graphics Indexing Contributions Applications

description

Presentation of the Project Environment of Work Introducing the Subject Unformat Process Applied to Graphics Indexing Contributions Applications. Unformating SVG Documents. PSI 2005N. Ramangaseheno. Environment of Work. Laboratories : - PowerPoint PPT Presentation

Transcript of PSI 2005N. Ramangaseheno

Page 1: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Unformating SVG DocumentsUnformating SVG Documents

● Presentation of the Project– Environment of Work

– Introducing the Subject– Unformat Process Applied

to Graphics Indexing● Contributions● Applications

Page 2: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Environment of WorkEnvironment of Work

Laboratories :

• DI (Document Interaction) team of PSI (Perception, Systèmes, Information) laboratory, University of Rouen, France,

• IPI (Image Processing and Interpretation) research group of SCSIT (School of Computer Science and Information Technology), University of Notthingham, England.

Tutors :

• Eric Trupin, « Directeur de Recherche » within the PSI,• Tony Pridmore, Senior Lecturer, member of IPIresearch group.

Collaborations :

• Mathieu Delalandre, post-doc within the IPI group,• Karim Zouba, master traineewithin the PSI.

Integrated project : Indexation de graphiques vectoriels

Programming language : Java

Trainning period : april 2005 - september 2005

Page 3: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

● Presentation of the Project– Environment of Work

– Introducing the Subject– Unformat Process Applied

to Graphics Indexing● Contributions● Applications

Unformating SVG DocumentsUnformating SVG Documents

Page 4: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Vector GraphicsVector Graphics

miyamoto.ai

<rect x="400" y="100" width="400“ height="200"fill="yellow" stroke="navy" stroke-width="10" />

(a) (b)

Common vector formats :

• AI (Adobe Illustrator)

• SVG (Scalable Vector Graphic)

• WMF (Windows Metafile)

• EPS (Encapsulted PostScript)

• DXF (AutoCAD)

a vector graphic

a SVG rectangle

a WMF pen

an EPS plane

Page 5: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Image IndexingImage Indexing

Graphics types : - raster images or bitmaps - vector mages

Indexing : automatic extraction of informative characteristics from multimedia containers to aid retrieval and browsing through large databases.

Architecture of an image indexing system

Graphic document

Conception of visual

descriptors

Image signature

Image database

Measure of similarity

Classification

Page 6: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

● Presentation of the Project– Environment of Work

– Introducing the Subject– Unformat Process Applied

to Graphics Indexing● Contributions● Applications

Unformating SVG DocumentsUnformating SVG Documents

Page 7: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Diagram of the Indexing SystemDiagram of the Indexing System

Vector graphic

Analyzer

Generator of synthetic

documents

Unformat process

Low-level representation

High-level representation

Page 8: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

SVG document unformated document

R2R1 R3

Why unformat before analysing ?

Problem with Unformating Problem with Unformating

Page 9: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

region graph

R2R1 R3

acquiring modelling filtering

SVG document unformated document

Problem with Unformating Problem with Unformating

Page 10: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

● Presentation of the Project● Contributions

– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search

● Applications

Unformating SVG DocumentsUnformating SVG Documents

Page 11: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

filtering

parsing

modelling

intersection search

Diagram of the Unformat SystemDiagram of the Unformat System

SVG document

Page 12: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

● Presentation of the Project● Contributions

– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search

● Applications

Unformating SVG DocumentsUnformating SVG Documents

Page 13: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

The SVG FormatThe SVG FormatNorm and advantages:

● W3C norm for describing 2D graphics => open standard

● Growing format   => growig number of visualizers and users

● Vectorial description of graphical objects => scalablility

● Based on XML (described by a DTD); compatible with XLink, XPointer, CSS/XSL, and SMIL (animation language) => textual, separation between semantic and presentation

● Scripts and animations started on associated events => interactif

Inconvenient :

● Lack of realism

Adapté pour :

● Interactive geographic maps

● Technical drawings

● XML accounts

<rect x="400" y="100" width="400“ height="200"fill="yellow" stroke="navy" stroke-width="10" />

(a) (b)

a SVG rectangle

Page 14: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Structure of a SVG DocumentStructure of a SVG Document

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><!DOCTYPE svg PUBLIC

"-//W3C//DTD SVG 1.0//EN""http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">

<svg width="5cm" height="4cm"><desc>Un joli rectangle</desc><rect x="3cm" y="0.5cm" width="1.5cm" height="2cm"/>

</svg>

Exemple :

SVG tag corresponding declaration

<svg> SVG document

<g> group of objects

<‘symbol’> geometrical shape

<text> , <tspan>, ou <tref> text

<image> image

<defs> definition of links

<use> link towards an internal graphical object

Page 15: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Common shapes● Ellipse: <ellipse cx="400" cy="300" rx="72" ry="50" />

● Rectangle: <rect x="150" y="50" width="135" height="100" />

● Circle: <circle cx="70" cy="100" r="50" />

● Line: <line x1="375" y1="50" x2="425" y2="150" />

● Polyline: <polyline points="50, 250,75,350,100,250,125,350" />

● Polygon: <polygon points="250,250,297,284,279,340,220,340" />

Complex shape

● Path: <path d="M 50 250 L 100 250 L 150 300"/>

Geometrical ShapesGeometrical Shapes

Page 16: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

Unformating SVG DocumentsUnformating SVG Documents

● Presentation of the Project● Contributions

– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search

● Applications

Page 17: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

SAX ParserSAX Parser

● Any XML handling need a parser

– a parser is a syntaxic analyzer; it is placed between the XML file and the application

– a parser can be used :

● from a program (script, java, C++)● from a navigator

● SAX, event driven parser

– handler methods called from special events

– file sequentially analyzed before being transmitted to the application

Handler

startDocument()startElement()endElement()endDocument()

treated events

Architecture of a SAX application

Page 18: PSI 2005N. Ramangaseheno

SVG ParsingSVG Parsing

SVG code

vectorial representation

PSI 2005 N. Ramangaseheno

Parser SAX

startDocument()startElement()charactersendElement() endDocument()

SVG document

graphical objects

met events

Page 19: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

Unformating SVG DocumentsUnformating SVG Documents

● Presentation of the Project● Contributions

– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search

● Applications

Page 20: PSI 2005N. Ramangaseheno

OGraphic

OGraphicImpl

OPoint OLine OHL

OExtremity OJunction

PSI 2005 N. Ramangaseheno

GOMLib [Delalandre 2004]:

• Graphical Objects Modelling Library

• XML and SVG export

• Multi-model : different representations possible

Graphical Objects ModellingGraphical Objects Modelling

line graph hierarchical list hierarchical graph

linked-squares point list line list

Page 21: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

Unformating SVG DocumentsUnformating SVG Documents

● Presentation of the Project● Contributions

– Diagram of the Unformat System– The SVG Format– Parsing – Modelling– Filtering– Intersection Search

● Applications

Page 22: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Filtering superfluous linesFiltering superfluous lines

visuallly in reality

Need filtering to respect orders Need filtering to respect orders 1 point of a 2D planimetry = 1 single representation1 point of a 2D planimetry = 1 single representation

Page 23: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

Preliminary Tests (1/2)Preliminary Tests (1/2)Given - two lines L1 et L2

- b1(xb1,yb1) begin point of L1 ; b2(xb2,yb2) begin point of L2

- e2(xe2,ye1) end point of L1 ; e2(xe2,ye2) end point of L2

•L1 isEqual L2 : L1 and L2 are equal if

xb1 = xb2 ; yb1 = yb2 ; xe1 = xe2 ; ye1 = ye2 ;

•L1 isParallel L2 : L1 and L2 are parallel if

(( xe1 - xb1 ) * ( xe2 - xb2 ) - ( ye1 - yb1 ) * ( ye2 - yb2 )) = 0

•L1 isColinear p : a point p(x,y) is colinear to L1 if

y = t * x + o

(t = ( ye1 - yb1 ) / (xe1 - xb1 ) and o = yb1 - t * xb1 )

•L1 isColinear L2 : L2 is colinear to L1 if

L1 isColinear b2 and L1 isColinear e2

b1b1 e1e1

b2b2 e2e2

Page 24: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

•L1 overlaps p : L1 overlaps a point p(x,y) if

(( x - xb1 ) * ( x - xe1 )) < 0 or (( y - yb1 ) * ( y -

ye1 )) < 0

•L1 overlaps L2 : L1 overlaps L2 if

L1 overlaps( b2 ) or L1 overlaps( e2 )

•L1 isConnected p : L1 is connected to the point p(x,y) if

b1 = p

or e1 = p

•L1 isConnected L2 : L1 is connected to L2 if

L1 isConnected b2

or L1 isConnected e2 l1 is connected to l2

Preliminary Tests (2/2)Preliminary Tests (2/2)

Page 25: PSI 2005N. Ramangaseheno

Filtering Tests (1/3)Filtering Tests (1/3)

PSI 2005 N. Ramangaseheno

•L1 sameAs L2 : L1 and L2 are the same if

L1 isEqual L2

or xb1 = xe2 ; yb1 = ye2 ; xe1 = xb2 ; ye1 = yb2 ;

in this case, line L2 is filtered (erased)

l1 same as l2

Page 26: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

•L1 includes L2 : L1 includes L2

case (a) : L2 totally included inside L1

if L1 isColinear L2

and L1 overlaps b2

and L1 overlaps e2

case (b) : L2 included inside and connected to L1

or L1 isConnected L2

and L1 isParallel L2

and [ L1 overlaps b2 or L1 overlaps e2 ]

l1 includes l2

(a)

(b)

Filtering Tests (2/3)Filtering Tests (2/3)

Page 27: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

•L1 isJoined L2 : L1 and L2 join together

case (a) : L1 is extended by L2, without overlapping

if L1 isConnected L2

and L1 isParallel b2

and « L1 overlaps e2 » is false

case (b) : L1 is extended by L2, with overlapping

or L1 isColinear L2

and L1 overlaps L2

and L2 overlaps L1

l1 and l2 join together

(a)

(b)

Filtering Tests (3/3)Filtering Tests (3/3)

Page 28: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

Unformating SVG DocumentsUnformating SVG Documents

● Presentation of the Project● Contributions

– Diagram of the Unformat System

– The SVG Format– Parsing – Modelling– Filtering– Intersection Search

● Applications

Page 29: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

X junction T junction

Multi-degree junction

Segments separation

Get Line Intersection (1/3)Get Line Intersection (1/3)

Page 30: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Intersections processing algorithmSearch and list all junctions of the document, For each line, test if it contains the junctionIf yes, break the line in two at the junction point

Used tests

•L1 isIntersected L2 : L1 and L2 intersects themselves on a point p(x,y)so that p <-- L1 getIntersection L2

case (a) : X junction

if p is not null

case (b) : T junction

if p is null

L1 overlaps p and L2 overlaps p

or L1 isConnected p and L2 overlaps p

or L2 isConnected p and L1 overlaps p

In both cases, add junction p(x,y) to the junctions list

Get Line Intersection (2/3)Get Line Intersection (2/3)

Page 31: PSI 2005N. Ramangaseheno

L1 isIntersected L2 : returns • intersection point p(xc,yc) between lines L1 and L2

• null if lines are parralel or colinear • null if x<0 , y<0

Four cases to take into account :

1.L1 and L2 are regular

y1 = a1 * x1 + b1 ;y2 = a2 * x2 + b2 ;

yc = y1 = y2 ;xc = x1 = x2 ;

xc = (b2 - b1)/(a1 - a2) ;yc = a1 * xc + b1 = a2 * xc + b2 ;

4. L1 and L2 are irregular (see case 2.)Two cases:• L1 is vertical, L2 horizontal• L2 is vertical, L1 horizontal

PSI 2005 N. Ramangaseheno

Get Line Intersection (3/3)Get Line Intersection (3/3)2. L1 is regular, L2 irregularTwo cases:

• L2 is horizontal

y1 = a1 * x1 + b1 ; x2 = c ; - y2 ;

yc = c ; xc = (c - b1)/ a1 ;

• L2 is vertical y1 = a1 * x1 + b1 ; y2 = c ; - x2 ;

xc = c ; yc = a1 * c + b1;

3. L1 is irregular, L2 regular (see case 2.)Two cases:

• L1 is horizontal• L1 is vertical

Page 32: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

● Presentation of the Project● Contributions● Applications

Unformating SVG DocumentsUnformating SVG Documents

Page 33: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

Experiments & Results (1/3)Experiments & Results (1/3)Unformating results on SVG documents created with the 2gT system ("graphic ground Truth").

Page 34: PSI 2005N. Ramangaseheno

PSI 2005 N. Ramangaseheno

• Colour index

– black vectors : no change

– red vectors : filtered vectors

– blue vectors : « broken »vectors

• «visually », all original lines are retrieved

• After reduction, we do see that all intersections have been erased

original SVG document unformated document after reduction of 20%

Experiments & Results (2/3)Experiments & Results (2/3)

Page 35: PSI 2005N. Ramangaseheno

PSI 2005 N. RamangasehenoPSI 2005 N. Ramangaseheno

• Algorithm complexity : n(n-1)

• Filtering : n + (n-1) + (n-2) + … + 1 comparaisons

• Intersections retrieval : n + (n-1) + (n-2) + … + 1 comparaisons

• Runtime : about 1,5 min for 100 documents (2500 vectors and 100 intersections per

document)

Experiments & Results (3/3)Experiments & Results (3/3)

Page 36: PSI 2005N. Ramangaseheno

ConclusionConclusion

PSI 2005 N. Ramangaseheno

Outcome• Technical achievement

• Unformating system effective and functional

PerspectivesUse upstream pattern recognition

tools