Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter...

Post on 28-Mar-2015

213 views 0 download

Transcript of Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter...

Acoustics Research Institute

Austrian Academy of Science

MPEG-7 Today‘s Multimedia Standard

Peter Balazshttp://www.kfs.oeaw.ac.at

Institut für Schallforschung der Österreichischen Akademie der Wissenschaften: A-1010 Wien; Liebiggasse 5. Tel. +43 1/4277-29500; Fax +43 1/4277-9296; email: xxl@kfs.oeaw.ac.at; http://www.kfs.oeaw.ac.at

OeAW-ISF

Peter Balazs1999 started as programmer at the ISF2001 finshed mathematics (University of Vienna)

MPEG-7

OeAW-ISF

• ISO / IEC Standard„Mulitmedia Content Description Interface“

• Multimedia data / metadata description systemLow Level – High Level; content based

• Open systemInheritance

• Description of methodsnormativ – informativ

MPEG-7

OeAW-ISF

• ISO / IEC Standard„Mulitmedia Content Description Interface“

• Multimedia data / metadata description systemLow Level – High Level

• Open systemInheritance

• Description of methodsnormativ – informativ

<AudioDescriptorxsi:type="SoundModelStatePathType"> <SoundModelRef>IDDogBarks</SoundModelRef>

<StateRef>IDState1</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState2</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState3</StateRef> <RelativeFrequency>0.045</RelativeFrequency> <StateRef>IDState4</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState5</StateRef> <RelativeFrequency>0.442</RelativeFrequency> <StateRef>IDState6</StateRef> <RelativeFrequency>0.513</RelativeFrequency>

</AudioDescriptor>

MPEG-7

OeAW-ISF

• History

Call for Proposals October 1998

Evaluation February 1999

First version of Working Draft (WD) December 1999

Committee Draft (CD) October 2000

Final Committee Draft (FCD) February 2001

Final Draft International Standard (FDIS) July 2001

International Standard (IS) September 2001

• Development

Amendment Audio May 2002

Call for Proposals (Systems, version 2) July 2002

MPEG 21 international standard April 2009

XML = eXtensible Markup Language

XML

OeAW-ISF

<?xml version=„1.0“>

• Metasprache

• Hypertext

• Markup markup = tag <Befehl> ... </Befehl>

• Open Standard <?xml version=„1.0“>

<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,

Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>

<?xml version=„1.0“>

<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,

Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>

<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........

<?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [

<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>

<!ELEMENT Vorname (#PCDATA)>....]>

<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........

XML = eXtensible Markup Language

XML

OeAW-ISF

• Metasprache

• Hypertext

• Markup markup = tag <Befehl> ... </Befehl>

• Open Standard <?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [

<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>

<!ELEMENT Vorname (#PCDATA)>....]>

<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........

<Set ID="Viewer3" RunMode="Multiple> <Table ID="Settings"> CursorOpts = 0 0 1 440 SignalOpts = 1 1 </Table> <Set ID="Profiles"> <Table ID="Default"> FrameOpts = 40 1 75 2 0 1 GraphXY = 0 1e4 1 -80 50 1 Method = 0 32 20 0 1 0 0 0 1 0 0 Average = 0 0 99 </Table> </Set></Set>

MPEG-7

OeAW-ISF

• DescriptorsLow Level

• Descriptor SchemesHigh Level, container

• Descriptor Definition Language (DDL)XML Schema, STX Schema

• System ToolsASCII Text - binary

MPEG-7

OeAW-ISF

Out of [1]

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors

• Single Sample

• SegmentsDS, compare to STX

Out of [1]

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors

• Scalar

• Vector

• Single

• Seriesseries of vectors

= table, matrix

• Scalable Series Out of [2]

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Signal ParametersAudioHarmonicity,

AudioFundamentalFrequency

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [2]

• Silence

Out of [1]

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

• Melody Description ToolsMelodyContour DS, Melody Sequence DS

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

• Melody Description ToolsMelodyContour DS, Melody Sequence DS

• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;

SoundModelStatePath, SoundModelStateHistogram

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

• Melody Description ToolsMelodyContour DS, Melody Sequence DS

• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;

SoundModelStatePath, SoundModelStateHistogram

• SpokenContentDescription Tools SpokenContentHeader : WordLexicon, PhonLexicon;

SpokenContentLattice: WordLinks, PhonLinks.

OeAW-ISF

MPEG-7 Audio: Amendment

• New Base typesoptional attribute for channel

• Modification of Spoken Content Description Tools„acoustics only“ score possible for speech recognition; prosody, syllabels

• Audio Signal Quality DSBackgroundNoiseLevel, BalanceType, DCoffsetType, BandwidthType.

TransmissionTechnologyType: shellac, vinyl,....

• Additional Tools:tempo description, compact variable precision representation (BAM)

• Liguistic Description Tools:semantic structure of liguistic data

OeAW-ISF

MPEG-7

Literatur:

[1] José M. Martínez, MPEG-7 Overview (version 8) ISO/IEC JTC1/SC29/WG11N4980, Klagenfurt, July 2002, http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm

[2] ISO / IEC, Information Technology – Multimedia Content Description Interface – Part 4: Audio, Geneva, July 2001

[3] Oliver Pott, Günter Wielange, XML Praxis und Referenz, München 2001

[4] J. Bitzer, J. H. Martínez, Information Technology — Multimedia Content Description Interface — Part 4: Audio — Proposed Draft Amendment , Fairfax, May 2002

Links:

[4] MPEG Home Page, http://mpeg.telecomitalialab.com/

[5] Extensible Markup Language, http://www.w3.org/XML/

[6] STX, http://www.kfs.oeaw.ac.at/software.htm