Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter...

25
Acoustics Research Institute Austrian Academy of Science MPEG-7 Today‘s Multimedia Standard Peter Balazs http://www.kfs.oeaw.ac.at Institut für Schallforschung der Österreichischen Akademie der Wissenschaften: A-1010 Wien; Liebiggasse 5. Tel. +43 1/4277-29500; Fax +43 1/4277-9296; email: [email protected]; http://www.kfs.oeaw.ac.at OeAW-ISF Peter Balazs 1999 started as programmer at the ISF 2001 finshed mathematics (University of Vienna)

Transcript of Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter...

Page 1: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

Acoustics Research Institute

Austrian Academy of Science

MPEG-7 Today‘s Multimedia Standard

Peter Balazshttp://www.kfs.oeaw.ac.at

Institut für Schallforschung der Österreichischen Akademie der Wissenschaften: A-1010 Wien; Liebiggasse 5. Tel. +43 1/4277-29500; Fax +43 1/4277-9296; email: [email protected]; http://www.kfs.oeaw.ac.at

OeAW-ISF

Peter Balazs1999 started as programmer at the ISF2001 finshed mathematics (University of Vienna)

Page 2: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

MPEG-7

OeAW-ISF

• ISO / IEC Standard„Mulitmedia Content Description Interface“

• Multimedia data / metadata description systemLow Level – High Level; content based

• Open systemInheritance

• Description of methodsnormativ – informativ

Page 3: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

MPEG-7

OeAW-ISF

• ISO / IEC Standard„Mulitmedia Content Description Interface“

• Multimedia data / metadata description systemLow Level – High Level

• Open systemInheritance

• Description of methodsnormativ – informativ

<AudioDescriptorxsi:type="SoundModelStatePathType"> <SoundModelRef>IDDogBarks</SoundModelRef>

<StateRef>IDState1</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState2</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState3</StateRef> <RelativeFrequency>0.045</RelativeFrequency> <StateRef>IDState4</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState5</StateRef> <RelativeFrequency>0.442</RelativeFrequency> <StateRef>IDState6</StateRef> <RelativeFrequency>0.513</RelativeFrequency>

</AudioDescriptor>

Page 4: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

MPEG-7

OeAW-ISF

• History

Call for Proposals October 1998

Evaluation February 1999

First version of Working Draft (WD) December 1999

Committee Draft (CD) October 2000

Final Committee Draft (FCD) February 2001

Final Draft International Standard (FDIS) July 2001

International Standard (IS) September 2001

• Development

Amendment Audio May 2002

Call for Proposals (Systems, version 2) July 2002

MPEG 21 international standard April 2009

Page 5: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

XML = eXtensible Markup Language

XML

OeAW-ISF

<?xml version=„1.0“>

• Metasprache

• Hypertext

• Markup markup = tag <Befehl> ... </Befehl>

• Open Standard <?xml version=„1.0“>

<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,

Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>

<?xml version=„1.0“>

<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,

Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>

<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........

<?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [

<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>

<!ELEMENT Vorname (#PCDATA)>....]>

<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........

Page 6: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

XML = eXtensible Markup Language

XML

OeAW-ISF

• Metasprache

• Hypertext

• Markup markup = tag <Befehl> ... </Befehl>

• Open Standard <?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [

<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>

<!ELEMENT Vorname (#PCDATA)>....]>

<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........

<Set ID="Viewer3" RunMode="Multiple> <Table ID="Settings"> CursorOpts = 0 0 1 440 SignalOpts = 1 1 </Table> <Set ID="Profiles"> <Table ID="Default"> FrameOpts = 40 1 75 2 0 1 GraphXY = 0 1e4 1 -80 50 1 Method = 0 32 20 0 1 0 0 0 1 0 0 Average = 0 0 99 </Table> </Set></Set>

Page 7: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

MPEG-7

OeAW-ISF

• DescriptorsLow Level

• Descriptor SchemesHigh Level, container

• Descriptor Definition Language (DDL)XML Schema, STX Schema

• System ToolsASCII Text - binary

Page 8: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

MPEG-7

OeAW-ISF

Out of [1]

Page 9: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors

• Single Sample

• SegmentsDS, compare to STX

Out of [1]

Page 10: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors

• Scalar

• Vector

• Single

• Seriesseries of vectors

= table, matrix

• Scalable Series Out of [2]

Page 11: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

Page 12: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

Page 13: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Signal ParametersAudioHarmonicity,

AudioFundamentalFrequency

Page 14: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

Page 15: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation

Page 16: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]

Page 17: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]

Page 18: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Low Level Descriptors• Basic

AudioWaveform, AudioPower

• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness

• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection

• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency

• Timbral TemporalLogAttackTime, TemporalCentroid

• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [2]

• Silence

Out of [1]

Page 19: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

Page 20: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

Page 21: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

• Melody Description ToolsMelodyContour DS, Melody Sequence DS

Page 22: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

• Melody Description ToolsMelodyContour DS, Melody Sequence DS

• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;

SoundModelStatePath, SoundModelStateHistogram

Page 23: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: High Level DSs

• AudioSignatureAudioSpectrumFlatness

• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)

PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)

• Melody Description ToolsMelodyContour DS, Melody Sequence DS

• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;

SoundModelStatePath, SoundModelStateHistogram

• SpokenContentDescription Tools SpokenContentHeader : WordLexicon, PhonLexicon;

SpokenContentLattice: WordLinks, PhonLinks.

Page 24: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7 Audio: Amendment

• New Base typesoptional attribute for channel

• Modification of Spoken Content Description Tools„acoustics only“ score possible for speech recognition; prosody, syllabels

• Audio Signal Quality DSBackgroundNoiseLevel, BalanceType, DCoffsetType, BandwidthType.

TransmissionTechnologyType: shellac, vinyl,....

• Additional Tools:tempo description, compact variable precision representation (BAM)

• Liguistic Description Tools:semantic structure of liguistic data

Page 25: Acoustics Research Institute Austrian Academy of Science MPEG-7 Todays Multimedia Standard Peter Balazs  Institut für Schallforschung.

OeAW-ISF

MPEG-7

Literatur:

[1] José M. Martínez, MPEG-7 Overview (version 8) ISO/IEC JTC1/SC29/WG11N4980, Klagenfurt, July 2002, http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm

[2] ISO / IEC, Information Technology – Multimedia Content Description Interface – Part 4: Audio, Geneva, July 2001

[3] Oliver Pott, Günter Wielange, XML Praxis und Referenz, München 2001

[4] J. Bitzer, J. H. Martínez, Information Technology — Multimedia Content Description Interface — Part 4: Audio — Proposed Draft Amendment , Fairfax, May 2002

Links:

[4] MPEG Home Page, http://mpeg.telecomitalialab.com/

[5] Extensible Markup Language, http://www.w3.org/XML/

[6] STX, http://www.kfs.oeaw.ac.at/software.htm