Jérôme Daniel, France Telecom R&D

16
D1 - 09/05/22 France Télécom Recherche & Développement Workshop « From 5.1 to Sound Field Synthesis..." AES 120th Convention, Paris 2006 Higher Order Ambisonics: promises and reality Jérôme Daniel, France Telecom R&D

description

Workshop « From 5.1 to Sound Field Synthesis..."  AES 120th Convention, Paris 2006 Higher Order Ambisonics: promises and reality. Jérôme Daniel, France Telecom R&D. Front (X). Left (Y). Right. Back. Traditional 1st order Ambisonics: B-Format encoding. Panoramic sound recording - PowerPoint PPT Presentation

Transcript of Jérôme Daniel, France Telecom R&D

Page 1: Jérôme Daniel, France Telecom R&D

D1 - 21/04/23France TélécomRecherche & Développement

Workshop « From 5.1 to Sound Field Synthesis..." AES 120th Convention, Paris 2006

Higher Order Ambisonics: promises and reality

Jérôme Daniel, France Telecom R&D

Page 2: Jérôme Daniel, France Telecom R&D

#2

Traditional 1st order Ambisonics: B-Format encoding

s Panoramic sound recording

QCoincident omni (W) and

bidirectional (X,Y) microphones

QFront-back, Left-Right separation

QDirectional information

= amplitude relationships

QDescription of wave propagation

direction & speed

localization

QIndependent of any loudspeaker

layout

Front (X)

Back

Left (Y)

Right

Page 3: Jérôme Daniel, France Telecom R&D

#3

Reproduction over loudspeakers : spatial decoding

s Simulate any coincident mic setup

QRecombine B-Format directivity patterns

QDecoding operation: matrix signals W,X,Y

QOne virtual microphone per loudspeaker

Q... as many as wanted, but...

Q… sound image blur remains the same

+

-= + = +-

+

B-Format

Front (X)

Back

Left (Y)

Right

Page 4: Jérôme Daniel, France Telecom R&D

#4

Reproduction over loudspeakers : spatial decoding

s Simulate any coincident mic setup

QRecombine B-Format directivity patterns

QDecoding operation: matrix signals

W,X,Y

QOne virtual microphone per loudspeaker

Q... as many as wanted, but...

Q… sound image blur remains the same

s Optimized decoding for localization

(LF < 600-700z)Reproduce true wave propagation at the listener scale ( good ITD)

HF (>600- 700 Hz)Concentrate energy contributions in the expected direction ( less altered ILD, ITD)

Front (X)

Back

Left (Y)

Right

minimise opposite contributions

Compromise for large area [Malham]Optimize localization at the sweet spot [Gerzon]

Page 5: Jérôme Daniel, France Telecom R&D

#5

"Traditional" 1st order Ambisonics: pros & cons

s ProsQCompact multichannel format (no redundancy)

QSpatial homogeneity

QAcoustic fidelity (regarding propagation properties)

QEasily extended to 3D (additional Z)

QFlexibility: sound field transformation; reproduction setups

QCommercialized B-Format microphones (eg SoundField™)

s ConsQBlurred / unstable sound images ("tiny" sweet spot)

QNot well adapted to irregular/unbalanced loudspeaker arrangements (esp. ITU setup)

QLimitations due to low directivity of usual mikes, esp. at LF

Q... that’s why non-coincident microphone approaches might be preferred

Page 6: Jérôme Daniel, France Telecom R&D

#6

Introducing Higher Order Ambisonics (HOA)

s Increase angular discrimination in

spatial encoding

Qadd directivities with "faster"

angular variation

Front (X)

Back

Left (Y)

Right

1st order 2nd order 3rd order 4th order

Page 7: Jérôme Daniel, France Telecom R&D

#7

Introducing Higher Order Ambisonics (HOA)

s Increase angular discrimination in

spatial encoding

Qadd directivities with "faster"

angular variation

s Increase angular selectivity of

loudspeakers’ contributions

Qselective virtual microphone

directivities

Qbetter use of narrowed

loudspeakers

Front (X)

Back

Left (Y)

Right

+ + + +

= = = =

Page 8: Jérôme Daniel, France Telecom R&D

#8

Introducing Higher Order Ambisonics (HOA)

s Increase angular discrimination in

spatial encoding

Qadd directivities with "faster"

angular variation

s Increase angular selectivity of

loudspeakers’ contributions

Qselective virtual microphone

directivities

Qbetter use of narrowed

loudspeakers

Front (X)

Back

Left (Y)

Right

1st order 2nd order 3rd order 4th order

Page 9: Jérôme Daniel, France Telecom R&D

#9

Rendering properties of higher spatial resolution

s Acoustic reconstruction

QEnlarged sweet area "Holophony" [Nicol, Daniel]

QEnhanced distance encoding control of the wave curvature

monochromatic plane wave (f=600Hz)

1st order 2nd order 5th order 10th order

s Quality of sound images: localization clues for a centred listener

spherical wave (R=1m)(gaussian pulse)

Order M 1 2 3 4

flim 700 Hz 1300 Hz 1900 Hz 2500 Hz

E 45° 30° 22.5° 18°

good reconstruction (good ITD) up to flim

blur angle due to HF clues alteration (ILD&ITD)above flim

Page 10: Jérôme Daniel, France Telecom R&D

#10

Compatibility with irregular/unbalanced arrangementss Synthesize directivities adapted to ITU inter-loudspeaker angles

QFrom 4th order ambisonics [Craven, 2003]

QUsing 5th order resolution [Laborie et al]: better front channels

separation

QPossible decoding criterion (among others): imitate pair-wise pan-pot

[Craven, 2003]

[Laborie et al]

Page 11: Jérôme Daniel, France Telecom R&D

#11

Compatibility with irregular/unbalanced arrangementss Synthesize directivities adapted to ITU inter-loudspeaker angles

QFrom 4th order ambisonics [Craven, 2003]

QUsing 5th order resolution [Laborie et al]: better front channels

separation

QPossible decoding criterion (among others): imitate pair-wise pan-pot

s 4th order decoding over enriched ITU setup (5+2+1)QC (0°), L&R(+-30°), SL&SR(+-120°) … + L&R(+-70°) … + B (180°)

QDemonstration on a 8-loudspeaker setup (kindly provided by Cabasse)

◊ = "energy vector" (* = target, ie ideal sound image)

Page 12: Jérôme Daniel, France Telecom R&D

#12

Extension to 3D encoding and reproduction

s 3D encoding and decoding

s Dynamic binaural reproduction

QVirtual loudspeakers doesn’t sound so good

QEnhanced method: better efficiency (CPU) & rendering

QSound field rotation driven by head-tracker

QDemo : Poster session P31, Tuesday, 14:00 - 15:30

Encoding into 3D HOA Format Reproduction over a 3D rig

Reproduction over headphones

Spatial decoding(similar to 2D)

Head-tracker

“Virtualization”:HRTF filtering

K N LdS

pk signals

K

HO

A

signals

SoundField

Rotation

Page 13: Jérôme Daniel, France Telecom R&D

#13

First conclusion on Higher Order Ambisonics

s ProsQScalable multichannel format

QSpatial homogeneity

QAcoustic fidelity + "high spatial definition"Wave field reconstruction

QEasily extended to 3D – Efficient binaural spatialisation

QEven more flexibility: sound field transformation;

reproduction setups, including irregular arrangements like ITU

s ConsQnothing?

s What do we need in practice?QHOA (or « high spatial resolution ») microphone systems

QSpatial processing tools

Page 14: Jérôme Daniel, France Telecom R&D

#14

Higher Order Ambisonics Microphone Systems s Synthesis of Spherical Harmonics

QExtension of differential microphones:

Pressure gradient and higher order derivatives

using non-coincident acoustic sensors!

QNon concentric sensor distribution (Trinnov)

QDistribution over a rigid sphere (FT)

Q[Meyer, Elko, Kubli] [Rafaely] [Ward, Abhayapala]…

QTrade-off on the size of the array

–bigger is better to have spatial resolution at LF

–smaller is better to reduce spatial aliasing (at HF)

s A few words on FT prototype

QDesigned for "proof of concept" (homogeneous 3D)

Q32 sensors 4th order 3D (and even 5th order 2D)

QObjective measurements & validation [Moreau et al]

Poster session P31, Tuesday, 14:00 -

15:30

Page 15: Jérôme Daniel, France Telecom R&D

#15

Tools and applications

s Recording and mixing toolsQPrototypes of HOA mic (FT, Trinnov)

QSuite of VST plug-ins demo

QUse in common audio edition tools, or interactive audio progr.

s ApplicationsQMusic, documentary, fictions

QSharing of events/ambiances (eg familial use), teleconferences

QInteractive audio and multi-media:–A flexible multi-channel 3D audio format–Games, Virtual/Mixt Reality–News nodes for virtual scene description in MPEG4 (AudioBIFSV3)

–label a multi-channel stream as a HOA content (AudioChannelConfig)–a new kind of sound object that describes a Surrounding Sound Field

(SurroundingSound)

Page 16: Jérôme Daniel, France Telecom R&D

#16

Demonstrations

s Loudspeaker reproductionQReproduction of 4th order 3D recordings over enriched ITU setup (5 to 8 ldspk)

QAcknowledment:

Many thanks to Cabasse

and R&D manager Yvon Kernéis

s Head-tracked binaural reproduction Q[Moreau et al] Poster session P31, Tuesday, 14:00 - 15:30

QCould also be shown after this workshop