AES 2008

San Francisco

Jim Wheeler’s notes

These are quick notes, jotted down on-the-fly…

sorry ‘bout the typos.

 

125th Audio Engineering Society conference

the organization’s 60 year anniversary

Opening Ceremonies, Awards, Keynote Speech


Opening Remarks
• Executive Director Roger Furness
• President Bob Moses
• Convention Co-chairs John Strawn, Valerie Tyler
Program
• AES Awards Presentation
• Introduction of Keynote Speaker
• Keynote Address by Chris Stone


Awards Presentation
Please join us as the AES presents special awards to those who have made outstanding contributions to the Society in such areas of research, scholarship, and publications, as well as other accomplishments that have contributed to the enhancement of our industry. The awardees are:
PUBLICATIONS AWARD: Roger S. Grinnip III
BOARD OF GOVERNORS AWARD: Jim Anderson, Peter Swarte
FELLOWSHIP AWARD: Jonathan Abel, Angelo Farina, Rob Maher, Peter Mapp, Christoph Musialik, Neil Shaw, Julius Smith, Gerald Stanley, Alexander Voishvillo, William Whitlock
SILVER MEDAL AWARD: Keith Johnson
GOLD MEDAL AWARD: George Massenburg
DISTINGUISHED SERVICE MEDAL AWARD: Jay McKnight

IMG_1576.jpg

 

Keynote Speaker
Record Plant co-founder Chris Stone will explore new trends and opportunities in the music industry and what it takes to succeed in today's environment, including how to utilize networking and free services to reduce risk when starting a new small business. Speaking from his strengths as a business/marketing entrepreneur, Stone will focus on the artist’s need to develop a sophisticated approach to operating their own business and also how traditional engineers can remain relevant and play a meaningful role in the ongoing evolution of the recording industry. Stone’s keynote address is entitled: The Artist Owns the Industry.

 

Chris Stone, Record Plant founder

The Artist owns the industry! The Music Industry is alive and well.  Problem is:  only a few smart people seem to be aware of how to take commercial advantage of that fact with today’s new music business model.

Yesterday, today and tomorrow:

Who IS  the artist?  Musician, film director, game developer, music pastor….    They’re our target customer.  Today the music artist has to be involved in everything… 360.  Marketing, promotion, touring, song-writing.    The CD used to be the major revenue stream. Today, the artist gives the CD away.  It’s now the tour, merchandise.  All this has led to a different way of marketing.  It’s now very web-centric.  Buy a ticket to see Metallica and get a voucher for a free CD.  Also a free download of the concert you just bought a ticket to.

The Internet has changed everything!  Napster, iTunes.   Major labels stopped doing artist development.

Recording gear got a lot cheaper, so it’s now shifted to a home-based studio.  DIY became the word of the day.  They all have their own studios, and stay on the road 250 days a year.  Must have an email list, and sells 10-30 CD’s out of the trunk.  When this happens, that’s when an artist needs a manager.

Major Labels and Hi-End recording studios still remain very viable.

More of Today’s facts:  Brick and Mortar record chains are dying    wal-mart, target, best buy (just bought Napster)   All use music sales as ‘loss leader.’

iTunes is #1   overall music retailer.

DRM is gone away   My Space Music took its place.

CD sales have decreased 46% since downloading began.   But it’s still 82.6% of the US market.

Ringtones are now a 7 billion dollar industry.

But How does the artist get above the noise?

Future of the Music Industry.  Continuing reduction in CD sales, unrestricted downloading is a fact.

The PC, not the Mobile Phone,  is key to Market.   Only 9% of phones download music.

But, Digital downloading continues to grow.

The Music Industry continues to morph from 4 major labels to thousands of indie labels all over the world.  Regional music distribution and promotion is returning as a method to publicize local artists.

Artists continue to gain more control of the music (vertical integration).

Downloading, Music Discovery (Pandora) and social networking    are where the industry is going!

 

There are many new opportunities for music entrepreneurs:

Provide label startup and mktg and promo svcs for band and indie labels

Music publishing sales and admin to service growth to new industry users

Artist mgmt is now the band itself. The band has to be its own COO!  It’s a business.

Booking agent for more genre related, regional venues

On-site merchandising services for touring artist.

Marketing is way more than just sales

Short music videos, website design, publicists are now regional

Project recording post studios, audio engineers and music producers (new niche to fill)

Music placement services   krtipsheet.com

To finds these opportunities, join music industry associations:

AES, NAMM, NARAS, NARIP, SPARS

Social network with like-minded industry people (you gotta hang out)

 Competitive Advantage:   decide what it is you do better than other people… as a company or individual.  First do a feasibility study.  Download a free template from google.  Target demographic, geographic, psychographic, competition. 

Assess your SWOT:   strengths, weaknesses, opportunities and threats.

Know you competition better than you know yourself.

BEFORE YOU PUT YOUR MONEY DOWN:

Create a business plan.  Convince others. 

Marketing 4P’s  product, pricing, promo, place

Talk to the experts/winners in your specialty.

Score counseling   www.score.org   10500 counselors  389 chapters, face to face and email since 1964.  Affiliate of SBA.

 

Loudness Workshop
Moderator - John Chester - consultant

Speakers:

·         Thomas Lund - TC Electronics

·         Jeffery Riedmiller - Dolby

·         Andrew Mason - BBC

·         Marvin Caesar - Aphex

·         James D. Johnston - Neural Audio

·         Robert Orban - Orban/CRL

New challenges and opportunities await broadcast engineers concerned about optimum sound quality in this contemporary age of multichannel sound and digital broadcasting. The earliest studies in the measurement of loudness levels were directed to telephony issues, with the publication in 1933 of the equal-loudness contours of Fletcher and Munson, and the Bell Labs tests of more than a half-million listeners at the 1938 New York Worlds Fair demonstrating that age and gender are also important factors in hearing response. A quarter of a century later, broadcasters began to take notice of the often-conflicting requirements of controlling both modulation and loudness levels. These are still concerns today as new technologies are being adopted. This session will explore the current state of the art in the measurement and control of loudness levels and look ahead to the next generation of techniques that may be available to audio broadcasters.

John Chester

NYT or WSJ article 9-25-08  Thanks to latest Metallica release, people are complaining that everything’s getting too loud!

 

Marvin Caesar, Aphex

Apologized for being complicit in loudness wars.  Listener fatigue is an overlooked factor.

Metallica’s stuff actually sounds better on Guitar Hero than on the CD.  Sounds like the mastering engineer is to blame.  Ultimately, though, it’s probably the producer and artist to blame. Competition among artist and labels drives this race to the crush.

Loudnesss Wars:   has leaked from broadcast world to the production world.

Processing to get maximum level may require clipping

Leveling, compression, limiting, clipping     dynamic control should be done in this order.

Look Ahead processing simulates, but prevents clipping.

Be aware of the effect of Asymmetry has on limiter output.  Male voice tends to be more asymmetrical.

Loss of harmonics (Gibbs Effect) increases amplitude

One of the causes of the Loudness War was the 75 Microsecond Pre-Emphasis curve.  This curve has created an industry.

Salience and Listener Fatigue: 

reduced amplitude changes

reduced frequency response changes

reduced transient response changes

The more difficult it is to discern salience, the faster and greater the fatigue.

 

James Johnston (JJ), Neural Audio Corp.

IMG_1641.jpgLoudness is not intensity.    Loudness is the internal, subjective experience of how loud a signal is.

The term Loudness dates back to the Fletcher-Munson curves.

Intensity is an objective measure of volume.   Higher distortion = higher loudness.

Doubling of loudness is equivalent to a change of power of 10 to the 3.5

Loudness vs. dB

Db SPL is a measure of the intensity of a signal.  dB does not measure loudness

A doubling of loudness is about 10dB or so, whereas doubling of volume occurs with every 3db.

 

Implications:  single band weighting filters can’t get it right.

 They can get it moderately right for wide band signals with similar spectrum, where spectrum is smoothed on a critical band basis.  How to handle varying content that’s speech, music, effects.

Distortion especially in upper region 70-120hz region can throw off loudness measurement by a phenomenal amount.

Overall, loudness models for extended periods are still in development

We don’t know if loudness or annoyance, or something else, is what people adjust volume controls for

We need a system that can be adapted at the point of playback, NOT at the source.  Then just maybe, we might get some dynamic range back.

JJ’s recommendation:  Broadcast uncompressed program material (with digital transmission, there’s no reason to try and get above a noise floor), then give the listener two volume controls:  one to determine the softest and the other to set the loudest.  Leave it up to the listener.

JJ ended by saying that “if you’re in radio, I no longer listen to you.  I can’t stand loud all the time.”

 

Thomas Lund, TC Electronic A/S  Denmark

Dynamic Range Tolerance

Cinema, home theatre, living room, kitchen, bedroom, iPod, car, in-Flight entertainment

A loudness measure must be based on statistics.     Not only dialog as reference.

Center of Gravity:  indicates the overall loudness of a program or music track

Consistency,  indicates intrinsic loudness changes inside a program or track   0 at top of scale

LM5D  loudness meter v.1.1.2   tdm plugin  made by tcElectronics.

Dialog Normalization   easy to find examples where it doesn’t work

Universal Solution:  reclaim land  -  analog transmission has room for emphasis which digital doesn’t need   true peaks can go to -1dbfs in DTV

Metadata:  universal solution works with linear audio  AC3, AAC and other codecs. 

If broadcast gets anchored only on dialog level, we are heading for consumer chaos w/extreme level jumps between programming, commercials and other home sources.

Therefore, don’t hang your hat too firmly on dialog.

Metallica  The Day that Never Comes   album is way loud.

Suggested delivery specs for int’l broadcast enabling quality audio for use w/linear audio or any codec.

Longterm loudness -20 LkFS

Tru peak -1dbFS

Dialog  -26 to -22

 

Andrew Mason, BBC Research

Loudness Measurement

People complain that things are too loud. 

In UK, we use a PPM meter, it doesn’t measure loudness.

IRU-R BS.1770     Soulodre

Showed a cool loudness meter, BBC R&D.R1_7    2,4,10,30 sec needle segments, longer at bottom of needle, shortest at top.

IMG_1577.jpg

Bob Orban / CRL

Compared the CBS and the ITU BS.1770 loudness meters

Short-term vs. Long-term loudness

Bob played a 3.5 minute montage  thru Optimod 6300   then, turned on the loudness controller which used the CBS  value as a controller.  The ITU BS.1770 value seem to remain the same, but the CBS value was a good 5 or 6 db less.

free download of his loudness meter.  www.orban.com/meter

 

Jeffery Riedmiller, Dolby Laboratories

Broadcast Loudness:  the Practical side of implementation

Audio Loudness and Dialnorm (DN)   which is the AC3 metadata value representing the average dialogue loudness of a single program

Addresses INTER-pgm level differences only.  Does not address INTRA-pgm level differences.

DRC Dynamic Range Control subsystem handles the INTRA-pgm level differences.

DN=-31 results in no boost,  big problem at output decoder

 

Joint Dolby / MSO (multi system operators) Case Study

Schedules 250k ads per day.  Each ad now has loudness and dialnorm metadata, and customer complaints have dropped.

Proper use of ac-3 metadata and dialnorm does improve listener/viewer satisfaction

 

 

IMG_1706.jpgThe Art of Sound Effects

Sound effects: footsteps, doors opening and closing, a bump in the night. These are the sounds that can take the flat one-dimensional world of audio, television, and film and turn them into realistic three-dimensional environments. From the early days of radio to the sophisticated modern day High Def Surround Sound of contemporary film; sound effects have been the final color on the director's palatte. Join Sound Effects and Foley Artists Sue Zizza and David Shinn of SueMedia Productions as they present a 90 minute session that explores the art of sound effects; creating and performing manual effects; recording sound effects with a variety of microphones; and using various primary sound effect elements for audio, video and film projects.

·         IMG_1694.jpgSue Zizza, sfx/foley artist

·         David Shinn, her engineer

“I’m a Sound Effects artist – for radio or audio books,  but a  Foley artist – for TV or Film.”

Sound Effects add context, create location and spaces in which the characters live.

There are two types of SFX:  Spots FX vs. Ambience FX

To make sfx ‘read’ it often helps to add length to the sound.  Sue gave the example of just picking up or hanging up a prop telephone, vs. making it rattle around a bit in its cradle to make a longer SFX.

Check out the toy plastic Megaphone from Bed Bath and Beyond   

Lots of in-studio SFX creation employ weird source toys, but it’s about fooling the listener.

Waling in Snow – squeezing box of corn-starch

Open a Rum bottle with cork stopper, dice going into a tumbler for ice sound, pouring water with distance to make sure it ‘prints’ in the listeners mind.

Throw in a 7w nitelite bulb, glass-end first, into tumbler, followed immediately by dice   for a richer ice into glass sound.

Toys and junk:  Refrigerator latch, old electric screwdriver, tie-down strap ratchet,   to simulate robot walking and turning head.

If actors were recorded Binaural, then you want to record SFX in Binaural.  Or a 5.1 surround mic, etc.

Stereo vs. Mono     Record SPOT EFX in mono (better for localization).  Record AMBO EFX in stereo.

Resonating Box:  creaking door, ship, getting it on below decks. [pix]

When doing a ‘period piece’  try to simulate the actual setting.  eg. A 1964 Jack Kerouac dialog with his mother, recorded in an actual living room, with FX grabbed in mono and placed later with pan pots.

Campbell Scott trying to get himself and his dead buddy out of the desert.  Talking turkey-vultures.

Built a facility to foley the entire scene.  Bought 400 lbs of kitty litter, in a 4’x30’x4” walking lane.   So that it could be a continuous walking effect.  Director wanted nothing from the ‘can.’

Footsteps Foley:    “a sfx artist can walk a mile in 2 ½ inches” 

In film, footsteps are always recorded with a shotgun, for presence.  A mono mic accentuates front to back movement, so you get perspective changes.

IMG_1691.jpg2’x4’x1” Plywood on 1x4 sticks, so it’s off the floor (for more resonance):  natural on one side, linoleum on the other.  Mic-ed with an x-y mic, so you get left-right movement, or single cardioid orshotgun.

When you stop, always place a scuff.  An while they’re standing there, maybe they should subtley scuff around or shift weight.

Heavy piece of granite 2’x4’x1”   add a little Morton salt for grit.  For beach, grind some kitty litter under feet.  Sand on wood good for a beach. 

Leather Bottom shoes give the best detail to the sound.

Gravel Bag (stage sand bag) filled with gravel   you walk on that.  Size of gravel, loose stone.  Good for outdoor walking.

Alien Fish walking:  cellulose sponge on feet, to get a squishing sound.

Breathing through the mouth, losing noisy clothing, jewelry.

“Yap (Yak?) Box”  small blue synth generator.

Surround demo:  Sonic Force needed a field recording of a helicopter

Cavalry  soldiers yelling and running by mic.

 

DPA-5100  a 5.1 surround mic, from Denmark   brand new

 

Wind wand   rubber bands stretched over dowel, spin it around   didgeridoo sound-alike

 

 

 

Naked door lockset, add wood door (lid of wooden box)

 

 

 

A padded box with coconut halves, for horse, along with a block and tackle for harness sounds.

Material in box:  fresh step kitty litter, but prefer unscented clay.  Don’t want too much dust

2x4 built gate with hinge and latch

IMG_1710.jpg

                                                                          

IMG_1692.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

How do you decide between live created sounds vs. library sounds?

Well, does the in-the-can sfx sound like it was recorded in the space where the voices were recorded?  That’s what determines the answer.  Also, it can be quicker to foley sfx live than manually place canned.

 

 Fabric sounds recorded on a ribbon mic is warmer than on a condenser.   Theree was  an Outdoor walla vs. Indoor walla discussion.

 

 

Scorsese adds animal sounds behind natural sounds in Raging Bull

Sue also mentioned a Nicholas Cage film where a match strike had a jet engine layered under.

 

 

Analyzing, Recommending, and Searching Audio Content—Commercial Applications of Music Information Retrieval
Chair:
Jay LeBoeuf, Imagine Research
Panelists:
Markus Cremer, Gracenote  Director of DSP Technology  mcremer@gracenote.com  510-428-7217

Matthias Gruhne, Fraunhofer Institute for Digital Media Technology
Tristan Jehan, The Echo Nest
Keyvan Mohajer, Melodis Corporation

Abstract:
This workshop will focus on the cutting-edge applications of music information retrieval technology (MIR). MIR is a key technology behind music startups recently featured in Wired and Popular Science. Online music consumption is dramatically enhanced by automatic music recommendation, customized playlisting, song identification via cell phone, and rich metadata / digital fingerprinting technologies. Emerging startups offer intelligent music recommender systems, lookup of songs via humming the melody, and searching through large archives of audio. Recording and music software now offer powerful new features, leveraging MIR techniques. What’s out there and where is this all going? This workshop will inform AES members of the practical developments and exciting opportunities within MIR, particularly with the rich combination of commercial work in this area. Panelists will include industry thought-leaders: a blend of established commercial companies and emerging start-ups.

MIR:  Analyzing, recommending and searching audio content – commercial applications of music information retrieval

Jay LeBoeuf, Imagine Research

 

Markus Cremer, head of small research group   Gracenote Media Database

CDDB database, over 2.0B searches / month   GN acquired by Sony 2 months ago.

doesn’t make their own stuff, they buy it (unlike Melodis).

Genius from Apple has licensed access from GraceNote.

Text mining and collaborative filtering.

My Question:  are any library music companies or Soundminer approaching you guys to utilize your search technology?

Markus Answer:  some, but it’s not the most lucrative financial model for us, so it’s a low priority.  Talk to Markus after the session to get names of which companies have approached him.

 

Matthias Gruhne, Fraunhofer Institute for Digital Media Technology

Semantic Metadata Systems

 

Tristan Jehan, The Echo Nest

Echo Nest is a music intelligence platform       more meaningful connections between people and music

Combines cultural and musical context automatically

Free web API for developers @ developer.echonest.com

 

Keyvan Mohajer, Melodis Corporation

Midomi mobile, an iPod app.  Voice recognition, hum a few bars, or say name of artist or song, or hold phone up to a radio.   www.midomi.com   sounds  like a better version of Shazam.  Wrote all their own stuff.

Humming Engine built 4 yrs ago. 

Sound Source separation to delineate melody.   It’s a manual process.  Goal is ‘all the music in the world.’  Currently at 10 mil songs in database.  Users submit melodies, user generated, like Wikipedia.

3 providers:  GraceNote, Midomi, Shazam.  Shazam does not offer humming.

Multi-modal adaptive research   is technique used.

When you hum or sing a melody into Midomi, you can upload your picture.  Then when others get a hit on the song you uploaded, they see your picture and they can become your friend on MySpace.  They also rate your performance.

 

MIR has been called subjective, but only the recommendation part is subject.  The identification part is not subjective… it just works.

 

I asked the panelists if any consideration had been given to employing this new identification and recommendation technology to library music / needledrops.  THEY LOOKED AT ME LIKE I WAS FROM MARS!   No one had every heard of such a thing.  One of them replied to me:  “Do you mean, like, bootleg recordings?” when I mentioned ‘cleared music.’

YIKES!

AES has turned into a show of wanna-be rock-stars and bedroom guitar players.

I probably don’t need to come to this show again!

This no longer represents my industry.

 

Listening Tests on Existing and New HDTV Surround Coding Systems
Moderator: Gerhard Stoll, IRT

  Speakers:

·         Steve Lyman, Dolby Laboratories

·         Florian Camerer, ORF

·         Kimio Hamasaki, NHK Science &Technical Research Laboratories

·         Andrew Mason, BBC R&D

·         Bosse Ternstrom, SR

With the advent of HDTV services, the public is increasingly
being exposed to surround sound presentations using so-called home theater environments. However, the restricted bandwidth available into the home, whether by broadcast, or via broadband, means that there is an increasing interest in the performance of low bit rate surround sound audio coding systems for
emission coding. The European Broadcasting Union Project Group D/MAE (Multichannel Audio Evaluations) conducted immense listening tests to asses the sound quality of multichannel audio codecs for broadcast applications in a range from 64 kbit/s to 1.5 Mbit/s. Several laboratories in Europe have contributed to this work.

This Broadcast Session will provide profound information about these tests and the results. Further information will be provided, how the professional industry, i.e. codec proponents and decoder manufacturers, is taking further steps to develop new products for multichannel sound in HDTV.

 

Gerhard Stoll, IRT

Listening tests of hi-end codecs:  Dolby Digital +, HiEfficiencey AAC, ProLogic

SACD, DVD A, Linear PCM, lossless coding

Discrete 5.1 MCA coding

 

Bit-rate is still very important.  448kbps and above is best

Old MCA codecs are still verby good when operated t appropriate bir-rates

Low bit-rate codecs can produce hi qual in many, but not all source material.

 

Steve Lyman, Dolby Labs

Steve.lyman@dolby.com

Codecs provide excellent audio quality:

DD ac2 @ 448kb/s

DD+ E-AC3 @ 448

AAC @ 320 kb/s

HE-AAC @ 192 or 160

MetaData:  opens up delivery possibilities to listeners

Consistent loudness

Control of dynamic range

Program to speaker configuration

Downmixing as necessare

Addl. Svcs   video descriptive, emergency, svcs for hearing impaired)

A good mixer will place the average level at an anchor point, say -20dbfs, which allows ample headroom, or DialNorm value   Then, a variety of styles:  action movie, drama, sports, symphony, rock, news can be controlled in your living room.   Means dropping that -20 to -31, then dialing around attenuation based upon material category.

 

Dynamic Range control     little vs. big (full theatre system)

 

WorldWide:  Evolution of terrestrial specs (Europe)

Needs all-in-one decoder / transcoder

 

Bavaria   new Dolby products      allows stream mixing in the receiver   ATSC Service Types

Main Services

Complete Main

Music and Effects

Associated Svcs

Visually impaired

Hearing impaired

Dialogue

Etc.

 

Andrew Mason, BBC Research

Leading to… 5.1 Surround

BS-775 listening position of speaker placement.  Rears are real wide, like at my place.

Bit-Rate reduction:  bandwidth costs money.    Cable compresses video 100:1.  But for HD, audio winds up for 5.1 384kb/s

BBC has bundled surround with HDTV broadcasts – TWC could take a lesson.

DVB-2, DAB+

 

Bossert , Swedish Radio

Hi-Quality Multi-Channel  in Radio

Proponent of higher bit-rates in audio   448 should be a minimum.  But he really prefers 640kb/s.

Lower bit-rates result in artifacts when gaining up center channel, for example.  He also advocates 96x24.

 

Florrian Camera, European TV from Austria   ORF

Works in multichannel audio

Live:  Dolby E

Post:  wav files   Poly-WAV or Dolby E

Quality -> PCM is goal,  Dolby E is an intermediate solution

EBU:  >320 kb/s

Austrian HDTV data rates:

Video  12 mb/s  448 kb/s

Challenge:  applause is the toughest    lo-bit rate sounds like French fries

Goal:  Audio free of artifacts

 

Kimio Hamasaki, NHK Science eand Tech Research Labs

Do we need audio lossy coding for future of HD broadcasting?

In Japan, ISDB-Tsb

Archiving and Preservation for Audio Engineers
Chair:
Konrad Strauss
Panelists:
Chuck Ainlay
George Massenburg
John Spencer

Abstract:
The art of audio recording is 130 years old. Recordings from the late 1890s to the present day have been preserved thanks to the longevity of analog media, but can the same be said for today's digital recordings? Digital storage technology is transient in nature, making lifespan and obsolescence a significant concern. Additionally, digital recordings are usually platform specific; relying on the existence of unique software and hardware platforms, and the practice of nondestructive recording creates a staggering amount of data much of which is redundant or unneeded. This workshop will address the subject of best practices for storage and preservation of digital audio recordings and outline current thinking and archiving strategies from the home studio to the large production facility.

Konrad Strauss, Indiana University, Jacobs School of Music, Bloomington

Backup:  ensure against catastrophic data loss

Archiving:  ensure that data is preserved (long-term storage)

Analog Recordings: 

once recorded, master becomes archival format.  Generally accepted conventions for linear tape and accompanying documentation

Problems

Gradual degradation:  deterioration of medium, degradation thru migration,

Limited lifespan of media

Digital Recordings:

Lack of generally accepted conventions

Short life of formats: can a format be read in 5 yrs?

Short life of hdwe:  will hdwe be supported in 5 yrs?

Short life of media:  Exabyte anyone?

Digital Advantage

Unlimited migration: infinite copies w/no degradation

IT data mgmt techniques:  planned migration to new formats, preserve the data, not the media

Digital Recordings:

Active mgmt required:  ongoing funding, regular migration for eternity

Archiving Paradigm: 

Eternal  file, not eternal carrier

 

NARAS DAW guidelines for preserving media   (stanrdardization of deliverables)  download and circulate to Corey

Save copy as    ensures that you don’t wind up with missing audio files.

Daily backup plan to ensure against data loss

 

Archiving:  risk mitigation on multiple formats, multiple copies, geographical separation (off-site), regular migration.

Validation:  integrity checks, checksums

 

Year-Month-Day, IEC standard,  is a good way to start filename.

 

George Massenburg

Larry Blake, SMPTE naming conventions   addresses OMF issues     on SMPTE site

AES Media Label   encapsulates documentation, leads to MetaData

It will be up on the AES Nashville site.

NARAS:  Retrospect, Talus Group Brew, Tar (EMI likes TAR) 

     another layer of software modality     no good at incremental

Never add any kind of compression or encryption.

Record Companies are starting to demand stems, so they can remix your album.

 

Chuck Ainlay

What can we do to manage our data to ensure we can retrieve it?

Labelling is big, specs of sample rate, bit depth, DAW and version.

Hierarchical organization in folders is also important.

Place a leading 01,02 ahead of track-name, so they’ll sort on a hand-off.

Uses separate session for each song.

 

John Spencer

We’re not there yet   MXF Interchange still needs work in the industry.

John’s a data-wrangler, not a recording engineer.  Consolidation is important, see NARAS document.

BWAV, consolidate.

Consolidation eliminates the need to have the plugin.  But it causes the road to fork.  SO, best solution is to create a new session and pull consolidate there.  George says it’s easier to just make a duplicate track, and consolidate one of them, so you can always get back to pre-consolidation edit.

 

LTO Tape is an industry standard.  Cheaper than burning.

Be proactive about pulling any archive off the shelf every 6 mos or so and see if it restores!!!

 

Don’t charge for archival, but charge for retrieval.  Great business model.

 

Simple Archival Workflow:

Keep two firewire drive backups and a DVD-R of rendered (consolidated) multi-track files and mix files.

 

Barry:  Version Control needs standardization

SCCS   or SVN

GIT

BagIt

www.digitalpreservation.gov  NDIPP

 

MD-5  is a utility that goes thru a subdir and creates a ‘checksum   128-bit    sent along with the file, when they get it, run MD-5, if it doesn’t compare, the file needs to be re-transmitted.

Checksum validates a file sent to an archive server.

 

FAT32 has a 2gb limitation on wav files.   Problem with MicroSoft, not wav file.

Mac’s HFS+ doesn’t have that problem.

 

Binaural Audio Technology—History, Current Practice, & Emerging Trends

Presenter:
Robert Schulein,
RBS Consultants – Consultants in Acoustics - Schcaumburg, IL USA


Abstract:
During the winter and spring of 1931-32, Bell Telephone Laboratories, in cooperation with Leopold Stokowski and the Philadelphia Symphony Orchestra, undertook a series of tests of musical reproduction using the most advanced apparatus obtainable at that time. The objectives were to determine how closely an acoustic facsimile of an orchestra could be approached using both stereo loudspeakers and binaural reproduction. Detailed documents discovered within the Bell Telephone archives will serve as a basis for describing the results and problems revealed while creating the binaural demonstrations. Since these historic events, interest in binaural recording and reproduction has grown in areas such as sound field recording, acoustic research, sound field simulation, audio for electronic games, music listening, and artificial reality. Each of theses technologies has its own technical concerns involving transducers, environmental simulation, human perception, position sensing, and signal processing. This Master Class will cover the underlying principles germane to binaural perception, simulation, recording, and reproduction. It will include live demonstrations as well as recorded audio/visual examples.

 

GRAS (Denmark)  supplied the Kemar Manequin (the Head)

KEMAR=Knowles Electronics Mannequin for Acoustic Research

Perception and Sound

5 human senses    data rates from all 5 sensors:

sight=10m b/s

Taste=1 b/s

Smell = 100 b/s

Hearing=100 b/s

 

Showed a film by Brandon Pletsch   detailing how human hearing works, metabolically and electronically

Major attributes of hearing

Dynamic range of sounds   you can get as much as you want

Tonality of sounds   old news

Sound localization   focus of this talk

It is important to accurately localize sounds in the horizontal plane.  Two ears do this.  Binaural=having to do with two ears

Lord Rayleigh (John Strutt) postulaated the Duplex Theory of Localization:

Phase delay of arrival of sound at each ear, difference in time, is how this is accomplished.

But the Duplex Theory doesn’t take into account the Pinnae, which allows perception of vertical as well as horizontal.

 

Bob built a lo-cost knock-off of KEMAR,

calls it DEXTER=Dual-channel Experimental Transducers for Electroacoustical Research

He built it from a store mannequin, drilled holes in his head, etc.

 

Excellent ability to localize sounds in the frontal portion of the horizontal plane

Good ability to localize sound in back portion of horiz. plane

Poor ability to localize sound in vertical plan

So we’re designed to work in a horizontal plane,  If we were billy goats, maybe we’d have one hear on our chin.

 

Chrisine Rankovic and Jont Allen wrote papers in ‘30s at Bell Labs

Harvey Fletcher 1884-1981   The Speech, Hearing and Communications Theory,  at Bell Labs

Showed a film of Harvey Fletcher, at age 79, filmed in 1963 at Bell Labs

Stereo, or binaural, made the telephone sound like natural conversation, after many failed attempts to modify freq. response and different mics and speakers.

Harry Nyquist, Edward C. Wente, William B. Snow

Leopold Stokowski (conductor of Philadelphia Symphony) collaborated with Harvey Fletcher to do the first stereo PA.  Built their own discreet filters from R’s and L’s.  Used a tailor’s mannequin named Oscar, with large mics (dynamic) in his cheeks.  ¼” in diameter, small for the time.  Would have used a condenser, but they were vacuum tube-ed at the time, so it would have melted the wax dummie’s head.

Used earphone distribution listening stations. 

Early test subjects reported front-back localization ambiguity, much as we experienced with Bob’s current-day tests.  But when test subjects were able to see the sound source, it improved the front-back localization accuracy.  Sight augments sound in localization accuracy.

 

2008 Applications of Binaural Technology

What’s change since 1932

Our understanding of hearing process

Our understanding and refinement of electronic and acoustic devices

Some Binaural Technology Applications

In-Ear Monitor

Why add ambience audio to In-Ear Monitor System? 

User does not have a natural perception of the acoustics of the performance space

User not able to communicate with others

Adding ambience to In-Ear monitoring (IEM) system:

Accuracy in sound source locatlization

Accuracy in frequency response

Accuracy in in dynamic range

Hearing protection for performer

Ease of use by performer

Custom Sensaphonics 3D Ambience IEM

User has ability to set ambience vs. monitor signal.  Then performer can switch to full ambience (for conversation).  Perform vs. Talk switch.

 

Demo film:

Binaural Sound in a Theme Park Environment

Pirate Soul Museum in Key West Florida

Produced by Sam Bruckner, with Bob Schulein

 

Binaural Synthesis and Head Tracking

Head Related Transfer Functions:  linear systems analysis defines the transfer function as the complex ration between the output signal spectrum and the input signal spectrum as a function of frequency.    Free-Field Transfer Function  FFTF    Blauert 1974, 1981

Sound Field Synthesis Applications, from Moller, Sorensen, Hammershoi, and Jensen JAES vol.43, no.5 1995 May

Sound Field Recording and Head Tracking  more work done by Algazi, [Dick]Duda and Thompson JAES vo. 52, no.11, 2004

Examples of Binaural Synthesis   

Dolby Headphone:  convolves a set of generic HRTFs for a 5.1 loundspeaker layout, with a synthesized listening room to develop a 5.1 Channel Headphone Playback Experience.

Beyer Dynamic Headzone:  similar to Dolby Headphone

 

Multi-Sensory Influences on Hearing:  have you ever been fooled by your senses?

                The McGurk Effect   seeing fools hearing

 

Demo-ed a video of a bluegrass band, so I left.

DTV Audio Myths

There is no limit to the confusion created by the audio options in DTV. What do the systems really do? What happens when the Systems fail? How much control can be exercised at each step in the content food chain? There are thousands of opinions and hundreds of options but what really works and how do you keep things under control? Bring your questions and join the discussion as four experts from different stages in the chain try to sort it out.

Moderator:   Jim Kutzner, Chief Engineer, PBS

Speakers:

Tim Carroll - Linear Acoustic, Inc    Formerly with Dolby   tim@linearacoustic.com

Metadata is supposed to handle dynamic range, thus eliminating need to jack with levels at the stations.  But stations need to achieve loudness consistency between disparate sources:  network, local, outliers (w/o metadata).

What is the problem?

Local stations simply do not have the resources   in the past, this has all been handled by an unseen processor.

Boradcasters are struggling with overly dynamic programs, metadata not helping

If the goal is to preserve the original content (and satisfy viewers), it requires lots sof time and/or advance tools.

Yeah, but most of my viewers are via cable

During NAB, leading cable ingineers described how analog tier will be served:  station provides downconverted audio/video in some cases

Mostly, IRDs will be used to downconvert video and downmix audio to feed analog modulators

DTV signal should be appropriate for the HD/5.1 viewers, but must also work for the downconverted/downmixed stereo viewers (aka most of your audience).

Who is responsible for getting it right?

                Ultimately, it is the license holder who must maintain FCC compliance

The Tools Exist

Loudness meters are required in post, and are the new DTV Mod Monitors for stations

File-based content can be corrected in the file domain

Smart DTV audio processors that satisfy all of the new needs are available tody.

Metadata MUST be supplied to eliminate global processing that whacks your content down.

 

Far less processing is required

Modern production coupled with metadata can alleviate the need for aggressive processing

Processors MUST support metadata and take advantage of the clues

Smart processors coupled with loudness and DRC Metadata subsystems can co-exist

Upmixing (turns 2 channels into 5.1) is not a replacement for a good 5.1 mix… used as a production tool to help make better 5.1 mixes faster.

What Now?

ATSC and CEA are investigating loudness

Until, and if, something changes, dialnorm must be set appropriately, or audio must be adjusted to match fixed value

The onus is on the creative community to use the system to its appropriate best, lest we end up back in the 1980’s withthat NTSC sound.

Dialnorm is analogous to reference fluxivity   +6 over 185 nWb/m   it’s just a reference level of loudness.

What if it is not solved?

Consumer based loudness control

TV sound regulator – external

Terk VR-1 TV loudness regulator - external

Philips Sone Others proprietary - internal

Dolvy Volume – internal

But there are millions of legacy sets

 

 

More legislation  Eshoo,   Calm act

 

IMG_1593.jpgRobert Bleidt - Fraunhofer USA Digital Media Technologies

Audio Codecs in new ATSC standard   future

ATSC Mobile   us mobile TV service    same channels as used in homes today

Transmission based on IP, not an MPEG-2 Transport Stream   UDP, RTP, SDP, Internet RFCs  RFC 3640 for audio streaming

w/limited mobile bitrate, mp3 and AC-3 are not robust enough

ATSC-M replaces these   HE-AAC v2 is the audio codec

Markets:  iPod is a primary outlet

Advocating mpeg surround   cars offer a great locale for surround

 

David Wilson - Consumer Electronics Association  dwilson@ce.org

CEA-CEB11:  Ntsc/atsc loudness matching

                Applies to DTV rcvrs capable of rcving both ntsc and atsc signals

                Will be neede for quite some time

Purpose:  consumers typically set speech levels to preferred constant acoustic level

CEA-CEB11 recommends those levels

NTSC:  Speech peaks are typically 2db below 100% modulation after de-emphasis

ATSC (Dolby Digital): line mode=highest quality w/more options   RF mode

                Dialnorm allows for normalization of dialog from NTSC to ATSC

MPEG:  some products have NTSC, ATSC and MPEG   mpeg has no speech normalization requirements

If mpeg channels set speech levels similar to ntsc, then mpeg levels must be attenuated 17db

[see pix]

Top DTV myth:  any 200 watt per channel x 5 channel surround sound amplifier will have a 1kW power supply in it.

FTC Rule  the manufacturers rated power output shall be measuered with all associated channels fully driven to rated per channel power.

Audio metadata:  new standards project is just beginning

Goal is to make it easy for consumers to identify and navigate to alternate DTV audio, such as alternate language feeds.   Official scope is still being developed. 

Summary:

CEA-CEB1 recommends way for receivers to match ATSC/NTSC audio (dual digital/analog signals being sent to receivers).

Comparison of surround sound amps more challenging than one might think.

New bulletin will recommend rcvr response to audio metadata.

 

IMG_1617.jpgKen Hunold – Dolby   krh@dolby.com

DTV audio is the same as NTSC audio – just digital

                DTV audio incorporates several improvements over NTSC

more channels (1-6), greater freq resp, wider dyn range, no need for pre-emphasis

transparent audio delivery, consistent reproduction of program dialog: pgm to pgm, chan to chan

Uniform Loudness:  it was recognized that loudness varied chan to chan, pgm to pgm

It was identified that the avg perceived loudness of dialog should be uniform

It was not considered feasible to require that all pgms be created with the same loudness

Myth 2:  loudness problems are unique to DTV

Reality:  loudness problems have plagued TV audio for many years

Solutions are unique to DTV, but still existed in NTSC.

While disabling metadata can be done, it should only be done when JUST the AC-3 fit rate reduction system is used.

 

 

The Lip Sync Issue

This is a complex problem, with several causes and fewer solutions. From production to broadcast, there are many points in the signal path and post process where lip sync can either be properly corrected, or made even worse.

This session's panel will discuss several key issues. Where do the latency issues exist in post? Where do they exist in broadcast? Is there an acceptable window of latency? How can this latency be measured? What correction techniques exist? Does one type of video display exhibit less latency than another? What is being done in display design to address the latency? What proposed methods are on the horizon for addressing this issue in the future? Join us as our panel covers the field from measurement, to post, to broadcast, and to the home.

 

Moderator: Jonathan S. Abrams, Nutmeg Audio Post   Chief Technical Engineer

HDPostSync   Johathan’s solution for lip sync at Nutmeg

Panelists:

Robert Bleidt, Fraunhofer

 

Andrew Mason and Richard Salmon, BBC Research

Factors affecting perception of audio-video synchronization

Acquisition:  cmos vs. tube cameras   cmos captures at one instant, while the tube raster scans

Wet film can be scanned as well, but the focal-plane shutter exposes the frame at different times.

Aliasing is the result.

Production:  aspect ratio conversion introduces a delay, as does frame rate conversion… pull-down delay

…as much as 160ms of delay! 

Dolby E introduces 1 frame of delay for encode, 1 frame for decode.

[pix] details graphically how transmission delays are corrected (in a perfect world).

SMPTE / EBU Lip Sync Cookbook  download it.

 

Richard Fairbanks, Pharoah Editorial, Inc.

Where does sync go wrong?

Capture on Set

Cameras may not capture audio and video sync

Limited used of ‘old school’ clapper sticks

Settings during capture / ingest of recordings

Encoder / decoder / transcoder errors

Editing mistakes

Editing

Film Speed on set   video speed in post

Import aaf/omf audio to different frame rate

Lack of 2-pop at start and/or end

Layback to video deck with timecode output set wrong

Dolby E delays not compensated for

Some people’s lips appear ‘out of sync’

Sound Post Prodn

 

Problems are not self-correcting:  there is no EASY button, manual re-sync is ofte the best option

The buck stops here… do not pass it along.

Audio Video sync errors come in two flavors:

                Errors less than a frame due to display/movie frame rate mismatch

                Errors more than a frame due to any cause

We are the synchronization gate-keepers

A simple way to check AV sync:  “Flash-Pip”   his company markets a meter  “SyncCheck”

Simutaneous visual flash and audio burst

Play them together

They should be simultaneous

Any difference is an error to be corrected.

Signal Outputs must be electrically in sync:  use a dual trace scope

Sound and Image occur simultaneously

Digital image displays lag behind the data:  LCD, DLP, PLASMA (some plasmas 4 frames out)   latency

Digital audio processing also causes some time shift

Time delay as sound travels through air.

Correction:  typically, audio signals are delayed

Total system error less than 10ms (1/2 frame) is okay

Benefits of Flash-Pip method

Intuitive tech, non proprietary

Simple creation of signal source

Highly adaptable to most editing and playback situations

Measurement inclues speaker distance/’speed of sound

Good accuracy vs. cost

Weaknesses

Display scanning and overscan reduce detection resolution of a flash-pip’s leading edge, up to several milliseconds

Trade off maximum measureable error time with measurement resolution across time

Audio detection is limited by frequency content and speed of sound thru air

Must be performed during ‘tech time/setup time’ not during live situations.

 

Kent Terry, Dolby Laboratories  involved in development team of Dolby E

Dolby E requires tight sync with video, and Dolby’s standard provides sync down to the sample level

[pix]

 

Scott Anderson, Video Engineer, Syntax-Brillian

 

 

David Moulton, Sausalito Audio, LLC    TV Technology Magazine   dmoulton@moultonlabs.com

An Informed Couch Potato’s View of the Lip-Sync Problem

LipSync problems come from:

Separation of Digital Video and Audio Data Tracks

Varying latency between those tracks: video latency seems to be much greater than audio latency

Some LCD’s have huge delays

Some bad news:

Fixing errors at one point in the flow doesn’t correct errors downstreal

With HD video, the errors have become large

With HD video, the errors have become much more obvious and annoying

Golden Ears audio ear training   a product David developed

10ms of latency is the point beyond which musicians cannot play.

Humans have latency:  between ear-drum and auditory cortex  is 7ms

What does High Definition mean?

Is it a video standard?  Yes

Is it an audio standard?  No

Is it a synchronization standard?  NO

Is it a marketing claim?  Unfortunately, Yes

Does that Marketing Claim infer High Quality audio and synchronization?  Absolutely!

What does this mean?

The good news:  nobody can sue us

The bad news:  nobody can take our claim seriously, because we don’t live up to it.

Lack of sync makes on-cam persons appear to be ‘drooling’ their words, with an attendant loss of impact and credibility

Right now, it appears that we are unable to sustain sync reliably during live broadcast, in transmission, or reliably for all end users.

Right now, we are unable to sustain sync reliably during live broadcast because of the latency of playback devices.

David’s Standards:

1000:1  60db dyn range

1000 lines

20-20k

1ms in sync

3 sec/hr  length of time we can maintain HD

This is all possible with current technology

Until we can deliver this, WE DON’T HAVE HD

We aren’t delivering on the marketing claim, though we can’t be sued for it

No single group is responsible for the failure

 

Listener Fatigue & Longevity

This panel will discuss listener fatigue and its impact on listener retention. While listener fatigue is an issue of interest to broadcasters, it is also an issue of interest to telecommunications service providers, consumer electronics manufacturers, music producers and others. Fatigued listeners to a broadcast program may tune out, while fatigued listeners to a cell phone conversation may switch to another carrier, and fatigued listeners to a portable media player may purchase another company's product. The experts on this panel will discuss their research and experiences with listener fatigue and its impact on listener retention.

Moderator: David Wilson CEA   runs CES in Las Vegas

Participants :

James D. Johnston - Neural Audio

Listener Fatigue – Some Speculation    listener fatigue has not been quantified, no one knows how to measure it.

A Reminder – Loudness vs. Intensity

·         Intensity

o   Sound Pressure Level, SPL

o   Measured excitation in the atmosphere

·         Loudness

o   Perceived “sound level”

o   Proportional to inner hair cell firing on the basilar membrane

·         How do they relate to fatigue?

Is there one kind of fatigue?

·         Mechanical Cochlear fatigue – due to high intensity

o   Outer hair cell damage, at least, seems to correlate to intensity, but outer hair cells also depolarize due to high loudness

o   This raises the issue of both biochemical and mechanical fatigue

·         Neural / physiological Cochlear fatigue – due to high loudness

o   Inner hair cell firing rate is pretty much proportional to loudness (not intensity), and there must be some biochemical fatigue there

·         CNS fatigue – due to missing, false or contradictory cues

o   Does have to extract “what was that” from missing information create some kind of CNS fatigue?

·         Reflexive fatigue???

o   What about balance / hearing interaction?

§  Do balance and hearing interact?

§  What do conflicts in the two cause?

·         Especially at Rock’n’Roll levels

How does one measure Listener Fatigue?

·         There are no units

·         There is no external manifestation that can be singled out as listener fatigue

o   Annoyance

§  Fatigue?  Material?  Genre?  Lyrics?

o   “Upset” sensations

§  Motion sickness

§  Normal fatigue

·         Time spent listening willingly, under controlled circumstances

o   How in the world will we avoid other factors like

§  Boredom

§  Lack of time

§  Dislike of test setup

Then, some very speculative ideas

·         But first level:

o   Too loud (either intensity  or loudness) is bad in many ways

·         Conflicting Cues

o   Hearing very close to balance organs.  At least anecdotal examples of induced motion sickness have been reported

o   Unnatural effects (cognitive effort?)

 

Conclusion:  Basic research is necessary.  There is so little hard data known about this subject.

Headphones don’t convey the physiological component of bass.  Bass is meant to be felt in the gut… headphones don’t do that.  Plus, the bass output of the headphone amp in an iPod is horrible.

 

Marvin Caesar – Aphex, President since 1975

Mostly anecdotal info on listener fatigue.

How come radio stations are louder in the morning?   You had it cranked the night before.

Many mixes don’t pass the ‘light of day’ test.  On airplanes, you turn up the volume and shove the buds further in you ears.  Constantly changing channels on the radio, even if you like the music.

People turn off when you crush the audio.

One effect is irritability:  you’re having to work harder to hear stuff and it pisses you off.

 

Aphex Model 454 Headpod   almost impossible to find good demo material.

If you can provide your listener with better transient response, the listener doesn’t have to work as hard, so it slows fatigue.

So, what is listener fatigue?

Physiological

The Active Cochlea, Peter Dallos

If the outer hair cells are damaged, there is less gane

Temporary threshold shif

Loss lof low level sensitivity

Loss of discrimination – masking spread

What’s necessary for intelligibility, is salience    differentiation in frequency … it gives you more cues.  You want to keep this differential… you don’t want to crush everyting.

Psychological:  To summarize, successful speech perception under less than ideal listening conditions depends on greater functional integration across a very distributed left hemispheric network of cortical areas.  Therefore, speech perception is facilitated when hi-order cognitive subsystems become engaged, and it cannot be…

What causes listener fatigue?

Short and long term exposure to extremely loud signals

Long term exposure to over processed signals

But, even at ‘safe’ levels, you become emotionally disengaged… so you turn it up.

How to avoid it as a listener:  avoid listening at high levels, protect your ears at concerts.  Select pgm mtrl that is not overly processed (good luck).   Don’t ask the mastering engineer for the loudest CD ever.

As a broadcaster, as well, don’t get caught in the loudness wars.  Avoid heavy processing, especially look ahead limiting.  You need those edges to the sound, to keep it emotionally interesting.  Listen to your body as well as your ears.  And if you’re a broadcaster, make sure you enjoy the sound of your station.

STOP SQUEEZIN’ THE SHIT OUTTA EVERYTHING is what JJ said.

Salience definition:  difference between background noise and transients that cut thru.  Amplitude and Frequency Response changes.

IMG_1651.jpg

Ted Ruscitti - On-Air Research

Empirical Evidence and Consumer Opinions on Listener Fatigue   subtitle:  Quality Matters

No original thinking necessary in my line of work:  I’m in marketing.

Mine is a consumer research perspective.  I’m reporting your customers feedback.

A Perfect Storm – Two things are happening today to make high-quality and low-fatigue audio more important than ever before:

1.       Dramatically more competition for ears

2.       Minute-by-minute metered ratings for tV, radio and internet radio

Competing for Ears

·         Listeners, viewers and gamers have more choices to directly compare

·         There’ more minute-to-minute switching

·         We know that certain things increase switching

·         Bad audio really stands out.  Consumer backlash is growing

Useful metrics for audio exposure

·         TSL = Time Spent Listening

·         TSV = Time Spent Viewing

·         TSG = Time Spent Gaming

TSL: Time Spent Listening

·         TSL is a very good indicator of what doesn’t drive listeners away

·         TSL is a market of listener fatigue

·         Besides listener fatigue, other things affect TSL, but research suggests a strong correlation

How do we measure listener longevity?

What we measure:

·         Listener satisfaction

·         Listener longevity

·         Listener discontent and fatigue

How we measure:

·         PPM  Portable People Meters

·         Group / Individual listening tests

 

Listening Tests resulted in the following results:

The top 10 list for listener fatigue

1.       Low bitrates and lossy formats

2.       GIGO – garbage in-garbage out

3.       Stacked codecs  – Typical FM signal chain – no wonder it sounds so bad!

·         Label sends an MP3 to the station.  Bitrate = ???

·         PD burns it onto a CD

·         CD gets played through a cheap PC audio card

·         Audiovault saves it as an MP2

·         It plays on air through a cheap PC card

·         Studio-transmitter link uses compression

·         Ibiquity codec process it

·         Listener hears it on HD2 at 32kBps

4.       Bad A/D/A conversion

5.       Clipped audio

6.       Bad audio-to-video sync

a.       Very fatiguing for viewers.

b.      And if sound hits before video, tune-out rates go sky-high.  Humans can handle sound being late to picture – that occurs in nature over distances – but not early.

7.       HD Radio

a.       HD Radio is not HiDef !!!  it’s Hybrid-Digital   it’s a marketing scam.

b.      32kbps is definitely NOT HiDef

8.       L-R issues

a.       Increases multipath distortion

b.      Reduces Loudness:  L-R or too much spread to the stereo, or too much stereo verb, actually reduces loudness, increases distortion when broadcast on FM, and drives listener fatigue faster than anything else.

c.       Low Frequency L-R is worst

d.      When mixing for FM or TV, mix drier.  Some L-R processing accentuates reverb.

9.       Bad mono compatibility

10.   Big level differences

11.   Too much audio compression

a.       Surprise, this didn’t make the top 10

b.      When done artfully, compression doesn’t seem to offend the current generation of  music consumers – they grew up with it.

c.       Artful = Multiband       Offensive = Broadband

d.      Loudness doesn’t create fatigue.  Intensity is much more closely related to fatigue.

 

 

What happens to my music when it’s played on the radio?  An article by Bob Orban

Clipped audio will not pass thru a perceptual encoder   it’ll sound like mush.

 

Take away point:  Quality Matters, really, and listeners can tell the difference.

After 6 months listening to a codec, listeners learn the artifacts, it won’t go away. And you quit listening.  They become expert listeners.  And, they don’t like that codec anymore, particularly when the program material is smashed up way too loud.  They emotionally disengage from the music, and sales suffer.