Home ...

The Silicon Audition Project

The Silicon Audition Project

  • John Lazzaro - Research Specialist
  • John Wawrzynek - Professor

Silicon audition involves implementing computational models of biological auditory processing in micropower analog VLSI circuits. We conducted research in silicon audition at Berkeley from September 1991 until May 1997, funded by the National Science Foundation and the Office for Naval Research. An annotated bibliography and concise bibliography describe the accomplishments of the project.

This webpage was written shortly after the research ended. In January 2007, I did some light editing on these pages, to remove dead links and better specify the timeline of the research.

Why Model Audition?

Many engineering systems that process sound, such as speech recognition systems, pitch detectors, and psychoacoustic audio compressors, include simple models of human audition in their processing. Extensive research into the physiological and psychological basis of hearing provides a substrate for computational models of audition that are much more accurate than traditional engineering practice.

Engineering research into creating comprehensive auditory models, and applying them to practical problems, flourished in the 1980s and 1990s. Examples from that era include:

  • Cochleogram representations by Richard Lyon and Malcolm Slaney.
  • The stabilized auditory image representations by Roy Patterson.
  • Sound localization research by Richard Duda.
  • Auditory scene analysis systems by Guy Brown and Dan Ellis.
  • Auditory browsing systems by Musclefish.

Why Silicon Audition?

Engineering systems based on auditory models require substantial computational resources, especially when judged by the state of the art of computing in the early 1990s. For example, in that era a sound separation system designed by Guy Brown at Sheffield University operated at approximately 4000 times real time, running under UNIX on a Sun SPARCstation 1. The computational needs of auditory models are especially troubling when considering the use of such system in battery-operated portable devices.

Digital-signal-processors (DSPs) are the traditional solution to speeding up computationally-intensive audio algorithms. However, in many applications, the audio input takes an analog form: a voltage signal from a microphone or a guitar pickup. For these applications, an alternative approach to DSPs is to use a special-purpose analog to digital converter, that computes auditory model representations directly on the analog signal before digitization.

Auditory Models in Analog VLSI

The 1985-1995 time period was a time of continual improvement in analog circuit models of biological audition. Silicon models of cochlear function began with work by Richard F. Lyon and Carver Mead. Several generations of improved designs followed, including work by:

  • Lloyd Watts
  • Rahul Sarpeshkar
  • Andreas Andreou
  • Andre van Schaik
  • Neal A. Bhadkamkar
  • Shihab Shamma
  • Mohammed Ismail
  • Chris Toumazou

and their respective collaborators.

Research in that time period also addressed models of biological auditory processing at higher neural centers. John Lazzaro's graduate work at Caltech, in collaboration with Carver Mead and Richard Lyon, focused on circuit models beyond cochlear mechanics, as did contemporaneous work by Andre van Schaik and collaborators, and by Richard Lyon.

Silicon Audition at UC Berkeley

Our research at Berkeley explored systems issues in silicon audition: how can these circuits be used to create VLSI systems for practical applications? From 1991 to 1997, we focused on creating technology that would bring silicon audition to commercial viability. In this section, we review some highlights of this research; see this annotated bibliography and this concise bibliography for a complete listing of the results of the project.

Our early efforts focused on creating special-purpose analog-to-digital converter chips. These chips receive analog input, and perform several stages on analog computation on the signal, using circuit techniques from silicon audition. The output of these chips are a digital encoding of the final auditory representation, suitable for direct processing by computing systems.

We developed several core technologies for these special-purpose analog-to-digital converter chips.

  • Efficient digital communication protocols. The address-event protocol (AER) is a digital communication system optimized for communicating neural representations between chips. In collaboration with researchers from Caltech, we developed auditory chips that used AER for point-to-point communications; we later extended AER to support multi-chip systems.
  • Non-volatile, digitally-controlled parameter storage. Silicon audition prototypes typically used off-chip potentiometers for parameter control. In collaboration with Alan Kramer we developed an on-chip, non-volatile analog memory architecture, which can be programmed via a microprocessor-compatible asynchronous bus.
  • Low-power spiking neuron circuits. Pulse coding is an important part of many silicon auditory models; we developed microwatt versions of popular spiking circuits. .

We combined these technologies to produce several generations of analog-to-digital converter chips. An early generation device was featured in our article in June 1994 edition of IEEE Micro. The current generation chip includes the multi-chip AER protocol, supporting the construction of systems with several converters.

Using this chip, we designed a multi-chip system that serves as a front-end for an auditory scene analysis system, computing multiple auditory representations from an analog signal input. We performed a study on using this multi-chip system as a front-end for speaker-independent, isolated-word speech recognition, which was published in our 1997 article in Analog Integrated Circuits and Signal Processing. This article describes a "recognizer-representation gap" that prevents traditional speech recognition algorithms from making good use of auditory representations

We also investigated creating complete audio signal processing systems that operation in the micropower regime. This research was in collaboration with Richard Lippmann, MIT Lincoln Labs. This NIPS*96 paper describes a micropower hidden Markov model state decoder for wordspotting applications, operating in the analog domain. This paper describes a power management architecture for DSP systems that uses micropower analog processing.


Home ...