Silicon Audition: Annotated Bibliography

Between 1983 and 1997, my research centered on Silicon Audition: the design of analog integrated circuits that model auditory neurophysiology, and the use of these circuits in engineering systems. This research was done in collaboration with many people at several institutions: as an undergraduate working with Paul Mueller at the University of Pennsylvania, as a graduate student working with Carver Mead and Dick Lyon at Caltech, and as a postdoctoral fellow and research scientist at UC Berkeley, working with John Wawryznek. Grants from the Office of Naval Research and the National Science Foundation funded the work at Berkeley.

This web page is an annotated bibliography of my silicon audition research publications, including related work in analog circuit design. A concise reverse-chronological publication list is also available.

The Penn Years: 1982-1984.

As an undergraduate in Paul Mueller's lab, we worked on a board-level circuit model of the cochlear nucleus: the final system consisted of dozens of circuit boards full of SSI analog components. We used the system in a simple phonetic speech recognition system. We published an account of this research in the following book, the conference proceedings of the first Snowbird "Neural Networks for Computing Conference." An electronic version of this publication is not available.

Mueller, P. and Lazzaro J. (1986). A machine for neural computation of acoustical patterns with applications to real-time speech recognition. In Denker, J. (ed), Neural Networks for Computing (Snowbird, Utah). American Institute of Physics Conference Proceeding 151, pp. 321-327.

The Caltech Years: 1984-1990.

As a graduate student in Carver Mead's lab, we focused on the design of analog integrated circuits that model auditory neurophysiology. Most of the systems use the silicon cochlea design of Dick Lyon and Carver Mead as a core component. The simplest chip adds circuits to the silicon cochlea to make a full model of sensory transduction in the cochlea. The chip is described in this book chapter, which is available as a PDF by clicking the title below.

Lazzaro, J. and Mead, C. (1989). Circuit models of sensory transduction in the cochlea. In Mead, C. and Ismail, M. (eds), Analog VLSI Implementations of Neural Networks. Norwell, MA: Kluwer Academic Publishers, pp. 85-101.

This design was used as a building block for several larger systems. One is a model of pitch perception, published in the Proceedings of the National Academy of Sciences; the link is to a chapter of my Ph.D. thesis which also describes the system.

Lazzaro, J. and Mead, C. (1989). Silicon modeling of pitch perception. Proceedings National Academy of Sciences 86: 9597--9601.

Another design models auditory localization processing in the barn owl, focusing on interaural time delay computations. This was published in the journal Neural Computation, and as a book chapter; the PDF links are to a chapter of my Ph.D. thesis which also describes the system.

Lazzaro, J. and Mead, C. (1989). A silicon model of auditory localization. Neural Computation 1: 41--70.

Lazzaro, J. and Mead, C. (1990). Silicon models of auditory localization. In Zornetzer, Davis, and Lau (eds), An Introduction to Neural and Electronic Networks. New York: Academic Press, pp. 158--174.

Another design uses both spectral and temporal cues to create a binaural representation; my contribution to this project was minor. This paper was published in IEEE Journal of Neural Networks; see the IEEE Xplore link for a PDF copy of the paper.

Mead, C. A., Arreguit, X., and Lazzaro, J. P. (1991). Analog VLSI models of binaural hearing. IEEE Journal of Neural Networks 2: 230--236. [See IEEE Xplore]

Another chip creates a monaural representation of spectral shape; this paper was published in IEEE Journal of Solid State Circuits.

Lazzaro, J. P. (1991). A silicon model of an auditory neural representation of spectral shape. IEEE Journal Solid State Circuits 26: 772--777.

Other publications from this period describe systems that are not auditory models, but share a similar architecture. For example, this publication, presented at the Advanced Research in VLSI conference, models a sensor in the cardiovascular system that shares similarities with the cochlear representation. This research was done in collaboration with experimentalists at Dupont.

Lazzaro, J. P., Schwaber, J., and Rogers, W. (1991). Silicon baroreceptors: modeling cardiovascular pressure transduction in analog VLSI. In Sequin, C. (ed), Advanced Research in VLSI, Proceedings of the 1991 Santa Cruz Conference, Cambridge, MA: MIT Press, pp. 163--177.

Other publications document a visual motion chip (complete with photoreceptors on chip) whose architecture is similar to the pitch perception chip. This research was done with collaborators from Christof Koch's lab and Rodney Goodman's lab at Caltech, and was presented at NIPS and included in a IJCV paper.

Horiuchi, T., Lazzaro, J. P., Moore, A., and Koch, C. (1991). A correlation-based motion detection chip. In Lippman, R., Moody, J., Tourestzky, D. (eds), Advances in Neural Information Processing Systems 3. San Mateo, CA: Morgan Kaufmann Publishers, 406--413.

Horiuchi, T., Bair, W., Bishofberger, B., Moore, A., Koch, C., Lazzaro, J. (1992). Computing motion using analog VLSI chips - an experimental comparison among different approaches. International Journal of Computer Vision 8:3, 203-216. [see IEEE Xplore]

Other publications from this era document circuit components that are a part of the larger systems described above. One such circuit performs the winner-take-all computation, and was presented at NIPS.

Lazzaro, J., Ryckebusch, S., Mahowald, M. A., and Mead, C. (1988). Winner-take-all networks of O(n) complexity. In Tourestzky, D. (ed), Advances in Neural Information Processing Systems 1. San Mateo, CA: Morgan Kaufmann Publishers, pp. 703-711.

Lazzaro, J., Ryckebusch, S., Mahowald, M. A., and Mead, C. (1988). Winner-take-all networks of O(n) complexity. Caltech Computer Science Technical Report Caltech-CS-TR-88-21.

An improved version of sensory transduction circuits for the silicon cochlea, featuring temporal adaptation, was also presented at NIPS.

Lazzaro, J. P. (1992). Temporal adaptation in a silicon auditory nerve. In Moody, J., Hanson, S., Lippmann, R. (eds), Advances in Neural Information Processing Systems 4. San Mateo, CA: Morgan Kaufmann Publishers, pp 813--820.

A short review article of much of the work of this period was presented at the Asilomar Conference, at the invitation of Richard Duda, and my Ph.D. thesis is a longer review of much of this work. Another review resource is a bibliography that was part of a Silicon Audition tutorial I gave at NIPS.

Lazzaro, J. P. (1991). Biologically-based auditory signal processing in analog VLSI. IEEE Asilomar Conference on Signals, Systems, and Computers, pp. 790-794.

Lazzaro, J. (1989). Silicon Models of Early Audition. Ph.D. Thesis, Computer Science, California Institute of Technology, Caltech Computer Science Technical Report Caltech-CS-TR-89-10. By chapter: [Table of Contents (0)] [Introduction (1)] [Auditory Nerve Circuits (2)] [Winner Take All Circuits (3)] [Auditory Localization (4)] [Pitch Perception (5)] [Conclusions (6)] [References (7)]

Finally, my Master's thesis describes an analog circuit simulator, which was extensively used to simulate silicon auditory chips. This simulator later became part of the freely-redistributable Chipmunk Tools.

Lazzaro, J. (1989). anaLOG: A Functional Simulator for VLSI Neural Systems. Master's Thesis, Computer Science, California Institute of Technology, Caltech Computer Science Technical Report 5229:TR:86.

The Berkeley Years: 1991-1997.

At Berkeley, John Wawryznek and I focused on using the core technologies of the Caltech years to create useful engineering systems. One of our motivations was to leverage the low-power potential of these earlier circuits, to create complete systems that ran on microwatts of power; micropower performance is a capability that the competing digital approach struggles to deliver. While most of the circuit techniques used in the papers above are in fact micropower, some are not; the publication below, presented at ISCAS and in a book chapter, describes micropower replacements for these circuits.

Lazzaro, J. P. (1992). Low-power silicon spiking neurons and axons.IEEE International Symposium on Circuits and Systems, San Diego, CA, pp. 2220-2224. Lazzaro, J. P., and Wawrzynek, J. (1994). Low-power silicon axons, neurons, and synapses. In Zaghloul, M. E., Meador, J. L., and Newcomb, R. W., (eds) Silicon Implementations of Pulse Coded Neural Networks. Norwell, MA: Kluwer Academic Publishers, pp. 153-164.

In real-world computing devices, efficient I/O is an important consideration. In collaboration with researchers at several other sites, we implemented and documented a novel technique for communication neural representations off chip. This was presented at NIPS and was published in IEEE Journal of Neural Networks.

Lazzaro, J. P., Wawrzynek, J., Mahowald., M., Sivilotti, M., Gillespie, D. (1993). Silicon auditory processors as computer peripherals. IEEE Journal of Neural Networks 4:3 523--528. Lazzaro, J. P., Wawrzynek, J., Mahowald., M., Sivilotti, M., Gillespie, D. (1993). Silicon auditory processors as computer peripherals. In Hanson, S., Cowan, J., and Giles C., (eds), Advances in Neural Information Processing Systems 5. San Mateo, CA: Morgan Kaufmann Publishers, 820--827.

Using this communication scheme, we created a special-purpose analog-to-digital converter chip, that takes analog input, processed it through a silicon cochlea, a spectral-shape algorithm, and a temporal adaptation algorithm, and sends the final representation off chip efficiently. All parameters for the analog circuits are held on chip in analog floating-gate cells, whose values can be digitally programmed. We combined the chip with a workstation to generate a real-time display of the sound representation. This system is described in a paper published in IEEE Micro.

Lazzaro, J. P., Wawrzynek, J., and Kramer, A. (1994). Systems technologies for silicon auditory models. IEEE Micro, 14:3. 7-15. [see IEEE Xplore]

We extended the communication scheme to handle multi-chip systems; this method was presented at the Advanced Research in VLSI conference.

Lazzaro, J. P. and Wawrzynek, J. (1995). A multi-sender asynchronous extension to the address-event protocol. In Dally, W. J., Poulton, J. W., Ishii, A. T. (eds), 16th Conference on Advanced Research in VLSI, pp. 158--169.

We used this multi-chip communications protocol in a second-generation special-purpose analog-to-digital converter chip. This design let us combine several copies of the same chip, tuned with different parameters, to create a real-time auditory scene analysis system. We describe this system in a NIPS paper.

Lazzaro, J. P., Wawrzynek J. (1995). Silicon models for auditory scene analysis. In Mozer, M., Touretsky, D., and Hasselmo, M. (eds), Advances in Neural Information Processing Systems 8. Cambridge, MA: MIT Press.

We used this multi-chip system as a front-end for speaker-independent speech recognition experiments, to verify that the system was usable for a real engineering task. These experiments are described in a paper published in the journal Analog Integrated Circuits and Signal Processing, and in a book chapter.

Lazzaro, J. P., Wawrzynek, J. (1997). Speech recognition experiments with silicon auditory models. Analog Integrated Circuits and Signal Processing, 13:1-2, 37-51. Lazzaro, J. P., Wawrzynek, J. (1998). Speech recognition experiments with silicon auditory models. In Lande, T. S. (ed), Neuromorphic systems engineering : neural networks in silicon. Boston : Kluwer Academic.

This paper showed that a micropower analog integrated circuit could be a useful front-end for speech recognition. But could an entire speech recognition system be implemented in this technology? A key element in such a system is a hidden Markov model (HMM) state decoder. In collaboration with Richard Lippmann, we designed an analog integrated circuit that implemented the HMM component of a wordspotting algorithm. We presented this paper at NIPS, and it appeared in the IEEE Journal of Solid State Circuits.

Lazzaro, J. P., Wawrzynek J., and Lippmann, R. (1996). A micropower analog VLSI HMM state decoder for wordspotting. In Jordan, M., Mozer, M., and Petsche, T. (eds), Advances in Neural Information Processing Systems 9. Cambridge, MA: MIT Press. Lazzaro, J., Wawrzynek, J., Lippmann, R. P. (1997). A micropower analog circuit implementation of hidden Markov model state decoding. IEEE Journal Solid State Circuits 32:8, 1200--1209.

These papers showed the feasibility of integrating a complete speech recognition system in analog VLSI. However, actually implementing such a complex system is a daunting task. In collaboration with Richard Lippmann, we developed an architecture to combine a small analog system with a digital signal processor, to get most of the low-power benefits of a full analog speech recognition implementation while simplifying the analog part of the design.

Lazzaro, J., Wawrzynek, J., Lippmann, R. P. (1997). Anawake: Signal-based power management for digital signal processing systems. CNS Group Internal Technical Report, UC Berkeley.

In combining conventional speech recognition systems and biological auditory representations, we found that a "recognizer-representation gap" limits the effectiveness of using auditory representations with standard speech-recognition algorithms. We discuss the nature of the recognizer-representation gap in this paper:

Lazzaro, J. P., Wawrzynek, J. (1997). Speech recognition experiments with silicon auditory models. Analog Integrated Circuits and Signal Processing, 13:1-2, 37-51.

Home ...