Mammalian audition is highly resilient to acoustic variability, such as background noise and multiple talkers, yet how the brain accomplishes this seemingly simple feat is unknown. One plausible hypothesis suggests that the ascending auditory pathway is organized into hierarchical processing stages with sequentially changing feature extraction capabilities that culminate in an invariant noise robust representation. We are currently developing biologically inspired speech and sound recognition models as illustrated above, that 1) take into account hierarchical architecture of the ascending auditory pathway, 2) contain a variety of physiologically realistic mechanisms such as descending and ascending inhibitory projections. These models differ from conventional neural networks used in speech recognition as a result of their architecture and the fact that we make use of spiking neurons with realistic temporal dynamics. Using spike train outputs of this network we can accurately recognize speech even in the presence of background noise.