About Me
Hi, my name is Alex! I am currently a PhD Student at the University of Connecticut College of Engineering working with Monty Escabi and Ian Stevenson in the Physiological Acoustics Lab. I am apart of the Biomedical Engineering Program, focusing my coursework in Machine Learning, Digital Signal Processing and Human Psychoacoustics. Further, I am a fellow in the Institute for Brain and Cognitive Sciences where I help facilitate STEM education outreach initiatives across Connecticut. Bleeding blue, I graduated from the Univsersity of Connecticut in 2022, where I received my undergraduate degrees in Electrical Engineering and Molecular Cell Biology.
My Current Research
My current work is focused on the bottom-up acoustic cues (spectrum, temporal modulation, spectral modulation) that drive speech perception in natural enviornmental noise. Natural backgrounds can be quite diverse, with high degrees of spectrotemporal variability that arises from environmental acoustic generators. Whereas with speech, articulation imposes unique acoustic idiosyncrasies (i.e.: fundamental frequency, intonation) that influence our vocal quality, pronunciation, and phonetic implementation. What I am interested in, is how these acoustic cues interfere with one another, and are indicative of real-world human perception. See below more details about our paper on BioRxiv!
Machine and Human Audition
While my work is predominantly focused on human hearing, I am intersted in the differences between human and machine audition. Particularly, automated speech recognition (ASR) systems, such as Alexa and Siri, often reach a point of failure in diverse enviornmental noise. In contrast, humans are innately able to disentangle the foreground target from complex auditory scenes. I am interested in why this dichotomy exists, how we can interrogate black-box deep learning methods for ASR, to create biologically informed, and effective intervention.
Sound Synthesis
I am also interested in applying machine learning methods, and optimization tools to create synthetic sounds to drive perceptual studies of audition. Natural sounds have rich, spectrotemporal diversity compared to controlled stimuli, which lead to exciting perceptual phenomena present in real-life sounds. However, through sound synthesis procedures, we can preserve the natural diversity in auditory scenes, while nichely addressing specific acoustic features, centric to how we perceive sounds (ie: Temporal Fine Structure, Spectrum, Modulation, Reverberation). I am interested in applying generative models of sound synthesis (ie: diffusion, GANNs, etc) to further address the robustness and sensitivity of human hearing.
Engineering Education
In tandem with my research, I am very interested in STEM Education, and diversity, equity and inclusion initiatives within the field. I work with the Experiential Education Office to develop curriculum and mentor first-year engineering students alongside Nick Delaney, Monica Bullock and Jenn Pascal in the Engineering House Learning Community a collaborative program between the Office of First Year Programs and the UConn College of Engineering. Here we integrate community building and academic support to guide students along their path in engineering. However, we make a goal to integrate service-learning initiatives into the curriculum to show students the impact their education can have on their community, to promote narratives of DEI and outreach in engineering early on.
Publications
Low-dimensional interference of mid-level sound statistics predicts human speech recognition in natural environmental noise (in Review, 2024)
Alex C. Clonan, Xiu Zhai, Ian H. Stevenson, Monty A. Escabí
I am really excited to share our current preprint, you can currently see it posted on BioRxiv. Here we assess the influence of bottom-up acoustic features of the foreground and background by adversarially positioning them against one another. This feature representation is inspired by the computations in the auditory midbrain (IC). Our approach allows us to investigate perceptual transfer functions indicative of the cues we rely on for specific acoustic tasks.