The Evolved Person Perception & Cognition Lab


epac lab wave.png

In no particular order....

(please note that you will probably need to seek permission to use these resources)

Voice databases

  • The Speech Accent Archive: In addition to voice samples that can be downloaded and chopped up there are also additional web links


A collection of useful voice related papers and voice stimuli can be found here:

  • Voice Neurocognition Lab: A collection of voice samples and other stuff from Professor Pascal Belin’s Lab (Glasgow University):



  • Montreal Affective Voices (MAV) Audio Collection. This has been billed at the voice equivenen5t of the Ekman faces. It contains 90 NON-VERBAL emotional sounds (anger, disgust, fear, pain, sadness, surprise, happiness and pleasure and some neutral expressions). The contain 10 voices – 5 male and 5 female.

CITE:  Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: a validated set of nonverbal affect bursts for research on auditory affective processing. Behavior research methods, 40(2), 531-539.



(There’s not a lot of this stuff around and this is free)

  • The GRID audiovisual sentence corpus. This is a collection of HQ video and auditory stimuli (multispeaker). It is free to download (but the files are quite big). Information and access can be found here.


  • The VidTIMIT Audio-Video Dataset. These are video and audio recordings of individuals (N=43) reciting fairly short sentences. They are idea for a variety of research questions involving person identification or voice /face processing etc.


  • The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

    • Courtesy of Livingstone & Russo 

      • The RAVDESS (open access database) contains 7356 files. Each file was rated 10 times on emotional validity, intensity, and genuineness. Ratings were provided by 247 individuals who were characteristic of untrained adult research participants from North America. A further set of 72 participants provided test-retest data. High levels of emotional validity, interrater reliability, and test-retest intrarater reliability were reported.

    • The construction and validation of the RAVDESS is described in:

CITE: Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391.


  • The Geneva Faces and Voices (GEFAV) database The GEFAV has been developed and is distributed by the Swiss Center for Affective Sciences at the University of Geneva, Switzerland. It is a collection of European faces and voices of 111 individuals, including 61 women and 50 men, aged 18-35 years old. For each individual, we provide three kinds of facial stimuli (static neutral, static smiling and dynamic neutral) and two kinds of vocal stimuli (a three-vowel sequence /i/-/a/-/o/ and a sentence in French: “Bonjour. Il est deux heures moins dix”). The facial and vocal stimuli are available for download.”

Cite: Ferdenzi C, Delplanque S, Mehu-Blantar I, Da Paz Cabral KM, Domingos Felicio M, Sander D. (in press). The GEneva Faces And Voices (GEFAV) database. Behavior Research Methods.


Cite: Nolan, F. (2011). Dynamic Variability in Speech: a Forensic Phonetic Study of British English, 2006-2007. [data collection]. UK Data Service. SN: 6790,


OTHER sounds stuff

  • Sound effects archive from the BBC (British Broadcasting Corporation)

  • Sound Archive from the British Library

  • Create your very own sine waves and download them as .wav files. Courtesy of Audio Check (please give generously for using the function: these thinsg are not free to the creator).

  • Two very useful references with information about voice characteristics can be found here (HAL.INRA) and here (DEA.BRUNEL)

  • Find sounds: a searchable web site for free auditory stimuli (e.g. bells, whistles, barks etc). It is from copyright.