Adults' Production of Speech (Audio-Visual Recordings)
  • Description

    This data set contains audio-visual recording of adult native speakers of Thai, Cantonese, Mandarin and Malaysian-Mandarin producing syllables with all their tones and native Lancaster English speakers producing syllables with different intonations. There are 9,500 tokens from 10 Thai speaker; 7,440 tokens from 8 Cantonese speakers; 6,144 tokens from 8 Mandarin speakers; 9,360 tokens from 10 Malaysian-Mandarin speakers; 6,500 tokens from additional 5 Thai speakers and 4,752 tokens from 8 Lancaster English speakers. Audio files were extracted from every video file using Adobe Premier Pro 2.0 or VirtualDub software then they were segmented and labelled for each individual syllable within each file using the PRAAT program. These segmentation files were then used as segmentation cues to cut video files into individual syllables to be used as stimuli in speech perception experiments. The videos are in .avi format and all the audio files extracted from them are in .wav format which can be viewed via most media players. Labelled segmentation files are in .TextGrid format and can be viewed with the PRAAT program. The size of each video file is approximately 2 to 4 GB depending on the language with 266 files all together. Size of each extracted audio file is approximately 100 to 250 MB depending on the language. Size of each TextGrid file is approximately 150 to 250 KB depending on the language.


    • Data publication title Adults' Production of Speech (Audio-Visual Recordings)
    • Description

      This data set contains audio-visual recording of adult native speakers of Thai, Cantonese, Mandarin and Malaysian-Mandarin producing syllables with all their tones and native Lancaster English speakers producing syllables with different intonations. There are 9,500 tokens from 10 Thai speaker; 7,440 tokens from 8 Cantonese speakers; 6,144 tokens from 8 Mandarin speakers; 9,360 tokens from 10 Malaysian-Mandarin speakers; 6,500 tokens from additional 5 Thai speakers and 4,752 tokens from 8 Lancaster English speakers. Audio files were extracted from every video file using Adobe Premier Pro 2.0 or VirtualDub software then they were segmented and labelled for each individual syllable within each file using the PRAAT program. These segmentation files were then used as segmentation cues to cut video files into individual syllables to be used as stimuli in speech perception experiments. The videos are in .avi format and all the audio files extracted from them are in .wav format which can be viewed via most media players. Labelled segmentation files are in .TextGrid format and can be viewed with the PRAAT program. The size of each video file is approximately 2 to 4 GB depending on the language with 266 files all together. Size of each extracted audio file is approximately 100 to 250 MB depending on the language. Size of each TextGrid file is approximately 150 to 250 KB depending on the language.


    • Data type dataset
    • Keywords
      • Speech perception
      • Speech
      • Tone (Phonetics)
      • Language acquisition
    • Funding source
      • Australian Research Council
    • Grant number(s)
      • - DP0988201
    • FoR codes
      SEO codes
      Temporal (time) coverage
    • Start date 2009/01/01
    • End date 2012/12/01
    • Time period
       
      Spatial (location,mapping) coverage
    • Locations
      Data Locations

      Type Location Notes
      The Data Manager is: Denis Burnham
      Access conditions Open
      The data will be licensed under
    • Other license
    • Statement of rights in data Copyright Western Sydney University
      Citation Burnham, Denis (2011): Adults' Production of Speech (Audio-Visual Recordings). Western Sydney University.