Making a corpus from YouTube: dialects in North America

11 Mar

Here is a link to Rick Aschmann’s amazing collection of speech clips from Canadian and American speakers on YouTube using the Atlas of North American English as a starting point:


Aschmann’s work is a great example of how to use YouTube–you should also be aware that YouTube allows users to subtitle and to caption clips, which means that you can potentially find words and expressions in particular languages AND/OR translations into other languages.

You may want to get all the YouTube stuff into wave forms that you can analyze. Here are my instructions for how to get YouTube clips into Praat. Basically, you’ll capture the video and then convert the video into audio. Note that YouTube does involve compression, so it’s not the same as a lossless recording. That may or may not be important depending upon the phenomena you’re studying: