Tuesday, 26 August 2008

Text to Speech for EFL ESL Materials

Text to Speech (TTS) technology has come a long way in recent years and this is nowhere more evident than on the Read The Words website.

I've just been having a look at the site and trying to decide whether it has real potential for helping EFL ESL students with their listening, reading and pronunciation.


As an experiment I decided to select quite a challenging text and see what the site could do. I also decide to select a British English accent, as in the past I know that TTS systems had struggled more with UK accents than US ones, due to the wider range of sounds in UK English.

Anyway, here are the results. The text is from Wikipedia.org at: http://en.wikipedia.org/wiki/Text_to_speech and is about the challenges of text normalisation in TTS.

  • Click here to watch Elizabeth read the text to you.
    Or
  • Listen using this media player

This is the actual text you should be hearing:

"Text normalization challenges

The process of normalizing text is rarely straightforward. Texts are full of heteronyms, numbers, and abbreviations that all require expansion into a phonetic representation. There are many spellings in English which are pronounced differently based on context. For example, "My latest project is to learn how to better project my voice" contains two pronunciations of "project".

Most text-to-speech (TTS) systems do not generate semantic representations of their input texts, as processes for doing so are not reliable, well understood, or computationally effective. As a result, various heuristic techniques are used to guess the proper way to disambiguate homographs, like examining neighboring words and using statistics about frequency of occurrence.

Deciding how to convert numbers is another problem that TTS systems have to address. It is a simple programming challenge to convert a number into words, like "1325" becoming "one thousand three hundred twenty-five." However, numbers occur in many different contexts; when a year or perhaps a part of an address, "1325" should likely be read as "thirteen twenty-five", or, when part of a social security number, as "one three two five". A TTS system can often infer how to expand a number based on surrounding words, numbers, and punctuation, and sometimes the system provides a way to specify the context if it is ambiguous.

Similarly, abbreviations can be ambiguous. For example, the abbreviation "in" for "inches" must be differentiated from the word "in", and the address "12 St John St." uses the same abbreviation for both "Saint" and "Street". TTS systems with intelligent front ends can make educated guesses about ambiguous abbreviations, while others provide the same result in all cases, resulting in nonsensical (and sometimes comical) outputs. "

What I like about the site
  • The site is free though you do have to register.
  • The site creates a number of options once it has converted the text to speech. This includes creating an Mp3 file to download, creating an embed code to embed the audio into a blog or website, or download to i-pod.
  • They have quite a selection of avatars and voices
  • The site can convert text from a number of sources including Word, PDF, a website (just type in the URL) or even an RSS feed!
  • You can make the texts private or public
  • There doesn't seem to be a limit on many you can create
What I wasn't so sure about
  • I found it hard to get a link to the avatar reading the text. It would have been nice to be able to embed her into my blog, but I just couldn't get that to work.
  • Processing the text can take a while.
I haven't added any teaching suggestions yet for this posting, as I'm interested to see what other teachers think about this before I do that.

So, if you've listened to the text, please do send in a comment and let me know what you think about the useability of a tool like this with EFL ESL students.

Related lnks:
Activities for students:
Best

Nik Peachey

7 comments:

Gail Haythorne said...

Hi Nik,
I have an article in with the TES on this very site (out in Autumn I think) so was interested to read this. I've suggested some applications for teachers including keeping on top of (boring and long?) documentation by listening rather than reading, and creating revision documents for students. Will be interested to see ideas others post. Thanks for your really helpful blog.

Nik Peachey said...

Hi Gail

Thanks for the contribution. I'm working on a few more ideas myself too and will hope to post some time next week. Will your TES article be available online? Would be really interested to read it. Glad you like the blog (hope it doesn't fall into that boring and long category you mention!)

best

Nik

Helene Cruz said...

I actually tried it out and it is good. The only thing is that the voices do sound computerized. Other than that, I think that this could be used to increase comprehension and communication for our students.

Helene Cruz
Guam

Nik Peachey said...

Hi helen

I agree. I'm sure it has uses, though I don't think you could fool someone that it was a real person. I spent a lot of time looking at text to speach afew years back and it's amazing how much better this one is than tey were just 2 - 3 years ago. TTS certainly has a future I think.

Best

Nik

Anonymous said...

http://www.ispeech.org/convert.text.php

This one is free (totally, since you don't even have to register) and the voice is quite natural sounding.

The danger of TTS for foreign language study, is that since the voices are sounding more and more like a native speaker, students may assume that they are listening to a native speaker, thus learn wrong English.

I've heard "podcasts" done with these by non-native speakers that are full of grammatical errors.

Also, though more often than not the intonation and rhythm of the speech is natural, there are times when this is not so. If a student does "listen and repeat" practice after TTS, he/she is likely to be mis-learning something.

Students may benefit for TTS, but they should be taught (warned) about potential problems, too.

Nik Peachey said...

Hi Anonymous,

I agree. I think we need to make students aware that these aren't the voice of 'real' people, and they are different, but I think TTS is going to become part of our everyday communication experience and already some companies are outsourcing telephone services to TTS, so students are going to need to understand TTS and be able to identify it. iSpeech looks like a handy tool.

Best

Nik

Special Education Teacher said...

This is the first time I have seen this program and I'm currently working with ESL students so this would be nice to try.