New Software Puts Human Singing Voices On Tap

Rense.com

New Software Puts Human
Singing Voices On Tap
By Celeste Biever
NewScientist.com
11-27-3

Software that can sing any song you choose in the voice of a world-class soul diva will go on sale in January for just over $300.

Vocaloid will come in the form of two vocal "fonts", known as Leon and Lola. Simple vocal fonts are currently available that can turn a man's voice to a woman's. But Leon and Lola will be the first commercially available ones capable of imparting the tone and pronunciation of a specific human voice to any chosen song.

"We have cracked the genome of the English singing voice," claims Ed Stratton, founder and president of Zero-G, the UK audio sampling company that has licensed Vocaloid from Yamaha.

In theory, Vocaloid could be trained to sing like anyone, from Edith Piaf to Eminem. But for now the product will be limited to the voices of two professional, "breathy" soul singers. A company called Crypton, the only other licensee of Vocaloid, is also due to start selling a "bubbly", female Japanese font.

Sound transition

Synthesising a human singing voice is harder than mimicking a speaking voice because it involves both lyrics and melody. It is also harder than synthesising most musical instruments, because the voice has the widest range of transitions between sounds.

Vocaloid tackles this problem by breaking the human voice into 2500 phonemes, or syllables. To encode a voice, Zero-G's engineers record the waveforms produced when each of these phonemes is sung at four different pitches - a process that takes five working days.

Then the Vocaloid software extracts all possible transitions between phonemes. For example, the word "part" is made of five transitions; silence to p, p-a, a-r, r-t and t-silence. Vocaloid captures the singer's unique tone and pronunciation of the transitions and converts them to a mathematical form that it stores in a database.

When Vocaloid is fed the electronic score of a song, it pulls the relevant transitions from its database and stitches them together to produce the sound. Effects such as vibrato and the way someone "attacks" or leads into a note, are recorded separately and added in.

Little or liddle

Timing, volume, brightness, resonance and more detailed pronunciation can be layered on top by the software's user to make the voice sound even more realistic. "For example, a user may decide that 'liddle' sounds better than 'little' depending on the exact situation," says Stratton.

"There are other approaches for synthesising a realistic human singing voice but this one appears to have gone much farther at the high quality level," says Julius Smith of the Stanford University Center for Computer Research in Music and Acoustics.

But he would be more excited by tools that could turn anyone's voice into software. Stratton hopes to do this within five years, but says the process currently requires a deep knowledge of phonetics and signal processing: "Automating the audio processing is a very tall order indeed."

Although Vocaloid is the most convincing synthetic voice software yet, it could not replace a lead singer. So Stratton expects Leon and Lola to be a source of cheap, automatic backing vocals for musicians creating demo tracks. Experimental musicians could also use the fonts to create totally new sounds made of speeds and intervals that a real human could not produce.

© Copyright Reed Business Information Ltd.

http://www.newscientist.com/news/news.jsp?id=ns99994415

Disclaimer
Email This Article

MainPage
http://www.rense.com

This Site Served by TheHostPros