- Software that can sing any song you choose in the voice
of a world-class soul diva will go on sale in January for just over $300.
-
- Vocaloid will come in the form of two vocal "fonts",
known as Leon and Lola. Simple vocal fonts are currently available that
can turn a man's voice to a woman's. But Leon and Lola will be the first
commercially available ones capable of imparting the tone and pronunciation
of a specific human voice to any chosen song.
-
- "We have cracked the genome of the English singing
voice," claims Ed Stratton, founder and president of Zero-G, the UK
audio sampling company that has licensed Vocaloid from Yamaha.
-
- In theory, Vocaloid could be trained to sing like anyone,
from Edith Piaf to Eminem. But for now the product will be limited to the
voices of two professional, "breathy" soul singers. A company
called Crypton, the only other licensee of Vocaloid, is also due to start
selling a "bubbly", female Japanese font.
-
- Sound transition
-
- Synthesising a human singing voice is harder than mimicking
a speaking voice because it involves both lyrics and melody. It is also
harder than synthesising most musical instruments, because the voice has
the widest range of transitions between sounds.
-
- Vocaloid tackles this problem by breaking the human voice
into 2500 phonemes, or syllables. To encode a voice, Zero-G's engineers
record the waveforms produced when each of these phonemes is sung at four
different pitches - a process that takes five working days.
-
- Then the Vocaloid software extracts all possible transitions
between phonemes. For example, the word "part" is made of five
transitions; silence to p, p-a, a-r, r-t and t-silence. Vocaloid captures
the singer's unique tone and pronunciation of the transitions and converts
them to a mathematical form that it stores in a database.
-
- When Vocaloid is fed the electronic score of a song,
it pulls the relevant transitions from its database and stitches them together
to produce the sound. Effects such as vibrato and the way someone "attacks"
or leads into a note, are recorded separately and added in.
-
- Little or liddle
-
- Timing, volume, brightness, resonance and more detailed
pronunciation can be layered on top by the software's user to make the
voice sound even more realistic. "For example, a user may decide that
'liddle' sounds better than 'little' depending on the exact situation,"
says Stratton.
-
- "There are other approaches for synthesising a realistic
human singing voice but this one appears to have gone much farther at the
high quality level," says Julius Smith of the Stanford University
Center for Computer Research in Music and Acoustics.
-
- But he would be more excited by tools that could turn
anyone's voice into software. Stratton hopes to do this within five years,
but says the process currently requires a deep knowledge of phonetics and
signal processing: "Automating the audio processing is a very tall
order indeed."
-
- Although Vocaloid is the most convincing synthetic voice
software yet, it could not replace a lead singer. So Stratton expects Leon
and Lola to be a source of cheap, automatic backing vocals for musicians
creating demo tracks. Experimental musicians could also use the fonts to
create totally new sounds made of speeds and intervals that a real human
could not produce.
-
- © Copyright Reed Business Information Ltd.
-
- http://www.newscientist.com/news/news.jsp?id=ns99994415
|