VoiceOverKids

.agency

International Child voice over casting agency & studio

Real child voice overs! 

VoiceOverKids Child voice generator 2.0 is here!

VoiceOverKids AI is based on the latest (04.2025) AI text to speech model. For the best results read this tutorial or watch our tutorial video.

Things to take into account when using our AI text to speech child voice generator
 

Note that you are using a text to speech generator. If you add lines like “Read this with a joyful tone of voice” you have input this in a text field so the AI will just read your remark as a text and not as a direction to take. 

 

Our child voices are designed in specific styles and tone of voices. One voice might be very cheerful and another more         
neutral. If you choose a very cheerfull voice and want it to read a line in a sad tone of voice this will not give you the best
results as the voice was not trained with this in mind. In that case choose a more neutral voice. 

 

 

How to enhance emotions in your AI child voice output?

 

Add pauses, use the code line <break time="1.0s"/> and change the value number to the desired pause length.

 

AI systems are not good at guessing. So give the AI context to understand how your script should be delivered. Do this by engaging storytelling and writing it like a book. You can use ChatGTP for example to help you to create a storytelling script.

 

Example 1
    
Instead of: 
Read this with a slow, melancholic tone, conveying a deep sense of sadness and disappointment: 'The rain keeps falling, and all the plans we made seem so far away. 

 

Use:

 

 The girl was reading a book out loud with a slow, melancholic tone, conveying a deep sense of sadness and disappointment:   <break time="2.0s"/> 'The rain keeps falling, and all the plans we made seem so far away. <break time="1.0s"/> I can't help but feel like we've lost something important.'

 

And if you want the emphasize words or lines write these in CAPITALS

 

 The girl was reading a book out loud with a slow, melancholic tone, conveying a deep sense of sadness and disappointment:  <break time="2.0s"/> ‘THE RAIN WAS FALLING, and all the plans we made seem SO FAR AWAY. <break time="2.0s"/> I can't help it but <break time="1.0s"/> I feel like we've lost SOMETHING IMPORTANT'

 

Example 2

 

Instead of “I can't believe it, this is amazing”

    

Use

    

I CAN’T BELIEVE IT, this is amazing!' <break time="1.0s"/> she said with a bright and joyful voice. Every time you hit generate will generate a different result.

Advanced settings
 

Speed

This adjusts how fast or slow the voice speaks. Slower speeds tend to convey more emotion.

 

Stability

The stability slider controls how stable the voice is and the level of randomness in each generation. A lower setting introduces a broader emotional range, but setting it too low may cause the voice to sound too random or too fast. On the other hand, a high setting leads to a more monotonous tone with less emotion. For a lively and dramatic performance, set the stability lower and generate multiple samples until you find one that fits. For a more serious or monotone performance, use a higher setting. We recommend setting the stability between 45 and 50 for a balanced result.

 

Similarity
 

This slider adjusts how closely the AI adheres to the original voice. Higher values will make the voice sound closer to the selected one.

 

Style Exaggeration

This setting amplifies the style of the original voice. However, it can slightly reduce stability since the model is focusing more on imitating the style. We recommend keeping this setting at **0** to maintain stability.

 

Speaker Boost
 

This setting boosts the similarity to the original speaker's voice.