November 23, 2023
O. Wolfson
In this aritcle, we will walk through the process of using Google Cloud Text-to-Speech API to convert text to speech in a Node.js application. This guide will cover setting up Google Cloud credentials, managing billing, preparing the source data, and running a script that converts phrases from JSON data into audio files.
npm init
.bashnpm install @google-cloud/text-to-speech fs dotenv
.env
file in your project root.YOUR_CREDENTIALS
with the content of the downloaded JSON file:
GOOGLE_CREDENTIALS_CONTENT='YOUR_CREDENTIALS'
Your source data should be in JSON format containing phrases you want to convert. Here is an example (/data/phrases.json
):
json[
{
"id": "79546973-07ee-4b13-ae8e-2ebf07db7b2d",
"phrase": "Potremmo vederci domani?",
"translation": "Could we meet tomorrow?"
}
// Additional phrases...
]
The provided script converts each phrase in the JSON file into an audio file using Google Cloud Text-to-Speech API.
jsconst textToSpeech = require("@google-cloud/text-to-speech");
const fs = require("fs");
const util = require("util");
const path = require("path");
require("dotenv").config();
const phrases = require("../data/phrases.json");
const google_credentials_content = process.env.GOOGLE_CREDENTIALS_CONTENT;
const credentials = JSON.parse(google_credentials_content);
const client = new textToSpeech.TextToSpeechClient({ credentials });
async function convertTextToAudioFile(obj) {
// console.log("convertTextToAudioFile:", obj.phrase);
const request = {
input: { text: obj.phrase },
voice: { languageCode: "it-IT", ssmlGender: "FEMALE" },
audioConfig: { audioEncoding: "MP3" },
};
const [response] = await client.synthesizeSpeech(request);
// Specify the path where the audio should be saved
const outputPath = path.join(
__dirname,
"../public/audio/",
`${obj.id}.audio.mp3`
);
// Write the audio content to the file
fs..(outputPath, response., );
}
() {
( phraseObj phrases) {
(phraseObj);
}
}
();
@google-cloud/text-to-speech
, fs
, util
, path
, and dotenv
modules.phrases.json
.Place the script in your project directory (e.g., as utils/process-audio.js
) and run it using Node.js:
bashnode utils/process-audio.js
Be aware of the billing for the Text-to-Speech API:
By following these steps, you can integrate Google Cloud Text-to-Speech into your Node.js application to create a dynamic and interactive voice experience. Remember to monitor your usage to align with your budget and regularly update your application as needed.