纯JS实现：无需插件的文字转语音方案

作者：菠萝爱吃肉2025.09.19 18:14浏览量：2

简介：本文详细介绍如何利用JavaScript原生Web Speech API实现文字转语音功能，无需安装任何外部包或插件，覆盖基础实现、高级控制及实际应用场景。

JS原生文字转语音：无需安装任何包和插件的完整指南

在Web开发中，文字转语音（TTS）功能常用于辅助阅读、语音导航、无障碍访问等场景。传统实现方式通常依赖第三方库或浏览器插件，但现代浏览器已内置Web Speech API，支持纯JavaScript实现TTS功能。本文将详细介绍如何利用原生API实现文字转语音，无需任何外部依赖。

一、Web Speech API简介

Web Speech API是W3C标准的一部分，包含语音识别（Speech Recognition）和语音合成（Speech Synthesis）两部分。其中，语音合成（TTS）功能通过SpeechSynthesis接口实现，支持将文本转换为可听的语音输出。

核心接口

SpeechSynthesis：语音合成控制器，管理语音合成过程。
SpeechSynthesisUtterance：表示要合成的语音内容，包含文本、语速、音调等属性。
speechSynthesis.speak()：执行语音合成的方法。

浏览器兼容性

Web Speech API在主流浏览器（Chrome、Firefox、Edge、Safari）中均有良好支持，但部分功能（如语音选择）可能因浏览器而异。建议在实际使用前进行兼容性测试。

二、基础实现：从零开始

1. 创建语音合成实例

const utterance = new SpeechSynthesisUtterance();

SpeechSynthesisUtterance对象是语音合成的核心，通过设置其属性控制语音输出。

2. 设置文本内容

utterance.text = "Hello, this is a text-to-speech example.";

text属性指定要合成的文本内容，支持多语言（需浏览器支持对应语音）。

3. 配置语音参数

// 设置语速（0.1~10，默认1）
utterance.rate = 1.0;
// 设置音调（0~2，默认1）
utterance.pitch = 1.0;
// 设置音量（0~1，默认1）
utterance.volume = 1.0;

通过调整rate、pitch和volume，可控制语音的播放速度、音高和音量。

4. 执行语音合成

speechSynthesis.speak(utterance);

调用speechSynthesis.speak()方法后，浏览器会立即开始合成并播放语音。

完整示例

function speakText(text) {
  const utterance = new SpeechSynthesisUtterance();
  utterance.text = text;
  utterance.rate = 1.0;
  utterance.pitch = 1.0;
  utterance.volume = 1.0;
  speechSynthesis.speak(utterance);
}
// 调用示例
speakText("Welcome to JavaScript text-to-speech.");

三、高级功能：语音选择与事件监听

1. 获取可用语音列表

不同浏览器和操作系统可能提供不同的语音包，可通过speechSynthesis.getVoices()获取：

function listAvailableVoices() {
  const voices = speechSynthesis.getVoices();
  voices.forEach(voice => {
    console.log(`Name: ${voice.name}, Lang: ${voice.lang}, Default: ${voice.default}`);
  });
}
// 首次调用可能返回空数组，需监听voiceschanged事件
speechSynthesis.onvoiceschanged = listAvailableVoices;
listAvailableVoices(); // 立即尝试（部分浏览器可能直接返回）

2. 选择特定语音

function speakWithVoice(text, voiceName) {
  const voices = speechSynthesis.getVoices();
  const voice = voices.find(v => v.name === voiceName);
  if (voice) {
    const utterance = new SpeechSynthesisUtterance(text);
    utterance.voice = voice;
    speechSynthesis.speak(utterance);
  } else {
    console.error("Voice not found.");
  }
}
// 示例：使用英文女声（需浏览器支持）
speakWithVoice("Hello world!", "Google US English");

3. 事件监听与控制

const utterance = new SpeechSynthesisUtterance("Event demo");
// 监听开始事件
utterance.onstart = () => console.log("Speech started.");
// 监听结束事件
utterance.onend = () => console.log("Speech ended.");
// 监听错误事件
utterance.onerror = (event) => console.error("Error:", event.error);
speechSynthesis.speak(utterance);

4. 暂停与恢复

// 暂停所有语音
function pauseSpeech() {
  speechSynthesis.pause();
}
// 恢复所有语音
function resumeSpeech() {
  speechSynthesis.resume();
}
// 取消所有语音
function cancelSpeech() {
  speechSynthesis.cancel();
}

四、实际应用场景

1. 无障碍访问

为视障用户提供网页内容朗读功能：

document.querySelectorAll("p").forEach(paragraph => {
  paragraph.addEventListener("click", () => {
    const utterance = new SpeechSynthesisUtterance(paragraph.textContent);
    speechSynthesis.speak(utterance);
  });
});

2. 语音导航

在单页应用（SPA）中，通过语音提示用户操作：

function navigateWithVoice(step) {
  const steps = {
    1: "Welcome to the dashboard. Please select an option.",
    2: "You've chosen settings. Proceeding...",
  };
  const utterance = new SpeechSynthesisUtterance(steps[step] || "Unknown step.");
  speechSynthesis.speak(utterance);
}

3. 多语言支持

根据用户语言设置切换语音：

function speakInLanguage(text, lang) {
  const voices = speechSynthesis.getVoices();
  const voice = voices.find(v => v.lang.startsWith(lang));
  if (voice) {
    const utterance = new SpeechSynthesisUtterance(text);
    utterance.voice = voice;
    speechSynthesis.speak(utterance);
  }
}
// 示例：用法语朗读
speakInLanguage("Bonjour!", "fr");

五、注意事项与优化建议

1. 异步加载语音

首次调用getVoices()可能返回空数组，需监听voiceschanged事件：

let voicesLoaded = false;
function initVoices() {
  const voices = speechSynthesis.getVoices();
  if (voices.length > 0 && !voicesLoaded) {
    voicesLoaded = true;
    console.log("Voices loaded:", voices);
  }
}
speechSynthesis.onvoiceschanged = initVoices;
initVoices(); // 立即尝试

2. 移动端兼容性

部分移动浏览器可能限制后台语音播放，需确保在用户交互（如点击）后触发语音：

document.getElementById("speakButton").addEventListener("click", () => {
  speakText("Triggered by user interaction.");
});

3. 性能优化

避免频繁创建SpeechSynthesisUtterance对象，可复用实例。
长文本分块合成，防止阻塞UI线程。

function speakLongText(text, chunkSize = 100) {
  const chunks = [];
  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(text.substr(i, chunkSize));
  }
  chunks.forEach((chunk, index) => {
    setTimeout(() => {
      const utterance = new SpeechSynthesisUtterance(chunk);
      speechSynthesis.speak(utterance);
    }, index * 1000); // 每块间隔1秒
  });
}

六、总结

通过JavaScript原生Web Speech API，开发者可以轻松实现文字转语音功能，无需依赖任何外部包或插件。本文介绍了从基础实现到高级控制的完整流程，包括语音参数配置、语音选择、事件监听及实际应用场景。在实际开发中，需注意浏览器兼容性、异步加载和移动端限制，以确保功能的稳定性和用户体验。

关键点回顾

原生支持：利用浏览器内置的Web Speech API，无需额外依赖。
灵活控制：通过SpeechSynthesisUtterance属性调整语速、音调和音量。
多语言支持：根据lang属性选择合适的语音包。
事件驱动：监听onstart、onend和onerror事件实现精细控制。
实际应用：适用于无障碍访问、语音导航和多语言场景。

通过掌握这些技术，开发者可以高效地为Web应用添加语音功能，提升用户体验和可访问性。

发表评论

开发者关注产品榜

最热文章

关于作者

被阅读数
被赞数
被收藏数

活动

咨询