7 个开源的TTS(文本转语音)系统推荐

/ 工具 / 没有评论 / 3644浏览

7 个开源的TTS(文本转语音)系统推荐

前言:TTS在电视产品的应用,能够帮助对电视机界面无法采用可视化标准访问的盲人和弱视的人,在欧洲在美国已经开始制订了规范的实现标准,和实施的规章制度。

TTS(Text To Speech,文本转语音)是语音合成应用的一种,它将储存于电脑中的文件,如帮助文件或者网页,转换成自然语音输出。TTS可以帮助有视觉障碍的人阅读计算机上的信息,或者只是简单的用来增加文本文档的可读性。TTS经常与声音识别程序一起使用。

本文主要介绍7款开源的TTS系统,你可以用来学习,也可以在你的项目中使用。

1. MARY - Text-to-Speech System

MARY是一个采用Java开发的、多语种的文本转语音平台,它支持:德语、英语、美式英语、泰卢固语、土耳其语和俄语。

The MARY Text-to-Speech System (MaryTTS) MaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It was originally developed as a collaborative project of DFKI’s Language Technology Lab and the Institute of Phonetics at Saarland University. It is now maintained by the Multimodal Speech Processing Group in the Cluster of Excellence MMCI and DFKI.

As of version 5.2, MaryTTS supports German, British and American English, French, Italian, Luxembourgish, Russian, Swedish, Telugu, and Turkish; more languages are in preparation. MaryTTS comes with toolkits for quickly adding support for new languages and for building unit selection and HMM-based synthesis voices.

2. SpeakRight Framework - Helps to build Speech Recognition Applications

SpeakRight 是一个 Java 框架,用于编写语音识别应用,基于 VoiceXML 技术。使用 StringTemplate 模板引擎自动生成 VoiceXML 文档。

3. Festival - Speech Synthesis System

Festival提供了一个通用的框架,用于构建语音合成系统,该系统包含了各种模块示例。它提供了完整的文本转语音的API,原生支持Mac OS,支持的语言包括英语和西班牙语。

Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library and an Emacs interface. Festival is multi-lingual (currently English (British and American), and Spanish) though English is the most advanced. Other groups release new languages for the system. And full tools and documentation for build new voices are available through Carnegie Mellon's FestVox project (http://festvox.org)

The system is written in C++ and uses the Edinburgh Speech Tools Library for low level architecture and has a Scheme (SIOD) based command interpreter for control. Documentation is given in the FSF texinfo format which can generate, a printed manual, info files and HTML.

Festival is free software. Festival and the speech tools are distributed under an X11-type licence allowing unrestricted commercial and non-commercial use alike.

This distribution includes:

4. FreeTTS - Speech Synthesizer in Java

FreeTTS 是完全采用 Java 开发的语音合成系统,它是卡内基梅隆大学基于 Flite 这个小型的语音合成引擎开发的。

5. Festvox - Builds New Synthetic Voices

Festvox项目构建了一个更加系统化、全新的语音合成功能。Festvox是大部分语音合成库的基础。

6. eSpeak - Text to Speech

eSpeak是一个小型的、开放源码的语音合成系统,支持多种语言。eSpeak使用共振峰合成方法,这可以使提供的语言文件非常小。该系统支持Windows平台上的SAPI5,所以能用于屏幕阅读程序和其他支持Windows SAPI5接口的程序。eSpeak可以将文本转换成音素代码,因此它也可以用于另一个语音合成引擎的前端。

eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows. http://espeak.sourceforge.net eSpeak uses a "formant synthesis" method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings.

eSpeak is available as:

7. Flite - Fast Run time Synthesis Engine

Flite是一个小型、快速的TTS系统,是著名的语音合成系统festival的C版本,可用于嵌入式系统。

英文原文:http://www.findbestopensource.com/tagged/text-to-speech

Flite (festival-lite) is a small, fast run-time synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative synthesis engine to Festival for voices built using the FestVox suite of voice building tools. Flite 1.4-release is now released as source. Flite offers:

HSY75案

TTS 的几个验证可以访问的网站:

其他参考:

Architecture Walkthrough