腾讯云流式TTS介绍
接入文档链接:https://cloud.tencent.com/document/api/441/19499
该接口传入参数为json,目前还不支持云api3.0鉴权,输出协议采用了http chunk协议,数据格式包括opus压缩后的分片和pcm原始音频流,本文将从鉴权开始,详细介绍流式tts的客户端实现。
接口鉴权
1.构造json请求参数,为了方便将参数排序,使用TreeMap存储参数
代码语言:javascript
复制
mRequestMap.put("Action", "TextToStreamAudio");
mRequestMap.put("Text", text);
mRequestMap.put("SessionId", "session-1234");
mRequestMap.put("AppId", "1255824371");
mRequestMap.put("Timestamp", "" + System.currentTimeMillis() / 1000L);
mRequestMap.put("Expired", "" + (System.currentTimeMillis() / 1000L + 600));
mRequestMap.put("Speed", "0");
mRequestMap.put("SecretId", SECRET_ID);
mRequestMap.put("VoiceType", 0 + "");
mRequestBody = (new JSONObject(mRequestMap)).toString();
2.生成签名串,按要求拼接字符串后加密即可,这里需要注意仔细阅读鉴权文档的说明,不然很容易出错
代码语言:javascript
复制
private static String generateSign(TreeMap<String, String> params) { String paramStr = "POST" + DOMAIN_NAME + "?"; StringBuilder builder = new StringBuilder(paramStr); for (Map.Entry<String, String> entry : params.entrySet()) { builder.append(String.format(Locale.CHINESE, "%s=%s", entry.getKey(), String.valueOf(entry.getValue()))) .append("&"); }
//去掉最后一个& builder.deleteCharAt(builder.lastIndexOf("&")); String sign = ""; String source = builder.toString(); System.out.println(source); Mac mac = null; try { mac = Mac.getInstance("HmacSHA1"); SecretKeySpec keySpec = new SecretKeySpec(SECRET_KEY.getBytes(), "HmacSHA1"); mac.init(keySpec); mac.update(source.getBytes()); sign = Base64.encodeToString(mac.doFinal(), 2); } catch (NoSuchAlgorithmException | InvalidKeyException e) { e.printStackTrace(); } System.out.println("生成签名串:" + sign); return sign; }</code></pre></div></div><p>到这里我们就获得了一个完整的签名串,接下来就是本文的重点点部分了,网络请求和网络解析</p><h3 id="551ob" name="chunk%E5%88%86%E5%9D%97%E4%BC%A0%E8%BE%93%E7%BC%96%E7%A0%81">chunk分块传输编码</h3><p>这里由于腾讯云采用了http chunk协议返回,不同于常规的http诸如json返回,采用多段分片返回数据的方式。消息体由数量未定的块组成,并以最后一个大小为0的块为结束。</p><p>每一个非空的块都以该块包含数据的字节数(字节数16进制以表示)开始,跟随一个CRLF (回车及换行),然后是数据本身,最后块CRLF结束。在一些实现中,块大小和CRLF之间填充有白空格(0x20)。</p><p>最后一块是单行,由块大小(0),一些可选的填充白空格,以及CRLF。最后一块不再包含任何数据,但是可以发送可选的尾部,包括消息头字段。</p><p>消息最后以CRLF结尾。一个完整的chunk返回示例如下:</p><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0">HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked25
This is the data in the first chunk1C
and this is the second one3
con8
sequence
0
如果对chunk协议希望有一个完整的了解,可以参考这篇wiki:分块传输编码
请求TTS数据
代码如下,我们直接获取返回数据数据流管道,用于数据读取
代码语言:javascript
复制
private static InputStream obtainResponseStreamWithJava(String postJsonBody, TreeMap<String, String> requestMap) throws IOException {
//发送POST请求
URL url = new URL(SERVER_URL);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
String authorization = generateSign(requestMap);
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type", "application/json");
conn.setRequestProperty("Authorization", authorization);
conn.connect();
OutputStream out = conn.getOutputStream();
out.write(postJsonBody.getBytes("UTF-8"));
out.flush();
out.close();
if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {//todo
Log.w(TAG, "HTTP Code: " + conn.getResponseCode());
}
// String result = new String(toByteArray(conn.getInputStream()), "UTF-8");
InputStream inputStream = conn.getInputStream();
return inputStream;
}
OPUS
根据官网的文档得知,数据分为两种,opus压缩和pcm原始音频流,题主了解到opus拥有较好的压缩比(10:1),可以很好的节省传输时间和网络带宽。
opus是开源库,但是是用C++编写的,由于Android5.0以上才支持opus格式的播发,所以如果需要兼容5.0的系统,需要编译so库。opus源码地址
TTS数据解析
这里主要参考官网的java示例,循环读取数据,按以下格式说明不断读取头/序号/长度/音频数据,直到到达数据末尾。
代码示例如下:
代码语言:javascript
复制
private void processProtocolBufferStream(final InputStream inputStream) throws DeserializationException {
final long start = System.currentTimeMillis();YoutuOpusDecoder decoder = null; List<PcmData> pcmCache = new ArrayList<>(); boolean fillSuccess; int pbPkgCount = -1; while (!Thread.currentThread().isInterrupted()) { pbPkgCount++; try { //read head byte[] headBuffer = new byte[4]; fillSuccess = fill(inputStream, headBuffer); if (!fillSuccess) { throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount)); } //read seq byte[] seqBuffer = new byte[4]; fillSuccess = fill(inputStream, seqBuffer); if (!fillSuccess) { throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount)); } int seq = bytesToInt(seqBuffer); //read pkg size byte[] pbPkgSizeHeader = new byte[4]; fillSuccess = fill(inputStream, pbPkgSizeHeader); if (!fillSuccess) { throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount)); } int pbPkgSize = bytesToInt(pbPkgSizeHeader); Log.i(TAG, String.format("PB pkg#%s size = %s", pbPkgCount, pbPkgSize)); if (pbPkgCount == 0) { sTimeEnd = System.currentTimeMillis(); sTimeCost = sTimeEnd - sTimeStart; } if (pbPkgSize <= 0) { throw new ReadBufferException(String.format("PB pkg#%s size %s <= 0, break;", pbPkgCount, pbPkgSize)); } else if (pbPkgSize > 5000) { throw new ReadBufferException(String.format("PB pkg#%s size %s > 5000 bytes, too large, break;", pbPkgCount, pbPkgSize)); } //read pb pkg byte[] pbPkg = new byte[pbPkgSize]; fillSuccess = fill(inputStream, pbPkg); if (!fillSuccess) { throw new ReadBufferException(String.format("read PB pkg#%s fail, break;", pbPkgCount)); } //init decoder if (decoder == null) { decoder = new YoutuOpusDecoder(); decoder.config(); } //decode Log.i("DEBUG-1", "seq:" + seq); Pair<Integer, short[]> pair = decoder.decodeTTSData(seq, pbPkg); short[] pcm = pair.second; Log.d(TAG, (pcm == null ? "fail decode #" : "decode #") + pbPkgCount); //packaging pcm if (pcm == null) { pcm = new short[0]; } PcmData pcmData = new PcmData(pcm, seq == -1); //stop check if (Thread.currentThread().isInterrupted()) { Log.w(TAG, "pcm data ready, but thread is interrupted, break;"); break; } //init player if (mOpusPlayer == null) { mOpusPlayer = new OpusPlayer(); mOpusPlayer.setPcmSampleRate(16000); mOpusPlayer.setUncaughtExceptionHandler(new UncaughtExceptionHandler() { @Override public void uncaughtException(Thread thread, Throwable ex) { if (mTtsExceptionHandler != null) { mTtsExceptionHandler.onPlayException(thread, ex); } } }); } //enqueue if (pbPkgCount < mCacheCount) {//缓冲 pcmCache.add(pcmData); } else {//enqueue for (PcmData d : pcmCache) { mOpusPlayer.enqueue(d); } pcmCache.clear(); mOpusPlayer.enqueue(pcmData); } //end if (seq == -1) { long ms = System.currentTimeMillis() - start; Log.d(TAG, "finish last pb pkg#" + pbPkgCount + ", total cast time " + ms + " ms"); break; } } catch (Exception e) { if (mOpusPlayer != null) { mOpusPlayer.forceStop(); } if (e instanceof InterruptedIOException) { Log.i(TAG, "Interrupted while reading server response InputStream", e);// 正常流程, 无需抛出异常 } else { throw new DeserializationException(e); } } } }</code></pre></div></div><p>其中,按小端字节读取方式如下:</p><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0"> /** * 从 InputStream 读取内容到 buffer, 直到 buffer 填满 * * @return 如果 InputStream 内容不足以填满 buffer, 则返回 false. * @throws IOException 可能抛出的异常 */ private static boolean fill(InputStream in, byte[] buffer) throws IOException { int length = buffer.length; int hasRead = 0; while (true) { int offset = hasRead; int count = length - hasRead; int currentRead = in.read(buffer, offset, count); if (currentRead >= 0) { hasRead += currentRead; if (hasRead == length) { return true; } } if (currentRead == -1) { return false; } } }</code></pre></div></div><h3 id="c4v08" name="TTS%E8%AF%AD%E9%9F%B3%E6%92%AD%E6%94%BE">TTS语音播放</h3><p>TTS完成解析的数据都经由YoutuOpusDecoder类进行播放,此处主要封装了两个功能,第一个功能是封装了AudioTrack播放pcm原始音频,第二个是将解析完成的音频不断送入播放器</p><p>完整代码如下:</p><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0">public class OpusPlayer { private static final String TAG = "OpusPlayer"; private BlockingQueue<PcmData> mPcmQueue = new LinkedBlockingQueue<>(); private volatile Thread mPlayThread; private int mPcmSampleRate; private UncaughtExceptionHandler mUncaughtExceptionHandler; public void setUncaughtExceptionHandler(UncaughtExceptionHandler handler) { mUncaughtExceptionHandler = handler; } public void setPcmSampleRate(int pcmSampleRate) { mPcmSampleRate = pcmSampleRate; } public void enqueue(PcmData pcmData) { mPcmQueue.add(pcmData); if (mPlayThread == null) { mPlayThread = new Thread(new Runnable() { PcmPlayer mPlayer; @Override public void run() { Log.d(TAG, getThreadLogPrefix() + "start"); int playerPrepareFailCount = 0; int playCount = 0; long start = System.currentTimeMillis(); while (!Thread.currentThread().isInterrupted()) { //准备播放器 boolean isPlayerReady = preparePlayerIfNeeded(); if (!isPlayerReady) { releasePlayer(); playerPrepareFailCount++; if (playerPrepareFailCount > 5) { releasePlayer(); throw new RuntimeException("prepare player fail too many times, abort.");//不再尝试了 } else { Log.w(TAG, getThreadLogPrefix() + "prepare player fail, retry."); continue;//再尝试 } } //出队 PcmData pcmData; try { pcmData = mPcmQueue.take(); } catch (InterruptedException e) { e.printStackTrace(); Log.d(TAG, getThreadLogPrefix() + "force stop"); break; } //播放 if (pcmData != null) { try { short[] pcm = pcmData.getPcm(); if (pcm != null) { mPlayer.play(pcm); Log.d(TAG, getThreadLogPrefix() + "play #" + playCount); } else { Log.d(TAG, getThreadLogPrefix() + "play #" + playCount + " fail, pcm == null !!"); } if (pcmData.isLastOne()) { Log.d(TAG, getThreadLogPrefix() + "finish all task, will stop"); break; } playCount++; } catch (AudioTrackException e) { e.printStackTrace(); releasePlayer();//下一个循环会尝试重新初始化 player } } else { Log.w(TAG, getThreadLogPrefix() + "mPcmQueue.take() == null, nothing to play"); } } releasePlayer(); long time = System.currentTimeMillis() - start; Log.d(TAG, getThreadLogPrefix() + "stop, ran " + time + " ms"); } /** * @return true: player is ready */ boolean preparePlayerIfNeeded() { if (mPlayer == null) { mPlayer = new PcmPlayer(); try { mPlayer.prepare(AudioManager.STREAM_MUSIC, mPcmSampleRate, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT); } catch (AudioTrackException e) { e.printStackTrace(); releasePlayer(); } } return mPlayer != null; } void releasePlayer() { if (mPlayer != null) { mPlayer.release(); mPlayer = null; } } }); mPlayThread.setPriority(Thread.NORM_PRIORITY - 1);//播放耗时最长, 优先级比解码线程低一点, 可以让出多一点时间给解码线程 mPlayThread.setName(TAG + ".mPlayThread"); if (mUncaughtExceptionHandler != null) { mPlayThread.setUncaughtExceptionHandler(mUncaughtExceptionHandler); } mPlayThread.start(); } } private static String getThreadLogPrefix() { Thread currentThread = Thread.currentThread(); String s = currentThread.getName() + "#" + currentThread.getId() + ": "; return s; } public void forceStop() { if (mPlayThread != null && !mPlayThread.isInterrupted()) { mPlayThread.interrupt(); mPlayThread = null; } mPcmQueue.clear(); } public static class PcmData { private final short[] mPcm; private final boolean mIsLastOne; public PcmData(short[] pcm, boolean isLastOne) { mPcm = pcm; mIsLastOne = isLastOne; } short[] getPcm() { return mPcm; } boolean isLastOne() { return mIsLastOne; } }
}