腾讯云流式TTS语音合成客户端实现

腾讯云流式TTS介绍

接入文档链接:https://cloud.tencent.com/document/api/441/19499

该接口传入参数为json,目前还不支持云api3.0鉴权,输出协议采用了http chunk协议,数据格式包括opus压缩后的分片和pcm原始音频流,本文将从鉴权开始,详细介绍流式tts的客户端实现。

接口鉴权

1.构造json请求参数,为了方便将参数排序,使用TreeMap存储参数

代码语言:javascript
复制
 mRequestMap.put("Action", "TextToStreamAudio");
 mRequestMap.put("Text", text);
 mRequestMap.put("SessionId", "session-1234");
 mRequestMap.put("AppId", "1255824371");
 mRequestMap.put("Timestamp", "" + System.currentTimeMillis() / 1000L);
 mRequestMap.put("Expired", "" + (System.currentTimeMillis() / 1000L + 600));
 mRequestMap.put("Speed", "0");
 mRequestMap.put("SecretId", SECRET_ID);
 mRequestMap.put("VoiceType", 0 + "");
 mRequestBody =  (new JSONObject(mRequestMap)).toString();                     
                                         

2.生成签名串,按要求拼接字符串后加密即可,这里需要注意仔细阅读鉴权文档的说明,不然很容易出错

代码语言:javascript
复制
private static String generateSign(TreeMap<String, String> params) {
        String paramStr = "POST" + DOMAIN_NAME + "?";
        StringBuilder builder = new StringBuilder(paramStr);
        for (Map.Entry<String, String> entry : params.entrySet()) {
            builder.append(String.format(Locale.CHINESE, "%s=%s", entry.getKey(), String.valueOf(entry.getValue())))
                    .append("&");
        }
    //去掉最后一个&amp;
    builder.deleteCharAt(builder.lastIndexOf(&#34;&amp;&#34;));

    String sign = &#34;&#34;;
    String source = builder.toString();
    System.out.println(source);
    Mac mac = null;
    try {
        mac = Mac.getInstance(&#34;HmacSHA1&#34;);
        SecretKeySpec keySpec = new SecretKeySpec(SECRET_KEY.getBytes(), &#34;HmacSHA1&#34;);
        mac.init(keySpec);
        mac.update(source.getBytes());
        sign = Base64.encodeToString(mac.doFinal(), 2);
    } catch (NoSuchAlgorithmException | InvalidKeyException e) {
        e.printStackTrace();
    }

    System.out.println(&#34;生成签名串:&#34; + sign);
    return sign;
}</code></pre></div></div><p>到这里我们就获得了一个完整的签名串,接下来就是本文的重点点部分了,网络请求和网络解析</p><h3 id="551ob" name="chunk%E5%88%86%E5%9D%97%E4%BC%A0%E8%BE%93%E7%BC%96%E7%A0%81">chunk分块传输编码</h3><p>这里由于腾讯云采用了http chunk协议返回,不同于常规的http诸如json返回,采用多段分片返回数据的方式。消息体由数量未定的块组成,并以最后一个大小为0的块为结束。</p><p>每一个非空的块都以该块包含数据的字节数(字节数16进制以表示)开始,跟随一个CRLF (回车及换行),然后是数据本身,最后块CRLF结束。在一些实现中,块大小和CRLF之间填充有白空格(0x20)。</p><p>最后一块是单行,由块大小(0),一些可选的填充白空格,以及CRLF。最后一块不再包含任何数据,但是可以发送可选的尾部,包括消息头字段。</p><p>消息最后以CRLF结尾。一个完整的chunk返回示例如下:</p><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0">HTTP/1.1 200 OK

Content-Type: text/plain
Transfer-Encoding: chunked

25
This is the data in the first chunk

1C
and this is the second one

3
con

8
sequence

0

如果对chunk协议希望有一个完整的了解,可以参考这篇wiki:分块传输编码

请求TTS数据

代码如下,我们直接获取返回数据数据流管道,用于数据读取

代码语言:javascript
复制
private static InputStream obtainResponseStreamWithJava(String postJsonBody, TreeMap<String, String> requestMap) throws IOException {
//发送POST请求
URL url = new URL(SERVER_URL);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
String authorization = generateSign(requestMap);
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type", "application/json");
conn.setRequestProperty("Authorization", authorization);
conn.connect();
OutputStream out = conn.getOutputStream();
out.write(postJsonBody.getBytes("UTF-8"));
out.flush();
out.close();
if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {//todo
Log.w(TAG, "HTTP Code: " + conn.getResponseCode());
}
// String result = new String(toByteArray(conn.getInputStream()), "UTF-8");
InputStream inputStream = conn.getInputStream();
return inputStream;
}

OPUS

根据官网的文档得知,数据分为两种,opus压缩和pcm原始音频流,题主了解到opus拥有较好的压缩比(10:1),可以很好的节省传输时间和网络带宽。

opus是开源库,但是是用C++编写的,由于Android5.0以上才支持opus格式的播发,所以如果需要兼容5.0的系统,需要编译so库。opus源码地址

TTS数据解析

这里主要参考官网的java示例,循环读取数据,按以下格式说明不断读取头/序号/长度/音频数据,直到到达数据末尾。

tts分片格式

代码示例如下:

代码语言:javascript
复制
private void processProtocolBufferStream(final InputStream inputStream) throws DeserializationException {
final long start = System.currentTimeMillis();

        YoutuOpusDecoder decoder = null;

        List&lt;PcmData&gt; pcmCache = new ArrayList&lt;&gt;();
        boolean fillSuccess;
        int pbPkgCount = -1;

        while (!Thread.currentThread().isInterrupted()) {
            pbPkgCount++;
            try {
                //read head
                byte[] headBuffer = new byte[4];
                fillSuccess = fill(inputStream, headBuffer);
                if (!fillSuccess) {
                    throw new ReadBufferException(String.format(&#34;read PB pkg#%s size header fail, break;&#34;, pbPkgCount));
                }
                //read seq
                byte[] seqBuffer = new byte[4];
                fillSuccess = fill(inputStream, seqBuffer);
                if (!fillSuccess) {
                    throw new ReadBufferException(String.format(&#34;read PB pkg#%s size header fail, break;&#34;, pbPkgCount));
                }
                int seq = bytesToInt(seqBuffer);
                //read pkg size
                byte[] pbPkgSizeHeader = new byte[4];
                fillSuccess = fill(inputStream, pbPkgSizeHeader);
                if (!fillSuccess) {
                    throw new ReadBufferException(String.format(&#34;read PB pkg#%s size header fail, break;&#34;, pbPkgCount));
                }
                int pbPkgSize = bytesToInt(pbPkgSizeHeader);
                Log.i(TAG, String.format(&#34;PB pkg#%s size = %s&#34;, pbPkgCount, pbPkgSize));
                if (pbPkgCount == 0) {
                    sTimeEnd = System.currentTimeMillis();
                    sTimeCost = sTimeEnd - sTimeStart;
                }
                if (pbPkgSize &lt;= 0) {
                    throw new ReadBufferException(String.format(&#34;PB pkg#%s size %s &lt;= 0, break;&#34;, pbPkgCount, pbPkgSize));
                } else if (pbPkgSize &gt; 5000) {
                    throw new ReadBufferException(String.format(&#34;PB pkg#%s size %s &gt; 5000 bytes, too large, break;&#34;, pbPkgCount, pbPkgSize));
                }

                //read pb pkg
                byte[] pbPkg = new byte[pbPkgSize];
                fillSuccess = fill(inputStream, pbPkg);
                if (!fillSuccess) {
                    throw new ReadBufferException(String.format(&#34;read PB pkg#%s fail, break;&#34;, pbPkgCount));
                }

                //init decoder
                if (decoder == null) {
                    decoder = new YoutuOpusDecoder();
                    decoder.config();
                }
                //decode
                Log.i(&#34;DEBUG-1&#34;, &#34;seq:&#34; + seq);
                Pair&lt;Integer, short[]&gt; pair = decoder.decodeTTSData(seq, pbPkg);
                short[] pcm = pair.second;

                Log.d(TAG, (pcm == null ? &#34;fail decode #&#34; : &#34;decode #&#34;) + pbPkgCount);

                //packaging pcm
                if (pcm == null) {
                    pcm = new short[0];
                }
                PcmData pcmData = new PcmData(pcm, seq == -1);

                //stop check
                if (Thread.currentThread().isInterrupted()) {
                    Log.w(TAG, &#34;pcm data ready, but thread is interrupted, break;&#34;);
                    break;
                }

                //init player
                if (mOpusPlayer == null) {
                    mOpusPlayer = new OpusPlayer();
                    mOpusPlayer.setPcmSampleRate(16000);
                    mOpusPlayer.setUncaughtExceptionHandler(new UncaughtExceptionHandler() {
                        @Override
                        public void uncaughtException(Thread thread, Throwable ex) {
                            if (mTtsExceptionHandler != null) {
                                mTtsExceptionHandler.onPlayException(thread, ex);
                            }
                        }
                    });
                }

                //enqueue
                if (pbPkgCount &lt; mCacheCount) {//缓冲
                    pcmCache.add(pcmData);
                } else {//enqueue
                    for (PcmData d : pcmCache) {
                        mOpusPlayer.enqueue(d);
                    }
                    pcmCache.clear();
                    mOpusPlayer.enqueue(pcmData);
                }

                //end
                if (seq == -1) {
                    long ms = System.currentTimeMillis() - start;
                    Log.d(TAG, &#34;finish last pb pkg#&#34; + pbPkgCount + &#34;, total cast time &#34; + ms + &#34; ms&#34;);
                    break;
                }
            } catch (Exception e) {
                if (mOpusPlayer != null) {
                    mOpusPlayer.forceStop();
                }
                if (e instanceof InterruptedIOException) {
                    Log.i(TAG, &#34;Interrupted while reading server response InputStream&#34;, e);// 正常流程, 无需抛出异常
                } else {
                    throw new DeserializationException(e);
                }
            }
        }
    }</code></pre></div></div><p>其中,按小端字节读取方式如下:</p><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0"> /**
 * 从 InputStream 读取内容到 buffer, 直到 buffer 填满
 *
 * @return 如果 InputStream 内容不足以填满 buffer, 则返回 false.
 * @throws IOException 可能抛出的异常
 */
private static boolean fill(InputStream in, byte[] buffer) throws IOException {
    int length = buffer.length;
    int hasRead = 0;
    while (true) {
        int offset = hasRead;
        int count = length - hasRead;
        int currentRead = in.read(buffer, offset, count);
        if (currentRead &gt;= 0) {
            hasRead += currentRead;
            if (hasRead == length) {
                return true;
            }
        }
        if (currentRead == -1) {
            return false;
        }
    }
}</code></pre></div></div><h3 id="c4v08" name="TTS%E8%AF%AD%E9%9F%B3%E6%92%AD%E6%94%BE">TTS语音播放</h3><p>TTS完成解析的数据都经由YoutuOpusDecoder类进行播放,此处主要封装了两个功能,第一个功能是封装了AudioTrack播放pcm原始音频,第二个是将解析完成的音频不断送入播放器</p><p>完整代码如下:</p><div class="rno-markdown-code"><div class="rno-markdown-code-toolbar"><div class="rno-markdown-code-toolbar-info"><div class="rno-markdown-code-toolbar-item is-type"><span class="is-m-hidden">代码语言:</span>javascript</div></div><div class="rno-markdown-code-toolbar-opt"><div class="rno-markdown-code-toolbar-copy"><i class="icon-copy"></i><span class="is-m-hidden">复制</span></div></div></div><div class="developer-code-block"><pre class="prism-token token line-numbers language-javascript"><code class="language-javascript" style="margin-left:0">public class OpusPlayer {
private static final String TAG = &#34;OpusPlayer&#34;;

private BlockingQueue&lt;PcmData&gt; mPcmQueue = new LinkedBlockingQueue&lt;&gt;();
private volatile Thread mPlayThread;
private int mPcmSampleRate;
private UncaughtExceptionHandler mUncaughtExceptionHandler;

public void setUncaughtExceptionHandler(UncaughtExceptionHandler handler) {
    mUncaughtExceptionHandler = handler;
}

public void setPcmSampleRate(int pcmSampleRate) {
    mPcmSampleRate = pcmSampleRate;
}


public void enqueue(PcmData pcmData) {
    mPcmQueue.add(pcmData);

    if (mPlayThread == null) {
        mPlayThread = new Thread(new Runnable() {

            PcmPlayer mPlayer;

            @Override
            public void run() {
                Log.d(TAG, getThreadLogPrefix() + &#34;start&#34;);
                int playerPrepareFailCount = 0;
                int playCount = 0;
                long start = System.currentTimeMillis();

                while (!Thread.currentThread().isInterrupted()) {
                    
                    //准备播放器
                    boolean isPlayerReady = preparePlayerIfNeeded();
                    if (!isPlayerReady) {
                        releasePlayer();
                        playerPrepareFailCount++;
                        if (playerPrepareFailCount &gt; 5) {
                            releasePlayer();
                            throw new RuntimeException(&#34;prepare player fail too many times, abort.&#34;);//不再尝试了
                        } else {
                            Log.w(TAG, getThreadLogPrefix() + &#34;prepare player fail, retry.&#34;);
                            continue;//再尝试
                        }
                    }

                    //出队
                    PcmData pcmData;
                    try {
                        pcmData = mPcmQueue.take();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                        Log.d(TAG, getThreadLogPrefix() + &#34;force stop&#34;);
                        break;
                    }
                    
                    //播放
                    if (pcmData != null) {
                        try {
                            short[] pcm = pcmData.getPcm();
                            if (pcm != null) {
                                mPlayer.play(pcm);
                                Log.d(TAG, getThreadLogPrefix() + &#34;play #&#34; + playCount);
                            } else {
                                Log.d(TAG, getThreadLogPrefix() + &#34;play #&#34; + playCount + &#34; fail, pcm == null !!&#34;);
                            }
                            if (pcmData.isLastOne()) {
                                Log.d(TAG, getThreadLogPrefix() + &#34;finish all task, will stop&#34;);
                                break;
                            }
                            playCount++;
                        } catch (AudioTrackException e) {
                            e.printStackTrace();
                            releasePlayer();//下一个循环会尝试重新初始化 player
                        }
                    } else {
                        Log.w(TAG, getThreadLogPrefix() + &#34;mPcmQueue.take() == null, nothing to play&#34;);
                    }
                }

                releasePlayer();
                long time = System.currentTimeMillis() - start;
                Log.d(TAG, getThreadLogPrefix() + &#34;stop, ran &#34; + time + &#34; ms&#34;);
            }

            /**
             * @return true: player is ready
             */
            boolean preparePlayerIfNeeded() {
                if (mPlayer == null) {
                    mPlayer = new PcmPlayer();
                    try {
                        mPlayer.prepare(AudioManager.STREAM_MUSIC, mPcmSampleRate, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT);
                    } catch (AudioTrackException e) {
                        e.printStackTrace();
                        releasePlayer();
                    }
                }
                return mPlayer != null;
            }

            void releasePlayer() {
                if (mPlayer != null) {
                    mPlayer.release();
                    mPlayer = null;
                }
            }

        });
        mPlayThread.setPriority(Thread.NORM_PRIORITY - 1);//播放耗时最长, 优先级比解码线程低一点, 可以让出多一点时间给解码线程
        mPlayThread.setName(TAG + &#34;.mPlayThread&#34;);
        if (mUncaughtExceptionHandler != null) {
            mPlayThread.setUncaughtExceptionHandler(mUncaughtExceptionHandler);
        }
        mPlayThread.start();
    }
}

private static String getThreadLogPrefix() {
    Thread currentThread = Thread.currentThread();
    String s = currentThread.getName() + &#34;#&#34; + currentThread.getId() + &#34;: &#34;;
    return s;
}

public void forceStop() {
    if (mPlayThread != null &amp;&amp; !mPlayThread.isInterrupted()) {
        mPlayThread.interrupt();
        mPlayThread = null;
    }
    mPcmQueue.clear();
}

public static class PcmData {
    private final short[] mPcm;
    private final boolean mIsLastOne;

    public PcmData(short[] pcm, boolean isLastOne) {
        mPcm = pcm;
        mIsLastOne = isLastOne;
    }

    short[] getPcm() {
        return mPcm;
    }

    boolean isLastOne() {
        return mIsLastOne;
    }
}

}