1. 简介
短语音识别REST API支持以POST方式整段上传长度不多于
一分钟的语音文件。识别结果将以
JSON格式在请求响应中
一次性返回,开发者需保证在识别结果返回前连接不被中断。
2. 选取语音模型和编码格式
开发者需要根据自身应用场景选取合适的语音模型。不同的语音模型对应不同的编码与数据格式,并在特定的场景下获得更高的识别准确度。
3. 上传语音文件
<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- POST https://nlsapi.aliyun.com/recognize?model=chat
- Authorization: Dataplus *****
- Content-type: audio/pcm; samplerate=16000
- Accept: application/json
- Date: Sat, 11 Mar 2017 08:33:32 GMT
- Content-Length: *
- [audio data]
一个完整的语音识别请求需包含以下要素:
3.1 URL
3.2 输入参数
3.3 HTTP Header Field
3.3.1 Authorization Header
调用阿里巴巴智能语音交互平台的任何功能前都需经过严格的鉴权验证。在处理用户请求前,服务端会校验Authorization Header以确保用户请求在传输过程中没有被恶意篡改或替换。<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- Authorization: Dataplus access_id:signature
Authorization以固定字符串Dataplus开头,开发者需要将从阿里云申请到的access_id和经过计算的signature以:分隔并以
Base64编码后加入Header。
3.3.1.1 signature的计算
与
阿里云标准校验规则略有区别,计算语音服务的signature需要首先对语音文件进行
两次MD5和Base64编码,然后将编码结果与Reqeust Method,Accept,Content-Type和Date Header合并产生特征值,最后用阿里云取得的access_key对特征值进行HMAC-SHA1加密生成signature。<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- // 1.对body做两次MD5+BASE64加密
- String bodyMd5 = MD5Base64(MD5Base64(body));
- // 2.特征值
- String feature = method + "\n" + accept + "\n" + bodyMd5 + "\n" + content_type + "\n" + date;
- // 2.对特征值HMAC-SHA1加密
- String signature = HMACSha1(feature, access_secret);
3.3.1.2 计算 MD5+BASE64
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- public static String MD5Base64(String s) throws UnsupportedEncodingException {
- if (s == null)
- return null;
- String encodeStr = "";
- //string 编码必须为utf-8
- byte[] utfBytes = s.getBytes("UTF-8");
- MessageDigest mdTemp;
- try {
- mdTemp = MessageDigest.getInstance("MD5");
- mdTemp.update(utfBytes);
- byte[] md5Bytes = mdTemp.digest();
- BASE64Encoder b64Encoder = new BASE64Encoder();
- encodeStr = b64Encoder.encode(md5Bytes);
- } catch (Exception e) {
- throw new Error("Failed to generate MD5 : " + e.getMessage());
- }
- return encodeStr;
- }
3.3.1.3 计算 HMAC-SHA1
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- public static String HMACSha1(String data, String key) {
- String result;
- try {
- SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(), "HmacSHA1");
- Mac mac = Mac.getInstance("HmacSHA1");
- mac.init(signingKey);
- byte[] rawHmac = mac.doFinal(data.getBytes());
- result = (new BASE64Encoder()).encode(rawHmac);
- } catch (Exception e) {
- throw new Error("Failed to generate HMAC : " + e.getMessage());
- }
- return result;
- }
3.3.2 Content-Type Header
用来表明用户上传的音频数据格式和采样率,例如:<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- Content-Type: audio/wav; samplerate=8000
如果没有提供samplerate字段,采样率默认为16K。目前支持的音频编码格式如下:
4. 识别结果返回
HTTP状态码200表示识别成功,请求结果以application/json格式在Response Body中返回;其他的HTTP错误码表示识别失败,具体的错误消息以application/json格式在Response Body中返回。
4.1 识别成功与识别失败
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- {
- "request_id":"5552717cb25e4f64a180feecbe478889",
- "result":"测试五秒钟长度的语音"
- }
4.2 识别失败
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- {
- "request_id":"5552717cb25e4f64a180feecbe478889",
- "error_code":80103,
- "error_message":"Failed to invoke auth service!"
- }
4.3 错误码定义
5. 代码示例
5.1 请求DEMO
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- package com.alibaba.idst.nls;
- import java.io.File;
- import java.io.IOException;
- import java.io.UnsupportedEncodingException;
- import java.nio.file.FileSystem;
- import java.nio.file.FileSystems;
- import java.nio.file.Files;
- import java.nio.file.Path;
- import com.alibaba.fastjson.JSON;
- import com.alibaba.idst.nls.response.HttpResponse;
- import com.alibaba.idst.nls.utils.HttpUtil;
- import org.slf4j.Logger;
- import org.slf4j.LoggerFactory;
- public class HttpAsrDemo {
- private static Logger logger = LoggerFactory.getLogger(HttpAsrDemo.class);
- private static String url = "http://nlsapi.aliyun.com/recognize?";
- public static void main(String[] args) throws IOException {
- //请使用https://ak-console.aliyun.com/ 页面获取的Access 信息
- //请提前开通智能语音服务(https://data.aliyun.com/product/nls)
- String ak_id = args[0];
- String ak_secret = args[1];
- //使用对应的ASR模型 详情见文档部分2
- String model = "chat";
- url = url+"model="+model;
- //读取本地的语音文件
- Path path = FileSystems.getDefault().getPath("src/main/resources/demo.wav");
- byte[] data = Files.readAllBytes(path);
- HttpResponse response = HttpUtil.sendAsrPost(data,"pcm",16000,url,ak_id,ak_secret);
- logger.info(JSON.toJSONString(response));
- }
- }
5.2 请求服务的HttpUtil类
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- package com.alibaba.idst.nls.utils;
- /**
- * Created by songsong.sss on 16/5/23.
- */
- import java.io.*;
- import java.net.HttpURLConnection;
- import java.net.URL;
- import java.security.MessageDigest;
- import java.text.SimpleDateFormat;
- import java.util.*;
- import javax.crypto.spec.SecretKeySpec;
- import com.alibaba.idst.nls.response.HttpResponse;
- import javax.crypto.Mac;
- import org.slf4j.Logger;
- import org.slf4j.LoggerFactory;
- import sun.misc.BASE64Encoder;
- @SuppressWarnings("restriction")
- public class HttpUtil {
- static Logger logger = LoggerFactory.getLogger(HttpUtil.class);
- /*
- * 计算MD5+BASE64
- */
- public static String MD5Base64(byte[] s) throws UnsupportedEncodingException {
- if (s == null){
- return null;
- }
- String encodeStr = "";
- //string 编码必须为utf-8
- MessageDigest mdTemp;
- try {
- mdTemp = MessageDigest.getInstance("MD5");
- mdTemp.update(s);
- byte[] md5Bytes = mdTemp.digest();
- BASE64Encoder b64Encoder = new BASE64Encoder();
- encodeStr = b64Encoder.encode(md5Bytes);
- /* java 1.8以上版本支持
- Encoder encoder = Base64.getEncoder();
- encodeStr = encoder.encodeToString(md5Bytes);
- */
- } catch (Exception e) {
- throw new Error("Failed to generate MD5 : " + e.getMessage());
- }
- return encodeStr;
- }
- /*
- * 计算 HMAC-SHA1
- */
- public static String HMACSha1(String data, String key) {
- String result;
- try {
- SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(), "HmacSHA1");
- Mac mac = Mac.getInstance("HmacSHA1");
- mac.init(signingKey);
- byte[] rawHmac = mac.doFinal(data.getBytes());
- result = (new BASE64Encoder()).encode(rawHmac);
- /*java 1.8以上版本支持
- Encoder encoder = Base64.getEncoder();
- result = encoder.encodeToString(rawHmac);
- */
- } catch (Exception e) {
- throw new Error("Failed to generate HMAC : " + e.getMessage());
- }
- return result;
- }
- /*
- * 等同于javaScript中的 new Date().toUTCString();
- */
- public static String toGMTString(Date date) {
- SimpleDateFormat df = new SimpleDateFormat("E, dd MMM yyyy HH:mm:ss z", Locale.UK);
- df.setTimeZone(new java.util.SimpleTimeZone(0, "GMT"));
- return df.format(date);
- }
- /*
- * 发送POST请求
- */
- public static HttpResponse sendAsrPost(byte[] audioData, String audioFormat, int sampleRate, String url,String ak_id, String ak_secret) {
- PrintWriter out = null;
- BufferedReader in = null;
- String result = "";
- HttpResponse response = new HttpResponse();
- try {
- URL realUrl = new URL(url);
- /*
- * http header 参数
- */
- String method = "POST";
- String accept = "application/json";
- String content_type = "audio/"+audioFormat+";samplerate="+sampleRate;
- int length = audioData.length;
- String date = toGMTString(new Date());
- // 1.对body做MD5+BASE64加密
- String bodyMd5 = MD5Base64(audioData);
- String md52 = MD5Base64(bodyMd5.getBytes());
- String stringToSign = method + "\n" + accept + "\n" + md52 + "\n" + content_type + "\n" + date ;
- // 2.计算 HMAC-SHA1
- String signature = HMACSha1(stringToSign, ak_secret);
- // 3.得到 authorization header
- String authHeader = "Dataplus " + ak_id + ":" + signature;
- // 打开和URL之间的连接
- HttpURLConnection conn = (HttpURLConnection) realUrl.openConnection();
- // 设置通用的请求属性
- conn.setRequestProperty("accept", accept);
- conn.setRequestProperty("content-type", content_type);
- conn.setRequestProperty("date", date);
- conn.setRequestProperty("Authorization", authHeader);
- conn.setRequestProperty("Content-Length", String.valueOf(length));
- // 发送POST请求必须设置如下两行
- conn.setDoOutput(true);
- conn.setDoInput(true);
- // 获取URLConnection对象对应的输出流
- OutputStream stream = conn.getOutputStream();
- // 发送请求参数
- stream.write(audioData);
- // flush输出流的缓冲
- stream.flush();
- stream.close();
- response.setStatus(conn.getResponseCode());
- // 定义BufferedReader输入流来读取URL的响应
- if (response.getStatus() ==200){
- in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
- }else {
- in = new BufferedReader(new InputStreamReader(conn.getErrorStream()));
- }
- String line;
- while ((line = in.readLine()) != null) {
- result += line;
- }
- if (response.getStatus() == 200){
- response.setResult(result);
- response.setMassage("OK");
- }else {
- response.setMassage(result);
- }
- System.out.println("post response status code: ["+response.getStatus()+"], response massage : ["+response.getMassage()+"] ,result :["+response.getResult()+"]");
- } catch (Exception e) {
- System.out.println("发送 POST 请求出现异常!" + e);
- e.printStackTrace();
- }
- // 使用finally块来关闭输出流、输入流
- finally {
- try {
- if (out != null) {
- out.close();
- }
- if (in != null) {
- in.close();
- }
- } catch (IOException ex) {
- ex.printStackTrace();
- }
- }
- return response;
- }
- /*
- * 发送POST请求
- */
- public static HttpResponse sendTtsPost(String textData,String audioType, String audioName,String url,String ak_id, String ak_secret) {
- PrintWriter out = null;
- BufferedReader in = null;
- String result = "";
- HttpResponse response = new HttpResponse();
- try {
- URL realUrl = new URL(url);
- /*
- * http header 参数
- */
- String method = "POST";
- String content_type = "text/plain";
- String accept = "audio/"+audioType+",application/json";
- int length = textData.length();
- String date = toGMTString(new Date());
- // 1.对body做MD5+BASE64加密
- String bodyMd5 = MD5Base64(textData.getBytes());
- // String md52 = MD5Base64(bodyMd5.getBytes());
- String stringToSign = method + "\n" + accept + "\n" + bodyMd5 + "\n" + content_type + "\n" + date ;
- // 2.计算 HMAC-SHA1
- String signature = HMACSha1(stringToSign, ak_secret);
- // 3.得到 authorization header
- String authHeader = "Dataplus " + ak_id + ":" + signature;
- // 打开和URL之间的连接
- HttpURLConnection conn = (HttpURLConnection) realUrl.openConnection();
- // 设置通用的请求属性
- conn.setRequestProperty("accept", accept);
- conn.setRequestProperty("content-type", content_type);
- conn.setRequestProperty("date", date);
- conn.setRequestProperty("Authorization", authHeader);
- conn.setRequestProperty("Content-Length", String.valueOf(length));
- // 发送POST请求必须设置如下两行
- conn.setDoOutput(true);
- conn.setDoInput(true);
- // 获取URLConnection对象对应的输出流
- OutputStream stream = conn.getOutputStream();
- // 发送请求参数
- stream.write(textData.getBytes());
- // flush输出流的缓冲
- stream.flush();
- stream.close();
- response.setStatus(conn.getResponseCode());
- // 定义BufferedReader输入流来读取URL的响应
- InputStream is = null;
- String line = null;
- if (response.getStatus() ==200){
- is=conn.getInputStream();
- }else {
- in = new BufferedReader(new InputStreamReader(conn.getErrorStream()));
- while ((line = in.readLine()) != null) {
- result += line;
- }
- }
- FileOutputStream fileOutputStream = null;
- File ttsFile = new File(audioName+"."+audioType);
- fileOutputStream = new FileOutputStream(ttsFile);
- byte[] b=new byte[1024];
- int len=0;
- while(is!=null&&(len=is.read(b))!=-1){ //先读到内存
- fileOutputStream.write(b, 0, len);
- }
- if (response.getStatus() == 200){
- response.setResult(result);
- response.setMassage("OK");
- System.out.println("post response status code: ["+response.getStatus()+"], generate tts audio file :" + audioName+"."+audioType);
- }else {
- response.setMassage(result);
- System.out.println("post response status code: ["+response.getStatus()+"], response massage : ["+response.getMassage()+"]");
- }
- } catch (Exception e) {
- System.out.println("发送 POST 请求出现异常!" + e);
- e.printStackTrace();
- }
- // 使用finally块来关闭输出流、输入流
- finally {
- try {
- if (out != null) {
- out.close();
- }
- if (in != null) {
- in.close();
- }
- } catch (IOException ex) {
- ex.printStackTrace();
- }
- }
- return response;
- }
- }
5.3 请求结果类
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); overflow: auto; margin-top: 0px; margin-right: 0px; margin-bottom: 0px !important; margin-left: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; font-size-adjust: none; font-stretch: normal;' prettyprinted?="" linenums="">
- package com.alibaba.idst.nls.response;
- public class HttpResponse {
- private int status;
- private String result;
- private String massage;
- public int getStatus() {
- return status;
- }
- public void setStatus(int status) {
- this.status = status;
- }
- public String getResult() {
- return result;
- }
- public void setResult(String result) {
- this.result = result;
- }