功能介绍
语音iOS SDK提供提供将文本转为普通话语音的语音合成功能。
SDK下载地址
语音合成iOSSDK
获取appkey
获取appkey
重要接口说明
发送语音请求的对象及方法NlsRequest.h
语音请求初始化方法
- (instancetype)init;
- 说明 语音识别、语音合成的语音请求初始化方法
- 返回值 self
设置语音请求的appkey
- (void)setAppkey:(NSString *)appKey;
返回值 无
设置发送的请求是否需要带语音数据
- (void)setBstreamAttached:(BOOL)bstreamAttached;
- 说明 设置发送的请求是否需要带语音数据。若发送的是语音识别请求,则bstreamAttached为YES;若发送的是语音合成请求,则bstreamAttached为NO
- 参数bstreamAttached 请求是否需要带语音数据
返回值 无
设置TTS文本
- (void)setTTSText:(NSString *)text;
返回值 无
设置TTS语音编码格式
- (void)setTtsEncodeType:(NSString *)encode_type;
- 说明 设置TTS语音编码格式
- 参数encode_type 语音数据编码,取值范围pcm,wav或alaw,默认pcm
返回值 无
设置TTS采样率
(void)setTTSSampleRate:(NSString *)sampleRate;参数sampleRate 采样率,取值范围8000~16000,默认16000
返回值 无
设置TTS语速
- (void)setTtsSpeechRate:(NSInteger)speechRate;
- 说明 设置TTS语音编码格式
- 参数speechRate 播放速率,取值范围-500~500,默认0
返回值 无
设置TTS音量
- (void)setTtsVolume:(NSInteger)volume;
- 参数volume 播放音量,取值范围0~100,默认50
返回值 无
设置TTS模式
- (void)setTtsNus:(NSInteger)nus;
返回值 无
数加验证
- (void)Authorize:(NSString )authId withSecret:(NSString )secret;
- 说明 数加验证,未经过数加验证的语音请求均为非法请求。
- 参数authId 数加验证的ak_id
- secret 数加验证的ak_secret
返回值 无
将语音请求NlsRequest对象转换成JSON字符串
+ (NSString )getJSONStringfromNlsRequest:(NlsRequest )nlsRequest;
- 说明 将语音请求NlsRequest对象转换成JSON字符串形式。
- 参数nlsRequest NlsRequest对象
返回值 NlsRequest的JSON字符串
将对象转换成JSONString方法
+ (NSString *)getJSONString:(id)obj options:(NSJSONWritingOptions)options error:(NSError)error;**
- 说明 将object转换成JSONString。
- 参数obj 被转化对象
- options NSJSONWritingOptions
- error NSError
返回值 NlsRequest的JSON字符串
将对象转换成NSDictionary方法
+ (NSDictionary *)getObjectData:(id)obj;
- 说明 将object转换成NSDictionary。
- 参数obj 被转化对象
返回值 NSDictionary
语音服务SDK的核心类NlsRecognizer.h
语音服务SDK的核心类,封装了录音设备的初始化,压缩处理,语音检测(VAD)等复杂逻辑,自动的将语音数据同步传送到语音服务器上。开发者只需要传递正确delegate的,就能完成语音识别和语音合成
语音合成的关键回调函数
-(void)recognizer:(NlsRecognizer )recognizer didCompleteTTSWithVoiceData:(NSData)voiceData error:(NSError*)error;
- 说明 接收服务器返回的语音数据,多次回调,每次返回不大于8004字节的NSData数据,前4个字节为数据长度相关的数据。直到返回byte前4个字节为0000,停止回调。
- 参数recognizer NlsRecognizer
- voiceData 服务器传回的语音数据
- error 语音合成错误和异常 NSError
返回值 无
开始返回语音合成数据
-(void)recognizerDidStartRecieveTTSData:(NlsRecognizer *)recognizer;
返回值 无
结束返回语音合成数据
-(void)recognizerDidStopRecieveTTSData:(NlsRecognizer *)recognizer;
返回值 无
设置SDK工作模式
NlsRecognizer @property(nonatomic,assign,readwrite) kNlsRecognizerMode mode;
- 说明 设置语音SDK的工作模式,若不设置,则为kMODE_RECOGNIZER
设置SDK是否监听App状态
NlsRecognizer @property(nonatomic,assign,readwrite) BOOL cancelOnAppEntersBackground;
- 说明 设置SDK是否监听App状态,缺省为NO。如果设为YES,则SDK会监听App状态,一旦切换到后台,就自动取消请求。
配置语音服务模块的基础参数
+(void)configure;
- 说明 配置语音服务模块的基础参数,请在App启动的时候调用
- 参数 无
- 返回值 无
初始化NlsRecognizer
-(id)initWithNlsRequest:(NlsRequest )nlsRequest svcURL:(NSString )svcURL;
- 说明 初始化NlsRecognizer,注意: 在其他地方调用,可以用dispatch_once的方式调用。
- 参数nlsRequest 语音请求NlsRequest
- svcURL 语音服务地址
返回值 无
语音主服务是否可用
+(BOOL)isServiceAvailable;
- 说明 语音主服务是否可用,开发者可以根据测返回值,调整UI行为
- 返回值 返回语音主服务当前是否可用。
发送语音合成请求
-(void)sendText;
- 说明 用于语音合成TTS的启动方法。网络请求在后台继续,如果有识别结果返回,则会通过delegate的didCompleteRecognizingWithResult回调方法返回文本结果,通过didCompleteTTSWithVoiceData回调方法返回语音数据。
- 返回值 无
参数和错误码说明
发送TTS语音请求的参数:<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- {
- "requests" : {
- "tts_in" : {
- "text" : "",// 输入的文本
- "format" : "normal",
- "encode_type" : "pcm",// 输入的语音格式(编码类型),默认pcm
- "sample_rate" : "16000",采样率,默认16000
- "version" : "1.0"// 协议版本号
- "volume": 50,
- "speech_rate": 0,
- "nus": 1
- },
- "context": {
- "auth": {}
- }
- },
- "app_key" : "",
- "bstream_attached" : true,// 请求包的后面是不是还接着二进制语音流。
- "version" : "4.0"// 协议版本号
- }
返回的识别结果result是一个NlsRecognizerResult的对象:<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- {
- "status" : "1",// 服务器状态,0为失败,非零为成功
- "id" : "",// 透传系统始终的uuid,服务端配置是否返回
- "finish" : "1",// 0为未结束,非零为结束,识别是否已经结束
- "results" : {
- "tts_out" : {
- "encode_type" : "pcm",//输出的语音数据编码
- "id" : "",// 透传系统始终的uuid,服务端配置是否返回
- "status" : "OK",// 服务器状态,0为失败,非零为成功
- "speech_key_prefix" : ""// tts前缀
- },
- "out" : {}//保留字段
- },
- "bstream_attached" : false,// 应答包的后面是不是还接着二进制语音流。
- "version" : "4.0"// 协议版本号
- }
若识别发生错误,recognizer:didCompleteTTSWithVoiceData:error:的回调函数中error不为nil。相应错误码的对应表如下所示。
完整示例
创建应用
使用Xcode创建iOS application应用工程。
添加Framework
在Xcode工程中需要引入所需要的framework,NlsClientSDK.framework。
添加方法:选中工程,点击TARGETS,在右侧的Build Phases中选择 Link Binary With Libraries,点击上图中左下角的+号,在弹出界面中依次添加所依赖的framework。
<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- $lipo -info NlsClientSDK
- Architectures in the fat file: NlsClientSDK are: armv7 i386 x86_64 arm64
引入头文件
在需要调用SDK的文件中,添加如下头文件:<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- #import <NlsClientSDK/NlsClientSDK.h>
语音服务注册
AppDelegate中注册语音服务:<divre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- #import "AppDelegate.h"
- #import "ViewController.h"
- #import <NlsClientSDK/NlsClientSDK.h>
- @interface AppDelegate ()
- @end
- @implementation AppDelegate
- - (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {
- -
- #warning configure语音服务,必须在调用语音服务前执行该方法。
- [NlsRecognizer configure];
- ……
- }
实现语音识别功能
ViewController中实现语音识别方法:<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- #import "ViewController.h"
- #import <NlsClientSDK/NlsClientSDK.h>
- @interface ViewController ()<NlsRecognizerDelegate>
- @property(nonatomic,strong) NlsRecognizer *recognizer;
- @end
- @implementation ViewController
- - (void)viewWillAppear:(BOOL)animated
- {
- [super viewWillAppear:animated];
- // 检查语音主服务是否可用
- if([NlsRecognizer isServiceAvailable]) {
- NSLog(@"当前语音服务可用");
- }
- else {
- NSLog(@"当前语音服务不可用");
- }
- // 监测语音服务状态
- [[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(asrStatusChanged:) name:kNlsRecognizerServiceStatusChanged object:nil];
- }
- #pragma mark - Actions
- - (void)onTtsButtonClick:(id)sender {
- // 初始化语音请求类
- NlsRequest * nlsRequest = [[NlsRequest alloc] init];
- #warning appkey请从 "快速开始" 帮助页面的appkey列表中获取
- [nlsRequest setAppkey:@""]; // requested
- [nlsRequest setBstreamAttached:NO]; // requested 不带语音数据
- [nlsRequest setTtsText:self.inputTextView.text]; // requested
- #warning 请修改为您在阿里云申请的数字验证串Authorize withSecret
- [nlsRequest Authorize:@"" withSecret:@""]; // requested Access Key ID 和 Access Key Secret
- // 初始化语音服务核心类
- NlsRecognizer *r = [[NlsRecognizer alloc] initWithNlsRequest:nlsRequest svcURL:nil]; // requested
- r.delegate = self;
- r.cancelOnAppEntersBackground = YES;
- r.enableUserCancelCallback = YES;
- self.recognizer = r;
- // print nlsRequest
- NSString *nlsRequestJSONString = [NlsRequest getJSONStringfromNlsRequest:nlsRequest];
- NSLog(@"setupTtsIn : %@",nlsRequestJSONString);
- //开始语音合成
- [self.recognizer sendText];
- self.audioPlayer = [[NLSPlayAudio alloc] init];
- }
- #pragma mark - Notification Callbacks
- -(void)asrStatusChanged:(NSNotification*)notify{
- //处理网络变化
- }
- #pragma mark - RecognizerDelegate
- -(void) recognizer:(NlsRecognizer *)recognizer didCompleteRecognizingWithResult:(NlsRecognizerResult*)result error:(NSError*)error{
- //处理识别结果和错误信息
- }
- -(void) recognizer:(NlsRecognizer *)recognizer didCompleteTTSWithVoiceData:(NSData*)voiceData error:(NSError*)error{
- //处理服务器返回的语音数据
- Byte *messageByte = (Byte *)[voiceData bytes];
- NSString *pre4Str = [[NSString alloc]init];
- for (int i = 0; i < 4; i ++) {
- pre4Str = [pre4Str stringByAppendingString: [NSString stringWithFormat:@"%d",messageByte]];
- }
- if (![pre4Str isEqual:@"0000"]) {
- NSUInteger len = [voiceData length];
- Byte *voiceByte = (Byte*)malloc(len - 4);
- for(int i=0;i< len - 4;i++){
- voiceByte = messageByte[i + 4];
- }
- //对语音数据进行处理
- } else {
- NSLog(@"voiceData stop");
- }
- }
- -(void)recognizerDidStartRecieveTTSData:(NlsRecognizer *)recognizer {
- //处理开始语音合成事件
- }
- -(void)recognizerDidStopRecieveTTSData:(NlsRecognizer *)recognizer {
- //处理结束语音合成事件
- }
- @end
FAQ
问题1 : bitcode。
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- ld: 'xxx/NlsClientSDK.framework/NlsClientSDK(NlsRecognizer.o)' does not contain bitcode. You must rebuild it with bitcode enabled (Xcode setting ENABLE_BITCODE), obtain an updated library from the vendor, or disable bitcode for this target. for architecture arm64
- clang: error: linker command failed with exit code 1 (use -v to see invocation)
解决1 : 打开项目-targets-build settings-Enable Bitcode-设置为No。
问题2 : 数加验证失败4403。
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- {
- NSLocalizedDescription = "server closed connection, code:4403, reason:Unauthorized AppKey [xxx], wasClean:1";
- }
解决2 : 检查数加验证的 ak_id 和 ak_secret 是否正确;检查 appkey 填写是否正确。
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- #warning 请修改为您在阿里云申请的数字验证串Authorize withSecret
- [nlsRequest Authorize:@"ak_id" withSecret:@"ak_secret"]; // requested
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- #warning appkey请从 "快速开始" 帮助页面的appkey列表中获取。
- [nlsRequest setAppkey:@"your_appkey"]; // requested
问题3 : Assertion failed。
解决3 : 检查在开始语音识别前,是否注册;检查NlsRecognizer初始化时svcURL是否为nil。
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- //Config appkey.
- [NlsRecognizer configure];
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- NlsRecognizer *r = [[NlsRecognizer alloc] initWithNlsRequest:nlsRequest svcURL:nil];
问题4 : 错误码4400。
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; margin: 0px 0px 16px; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- {
- NSLocalizedDescription = "server closed connection, code:4400, reason:illegal params, operation forbidden, wasClean:1";
- }
解决4 : 检查appkey是否为空,填写正确的appkey。
<pre style='background: rgb(246, 246, 246); font: 12px/1.6 "YaHei Consolas Hybrid", Consolas, "Meiryo UI", "Malgun Gothic", "Segoe UI", "Trebuchet MS", Helvetica, monospace, monospace; padding: 10px; outline: 0px; border-radius: 3px; border: 1px solid rgb(221, 221, 221); color: rgb(51, 51, 51); text-transform: none; text-indent: 0px; letter-spacing: normal; overflow: auto; margin-top: 0px; margin-right: 0px; margin-bottom: 0px !important; margin-left: 0px; word-spacing: 0px; white-space: pre-wrap; word-wrap: break-word; box-sizing: border-box; orphans: 2; widows: 2; font-size-adjust: none; font-stretch: normal; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;' prettyprinted?="" linenums="">
- #warning 请修改为您在阿里云申请的APP_KEY
- [nlsRequest setAppkey:@"your_appkey"]; // requested