引言
在前面的文章中,我已经介绍了如何使用 WebRTC 的 Native API,通过它们大家应该已经了解了正常 API 的一些使用方法和套路。从本文开始,我将介绍一下我这边对 Native API 默认实现的覆写过程,本文我们将先来介绍一些如何把 Java 中的音视频传输给 WebRTC Lib。其他在 Java 中使用 WebRTC 的经验均收录于<在 Java 中使用 WebRTC>中,对这个方向感兴趣的同学可以翻阅一下。本文源代码可通过扫描文章下方的公众号获取或付费下载。
音视频数据采集
从Java采集音频数据
接口介绍
之前在介绍如何创建PeerConnectionFactory时,我们提到了AudioDeviceModule这个接口,WebRTC捕捉音频数据就是通过它来完成的。而我们正是通过实现这个接口,将自定义的音频采集模块注入到WebRTC中的。接下来我们先简单的看一下这个接口都包含什么内容。
// 这里我只留下一些关键的内容
class AudioDeviceModule : public rtc::RefCountInterface {
public:
// 该回调是音频采集的关键,当我们有新的音频数据时,需要将其封装成正确的形式,通过该回调传递音频数据
// Full-duplex transportation of PCM audio
virtual int32_t RegisterAudioCallback(AudioTransport* audioCallback) = 0;
// 列出所有可使用的音频输入输出设备,因为我们要代理整个音频采集(输出)模块,所以这些函数只返回一个设备就行了
// Device enumeration
virtual int16_t PlayoutDevices() = 0;
virtual int16_t RecordingDevices() = 0;
virtual int32_t PlayoutDeviceName(uint16_t index,
char name[kAdmMaxDeviceNameSize],
char guid[kAdmMaxGuidSize]) = 0;
virtual int32_t RecordingDeviceName(uint16_t index,
char name[kAdmMaxDeviceNameSize],
char guid[kAdmMaxGuidSize]) = 0;
// 在需要进行音频采集和音频输出时,上层接口会通过下列函数指定想要使用的设备,因为前面几个函数我们只返回了一个设备,所有上层接口只会使用该设备
// Device selection
virtual int32_t SetPlayoutDevice(uint16_t index) = 0;
virtual int32_t SetPlayoutDevice(WindowsDeviceType device) = 0;
virtual int32_t SetRecordingDevice(uint16_t index) = 0;
virtual int32_t SetRecordingDevice(WindowsDeviceType device) = 0;
// 初始化内容
// Audio transport initialization
virtual int32_t PlayoutIsAvailable(bool* available) = 0;
virtual int32_t InitPlayout() = 0;
virtual bool PlayoutIsInitialized() const = 0;
virtual int32_t RecordingIsAvailable(bool* available) = 0;
virtual int32_t InitRecording() = 0;
virtual bool RecordingIsInitialized() const = 0;
// 开始录音/播放的接口
// Audio transport control
virtual int32_t StartPlayout() = 0;
virtual int32_t StopPlayout() = 0;
virtual bool Playing() const = 0;
virtual int32_t StartRecording() = 0;
virtual int32_t StopRecording() = 0;
virtual bool Recording() const = 0;
// 后面这部分是音频播放相关,我并没有使用到
// Audio mixer initialization
virtual int32_t InitSpeaker() = 0;
virtual bool SpeakerIsInitialized() const = 0;
virtual int32_t InitMicrophone() = 0;
virtual bool MicrophoneIsInitialized() const = 0;
// Speaker volume controls
virtual int32_t SpeakerVolumeIsAvailable(bool* available) = 0;
virtual int32_t SetSpeakerVolume(uint32_t volume) = 0;
virtual int32_t SpeakerVolume(uint32_t* volume) const = 0;
virtual int32_t MaxSpeakerVolume(uint32_t* maxVolume) const = 0;
virtual int32_t MinSpeakerVolume(uint32_t* minVolume) const = 0;
// Microphone volume controls
virtual int32_t MicrophoneVolumeIsAvailable(bool* available) = 0;
virtual int32_t SetMicrophoneVolume(uint32_t volume) = 0;
virtual int32_t MicrophoneVolume(uint32_t* volume) const = 0;
virtual int32_t MaxMicrophoneVolume(uint32_t* maxVolume) const = 0;
virtual int32_t MinMicrophoneVolume(uint32_t* minVolume) const = 0;
// Speaker mute control
virtual int32_t SpeakerMuteIsAvailable(bool* available) = 0;
virtual int32_t SetSpeakerMute(bool enable) = 0;
virtual int32_t SpeakerMute(bool* enabled) const = 0;
// Microphone mute control
virtual int32_t MicrophoneMuteIsAvailable(bool* available) = 0;
virtual int32_t SetMicrophoneMute(bool enable) = 0;
virtual int32_t MicrophoneMute(bool* enabled) const = 0;
// 多声道支持
// Stereo support
virtual int32_t StereoPlayoutIsAvailable(bool* available) const = 0;
virtual int32_t SetStereoPlayout(bool enable) = 0;
virtual int32_t StereoPlayout(bool* enabled) const = 0;
virtual int32_t StereoRecordingIsAvailable(bool* available) const = 0;
virtual int32_t SetStereoRecording(bool enable) = 0;
virtual int32_t StereoRecording(bool* enabled) const = 0;
// Playout delay
virtual int32_t PlayoutDelay(uint16_t* delayMS) const = 0;
};
实现内容
简单浏览完AudioDeviceModule之后,想必大家应该已经有思路了,我这里因为只涉及到音频采集,所以只实现了其中几个接口。简单的讲,我的思路就是在AudioDeviceModule中创建一个线程,当StartReCording
被调用时,该线程开始以某一频率调用Java的相关代码来获取Audio PCM数据,然后以回调的形式上交数据。下面我就来介绍一下我实现的核心内容。
// 首先,我定了一个两个下级接口与Java端接口对应
class Capturer {
public:
virtual bool isJavaWrapper() {
return false;
}
virtual ~Capturer() {}
// Returns the sampling frequency in Hz of the audio data that this
// capturer produces.
virtual int SamplingFrequency() = 0;
// Replaces the contents of |buffer| with 10ms of captured audio data
// (see FakeAudioDevice::SamplesPerFrame). Returns true if the capturer can
// keep producing data, or false when the capture finishes.
virtual bool Capture(rtc::BufferT<int16_t> *buffer) = 0;
};
class Renderer {
public:
virtual ~Renderer() {}
// Returns the sampling frequency in Hz of the audio data that this
// renderer receives.
virtual int SamplingFrequency() const = 0;
// Renders the passed audio data and returns true if the renderer wants
// to keep receiving data, or false otherwise.
virtual bool Render(rtc::ArrayView<const int16_t> data) = 0;
};
// 这两个下级接口的实现如下
class JavaAudioCapturerWrapper final : public FakeAudioDeviceModule::Capturer {
public:
// 构造函数主要是保存Java音频采集类的全局引用,然后获取到需要的函数
JavaAudioCapturerWrapper(jobject audio_capturer)
: java_audio_capturer(audio_capturer) {
WEBRTC_LOG("Instance java audio capturer wrapper.", INFO);
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
audio_capture_class = env->GetObjectClass(java_audio_capturer);
sampling_frequency_method = env->GetMethodID(audio_capture_class, "samplingFrequency", "()I");
capture_method = env->GetMethodID(audio_capture_class, "capture", "(I)Ljava/nio/ByteBuffer;");
WEBRTC_LOG("Instance java audio capturer wrapper end.", INFO);
}
// 析构函数释放Java引用
~JavaAudioCapturerWrapper() {
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
if (audio_capture_class != nullptr) {
env->DeleteLocalRef(audio_capture_class);
audio_capture_class = nullptr;
}
if (java_audio_capturer) {
env->DeleteGlobalRef(java_audio_capturer);
java_audio_capturer = nullptr;
}
}
bool isJavaWrapper() override {
return true;
}
// 调用Java端函数获取采样率,这里我是调用了一次Java函数之后,就讲该值缓存了起来
int SamplingFrequency() override {
if (sampling_frequency_in_hz == 0) {
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
this->sampling_frequency_in_hz = env->CallIntMethod(java_audio_capturer, sampling_frequency_method);
}
return sampling_frequency_in_hz;
}
// 调用Java函数获取PCM数据,这里值得注意的是需要返回16-bit-小端序的PCM数据,
bool Capture(rtc::BufferT<int16_t> *buffer) override {
buffer->SetData(
FakeAudioDeviceModule::SamplesPerFrame(SamplingFrequency()), // 通过该函数计算data buffer的size
[&](rtc::ArrayView<int16_t> data) { // 得到前一个参数设置的指定大小的数据块
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
size_t length;
jobject audio_data_buffer = env->CallObjectMethod(java_audio_capturer, capture_method,
data.size() * 2);// 因为Java端操作的数据类型是Byte,所以这里size * 2
void *audio_data_address = env->GetDirectBufferAddress(audio_data_buffer);
jlong audio_data_size = env->GetDirectBufferCapacity(audio_data_buffer);
length = (size_t) audio_data_size / 2; // int16 等于 2个Byte
memcpy(data.data(), audio_data_address, length * 2);
env->DeleteLocalRef(audio_data_buffer);
return length;
});
return buffer->size() == buffer->capacity();
}
private:
jobject java_audio_capturer;
jclass audio_capture_class;
jmethodID sampling_frequency_method;
jmethodID capture_method;
int sampling_frequency_in_hz = 0;
};
size_t FakeAudioDeviceModule::SamplesPerFrame(int sampling_frequency_in_hz) {
return rtc::CheckedDivExact(sampling_frequency_in_hz, kFramesPerSecond);
}
constexpr int kFrameLengthMs = 10; // 10ms采集一次数据
constexpr int kFramesPerSecond = 1000 / kFrameLengthMs; //每秒采集的帧数
// 播放器里其实什么也没干^.^
class DiscardRenderer final : public FakeAudioDeviceModule::Renderer {
public:
explicit DiscardRenderer(int sampling_frequency_in_hz)
: sampling_frequency_in_hz_(sampling_frequency_in_hz) {}
int SamplingFrequency() const override {
return sampling_frequency_in_hz_;
}
bool Render(rtc::ArrayView<const int16_t>) override {
return true;
}
private:
int sampling_frequency_in_hz_;
};
// 接下来是AudioDeviceModule的核心实现,我使用WebRTC提供的EventTimerWrapper和跨平台线程库来实现周期性Java采集函数调用
std::unique_ptr<webrtc::EventTimerWrapper> tick_;
rtc::PlatformThread thread_;
// 构造函数
FakeAudioDeviceModule::FakeAudioDeviceModule(std::unique_ptr<Capturer> capturer,
std::unique_ptr<Renderer> renderer,
float speed)
: capturer_(std::move(capturer)),
renderer_(std::move(renderer)),
speed_(speed),
audio_callback_(nullptr),
rendering_(false),
capturing_(false),
done_rendering_(true, true),
done_capturing_(true, true),
tick_(webrtc::EventTimerWrapper::Create()),
thread_(FakeAudioDeviceModule::Run, this, "FakeAudioDeviceModule") {
}
// 主要是将rendering_置为true
int32_t FakeAudioDeviceModule::StartPlayout() {
rtc::CritScope cs(&lock_);
RTC_CHECK(renderer_);
rendering_ = true;
done_rendering_.Reset();
return 0;
}
// 主要是将rendering_置为false
int32_t FakeAudioDeviceModule::StopPlayout() {
rtc::CritScope cs(&lock_);
rendering_ = false;
done_rendering_.Set();
return 0;
}
// 主要是将capturing_置为true
int32_t FakeAudioDeviceModule::StartRecording() {
rtc::CritScope cs(&lock_);
WEBRTC_LOG("Start audio recording", INFO);
RTC_CHECK(capturer_);
capturing_ = true;
done_capturing_.Reset();
return 0;
}
// 主要是将capturing_置为false
int32_t FakeAudioDeviceModule::StopRecording() {
rtc::CritScope cs(&lock_);
WEBRTC_LOG("Stop audio recording", INFO);
capturing_ = false;
done_capturing_.Set();
return 0;
}
// 设置EventTimer的频率,并开启线程
int32_t FakeAudioDeviceModule::Init() {
RTC_CHECK(tick_->StartTimer(true, kFrameLengthMs / speed_));
thread_.Start();
thread_.SetPriority(rtc::kHighPriority);
return 0;
}
// 保存上层音频采集的回调函数,之后我们会用它上交音频数据
int32_t FakeAudioDeviceModule::RegisterAudioCallback(webrtc::AudioTransport *callback) {
rtc::CritScope cs(&lock_);
RTC_DCHECK(callback || audio_callback_);
audio_callback_ = callback;
return 0;
}
bool FakeAudioDeviceModule::Run(void *obj) {
static_cast<FakeAudioDeviceModule *>(obj)->ProcessAudio();
return true;
}
void FakeAudioDeviceModule::ProcessAudio() {
{
rtc::CritScope cs(&lock_);
if (needDetachJvm) {
WEBRTC_LOG("In audio device module process audio", INFO);
}
auto start = std::chrono::steady_clock::now();
if (capturing_) {
// Capture 10ms of audio. 2 bytes per sample.
// 获取音频数据
const bool keep_capturing = capturer_->Capture(&recording_buffer_);
uint32_t new_mic_level;
if (keep_capturing) {
// 通过回调函数上交音频数据,这里包括:数据,数据大小,每次采样数据多少byte,声道数,采样率,延时等
audio_callback_->RecordedDataIsAvailable(
recording_buffer_.data(), recording_buffer_.size(), 2, 1,
static_cast<const uint32_t>(capturer_->SamplingFrequency()), 0, 0, 0, false, new_mic_level);
}
// 如果没有音频数据了,就停止采集
if (!keep_capturing) {
capturing_ = false;
done_capturing_.Set();
}
}
if (rendering_) {
size_t samples_out;
int64_t elapsed_time_ms;
int64_t ntp_time_ms;
const int sampling_frequency = renderer_->SamplingFrequency();
// 从上层接口获取音频数据
audio_callback_->NeedMorePlayData(
SamplesPerFrame(sampling_frequency), 2, 1, static_cast<const uint32_t>(sampling_frequency),
playout_buffer_.data(), samples_out, &elapsed_time_ms, &ntp_time_ms);
// 播放音频数据
const bool keep_rendering = renderer_->Render(
rtc::ArrayView<const int16_t>(playout_buffer_.data(), samples_out));
if (!keep_rendering) {
rendering_ = false;
done_rendering_.Set();
}
}
auto end = std::chrono::steady_clock::now();
auto diff = std::chrono::duration<double, std::milli>(end - start).count();
if (diff > kFrameLengthMs) {
WEBRTC_LOG("JNI capture audio data timeout, real capture time is " + std::to_string(diff) + " ms", DEBUG);
}
// 如果AudioDeviceModule要被销毁了,就Detach Thread
if (capturer_->isJavaWrapper() && needDetachJvm && !detached2Jvm) {
DETACH_CURRENT_THREAD_IF_NEEDED();
detached2Jvm = true;
} else if (needDetachJvm) {
detached2Jvm = true;
}
}
// 时间没到就一直等,当够了10ms会触发下一次音频处理过程
tick_->Wait(WEBRTC_EVENT_INFINITE);
}
// 析构函数
FakeAudioDeviceModule::~FakeAudioDeviceModule() {
WEBRTC_LOG("In audio device module FakeAudioDeviceModule", INFO);
StopPlayout(); // 关闭播放
StopRecording(); // 关闭采集
needDetachJvm = true; // 触发工作线程的Detach
while (!detached2Jvm) { // 等待工作线程Detach完毕
}
WEBRTC_LOG("In audio device module after detached2Jvm", INFO);
thread_.Stop();// 关闭线程
WEBRTC_LOG("In audio device module ~FakeAudioDeviceModule finished", INFO);
}
顺便一提,在Java端我采用了直接内存来传递音频数据,主要是因为这样减少内存拷贝。
从Java采集视频数据
从Java采集视频数据和采集音频数据的过程十分相似,不过视频采集模块的注入是在创建VideoSource的时候,此外还有一个需要注意的点是,需要在SignallingThread创建VideoCapturer。
//...
video_source = rtc->CreateVideoSource(rtc->CreateFakeVideoCapturerInSignalingThread());
//...
FakeVideoCapturer *RTC::CreateFakeVideoCapturerInSignalingThread() {
if (video_capturer) {
return signaling_thread->Invoke<FakeVideoCapturer *>(RTC_FROM_HERE,
rtc::Bind(&RTC::CreateFakeVideoCapturer, this,
video_capturer));
} else {
return nullptr;
}
}
VideoCapturer这个接口中需要我们实现的内容也并不多,关键的就是主循环,开始,关闭,接下来看一下我的实现吧。
// 构造函数
FakeVideoCapturer::FakeVideoCapturer(jobject video_capturer)
: running_(false),
video_capturer(video_capturer),
is_screen_cast(false),
ticker(webrtc::EventTimerWrapper::Create()),
thread(FakeVideoCapturer::Run, this, "FakeVideoCapturer") {
// 保存会使用到的Java函数
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
video_capture_class = env->GetObjectClass(video_capturer);
get_width_method = env->GetMethodID(video_capture_class, "getWidth", "()I");
get_height_method = env->GetMethodID(video_capture_class, "getHeight", "()I");
get_fps_method = env->GetMethodID(video_capture_class, "getFps", "()I");
capture_method = env->GetMethodID(video_capture_class, "capture", "()Lpackage/name/of/rtc4j/model/VideoFrame;");
width = env->CallIntMethod(video_capturer, get_width_method);
previous_width = width;
height = env->CallIntMethod(video_capturer, get_height_method);
previous_height = height;
fps = env->CallIntMethod(video_capturer, get_fps_method);
// 设置上交的数据格式YUV420
static const cricket::VideoFormat formats[] = {
{width, height, cricket::VideoFormat::FpsToInterval(fps), cricket::FOURCC_I420}
};
SetSupportedFormats({&formats[0], &formats[arraysize(formats)]});
// 根据Java中反馈的FPS设置主循环执行间隔
RTC_CHECK(ticker->StartTimer(true, rtc::kNumMillisecsPerSec / fps));
thread.Start();
thread.SetPriority(rtc::kHighPriority);
// 因为Java端传输过来的时Jpg图片,所以我这里用libjpeg-turbo进行了解压,转成YUV420
decompress_handle = tjInitDecompress();
WEBRTC_LOG("Create fake video capturer, " + std::to_string(width) + ", " + std::to_string(height), INFO);
}
// 析构函数
FakeVideoCapturer::~FakeVideoCapturer() {
thread.Stop();
SignalDestroyed(this);
// 释放Java资源
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
if (video_capture_class != nullptr) {
env->DeleteLocalRef(video_capture_class);
video_capture_class = nullptr;
}
// 释放解压器
if (decompress_handle) {
if (tjDestroy(decompress_handle) != 0) {
WEBRTC_LOG("Release decompress handle failed, reason is: " + std::string(tjGetErrorStr2(decompress_handle)),
ERROR);
}
}
WEBRTC_LOG("Free fake video capturer", INFO);
}
bool FakeVideoCapturer::Run(void *obj) {
static_cast<FakeVideoCapturer *>(obj)->CaptureFrame();
return true;
}
void FakeVideoCapturer::CaptureFrame() {
{
rtc::CritScope cs(&lock_);
if (running_) {
int64_t t0 = rtc::TimeMicros();
JNIEnv *env = ATTACH_CURRENT_THREAD_IF_NEEDED();
// 从Java端获取每一帧的图片,
jobject java_video_frame = env->CallObjectMethod(video_capturer, capture_method);
if (java_video_frame == nullptr) { // 如果返回的图片为空,就上交一张纯黑的图片
rtc::scoped_refptr<webrtc::I420Buffer> buffer = webrtc::I420Buffer::Create(previous_width,
previous_height);
webrtc::I420Buffer::SetBlack(buffer);
OnFrame(webrtc::VideoFrame(buffer, (webrtc::VideoRotation) previous_rotation, t0), previous_width,
previous_height);
return;
}
// Java中使用直接内存来传输图片
jobject java_data_buffer = env->CallObjectMethod(java_video_frame, GET_VIDEO_FRAME_BUFFER_GETTER_METHOD());
auto data_buffer = (unsigned char *) env->GetDirectBufferAddress(java_data_buffer);
auto length = (unsigned long) env->CallIntMethod(java_video_frame, GET_VIDEO_FRAME_LENGTH_GETTER_METHOD());
int rotation = env->CallIntMethod(java_video_frame, GET_VIDEO_FRAME_ROTATION_GETTER_METHOD());
int width;
int height;
// 解压Jpeg头部信息,获取长宽
tjDecompressHeader(decompress_handle, data_buffer, length, &width, &height);
previous_width = width;
previous_height = height;
previous_rotation = rotation;
// 以32对齐的方式解压并上交YUV420数据,这里采用32对齐是因为这样编码效率更高,此外mac上的videotoolbox编码要求必须使用32对齐
rtc::scoped_refptr<webrtc::I420Buffer> buffer =
webrtc::I420Buffer::Create(width, height,
width % 32 == 0 ? width : width / 32 * 32 + 32,
(width / 2) % 32 == 0 ? (width / 2) : (width / 2) / 32 * 32 + 32,
(width / 2) % 32 == 0 ? (width / 2) : (width / 2) / 32 * 32 + 32);
uint8_t *planes[] = {buffer->MutableDataY(), buffer->MutableDataU(), buffer->MutableDataV()};
int strides[] = {buffer->StrideY(), buffer->StrideU(), buffer->StrideV()};
tjDecompressToYUVPlanes(decompress_handle, data_buffer, length, planes, width, strides, height,
TJFLAG_FASTDCT | TJFLAG_NOREALLOC);
env->DeleteLocalRef(java_data_buffer);
env->DeleteLocalRef(java_video_frame);
// OnFrame 函数就是将数据递交给WebRTC的接口
OnFrame(webrtc::VideoFrame(buffer, (webrtc::VideoRotation) rotation, t0), width, height);
}
}
ticker->Wait(WEBRTC_EVENT_INFINITE);
}
// 开启
cricket::CaptureState FakeVideoCapturer::Start(
const cricket::VideoFormat &format) {
//SetCaptureFormat(&format); This will cause crash in CentOS
running_ = true;
SetCaptureState(cricket::CS_RUNNING);
WEBRTC_LOG("Start fake video capturing", INFO);
return cricket::CS_RUNNING;
}
// 关闭
void FakeVideoCapturer::Stop() {
running_ = false;
//SetCaptureFormat(nullptr); This will cause crash in CentOS
SetCaptureState(cricket::CS_STOPPED);
WEBRTC_LOG("Stop fake video capturing", INFO);
}
// YUV420
bool FakeVideoCapturer::GetPreferredFourccs(std::vector<uint32_t> *fourccs) {
fourccs->push_back(cricket::FOURCC_I420);
return true;
}
// 调用默认实现
void FakeVideoCapturer::AddOrUpdateSink(rtc::VideoSinkInterface<webrtc::VideoFrame> *sink,
const rtc::VideoSinkWants &wants) {
cricket::VideoCapturer::AddOrUpdateSink(sink, wants);
}
void FakeVideoCapturer::RemoveSink(rtc::VideoSinkInterface<webrtc::VideoFrame> *sink) {
cricket::VideoCapturer::RemoveSink(sink);
}
至此,如何从Java端获取音视频数据的部分就介绍完了,你会发现这个东西其实并不难,我这就算是抛砖引玉吧,大家可以通过我的实现,更快的理解这部分的流程。
文章说明
更多有价值的文章均收录于贝贝猫的文章目录
版权声明: 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
创作声明: 本文基于下列所有参考内容进行创作,其中可能涉及复制、修改或者转换,图片均来自网络,如有侵权请联系我,我会第一时间进行删除。
参考内容
[1] JNI的替代者—使用JNA访问Java外部功能接口
[2] Linux共享对象之编译参数fPIC
[3] Android JNI 使用总结
[4] FFmpeg 仓库