Android 中的 libwebrtc

libwebrtc 是实现 webrtc 协议的开源 C++ 库，由谷歌创建。Webrtc 将一个本地对等点与一个或多个远程对等点连接起来。本地对等点向远程对等点发送音频/视频轨道，并且还从远程对等点接收音频/视频轨道。

在 Android 的背景下，本文将拆解本地 peer 如何创建音频/视频轨道，看看 react-native-webrtc 和libwebrtc的源代码来了解幕后发生的事情。

设置 PeerConnectionFactory（源码） – 负责 AV 编码器、音频设备的相关配置

PeerConnectionFactory.initialize(
            PeerConnectionFactory.InitializationOptions.builder(reactContext)
                .setNativeLibraryLoader(new LibraryLoader())
                .setInjectableLogger(injectableLogger, loggingSeverity)
                .createInitializationOptions());

        if (injectableLogger == null && loggingSeverity != null) {
            Logging.enableLogToDebugOutput(loggingSeverity);
        }

        if (encoderFactory == null || decoderFactory == null) {
            // Initialize EGL context required for HW acceleration.
            EglBase.Context eglContext = EglUtils.getRootEglBaseContext();

            if (eglContext != null) {
                encoderFactory
                    = new DefaultVideoEncoderFactory(
                    eglContext,
                    /* enableIntelVp8Encoder */ true,
                    /* enableH264HighProfile */ false);
                decoderFactory = new DefaultVideoDecoderFactory(eglContext);
            } else {
                encoderFactory = new SoftwareVideoEncoderFactory();
                decoderFactory = new SoftwareVideoDecoderFactory();
            }
        }

        if (adm == null) {
            adm = JavaAudioDeviceModule.builder(reactContext)
                .setEnableVolumeLogger(false)
                .createAudioDeviceModule();
        }

        mFactory
            = PeerConnectionFactory.builder()
                .setAudioDeviceModule(adm)
                .setVideoEncoderFactory(encoderFactory)
                .setVideoDecoderFactory(decoderFactory)
                .createPeerConnectionFactory();

        // Saving the encoder and decoder factories to get codec info later when needed
        mVideoEncoderFactory = encoderFactory;
        mVideoDecoderFactory = decoderFactory;

        getUserMediaImpl = new GetUserMediaImpl(this, reactContext);

创建 PeerConnection (源码) — 创建一个对等点

public void peerConnectionInit(ReadableMap configuration, int id) {
        PeerConnection.RTCConfiguration rtcConfiguration = parseRTCConfiguration(configuration);

        try {
            ThreadUtils.submitToExecutor(() -> {
                PeerConnectionObserver observer = new PeerConnectionObserver(this, id);
                PeerConnection peerConnection = mFactory.createPeerConnection(rtcConfiguration, observer);
                observer.setPeerConnection(peerConnection);
                mPeerConnectionObservers.put(id, observer);
            }).get();
        } catch (ExecutionException | InterruptedException e) {
            e.printStackTrace();
            throw new RuntimeException(e);
        }
    }

创建视频轨道 – 我们从摄像头捕捉帧作为纹理，并将其混合到 libwebrtc 的表面。我们也可以混合其他来源的纹理，而不是摄像头。在此，（Open GL）纹理将作为视频轨道的输入


VideoTrack createVideoTrack(AbstractVideoCaptureController videoCaptureController) {
    videoCaptureController.initializeVideoCapturer();

    VideoCapturer videoCapturer = videoCaptureController.videoCapturer;
    if (videoCapturer == null) {
        return null;
    }

    PeerConnectionFactory pcFactory = webRTCModule.mFactory;
    EglBase.Context eglContext = EglUtils.getRootEglBaseContext();
    SurfaceTextureHelper surfaceTextureHelper =
        SurfaceTextureHelper.create("CaptureThread", eglContext);

    if (surfaceTextureHelper == null) {
        Log.d(TAG, "Error creating SurfaceTextureHelper");
        return null;
    }

    String id = UUID.randomUUID().toString();

    TrackCapturerEventsEmitter eventsEmitter = new TrackCapturerEventsEmitter(webRTCModule, id);
    videoCaptureController.setCapturerEventsListener(eventsEmitter);

    VideoSource videoSource = pcFactory.createVideoSource(videoCapturer.isScreencast());
    videoCapturer.initialize(surfaceTextureHelper, reactContext, videoSource.getCapturerObserver());

    VideoTrack track = pcFactory.createVideoTrack(id, videoSource);

    track.setEnabled(true);
    tracks.put(id, new TrackPrivate(track, videoSource, videoCaptureController, surfaceTextureHelper));

    videoCaptureController.startCapture();

    return track;
}

创建音轨（源码）——这涉及从 Java 捕获音频，然后将捕获的 PCM 数据发送到 C++ 本机代码进行处理。让我们看看这是如何实现的。

在Android中，libwebrtc支持从以下位置捕获音频

OpenSL（创建OpenSLESAudioDeviceModule）
AAudio（创建AAudioAudioDeviceModule）
Java Audio（来自 C++ 端的CreateJavaAudioDeviceModule或来自 Java 端的JavaAudioDeviceModule）

我们将研究使用 JavaAudioDeviceModule 捕获音频，因为我们将从 java 端推送音频。

在创建 PeerConnectionFactory 时，必须创建 JavaAudioDeviceModule 并将其设置为音频设备。JavaAudioDeviceModule 负责从 android.media.AudioRecord 中捕获 PCM 数据并创建音轨。

让我们看看 JavaAudioDeviceModule 的 createAudioDeviceModule 函数。它通过使用 WebRtcAudioRecord（音频输入）和 WebRtcAudioTrack（音频输出）创建 JavaAudioDeviceModule。

 /**
   * Construct an AudioDeviceModule based on the supplied arguments. The caller takes ownership
   * and is responsible for calling release().
   */
  public JavaAudioDeviceModule createAudioDeviceModule() {
    Logging.d(TAG, "createAudioDeviceModule");
    if (useHardwareNoiseSuppressor) {
      Logging.d(TAG, "HW NS will be used.");
    } else {
      if (isBuiltInNoiseSuppressorSupported()) {
        Logging.d(TAG, "Overriding default behavior; now using WebRTC NS!");
      }
      Logging.d(TAG, "HW NS will not be used.");
    }
    if (useHardwareAcousticEchoCanceler) {
      Logging.d(TAG, "HW AEC will be used.");
    } else {
      if (isBuiltInAcousticEchoCancelerSupported()) {
        Logging.d(TAG, "Overriding default behavior; now using WebRTC AEC!");
      }
      Logging.d(TAG, "HW AEC will not be used.");
    }
    // Low-latency mode was introduced in API version 26, see
    // https://developer.android.com/reference/android/media/AudioTrack#PERFORMANCE_MODE_LOW_LATENCY
    final int MIN_LOW_LATENCY_SDK_VERSION = 26;
    if (useLowLatency && Build.VERSION.SDK_INT >= MIN_LOW_LATENCY_SDK_VERSION) {
      Logging.d(TAG, "Low latency mode will be used.");
    }
    ScheduledExecutorService executor = this.scheduler;
    if (executor == null) {
      executor = WebRtcAudioRecord.newDefaultScheduler();
    }
    final WebRtcAudioRecord audioInput = new WebRtcAudioRecord(context, executor, audioManager,
        audioSource, audioFormat, audioRecordErrorCallback, audioRecordStateCallback,
        samplesReadyCallback, useHardwareAcousticEchoCanceler, useHardwareNoiseSuppressor);
    final WebRtcAudioTrack audioOutput =
        new WebRtcAudioTrack(context, audioManager, audioAttributes, audioTrackErrorCallback,
            audioTrackStateCallback, useLowLatency, enableVolumeLogger);
    return new JavaAudioDeviceModule(context, audioManager, audioInput, audioOutput,
        inputSampleRate, outputSampleRate, useStereoInput, useStereoOutput);
  }
}

WebRtcAudioRecord.java 具有指向本地音频记录（c++）的指针，并通过调用 nativeDataIsRecorded 方法将捕获的 PCM 字节传输到本地记录中。

while (keepAlive) {
  int bytesRead = audioRecord.read(byteBuffer, byteBuffer.capacity());
  if (bytesRead == byteBuffer.capacity()) {
    if (microphoneMute) {
      byteBuffer.clear();
      byteBuffer.put(emptyBytes);
    }
    // It's possible we've been shut down during the read, and stopRecording() tried and
    // failed to join this thread. To be a bit safer, try to avoid calling any native methods
    // in case they've been unregistered after stopRecording() returned.
    if (keepAlive) {
      long captureTimeNs = 0;
      if (Build.VERSION.SDK_INT >= 24) {
        if (audioRecord.getTimestamp(audioTimestamp, AudioTimestamp.TIMEBASE_MONOTONIC)
            == AudioRecord.SUCCESS) {
          captureTimeNs = audioTimestamp.nanoTime;
        }
      }
      nativeDataIsRecorded(nativeAudioRecord, bytesRead, captureTimeNs);
    }
    if (audioSamplesReadyCallback != null) {
      // Copy the entire byte buffer array. The start of the byteBuffer is not necessarily
      // at index 0.
      byte[] data = Arrays.copyOfRange(byteBuffer.array(), byteBuffer.arrayOffset(),
          byteBuffer.capacity() + byteBuffer.arrayOffset());
      audioSamplesReadyCallback.onWebRtcAudioRecordSamplesReady(
          new JavaAudioDeviceModule.AudioSamples(audioRecord.getAudioFormat(),
              audioRecord.getChannelCount(), audioRecord.getSampleRate(), data));
    }
  } else {
    String errorMessage = "AudioRecord.read failed: " + bytesRead;
    Logging.e(TAG, errorMessage);
    if (bytesRead == AudioRecord.ERROR_INVALID_OPERATION) {
      keepAlive = false;
      reportWebRtcAudioRecordError(errorMessage);
    }
  }
}

WebRtcAudioRecord.java 从 android.media.AudioRecord 中捕获 PCM 数据，并通过 nativeDataIsRecorded 发送到本地 C++ 库进行处理。

让我们看看 WebRtcAudioRecord.java 和本地代码是如何挂钩的。

WebRtcAudioRecord.java 通过 setNativeAudioRecord 获取本地音频记录指针：

public  void  setNativeAudioRecord ( long nativeAudioRecord) { 
  this .nativeAudioRecord = nativeAudioRecord; 
}

看看是谁在调用 setNativeAudioRecord 方法。该方法由 audio_record_jni.cc 中 AudioRecordJni 的构造函数调用（请参阅：Java_WebRtcAudioRecord_setNativeAudioRecord，如下所示）

AudioRecordJni::AudioRecordJni(JNIEnv* env,
                               const AudioParameters& audio_parameters,
                               int total_delay_ms,
                               const JavaRef<jobject>& j_audio_record)
    : j_audio_record_(env, j_audio_record),
      audio_parameters_(audio_parameters),
      total_delay_ms_(total_delay_ms),
      direct_buffer_address_(nullptr),
      direct_buffer_capacity_in_bytes_(0),
      frames_per_buffer_(0),
      initialized_(false),
      recording_(false),
      audio_device_buffer_(nullptr) {
  RTC_LOG(LS_INFO) << "ctor";
  RTC_DCHECK(audio_parameters_.is_valid());
  Java_WebRtcAudioRecord_setNativeAudioRecord(env, j_audio_record_,
                                              jni::jlongFromPointer(this));
  // Detach from this thread since construction is allowed to happen on a
  // different thread.
  thread_checker_.Detach();
  thread_checker_java_.Detach();
}

AudioRecordJni (c++) 扩展了 AudioInput (c++)，因此充当音频输入：

class AudioRecordJni : public AudioInput

AudioRecordJni 是在 java_audio_device_module.cc 中的 JNI_JavaAudioDeviceModule_CreateAudioDeviceModule 方法中创建的：

static jlong JNI_JavaAudioDeviceModule_CreateAudioDeviceModule(
    JNIEnv* env,
    const JavaParamRef<jobject>& j_context,
    const JavaParamRef<jobject>& j_audio_manager,
    const JavaParamRef<jobject>& j_webrtc_audio_record,
    const JavaParamRef<jobject>& j_webrtc_audio_track,
    int input_sample_rate,
    int output_sample_rate,
    jboolean j_use_stereo_input,
    jboolean j_use_stereo_output) {
  AudioParameters input_parameters;
  AudioParameters output_parameters;
  GetAudioParameters(env, j_context, j_audio_manager, input_sample_rate,
                     output_sample_rate, j_use_stereo_input,
                     j_use_stereo_output, &input_parameters,
                     &output_parameters);
  auto audio_input = std::make_unique<AudioRecordJni>(
      env, input_parameters, kHighLatencyModeDelayEstimateInMilliseconds,
      j_webrtc_audio_record);
  auto audio_output = std::make_unique<AudioTrackJni>(env, output_parameters,
                                                      j_webrtc_audio_track);
  return jlongFromPointer(CreateAudioDeviceModuleFromInputAndOutput(
                              AudioDeviceModule::kAndroidJavaAudio,
                              j_use_stereo_input, j_use_stereo_output,
                              kHighLatencyModeDelayEstimateInMilliseconds,
                              std::move(audio_input), std::move(audio_output))
                              .release());
}

当我们从 JavaAudioDeviceModule.java 调用 nativeCreateAudioDeviceModule 时，JNI_JavaAudioDeviceModule_CreateAudioDeviceModule 将被调用（即 nativeCreateAudioDeviceModule 通过代码生成映射到 JNI_JavaAudioDeviceModule_CreateAudioDeviceModule)

看看调用 JavaAudioDeviceModule.java 中的 nativeCreateAudioDeviceModule。JavaAudioDeviceModule.java 中的方法 getNativeAudioDeviceModulePointer 调用 nativeCreateAudioDeviceModule

public long getNativeAudioDeviceModulePointer() {
    synchronized (nativeLock) {
      if (nativeAudioDeviceModule == 0) {
        nativeAudioDeviceModule = nativeCreateAudioDeviceModule(context, audioManager, audioInput,
            audioOutput, inputSampleRate, outputSampleRate, useStereoInput, useStereoOutput);
      }
      return nativeAudioDeviceModule;
    }
  }

继续看看是谁在调用 JavaAudioDeviceModule.java 中的 getNativeAudioDeviceModulePointer，它是在 PeerConnectionFactory.java 的 createPeerConnectionFactory 方法中调用的。

public PeerConnectionFactory createPeerConnectionFactory() {
      checkInitializeHasBeenCalled();
      if (audioDeviceModule == null) {
        audioDeviceModule = JavaAudioDeviceModule.builder(ContextUtils.getApplicationContext())
                                .createAudioDeviceModule();
      }
      return nativeCreatePeerConnectionFactory(ContextUtils.getApplicationContext(), options,
          audioDeviceModule.getNativeAudioDeviceModulePointer(),
          audioEncoderFactoryFactory.createNativeAudioEncoderFactory(),
          audioDecoderFactoryFactory.createNativeAudioDecoderFactory(), videoEncoderFactory,
          videoDecoderFactory,
          audioProcessingFactory == null ? 0 : audioProcessingFactory.createNative(),
          fecControllerFactoryFactory == null ? 0 : fecControllerFactoryFactory.createNative(),
          networkControllerFactoryFactory == null
              ? 0
              : networkControllerFactoryFactory.createNativeNetworkControllerFactory(),
          networkStatePredictorFactoryFactory == null
              ? 0
              : networkStatePredictorFactoryFactory.createNativeNetworkStatePredictorFactory(),
          neteqFactoryFactory == null ? 0 : neteqFactoryFactory.createNativeNetEqFactory());
    }
  }

在本文中，我们了解了视频和音频数据如何在 Android 中从 java 端传递到 c++ 端。

然后，libwebrtc 对这些音频/视频数据进行编码，并将其发送给远程对等点。