本文最后更新于:2025年4月9日 下午
前言
好久没写博客了,这段时间几乎所有的精力都用在适应新工作上了,今天难得有些时间可以把这段时间做的需求整理一下,记一记写一写,加深下记忆吧。这个需求是在渲染引擎里加入录屏功能,将引擎渲染的纹理和麦克风采集的声音写入到本地 mp4 文件。
实现方法
可以将视频、音频分开录制,AVAssetWriter 将底层引擎回调的 RGB 数据写成 mp4 文件,使用 VideoToolBox 录制一路音频 m4a 保存本地,然后使用 AVAssetExportSession 将音频视频合成。这种方式实现是肯定没问题的,但过程有些繁琐,并且音频视频被多次编解码,效率很低。
AVAssetWriter 是支持实时写入音频和视频的,唯一要费点心思的是音视频同步。音频的采集使用 AVCaptureSession,比较简单,接口回调的音频数据类型是 CMSampleBufferRef,其中包含了时间戳信息,这个时间戳并不是从 0 开始的,所以收到音频首帧时需要记录下时间戳作为视频帧时间戳的 base time,此后的视频帧在此 base time 基础上叠加系统时间差值即可。
代码
首先是 AVAssetWriter 的初始化,这里分成两种情况,如果用户没有授权麦克风权限,则只录制引擎的视频纹理,如果授权,则录制纹理和声音:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 - (BOOL )setupAssetWriter:(RecorderSourceType)sourceType { _sourceType = sourceType; NSError *error; if ([[NSFileManager defaultManager] fileExistsAtPath:_outputFilePath]) { [[NSFileManager defaultManager] removeItemAtPath:_outputFilePath error:&error]; if (error) { NSLog (@"Remove item at outputFilePath failed: %@" , error.localizedDescription); return NO ; } } _assetWriter = [[AVAssetWriter alloc] initWithURL:[NSURL fileURLWithPath:_outputFilePath] fileType:AVFileTypeMPEG4 error:&error]; if (error) { NSLog (@"AVAssetWriter create failed: %@" , error.localizedDescription); return NO ; } if (sourceType & RecorderSourceType_Video) { CGSize outputSize = CGSizeMake (GetWidth(), GetHeight()); NSInteger numPixels = outputSize.width * outputSize.height; CGFloat bitrateFactorPerPixel = 3.0 ; NSInteger bitrate = numPixels * bitrateFactorPerPixel; NSInteger fps = 30 ; NSInteger keyFrameInterval = fps * 3 ; NSDictionary *videoCompressionProperties = @{AVVideoAverageBitRateKey : @(bitrate), AVVideoExpectedSourceFrameRateKey : @(fps), AVVideoMaxKeyFrameIntervalKey : @(keyFrameInterval), AVVideoProfileLevelKey : AVVideoProfileLevelH264BaselineAutoLevel }; NSDictionary *videoCompressionSettings = @{AVVideoCodecKey : AVVideoCodecTypeH264 , AVVideoScalingModeKey : AVVideoScalingModeResizeAspectFill , AVVideoWidthKey : @(outputSize.width), AVVideoHeightKey : @(outputSize.height), AVVideoCompressionPropertiesKey : videoCompressionProperties}; if ([_assetWriter canApplyOutputSettings:videoCompressionSettings forMediaType:AVMediaTypeVideo ]) {、 _videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoCompressionSettings]; _videoInput.expectsMediaDataInRealTime = YES ; NSDictionary *sourcePixelBufferAttributes = @{(NSString *)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_32BGRA), (NSString *)kCVPixelBufferWidthKey: @(outputSize.width), (NSString *)kCVPixelBufferHeightKey: @(outputSize.height)}; _videoAdaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:_videoInput sourcePixelBufferAttributes:sourcePixelBufferAttributes]; if ([_assetWriter canAddInput:_videoInput]) { [_assetWriter addInput:_videoInput]; } } } if (sourceType & RecorderSourceType_Audio) { NSInteger sampleRate = 48000 ; NSInteger channelCount = 1 ; NSInteger bitrate = 64000 ; NSInteger bitDepth = 16 ; NSDictionary *audioCompressionSettings = @{AVFormatIDKey : @(kAudioFormatMPEG4AAC), AVEncoderBitRatePerChannelKey : @(bitrate), AVSampleRateKey : @(sampleRate), AVNumberOfChannelsKey : @(channelCount)}; if ([_assetWriter canApplyOutputSettings:audioCompressionSettings forMediaType:AVMediaTypeAudio ]) { _audioInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:audioCompressionSettings]; _audioInput.expectsMediaDataInRealTime = YES ; if ([_assetWriter canAddInput:_audioInput]) { [_assetWriter addInput:_audioInput]; } } } return YES ; }
以上代码初始化 AVAssetWriter 并根据不同场景配置了音频和视频的输入,如果是授权了麦克风权限,还需要初始化 AVCaptureSession:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 - (BOOL )setupAudioRecorder { NSError *error; [[AVAudioSession sharedInstance] setCategory:AVAudioSessionCategoryPlayAndRecord error:&error]; [[AVAudioSession sharedInstance] setActive:YES error:&error]; if (error) { NSLog (@"AVAudioSession set category & action error: %@" , error.localizedDescription); return NO ; } _audioBufferQueue = dispatch_queue_create("com.audioBuffer.IPhoneScreenRecorder" , DISPATCH_QUEUE_SERIAL); _captureSession = [[AVCaptureSession alloc] init]; AVCaptureDevice *audioDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio ]; _audioDeviceInput = [AVCaptureDeviceInput deviceInputWithDevice:audioDevice error:&error]; if (error) { NSLog (@"AVCaptureDeviceInput create failed: %@" , error.localizedDescription); return NO ; } _audioDataOutput = [[AVCaptureAudioDataOutput alloc] init]; [_audioDataOutput setSampleBufferDelegate:self queue:_audioBufferQueue]; if ([_captureSession canAddInput:_audioDeviceInput]) { [_captureSession addInput:_audioDeviceInput]; } if ([_captureSession canAddOutput:_audioDataOutput]) { [_captureSession addOutput:_audioDataOutput]; } [_captureSession startRunning]; return YES ; }
初始化完成后,需要处理开始写入的时机和音视频时间戳的同步,processVideoBuffer 方法是引擎回调纹理数据的接口,这里同样也分两种场景来处理:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 - (void )processVideoBuffer:(void *)data width:(int )width height:(int )height { if (!_isRecording) return ; if (!_canWrite) return ; dispatch_async (_operationQueue, ^{ if (_sourceType & RecorderSourceType_Audio) { if (_timeBase == 0 ) { return ; } TimeFormat currentTimeStamp = GetTimeStampOffset(_timeBase) + _timeBase; CMTime pts = CMTimeMake (currentTimeStamp, 1000 ); CVPixelBufferRef pixelBuffer = CreatePixelBuffer(data, width, height); if (pixelBuffer == NULL ) return ; [self appendVideoPixelBuffer:pixelBuffer pts:pts]; } else { if (_timeBase == 0 ) { _timeBase = GetTimeStampInMS(); [_assetWriter startWriting]; [_assetWriter startSessionAtSourceTime:kCMTimeZero]; } TimeFormat currentTimeStamp = GetTimeStampOffset(_timeBase); CMTime pts = CMTimeMake (currentTimeStamp, 1000 ); CVPixelBufferRef pixelBuffer = CreatePixelBuffer(data, width, height); if (pixelBuffer == NULL ) return ; [self appendVideoPixelBuffer:pixelBuffer pts:pts]; } }); }
如果只录制视频,就不需要考虑时间戳对齐,在 AVAssetWriter 初始化完成后的第一帧即可记录时间基并开始写入。如果需要写入音频,就要保证音视频的时间基是相同的。既然音频回调的 CMSampleBufferRef 里有时间戳信息,那不如就用第一帧音频的时间戳作为视频的时间基,此后依次叠加系统时间的差值即可,所以这里 _timeBase 如果是 0,需要等待第一帧音频帧的到来,再看下音频帧回调里的处理:
#pragma mark - <AVCaptureAudioDataOutputSampleBufferDelegate> - (void )captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef )sampleBuffer fromConnection:(AVCaptureConnection *)connection { if (!_canWrite) return ; dispatch_sync (_operationQueue, ^{ if (_timeBase == 0 ) { CMTime pts = CMSampleBufferGetPresentationTimeStamp (sampleBuffer); _timeBase = (TimeFormat)(CMTimeGetSeconds (pts) * 1000 ); [_assetWriter startWriting]; [_assetWriter startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp (sampleBuffer)]; } [self appendAudioSampleBuffer:sampleBuffer]; }); }
在 AVAssetWriter 初始化完成后,收到的音频首帧里包含的时间戳会作为时间基赋值给 _timeBase,并开始 AVAssetWriter 的写入任务,startSessionAtSourceTime 要传入第一帧音频的时间戳信息。processVideoBuffer 方法中视频帧也会基于音频首帧的时间戳来写入,这样就实现了一个简单的,比较粗略的时间戳同步。
最后是音频和视频 buffer 的写入:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 - (void )appendVideoPixelBuffer:(CVPixelBufferRef)pixelBuffer pts:(CMTime )pts { if (!_isRecording || _assetWriter.status != AVAssetWriterStatusWriting ) { CFRelease (pixelBuffer); return ; } dispatch_async (_writingQueue, ^{ AVAssetWriterInput *input = _videoInput; if (input.readyForMoreMediaData) { BOOL success = [_videoAdaptor appendPixelBuffer:pixelBuffer withPresentationTime:pts]; if (!success) { [self stopRecordingWithCompletion:nil ]; NSError *error = _assetWriter.error; NSLog (@"AVAssetWriterInputPixelBufferAdaptor appendPixelBuffer failed with error: %@" , error.localizedDescription); } } else { NSLog ( @"Video input not ready for more media data, dropping buffer" ); } CFRelease (pixelBuffer); }); } - (void )appendAudioSampleBuffer:(CMSampleBufferRef )sampleBuffer { if (!_isRecording || _assetWriter.status != AVAssetWriterStatusWriting ) { return ; } CFRetain (sampleBuffer); dispatch_async (_writingQueue, ^{ if (_audioInput.readyForMoreMediaData) { BOOL success = [_audioInput appendSampleBuffer:sampleBuffer]; if (!success) { [self stopRecordingWithCompletion:nil ]; NSError *error = _assetWriter.error; NSLog (@"AVAssetWriterInput appendSampleBuffer failed with error: %@" , error.localizedDescription); } } else { NSLog ( @"Audio input not ready for more media data, dropping buffer" ); } CFRelease (sampleBuffer); }); }
在写入视频帧时,起初跟音频一样,也是使用 AVAssetWriterInput 的 appendSampleBuffer 来写入,但发现一直写入失败,苹果的报错只是说 unknown error,没有给出具体原因,之后在 Stack Overflow 上看有人遇到同样的问题,改用 AVAssetWriterInputPixelBufferAdaptor 写入 CVPixelBufferRef 可以解决,亲测是有效的,但具体原因就没有深究了。
参考资料
主要是苹果的接口文档了,还有一个网上找的 AVFoundation 的使用 Demo 用来参考:Demo
后记
现在是 23:26 分,本来想着再写一篇关于 crash 处理的,好像时间有点晚了,还有日记要写,就留到明天吧。