Chris
Chris

Reputation: 282

Merging Videos with Overlayed Text

In Swift for iOS, I have an array of AVURLAsset. I pass it through a function to stitch/merge the video assets together into one final video. For each sub-video, my goal is to overlay text centered in the video frame.

So far, I've achieved a semi-working version of this based on a former post I had, but have run into the following issues I cannot get my head around...

  1. The overlayed text for each video overlaps each other for the entire duration of the final merged video. Each text should correspond to the duration, start/end time, of its sub-video.

  2. Videos are recorded with a portrait, vertical orientation, and to keep this consistent, I thought setting AVMutableVideoCompositionLayerInstruction .preferredTransform to the corresponding AVAssetTrack video .preferredTransform would do that, but it doesn't. Videos are oriented -90 degrees where the bottom half of the screen is black.

  3. I originally had played back the final video asset via AVAssetExportSession and it's .asset property. But I noticed the text overlay only displayed if I created a new AVAsset from AVAssetExportSession's .outputUrl property. This required AVAssetExportSession to fully complete before playing back the video whereas before when I had played from the .asset property, the final video immediately played whatever had loaded without having to wait for the whole thing to load (but again, no text overlay). Is there any way to return to the original while rendering the text overlay?

I tried following an existing answer and Ray Wenderlich's tutorial, but no success.

Any guidance would be extremely appreciated..

func merge(videos: [AVURLAsset], completion: @escaping (_ url: URL, _ asset: AVAssetExportSession)->()) {

let videoComposition = AVMutableComposition()
var lastTime: CMTime = .zero

var maxVideoSize = CGSize.zero

guard let videoCompositionTrack = videoComposition.addMutableTrack(withMediaType: .video, preferredTrackID: Int32(kCMPersistentTrackID_Invalid)),
      let audioCompositionTrack = videoComposition.addMutableTrack(withMediaType: .audio, preferredTrackID: Int32(kCMPersistentTrackID_Invalid)) else { return }

let mainComposition = AVMutableVideoComposition()

let mainParentLayer = CALayer()
let mainVideoLayer = CALayer()
mainParentLayer.frame = CGRect(x: 0, y: 0, width: maxVideoSize.width, height: maxVideoSize.height)
mainVideoLayer.frame = CGRect(x: 0, y: 0, width: maxVideoSize.width, height: maxVideoSize.height)

mainParentLayer.addSublayer(mainVideoLayer)

var instructions = [AVMutableVideoCompositionInstruction]()

for video in videos {
    
    if let videoTrack = video.tracks(withMediaType: .video)[safe: 0], let text = savedTexts[video.url] {
                        
        videoCompositionTrack.preferredTransform = videoTrack.preferredTransform
        
        do {
            try videoCompositionTrack.insertTimeRange(CMTimeRangeMake(start: .zero, duration: video.duration), of: videoTrack, at: lastTime)
            
            if let audioTrack = video.tracks(withMediaType: .audio)[safe: 0] {
                try audioCompositionTrack.insertTimeRange(CMTimeRangeMake(start: .zero, duration: video.duration), of: audioTrack, at: lastTime)
            }
            
            lastTime = CMTimeAdd(lastTime, video.duration)
            
            let videoSize = videoTrack.naturalSize.applying(videoTrack.preferredTransform)
            let videoRect = CGRect(x: 0, y: 0, width: abs(videoSize.width), height: abs(videoSize.height))
            maxVideoSize = CGSize(width: max(maxVideoSize.width, videoRect.width), height: max(maxVideoSize.height, videoRect.height))
            
            let textLayer = CATextLayer()
            textLayer.string = text
            textLayer.foregroundColor = UIColor.white.cgColor
            textLayer.font = UIFont(name: "Helvetica-Bold", size: min(videoRect.height / 10, 100))
            textLayer.shadowOpacity = 0.5
            textLayer.alignmentMode = .center
            textLayer.contentsScale = UIScreen.main.scale
            textLayer.isWrapped = true
            
            let textHeight: CGFloat = min(videoRect.height / 10, 120)
            let textWidth: CGFloat = videoRect.width
            let xPos = (videoRect.width - textWidth) / 2
            let yPos = (videoRect.height - textHeight) / 2
            textLayer.frame = CGRect(x: xPos, y: yPos, width: textWidth, height: textHeight)
            textLayer.zPosition = 1
            
            let parentLayer = CALayer()
            parentLayer.backgroundColor = UIColor.clear.cgColor
            parentLayer.frame = videoRect
            parentLayer.addSublayer(textLayer)
            
            let videoCompositionInstruction = AVMutableVideoCompositionInstruction()
            videoCompositionInstruction.timeRange = CMTimeRangeMake(start: lastTime - video.duration, duration: video.duration)
            let layerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoTrack)
            layerInstruction.setTransform(videoTrack.preferredTransform, at: lastTime)
            videoCompositionInstruction.layerInstructions = [layerInstruction]
            instructions.append(videoCompositionInstruction)
            
            parentLayer.zPosition = 0
            mainParentLayer.addSublayer(parentLayer)
            
        } catch {
            print("Failed to insert track: \(error.localizedDescription)")
            return
        }
    }
}

mainParentLayer.frame = CGRect(x: 0, y: 0, width: maxVideoSize.width, height: maxVideoSize.height)
mainVideoLayer.frame = mainParentLayer.frame

mainComposition.renderSize = maxVideoSize
mainComposition.instructions = instructions
mainComposition.frameDuration = CMTime(value: 1, timescale: 30)
mainComposition.animationTool = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: mainVideoLayer, in: mainParentLayer)

let outputUrl = NSURL.fileURL(withPath: NSTemporaryDirectory() + "merged" + ".mp4")

guard let exporter = AVAssetExportSession(asset: videoComposition, presetName: AVAssetExportPresetHighestQuality) else { return }

exporter.videoComposition = mainComposition
exporter.outputURL = outputUrl
exporter.outputFileType = .mp4
exporter.shouldOptimizeForNetworkUse = true

exporter.exportAsynchronously {

    DispatchQueue.main.async {
        
        if let outputUrl = exporter.outputURL {
            
            if exporter.status == .completed {
                
                self.play(video: AVAsset(url: exporter.outputURL!))
                completion(outputUrl, exporter)
                
            } else if let error = exporter.error {
                
                print("Export failed: \(error.localizedDescription)")
            } else {
                
                print("Export status:", exporter.status)
            }
        }
    }
}
//Originally played video here via AVPlayer, which played back immediately
//play(video: exporter.asset)
}

Upvotes: 2

Views: 164

Answers (1)

Meep
Meep

Reputation: 531

I wish I have some time to review this post but here is the code below.

  1. The timings might need to be relative in the main video.
  2. I had this issue but I might confuse with the streaming and creating this video with the position of text. Here's the code but take this as a grain of salt.
  3. I'm trying to understand what is the problem - if the app wants to play the final video, is it possible to get the video file instead of using the .asset. If you want to show the produced video to user while it is being made. Use the CMSampleBuffer on the original video and add some layers to show its work to the user.

Here's the code here for creating a video.

func addLayersInCompositionTrack(videoEditorSettings : VideoEditorSettings, layer : CALayerStrategy, completion: @escaping (URL?) -> Void) {
    let asset = AVURLAsset(url: pathToVideo)
    let composition = AVMutableComposition()
    
    guard asset.isExportable else {
        fatalError()
    }
    
    do {
      let videoTrack = asset.tracks(withMediaType: .video).first!
      let videoTrackDuration = videoTrack.timeRange.duration
        
      let assetTimeRange = CMTimeRangeMake(start: .zero, duration: asset.duration)
        
        let compositionTrack : AVMutableCompositionTrack = composition.addMutableTrack(
            withMediaType: .video, preferredTrackID: CMPersistentTrackID(kCMPersistentTrackID_Invalid))!
        try compositionTrack.insertTimeRange(assetTimeRange, of: videoTrack, at: .zero)
        
        compositionTrack.preferredTransform = videoTrack.preferredTransform
        let videoInfo = orientation(from: videoTrack.preferredTransform)

        let videoSize: CGSize
        if videoInfo.isPortrait {
          videoSize = CGSize(
            width: videoTrack.naturalSize.height,
            height: videoTrack.naturalSize.width)
        } else {
          videoSize = videoTrack.naturalSize
        }
        
        //let videoSize = videoTrack.naturalSize
        
        
        let parentLayer = CALayer()
        parentLayer.frame = CGRect(origin: .zero, size: videoSize)
        
        let videoLayer = CALayer()
        videoLayer.frame = CGRect(origin: .zero, size: videoSize)
        
        
        let backgroundLayer = CAShapeLayer()
        backgroundLayer.frame = CGRect(origin: .zero, size: videoSize)
        backgroundLayer.backgroundColor = videoEditorSettings.backgroundColor
        
        let titleLayer = CATextLayer()
        titleLayer.backgroundColor = UIColor.clear.cgColor
        titleLayer.string = "Hello World!"
        titleLayer.shouldRasterize = true
        titleLayer.rasterizationScale = UIScreen.main.scale
        titleLayer.foregroundColor = UIColor.red.cgColor
        titleLayer.font = UIFont(name: "Helvetica", size: 28)
        titleLayer.shadowOpacity = 0.5
        titleLayer.alignmentMode = .natural
        titleLayer.frame = CGRect(x: 0, y: 0, width: videoSize.width/6, height: videoSize.height/6)
        
        let shapeLayer = CAShapeLayer()
        shapeLayer.frame = CGRect(x: videoSize.width/2, y: videoSize.height/2, width: videoSize.height/8, height: videoSize.height/8)
        shapeLayer.backgroundColor = UIColor.green.cgColor
        
        //validatePoints
        for point in layer.layerModel.points {
            if !parentLayer.contains(point) {
                fatalError()
            }
        }
        
        
        layer.generate(videoSize: videoSize, videoDuration: asset.duration.seconds)
        
        
        
        //parentLayer.addSublayer(backgroundLayer)
        if videoEditorSettings.backgroundColor == nil {
            parentLayer.addSublayer(videoLayer)
        }
        else {
            parentLayer.addSublayer(backgroundLayer)
        }
        
        parentLayer.addSublayer(titleLayer)
        parentLayer.addSublayer(shapeLayer)
        parentLayer.addSublayer(layer.layer)

        //layer.draw()
        //parentLayer.addSublayer(layer.layer)
        
        
        let layerComposition = AVMutableVideoComposition()
        layerComposition.frameDuration = CMTimeMake(value: 1, timescale: 30)
        layerComposition.renderSize = videoSize
        layerComposition.animationTool = AVVideoCompositionCoreAnimationTool(
          postProcessingAsVideoLayer: videoLayer,
          in: parentLayer)
        
        
        let instruction = AVMutableVideoCompositionInstruction()
        instruction.timeRange = CMTimeRangeMake(
          start: .zero,
          duration: composition.duration)
        
        let layerTrack = composition.tracks(withMediaType: .video).first!
        let layerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: layerTrack)
        
        //let transform = videoTrack.preferredTransform
        //layerInstruction.setTransform(transform, at: .zero)
        
        instruction.layerInstructions = [layerInstruction]
        layerComposition.instructions = [instruction]
        
        
        let compatiblePresets = AVAssetExportSession.exportPresets(compatibleWith: asset)

        export = AVAssetExportSession(
          asset: composition,
          presetName: AVAssetExportPresetHighestQuality)
        
        statusObservation = export!.observe(\AVAssetExportSession.status, options: .new) {
            export, change in
            os_log("export.status : %d", log: self.tag, type: .debug, export.status.rawValue)
        }
        
        let dateFormatter = DateFormatter()
        dateFormatter.dateFormat = "MMM_d_HH:mm"

        let videoName = dateFormatter.string(from: Date()) + "_" + UUID().uuidString
        let exportURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0].appendingPathComponent("\(videoName).mov")
        
        if FileManager.default.isDeletableFile(atPath: exportURL.absoluteString) {
            do {
                try FileManager.default.removeItem(at: exportURL)
            }
            catch {
                fatalError()
                //os_log("fileManager.error : %@", log: self.tag, type: .debug, error.localizedDescription)
            }
        }

        export!.videoComposition = layerComposition
        export!.outputFileType = .mov
        export!.outputURL = exportURL
        
        export!.exportAsynchronously {
            DispatchQueue.main.async { [self] in
                switch self.export!.status {
                    case .completed:
                        completion(exportURL)
                    case .exporting:
                        print("video editor exporting")
                    case .waiting:
                        print("video editor waiting")
                    default:
                        print(self.export!.error ?? "unknown error")
                }
          }
        }
        
        
        
    } catch {
      print(error)
      fatalError()
    }
}


private func orientation(from transform: CGAffineTransform) -> (orientation: UIImage.Orientation, isPortrait: Bool) {
  var assetOrientation = UIImage.Orientation.up
  var isPortrait = false
  if transform.a == 0 && transform.b == 1.0 && transform.c == -1.0 && transform.d == 0 {
    assetOrientation = .right
    isPortrait = true
  } else if transform.a == 0 && transform.b == -1.0 && transform.c == 1.0 && transform.d == 0 {
    assetOrientation = .left
    isPortrait = true
  } else if transform.a == 1.0 && transform.b == 0 && transform.c == 0 && transform.d == 1.0 {
    assetOrientation = .up
  } else if transform.a == -1.0 && transform.b == 0 && transform.c == 0 && transform.d == -1.0 {
    assetOrientation = .down
  }
  
  return (assetOrientation, isPortrait)
}

private func compositionLayerInstruction(for track: AVCompositionTrack, assetTrack: AVAssetTrack) -> AVMutableVideoCompositionLayerInstruction {
  let instruction = AVMutableVideoCompositionLayerInstruction(assetTrack: track)
  let transform = assetTrack.preferredTransform
  
  instruction.setTransform(transform, at: .zero)
  
  return instruction
}

Upvotes: 0

Related Questions