Organizing Video Clips for an Online Course

I’ve signed on to do a Coursera online course on cloud security. I’ll share more details as production progresses. This post contains a few notes on organizing video clips for a large project.

The video almost always consists of two synchronized streams: one of my bearded face narrating the video and the other of animated images, text, and diagrams. This is more complicated than my older video efforts, which consisted of animated presentations with voiceover.

I’ve now learned the value of the famous movie-studio clapperboard slate. I’ve also learned that your file naming process has to blend well with your editing style.

This is my third attempt to organize video clips for online lecture/demo purposes. I used my first organizing plan to produce the Cryptosmith series. I produced those videos using Apple’s Keynote for visuals, animation, and audio. Keynote lets me record my narration in real time as the presentation unfolds, and then lets me export the result as a video file.

I arranged the Cryptosmith videos to be played in a particular order. A video’s numeric place in the series forms a 2-digit prefix to its file name. This keeps the files sorted into sequential order. File names may also contain a “v” suffix with a revision number. For example:

01 Introduction v3.key
01 Introduction v3.mp4

Rick's home made video clapper board I’m now working on the Coursera specialization. It contains 4 courses, and each course contains 4 one-week modules. Each module may contain 8-12 videos in the 3-7 minute range. I shoot each animated presentation using Apple’s Keynote, and capture my on-screen narration using a Canon DSLR. The photo at left shows an image from the DSLR of me starting a narration. My home-made clapper slate identifies videos in terms of course#, module#, video#, and take #.

I use chromakey to merge my talking head with the animation. Here’s the result:

image from lecture video showing my talking head

Second Organizing Plan

I started the Coursera project using a second organizing plan. I organized each course into its own folder and used module#, video#, and take# to construct file names. Video #4’s file names looked like this:

M1 V04 t2 slides cvss.key – Keynote file with recorded narration; topic: CVSS
M1 V04 t2 slides cvss.mp4 – Audio/Video of Keynote with narration; topic: CVSS
M1 V04 t2.mp4 – Audio/Video of my face narrating.
M1 V04 t2 cvss.mp4 – Final form of video; topic: CVSS

When I remove a memory card from the DSLR, the first thing I do is rename the files. The Mac Finder displays the first frame of the video, which shows the clapper slate. I use that information to rename the file.

To help synchronize things, I’ve developed a ritual for starting the DSLR recording and Keynote’s narration recording. I hold up the slate (photo above), start the DSLR, and maybe recite the slate contents. I click “start” on Keynote’s recorder. There’s a countdown. I center myself in the video frame (it’s displayed on my iPad), and I smile.

Right before I narrate, I clap my hands. This yields an audio marker on both the Keynote and DSLR sound tracks. The photo at right shows my home-made clapper. If I mess up the narration, I pause the Keynote recording and restart the DSLR recording. I back up over the bad part on Keynote and re-record over it. The re-recording includes a hand clap recorded on both sources. I use the hand clap to synchronize the replacement DSLR video.

Unfortunately, a hand clap isn’t as clean and simple as I’d like. It has a perfectly sharp rising edge, but then trails off. Worse, it’s hard to decide just where the clap’s noise starts on a video track with no audio. This is probably why people use genuine clapper slates like the one shown at the top of this post.

A/V Synch Problems

The hand clap was supposed to solve all of my audio/video synchronization (A/V synch) problems. It helped, a little.

I have a very good mic that plugs into my desktop. I have various poor mics that connect to my camera. The speaking audio is recorded to Keynote on the computer along with the presentation video. The speaking video is recorded on the camera along with whatever audio the on-camera mic picks up. This is pretty close to the speaking audio, but noisier.

I’m using Apple’s Final Cut Pro X (FCPX), and learning a lot about A/V synch. If I limit my face time on camera, synch is pretty easy. Things get tricky if I’m there narrating for a few minutes in a row or pop in and out of the frame towards the end. My mouth goes out of synch from my audible words after only a couple minutes unless I make careful adjustments. I’ve managed to do this by hand for a few videos.

After producing about a half-dozen videos I looked for better A/V synch on FCPX. My preferred solution now uses multicam clips. A multicam clip combines two tracks onto one clip and adjusts the track speeds to synchronize the audio. The clip retains the individual tracks. I drag the multicam clip to the FCPX timeline and select the audio and video track I want. I usually drag two copies of the multicam clip to the timeline and line them up in parallel. I pick the camera video file and no audio from one, and the presentation audio and video files from the other. I chromakey the camera video and move it to an appropriate spot over the presentation video. And then everything is in synch.

Unfortunately, I can’t use multicam if I just re-record flubbed parts of the narration. I need to change my organizing plan.

Third Organizing Plan

I will use an industry-standard clapper slate, and also put slate information at the start of my Keynote presentation decks. I’ll also reformat file names to always include a brief marker for the video source: Keynote presentation (“p”), DSLR camera (“c”), and Final Cut Pro (“f”). The source also shows up on all slates. I’m ordering a 7-part clapper, and here’s how I’ll fill it out:

Prod – “Cloud Computing Course 1”
Roll – Module number: M1
Scene – Video number and optionally the starting slide number
Director: empty
Camera: the video source – “c” for camera on the clapper slate
Date, etc: ignored

Until I receive the clapper slate, I’ll just keep writing everything on a piece of card stock.

My file names are getting too long, so I’m going to shorten up all the data embedded in them. The code numbered names will come first, followed by a few descriptive words about the video. Here’s the format:

m0v00 t0w description
m0v00 t0s00w description

The starting “m0” gives the module (week) number the video appears in (4 weeks per class). The “v00” is the video number within that module. “The “t0” captures the take number, optimistically maxed out at 9 or less. If I’m restarting partway through a slide deck, the “s00” captures the starting slide number. The “w” captures the video source: camera, presentation, or FCPX. The description contains a few words to remind me which video this is. As before, names will be relative to a course in the specialization. Here are revised file names for video 4, take 2:

m1v04 t2p cvss.key – Keynote file with recorded narration; topic: CVSS
m1v04 t2p cvss.mp4 – Audio/Video of Keynote with narration; topic: CVSS
m1v04 t2c.mp4 – Audio/Video of my face narrating.
m1v04 t2f cvss.mp4 – Final form of video; topic: CVSS

When I mess up a video I’ll want to start completely over. If I don’t want to sacrifice the existing video, I’ll make a copy of the Keynote file and delete all slides I had successfully narrated. The new file’s name will be the same, but include the starting slide number of the re-recording. I then re-record both tracks starting at the broken slide number. This yields two video tracks with closely matching audio. In FCPX I produce a multicam clip of the first set of recorded slides and another multicam clip starting just before the mistake.