Or, Sony Vegas 101

Over the last few weeks a lot of my spare time has been going into learning how to edit videos – mostly of rocket launches and tests. Video is a great medium for capturing and carefully reviewing fast-moving objects – a standard (NTSC) camcorder captures 60 fields (half-frames, sort of) each second, which gives you a lot of time resolution to see what is going on.

For the last dozen years or so, ever since video editing became reasonably practical on a PC, I’ve periodically attempted to learn how to edit video, but it always seemed impossibly complex. This was despite the fact that I have a pretty good technical background in video – I sat on the the committee that developed the H.264 standard. (Hi Gary, Hi Thomas. I don’t miss your meetings at all.)

But I’ve finally managed it, and it turned out to be much less bad than it seemed (as usual). I’ll try to pass along the key tricks to getting it working.

Caveat: My focus has been on editing digital video from a consumer-type HD camcorder (a Canon HF100), for ultimate viewing on a computer (a Windows box). So I’m assuming you have already copied the .MTS files of your video clips from the SD card and have them to start with.

I’ll start with the Executive Summary (applicable to rocketry-type videos), then explain:

  • Camcorder setup (Canon HF100 or similar):
    • Use a gun sight as a viewfinder
    • Shortest possible shutter speed (1/2000 second is good)
    • Manually focus at infinity (turn off autofocus)
    • Turn on image stabilization
    • Set highest possible bit rate
    • Record at 60i
    • Get far away (> 100′)
  • Video editing – use Sony Vegas
    • Project settings
      • 1920×1080
      • 59.94 frames/second
      • Progressive scan
      • Deinterlace by interpolation
      • 32-bit floating point (video levels)
      • UNCHECK “Adjust source media”
    • Preview rendering quality: Set to Good (auto) or Best
      • Anything less than Good won’t de-interlace (on preview)
    • Add Sony Color Corrector
      • Gain = ((17+255)/255) = 1.06666667
      • Offset = -17
    • Options>GridSpacing>Frames
    • Options>Quantize to Frames
    • Output: Render As…
      • Audio: Default stereo at 44.1 kHz
      • MainConcepts AVCHD (H.264):
        • 1920×1080 (do not resample)
        • Progressive scan
        • Best
        • Main profile
        • 2-pass VBR (for quality)
        • Use CPU only (only if you get errors with the GPU)
  • 4 Mbps minimum; 10 to 14 Mbps is better

ABOUT VIDEO

Just like Thomas Edison’s motion picture film, video is a series of still pictures that are captured and shown in quick succession, to create the illusion of smooth motion. When you’re going to carefully analyze an event after the fact, it can be really helpful to look at those still pictures one at a time, or in slow-motion. You can easily measure the duration and timing of events by counting frames (pictures), because the frames are taken at fixed intervals of time.

In NTSC countries (USA, Canada, Japan), the standard video format is 30 frames (pictures)/second, in the rest of the world it’s 25 frames/second (PAL and SECAM standards). Since I’m in the USA I’m going to use the 30 frames/second number (adjust accordingly if that doesn’t fit where you live).

So, for example, if frame 30 shows an event starting, and frame 36 shows it ending, you know the event was 6 frame intervals long. That’s 6/30ths of a second (0.2 seconds).

Only…it’s not really 30 frames/second, it’s actually (30 * 1000/1001) frames/second, which is a tiny bit more than 29.97 frames/second. The reason for that is related to the transition from black-and-white to color broadcasting in the 1950s, the details of which are irrelevant today. Just accept it – when people say “30 Hz” in video, they mean 30 * 1000/1001  Hz. (They also mean that if they say “29.97 Hz”, which is a lot closer to the exact value, but not quite there.)

Sometimes you’ll hear about “progressive” video, often abbreviated to “p” as in “720p” and “1080p”. Progressive video is what you’d expect video to be – a simple series of still pictures captured and shown one after another, like frames of a movie film.

Other times you’ll hear about “interlaced” video (as in “1080i”). That…I’ll get to.

VIDEO CAPTURE WITH THE CAMCORDER

I’ve been using a Canon HF100 consumer-level HD camcorder. It’s pretty good. My only complaints are that it has a mild rolling shutter problem (not nearly as bad as my DSLR’s video mode), and a clunky UI for manual control. The newer models are probably better.

Viewfinder – use a gun sight

The biggest problem I’ve had with it is tracking fast-moving rockets with the non-existent viewfinder. I don’t count the LCD screen as a viewfinder – because it’s 4 inches from your nose, your eyes can’t focus on the screen and the distant rocket at the same time. And if you look at the LCD screen, then the moment the rocket (or whatever) goes off the screen, you have no idea what direction to go to find it. This is a serious problem when you’re zoomed in a long way, as your field of view is small.

After trying several alternatives (“sports finders”, cardboard tubes, optics), the best solution was to attach a gun sight to the camcorder. It has no magnification, just projects a target into your field of view. It has little set-screws for adjusting it, so you tweak these until the target points at the exact middle of the picture when the camera is zoomed in all the way. That way, as long as you keep the target on what you’re trying to shoot, it’ll be in the picture. The one I used cost about $45; I attached it with cable ties, a block of wood, some Blu-tack (under the wood), and a strip of scrap metal.

Canon HF100 camcorder with gun sight viewfinder. The masking tape is to prevent my face from hitting the “record” button.

Setting up the camcorder

The camcorder has dozens of things you can set. These are the ones that matter for videos of fast-moving things like rockets:

Shutter speed – set to 1/2000 second (or whatever is the fastest it’ll go)

The less time the shutter is open, the less motion blur you’ll get in each picture. As long as you’re shooting in daylight, there will still be enough light – the camera will open up the aperture and/or crank up the gain to compensate. Don’t set the shutter to “auto”; it’ll be much too slow to freeze fast motion and you’ll get blur. (“Auto” is fine for videos of your kids.)

Use manual focus

Camcorders autofocus slowly and unreliably when you’re taking a picture of a tiny moving dot against the blue sky. If you use autofocus, most of your video will be a blur. Manually focus at infinity and leave it there. (“Focus assist” doesn’t help outdoors.)

Image stabilization (IS) – turn it on

Ideally, you’d use a tripod and get nice steady video. But if you do that, you won’t be able to slew the camera fast enough to track really fast-moving things (at least I haven’t been able to, even with a damped “video head” on it). So I ended up shooting hand-held, with a collapsed monopod to help steady the camera against my body. Image stabilization helps reduce the shakes (not as much as I’d like).

Bit rate – MAXIMUM

The higher the bit rate, the less compression artifacts and the more detail in your video. The files are bigger, but you can always drop the bit rate later when you’re editing the video. On the Canon cameras, any rate less than FXP (17 Mbps) the camera captures only 1440×1080 pixels instead of the full 1920×1080. Always capture the best source video you can. FXP is the highest the HF100 will go, but newer cameras will run at a higher bit rate – use whatever rate is the highest.

Wind screen for mic – ON

Mic Level – AUTOMATIC

You want to reduce wind noise (there’s a lot outdoors) and the audio AGC works fine.

Frame rate – 60i (usually), 30p (if detail is more important than motion)

Back in the 20th century, around the time Philo Farnsworth was inventing the kind of electronic television that nobody uses anymore (and long before he invented desktop nuclear fusion that nobody uses), some bright fellow came up with a horribly clever idea to seemingly pack twice as many frames per second into a television channel as would otherwise fit – interlace. This was universally adopted, and in the 1990s when HTDV standards were set, broadcasters insisted on continuing with interlace.

Wikipedia has a pretty good article explaining the difference between interlaced video (what you get, usually) and progressive video (what you want, usually). I’m not going to repeat it here, except to say that normal computers display progressive video at 60 Hz (60 frames/second). So unless you use slow-motion, the best temporal resolution (picture rate) you can get on a typical computer is 60 Hz – that’s what you want for output.

Of the various frame rates on the camcorder, the 30p mode (30 frames/sec, progressive) offers the best spatial resolution (picture detail) – you get 30 frames/second (progressive), each picture at the full HD 1920 column x 1080 line resolution (at least in the “FXP” high bit rate mode, which is also what you want).

The 60i mode (60 fields/sec, interlaced) gives the best temporal resolution – you get 60 fields/second (interlaced), each of which has half the spatial resolution of the 30p mode; essentially 560 lines with 1920 columns. But because they’re interlaced fields instead of progressive frames, it’s not as simple as that. (If the scene isn’t moving much, you can combine the two fields of each frame to make a single full-resolution picture; but this doesn’t help you any with fast-moving things – go read the Wikipedia entry if you want the details.)

The bottom line is this:

  • If spatial resolution (detail) is more important than temporal resolution (how often pictures are captured), use 30p.
  • If temporal resolution (how often pictures are captured) is more important than spatial resolution (detail), use 60i.

Either way you’re going to be capturing the same total number of pixels each second – in 30p you’ll have half as many pictures, each with twice as much detail. In 60i, vice-versa.

For the rest of this post, I’m going to assume you’re using 60i, because that’s the best setting for the kind of video I do (rockets), and because it’s the most complicated to set up properly. I’ll mention the few places where you’d do things differently for 30p. (And, if picture detail is really most important, you probably ought to use a still camera.)

Get far away

Last, if you’re going to be capturing mid- or high-power rockets, get far away from the launch pad – at least 100 feet. Any closer and the angular rate of climb is just too fast to follow with your hands. Don’t zoom in too much on the launch.

EDITING – USE SONY VEGAS

First, I tried Adobe Premier Elements (versions 7 and 8), thinking that the low-end “Elements” version would be relatively simple to learn and use. Without going into any details, don’t do that. Just trust me on this.

Use Sony Vegas. It works, it’s reasonably priced, and it does everything you need. I’m currently using “Vegas Movie Studio HD Platinum“, version 10. It’s a horrible mouthful of a name, but it’s a good program and it retails for only $100 – well worth it. I don’t see anything in the higher-end versions that look useful.  (Update 2013-11 – I’m using Vegas Pro 11 now & have updated this post slightly for that – see also notes at the end of this post.)

The timeline

Fire up the program and you’ll see a bunch of horizontal stripes along the bottom of the screen with labels like “Video Overlay”, “Video”, “Voice”, etc. These are timelines that you’ll use to edit the various components of your media. You can (and should) drag-and-drop your source .MTS video files onto the timeline. Put the audio track on “Voice” and the video track on “Video”. ”Video Overlay” will be used for captions and the like (if you choose to).

Separate clips can be joined together on the timeline by dragging them so they abut each other.

You can cut part of a clip out by finding the cut point and hitting S (for Split). This splits the clip into two sections. Then select the section you don’t want with the mouse, and hit Del to delete it. That’ll leave a hole, so drag the other clips around to fill it up. It’s pretty straightforward if you play around with it for a while (viewing the tutorial videos that come with Vegas is helpful to get started). Vegas works with pointers into the source files without modifying them – once you’re done editing, you’ll “render” the project into a new output video file.

Vegas setup

Before you do anything with Vegas, set up a few things:

Options>Quantize to Frames – turn this ON

This forces Vegas to make all edits on exact frame boundaries, not in the middle of a frame display period. If you don’t do this, your edits will introduce a partial-frame time offset and you’ll end up with “ghosting”; Vegas will construct every output frame as a blurry mixture of the preceding and succeeding input frames. You don’t want that.

Options>Grid Spacing>Frames

This will draw vertical lines on the timeline showing you when each new frame starts. It’s really useful if you want to measure the timing or duration of events.

Project settings – Video tab

Correct project settings are critical to making Vegas editing work the way you want. You can get at the project settings with Project>Properties… (as well as a couple of other ways).

Video comes in various shapes and sizes. In full HD video, each picture is 1920 columns by 1080 lines, with square pixels. There are lots of other popular shapes and sizes; very often much smaller pictures are used for display on YouTube, phones, etc.

This is important: The “project settings” in Vegas control the format that Vegas uses while editing, NOT the final output format. The final output format is determined at the end of editing when you render (generate) your final output video. So do not set the project settings according to the output video you want, set it according to the input video you are working with. If the project settings in Vegas are different than the input format, Vegas will do its best to convert the input video to the format in the project settings.

For the video tab, ignore the Template choice – we’re going to use custom values, because none of the templates are what you want. I will explain.

Use:

  • Width: 1920
  • Height: 1080
  • Frame rate: 59.940 (for 60i input; use 29.970 if using 30p input. See below.)
  • Field order: None (progressive scan)
  • Pixel aspect ratio: 1.00 (Square)
  • Pixel format: 32-bit floating point (video levels)
  • Full-resolution rendering quality: Best
  • Deinterlace method: Interpolate fields
  • Adjust source media…: Do NOT check

The width and height should be set to match your input (source) video. In the case of the Canon HV100 (or any full-HD camera), this is 1920×1080.

The frame rate should be set at an exact multiple of the source video frame rate. Set it at 29.97 fps if you’re using 30p input, and at 59.94 if you’re using 60i input. (Vegas seems to internally round these up to 30000/1001 and 60000/1001 respectively.)

By setting 59.94 frames per second, you are telling Vegas to treat each of the two fields of the interlaced frame as a separate progressive output picture. This is critical if you’re using 60i input and want to avoid interlace artifacts such as combing and ghosting in your output. If you configure the project settings this way, Vegas will produce two progressive output frames for every interlaced input frame. This is exactly what you want for video of fast-moving objects, because it gives you a real 60 Hz picture rate.

The “deinterlace method” controls how Vegas will construct those progressive output pictures from the interlaced input video. “None” will simply repeat each line of a single interlace field to convert a 1920×560 interlace field to an output 1920×1080 picture. This looks OK, but you can do better. “Blend fields” produces the even-numbered lines (2, 4, 6…) in the output picture from one input field, and the odd-numbered lines (1, 3, 5…) from the other input field. This means that every other line in your output picture was captured at a different time. This looks great if there wasn’t any motion in the scene between the time the two fields were captured. But if there was (and I’m assuming there was), then it will look horrible – don’t do it. Choose “Interpolate fields”. This creates a single output picture from a single input field, generating the missing every-other line by averaging the lines above and below. This produces the best output for high-motion video.

Project settings – other tabs

For the other tabs in the Project Properties, the defaults work fine. I like to set the Ruler time format to “Seconds” to make it easier to measure time differences between frames. (Note also that the frame count is visible in the preview window.)

Color spaces – Studio RGB vs. Computer RGB vs. Canon RGB

One thing you may notice when using Vegas or viewing the output files after processing in Vegas is that the video looks “washed out” or too dark or contrasty. These problems come from color space conversion problems. Different video players use different color spaces, so the same video file might look fine in one player, and bad in another.

The brightness and color of pixels displayed on your computer screen is determined by three numbers, for the brightness of Red, Green, and Blue in each pixel. Each number can range from 0 (darkest) to 255 (brightest). If Red, Green, and Blue (RGB) are all 255, they combine to make bright white. This is the “Computer RGB” color space – each color has the range from 0 to 255.

The “Studio RGB” color space uses only the values from 16 to 235 to represent the RGB brightness. Values in the range of 0 to 15 are called “superblacks” and values 236-255 are “superwhites”. Standard televisions and video cameras are supposed to use the Studio RGB system. On a TV display, superblacks are all displayed as the same color of absolute black, and superwhites are all displayed as the same shade of whitest white, so 16 is as black as you can get, and 235 is as white as you can get.

But this is not true in the Computer RGB system. 16 is not pure black – it’s dark, but 0 is a lot darker. And 235 is not as bright as the computer display can get – full brightness comes at 255. So if you show a Studio RGB video (16-235) on a computer (0-255), it will look washed out, with low contrast. The blacks won’t be very black, and the whites will be grayish. And vice-versa; if you show a Computer RGB video on a TV, there will be too much contrast, with all the dark details (0 to 15) lost as black, and all the white highlights (236-255) “blown out” to pure white.

So a color space correction is needed when converting from one system to the other. Sony Vegas has the “Sony Color Corrector” video effect that you can use to do this conversion – it has presets for converting Computer to Studio RGB and another for the other direction.

But. If you look at the video files the Canon HF100 generates, you’ll find it uses the values 16-255, instead of either of the two standard systems. If you treat this as Studio RGB, the video will lose the bright highlights (236-255). If you treat it as Computer RGB, the black levels are wrong (16 instead of 0).

You can fix this by converting Canon RGB to Computer RGB (assuming you’ll view on a computer).

In the Sony Color Corrector, set Gain to 1.067 (this is (17+ 255)/255), and set an Offset of -17. The Color Corrector first multiplies all the color values by the Gain setting, so 16-255 becomes 17-272. Then it adds the Offset to all the values, so the range 17-272 becomes 0-255. I saved this setting as “Canon HFxxx to Comptuer RGB”.

Rendering (video output encoding)

Once you’re done editing in Vegas, you produce your output video file using Project>Render As…

The rendering (video encoding) process reads your input video files and creates a new output video file. The rendering settings control what format the video is in, the frame rate of the output video, and the compression method.

Choose the “MainConcept AVC/ACC (*.mp4) encoder (in “Save as type”). This uses H.264 to compress the video (the modern standard) and AAC for the audio (ditto).

But then hit the “Custom” button to the right to control how it will do that. Use:

  • Width: 1920
  • Height: 1080
  • Profile: Main
  • Frame rate: 59.94 (for 60i input, use 29.94 for 30p input video)
  • Field order: None (progressive)
  • Pixel aspect ratio: 1.0
  • For best quality, choose “Variable bit rate” and “Two pass”. (Two pass takes twice as long, but produces better quality)

These settings control the format of the output video you’re generating. If you want smaller pictures than full HD, just put in the appropriate dimensions for width and height (and maybe pixel aspect ratio, depending on the format); the conversion will happen during rendering. (As I said above, do this here and not in the project settings.)

Main profile provides the best quality at a given bit rate (best compression). My version of Vegas doesn’t offer it, but choose “High” if that’s an option – that’s even better.

The bit rates that you choose have a very big effect on the quality of your video. Higher bit rates will look better, but will produce proportionally larger files. I find that 4 Mbps (4,000,000 bps) is the absolute minimum for HD video, and a rate of 10 to 14 Mbps is much better. There isn’t much point in setting the bit rate higher than the bit rate of the source video (17 Mbps in my case).

Comments & questions are welcome.


Update, December 5 2013:

I’m using Vegas Pro 11 now. It’s pretty much the same, but I’ll offer 3 tips:

  • If you have trouble de-interlacing video, right click on the source video and look in Properties…. Make sure Field order doesn’t show None (progressive scan). If Vegas thinks the source video is already progressive, it won’t deinterlace it and you’ll get terrible combing or ghosting artifacts. I found this was a problem with interlaced .mp4 files (MPEG-4 encoded) made with ffmpeg. (VLC has no trouble playing these files, but Vegas gets confused.)
  • If you get “An error occurred while creating the media file. The reason for the error could not be determined” when rendering the final video (now under File>Render As…), it’s probably because Vegas Pro 11 has a bug when attempting to use GPU rendering. The workaround is to click the Customize Template… button, and change the Encode Mode from Automatic to CPU only.  It’s slower, but it works and doesn’t crash.
  • Set Project Properties>Pixel format to 32-bit floating point (video levels). It gives better colors and your computer is probably fast enough to handle it.