Scaling and Scalers
In a Nutshell
Scaling in video usually means the reformatting of the picture to occupy more or fewer scan lines than it did before without cropping of subjeet material. For digital video, scaling also refers to reformatting of the picture to occupy more or fewer pixels across each scan line, again without cropping of subject matter. The term scaling has also been used to changing the number of frames presented each second, without speeding up or slowing down the motion.
All scaling in video requires converting of the video to digital form and then processing it, otherwise referred to as working in the digital domain.
When the DVD player is set up for use with a 4:3 TV, it scales the picture from anamorphic DVD's . Some picture quality loss occurs.
Updated 5/15/03
How is Scaling Done?External Scaler vs. Progressive DVD Player
Simple Examples of Scaling.
1. So called anamorphic DVD's (enhanced for 16:9) have the same 480 scan lines (rows of pixels) and about 720 pixels on a line. The video frame is intended to be stretched to a 16:9 shape when viewed. Meanwhile most older TV sets will display the 480 visible scan lines only in a 4:3 shape and the largest 16:9 shaped area that fits in the screen encompasses 360 scan lines. Every DVD player has a scaler which will optionally scale the 480 scan lines of picture to occupy 360 scan lines (and add 60 black lines on top and 60 black lines on the bottom).
2. If your TV set has a "picture in picture" feature, the smaller picture is a scaled down video picture.
3. Often you will see on TV a picture that normally fills the screen occupying a portion of the screen with news headlines or stock ticker results rolling across the bottom. Scaling may be used during video production to produce this screen display as opposed to cutting off the material at the bottom of the picture..
4. NTSC pictures have (approximately) 480 scan lines. PAL and SECAM pictures have (approximately) 576 scan lines. In order to convert from one format to the other, scaling is done. Actually the simpler NTSC to PAL (and vice versa) converters work with one interlaced field at a time, thus scaling 240 scan lines to become 288 or vice versa.
More Advanced Examples of Scaling
5. Many HDTV sets display only 1080i or only 720p (or only one other scan line count). In the case of CRT based sets this simplifies the calibrations especially for geometry and convergence. The TV set has a built in scaler to handle other video formats including 480p.
6. Some TV sets, notably CRT projectors, actually give a better picture at some non-standard scan line count, or scan rate, such as 675p. The aim is to specify a scan line count so that the scan lines touch but don't overlap. Scalers are made that offer an almost continuously variable scan rate. (Adjusting the electron beam spot size using the focus control will change the sweet spot scan rate of such a TV set.)
Some Terms
Downconversion -- Scaling to a fewer number of scan lines per field or frameor a fewer number of pixels per scan line.
Interpolation -- Estimating or guessing what goes in between known items or quantities, or in the case of video, in between two scan lines or pixels.
Judder -- Minute unnatural irregularities in motion as reproduced. In video it is most often caused by repeated or dropped video fields or frames.
Resolution as in "a monitor or TV with several resolutions" -- Scan rate, more or less. So named because at each possible scan rate the monitor can resolve a different amount of subject detail.
Sampling -- In video, the picking of (usually uniformly spaced) spots in the picture or values along a video waveform to become pixels. Oversampling -- Taking more samples from the source material than there were pixels, or more samples than will be used in the final product.
Scan Rate -- The number of scan lines per second, including the "invisible" scan lines "drawn" during the time the electron beam returns to the top of the screen (vertical retrace interval).
Upconversion -- Scaling to a greater number of scan lines per field or frame or a greater number of pixels per scan line.
How is Scaling Done?
In a simple sense, if the finished picture needs to occupy more scan lines, we duplicate a scan line every now and then. If the finished picture needs to occupy fewer scan lines, we discard a scan line every now and then. (You might have guessed this.) The simplest scalers do just this. A small step up in complexity which gives a big improvement in quality is to make an inserted scan line be a blend (such as an average) of the scan lines above and below it. For deleted scan lines, the next scan line could be a blend of itself and the deleted scan line. There are many methods of blending scan line content, with many levels of complexity. The blending is sometimes referred to as interpolation since the intent is to guess what the original picture content was between the two scan lines in question. Elaborate interpolation methods might work with block of pixels taken from several consecutive scan lines. Elaborate interpolation methods include "bisinc" and "cubic". These mathematical formulas are too complex to discuss here and understanding them is not really needed for the subject of scaling itself.
Pixel Width; Oversampling
Every scaler has a "pixel width" which is the number of pixels per scan line used during the scaling process. This may or may not be the same as the number of pixels per scan line that the source material may have.
The wider the pixel width, the better the picture quality. The scaler's pixel width sometimes varies depending on the amount of scaling, but the entire pixel width is used to process the video. If the pixel width of the scaler is greater than the pixel width of the subject matter, the scaler is said to be oversampling.
The horizontal resolution of the scaled picture is never greater than the smaller of: the resolution of the source material and the pixel width of the scaler. Sometimes it is a even less.
Suppose there is a thin diagonal line in the picture and it passes through pixel 15 on one scan line and pixel 16 on the next. If we needed to insert a scan line in between these scan lines, the diagonal line wants to be represented by pixel position 15-1/2. We cannot do that with a scaler whose pixel width is the same as the resolution of the video source. One thing we can do is to do 2 for1 oversampling, which consists of have a pixel width in the scaler twice that of the source. If the picture were to be viewed, each pixel in the scaler would have half the physical width of each original pixel in the subject material. Here we could have scaling and interpolation in the horizontal direction as opposed to just making each pair of pixels in the scaler the same as the matching pixel in the source. In the above example, the spot in pixel position 15 of the source is represented by pixel positions 29 and 30 in the scaler, pixel position 16 in the source occupies pixel positions 31 and 32 in the scaler, and "source pixel position 15-1/2" would be pixel positions 30 and 31 in the scaler.
How It Might Look
In this example we are upscaling every three scan lines to become four scan lines. The inserted lines are marked with a red dot. How this scaling might be used is to take the inner 360 scan lines of a non-anamorphic wide screen DVD program and spread them out over 480 scan lines to better fit a 16:9 screen without having a smaller picture with black on all four sides.
The second square shows a small picture detail on the third and seventh scan lines (the inserted lines) pushed over to the right a bit.
The third square shows one method of averaging one scan line above and one line below to come up with the new scan line to insert.
The fourth square shows one possible placement of picture details in between the original picture pixel positions. If adjacent scan lines in the original shown above represented details in consecutive pixel positions, the scaler must use two of its own pixels to represent each of the original pixels across a scan line, and have a pixel width at least twice as wide.
Sophisticated interpolation methods may well make the line in the fourth diagram above absolutely straight. If there is additional subject matter besides just the single line on a solid background, how straight the line is made can vary.
De-Interlacing
All scalers we know of that accept interlaced video as input contain de-interlacers.
All of the original analog video formats -- NTSC, PAL, and SECAM -- are interlaced. One video frame is said to consist of two consecutive fields, or top to bottom scans of the picture tube, first the odd scan lines and then the even scan lines, or vice versa.
De-interlacing is the process of making a complete video frame out of each field with some semblance of making the result look as if it was not originally interlaced.
Whereas scaling operates on one field or one frame at a time. Scaling interlaced video without de-interlacing first can be done but it produces a much softer picture. For example NTSC has odd and even fields of 240 scan lines each. Scaling ione field at a time will produce a result that has no more than 240 lines of vertical resolution.
Click here for more on de-interlacing.
Progressive DVD Player or External Scaler?
If you are in the market for a stand alone de-interlacer scaler unit, you will not be at a significant disadvantage shopping for your scaler first and living with your older standard interlaced DVD player until your budget recovers.
"Everyone knows" that a progressive scan DVD player has an advantage over a standard DVD player together with a de-interlacer in the TV or a de-interlacer that stands alone. "They" say that the picture quality is better when the video from the MPEG decoder (first stage in the DVD player) is connected to the player's own de-interlacer by a digital link instead of connected to an outside de-interlacer by an analog link, the component video cables.
With a de-interlacer scaler unit in the video signal path, the progressive scan DVD player loses this advantage. The progressive player's component video cable going to the scaler is an analog link whereas a de-interlacer scaler unit has the digital link between de-interlacer and scaler within. Either way the number of analog to digital reconversions needed is now the same. (As we mentioned earlier, both de-interlacing and scaling are performed in the digital domain.
While progressive scan DVD players have reached the low price range, the low end and even medium range players on average do not de-interlace non-film source programs as well as some of the better modestly priced de-interlacer scaler units.
Progressive DVD players will regain their advantage once digital (such as DVI) output cabling supercedes (analog) component video cables, or if scalers are built into DVD players.
Meanwhile the stand alone and TV built-in de-interlacer scaler units keep their own advantage of handling all video sources as opposed to just DVD.
Almost all external scalers have their own de-interlacers and comb filters and even at modest price ranges (under $3000) some are extremely good. Currently, if you install an external scaler, you can no longer use the comb filter and/or de-interlacer already in your TV set.
Do You Need A New Scaler?
We have given the same overall advice with other specialized pieces of video gear, such as de-interlacers. We do not recommend running out and buying a new scaler until after you have found something wrong with the scaler(s) already in your equipment. On average the scalers in TV sets (and projectors), in DVD players, etc. are getting better and better as the years go by.
There is of course one situation where you need a scaler, that is, if you have an upscale TV and you must use or wish to use a scan rate different from the video source.
Unfortunately we don't have much advice on what to look for in terms of scaler picture quality or deficiencies that would get you interested in a new scaler. More obvious scaling deficiences include irregular diagonal lines and edges.
Judder
The current generation of scalers do not reduce judder in film based video
When the frame rate is changed, or "scaled" it is necessary to add or drop video frames. If a frame is added, some viewers will see that motion freezes for a moment unnaturally. If a frame is deleted, some viewers will see a slight jerk in motion. Unfortunately it is not practical in today's technology to blend, or interpolate, two video frames for the purpose of smoothing the resulting motion if, say, one of the frames is to be deleted.
For live video, where motion can occur with every succeeding frame, the state of the art accepts the judder caused by duplicating or deleting frames here and there to achieve a different frame rate.
For movies, the source frame rate is much lower. The most common example is U.S filming at 24 frames per second and NTSC video at (approximately) 60 frames per second. Video repeats the film frame content in a 3, 2, 3, 2 pattern known as 3-2 pulldown. Some viewers notice judder when viewing this video standard.
There has been some interest in acquiring TV sets (and projectors) that operate at 72 fps rather than 60 fps to reduce the judder and also reduce flicker. While there are some PC based systems that generate an accurate 3-3 pulldown from film source DVD's, scalers are just as apt to produce 4-2 pulldown some of the time as opposed to a consistent 3-3 pulldown. The scaler would have to analyze the video subject matter, taking hints from the pixel content, to consistently duplicate only frames of twosomes or delete only frames of threesomes in the 3-2 pulldown cadence. The technology to do this has not yet reached affordability.
Component Video versus RGB/VGA
Check the instructions of the TV (or projector) or external scaler to find out what kind of input(s) it expects and in the case of an external scaler what kind of output it delivers. Computer VGA is RGB. Component video is roughly red, white, and blue (luminance, a somewhat red sub-image, and a somewhat blue sub-image).
If either the input or the output is selectable as to whether it is RGB or component, the selection is always manual. Video circuits cannot tell whether the input is component or RGB by analyzing the subject matter. If the selection is wrong, the picture will be discolored but will still be intelligible enough to let you make the correct selection using on screen menus.
Some stand alone scalers deliver only RGB for "computer" scan rates 600p, 768p, and 1024p and deliver only component video output for "TV" scan rates 720p and 1080i. We feel that this is a handicap.
If a scaler is intended to accept and output RGB, it will deliver component video output with no settings changes if you feed it component video input. The reverse is also true with the slight chance of discolored fine detail depending on the quality of the circuits. The standard for component video is to carry the Pb and Pr components at half the horizontal resolution of the Y component.
Video Synchronizing Signals
Every once in awhile two pieces of equipment, for example a DVD player and a TV set, don't work together when one would suppose they should work together. Often the problem has to do with what happens after the end of one scan line and before the start of the next. In that interval, called the horizontal retrace interval, there are synchronizing pulses and a reference for what is considered black (the back porch)..
The video sync. from an external scaler must match the TV requirements with accuracy to about five milliseconds for standard definition TV and perhaps two milliseconds for HDTV. Scaler manufacturers have had to second guess these requirements and it can be confusing to the user. Sometimes the scaler has selectable sync. choices, sometimes not.
For component video the sync. is normally combined with the Y signal. For RGB the sync. may be combined with the green (RGsB), carried on a fourth cable (RGBS), or split into horizontal and vertical sync. using five cables (RGBHV). Computer VGA is officially RGBHV.
We have heard of at least one scaler which delivered both sync. on green and separate H and V sync. at the same time. One particular TV set behaved differently when the H and V sync. inputs were active even though it accepted sync. on green. The problem arose because all five video signal lines were carried using one cable with multi-pin plugs and jacks (VGA connectors). The solution given to that user was to dissect the cable and cut the sync. wires.
The Lowdown On DVD 4:3 Downconversion
Yes, 16:9 enhanced (so-called anamorphic) DVD's play on a 4:3 TV worse with some DVD players compared with others.
All DVD players have a letterbox mode to display 16:9 enhanced disks on a 4:3 TV set.
To do this the player reconstructs (scales; downconverts) the video, discarding and/or blending scan lines so that there are 360 lines of picture with 60 black lines above and 60 black lines below. 360 is a magic number because that many scan lines occupy the largest 16:9 shaped space within a 4:3 screen.
Ideally the dropped lines should be evenly selected from the original 480 scan lines. Unfortunately some players drop two in a row here and there, and this can cause moving objects to flicker objectionably.
Downconversion Also Increases "Jaggies"
All video has some "jaggies" or "stairstepping" of diagonal lines or edges due to the use of pixels or scan lines. The greater number of finer scan lines in HDTV means that stairstepping is less noticeable.
When scan lines are dropped during downconversion, the slope of a diagonal line or edge becomes more shallow at that spot. Overall, what was once a straight diagonal line becomes a zigzag line of shallow, steep, shallow, steep short segments. Stand back from the screen and observe the diagram below.
The exact method of dropping scan lines for playback of 16:9 enhanced DVD's on 4:3 TV sets without 16:9 mode varies from one brand of player to another. The above diagrams show a near worst case. See below for additional explanations.
Dropping Two Scan Lines In A Row
The usual interlaced video (includes component video output) has the scan lines arranged as follows prior to downconversion:
odd even 1 2 3 4 5 6 7 8 9 10 11 12
Doing this downconversion, we want to drop every fourth line. But we cannot simply drop lines 4, 8, 12, 16, 20, 24, and so on because they are all even. We have to drop lines equally from both the odd and even fields.
Suppose we drop lines 4, 7, 12, 15, 20, 23 which is approximately every fourth line with every other dropped line from the opposite field. What remains is,
odd even 1 2 3 6 5 8 9 10 11 14 13 16 17 18
We now have another problem, the remaining lines are out of order, for example line 6 is above line 5.
To keep the remaining lines in order we have to drop lines at the same place in each field, as in dropping 7, 8, 15, 16, 23, 24. The result is dropping lines two at a time, which causes greater gaps in the vertical resolution,
odd even 1 2 3 4 5 6 9 10 11 12 13 14 17 18
The results are better if the dropped lines are blended with the next or previous line respectively rather than dropped outright. There is still some jerkiness for thin objects and contrasting colored edges moving vertically. An object moving vertically uniformly in the original video will spend twice as much time on a line that was blended from two lines compared with on a line that was not affected by the downconversion or upconversion.
Beginning in 1999, progressive scan DVD players started to appear on the market in quantity. These players construct an entire 480 line video frame before sending it to the TV which must be capable of accepting it.
With the entire video field on hand we don't have the problem of dropping two scan lines in a row. We still need to drop (or blend) one out of every four scan lines but this time they can be more evenly spaced. For example lines 4, 8, 12, 16, 20, 24, etc. might be dropped, leaving lines 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15. We can even generate a new interlaced video signal to output to TV sets that don't accept progressive scan (the vast majority). In the above example, lines 1, 3, 6, 9, 11, 14 are picked out and treated as the odd field. Then lines 2, 5, 7, 10, 13, 15 are picked out and treated as the even field.
The "unknown" is that we don't know for sure which manufacturers generate the full video frames first and then drop the scan lines for downconverting the 16:9 enhanced DVD program for the 4:3 screen.
Click here for more information on how interlaced fields are combined to make full video frames (de-interlacing). Quality varies with make and model of DVD player.
Scaling and Downconversion Test
All parts (c) copyright 1997-2003, Allan W. Jayne, Jr. unless otherwise noted or other origin stated. All rights reserved.
P.O. Box 762, Nashua, NH 03061
603-889-1111 -- ajaynejr @ aol.com
If you would like to contribute an idea for our web page, please send
us an e-mail. Sorry, but due to the volume of e-mail we cannot reply personally
to all inquiries.