Commit b48b2a5c authored by Juha Jeronen's avatar Juha Jeronen Committed by Jean-Baptiste Kempf

IVTC trivial fixes 2

Signed-off-by: default avatarJean-Baptiste Kempf <jb@videolan.org>
parent f7c77e92
......@@ -2454,14 +2454,13 @@ static int RenderPhosphor( filter_t *p_filter,
* Transcode, some from TVTime, and some original.
*
* If the input material is pure NTSC telecined film, inverse telecine
* (also known as "film mode") will (ideally) exactly recover the original
* (progressive film frames. The output will run at 4/5 of the original
* (framerate with no loss of information. Interlacing artifacts are removed,
* and motion becomes as smooth as it was on the original film.
* For soft-telecined material, on the other hand, the progressive frames
* alredy exist, so only the timings are changed such that the output
* becomes smooth 24fps (or would, if the output device had an infinite
* framerate).
* will (ideally) exactly recover the original progressive film frames.
* The output will run at 4/5 of the original framerate with no loss of
* information. Interlacing artifacts are removed, and motion becomes
* as smooth as it was on the original film. For soft-telecined material,
* on the other hand, the progressive frames alredy exist, so only the
* timings are changed such that the output becomes smooth 24fps (or would,
* if the output device had an infinite framerate).
*
* Put in simple terms, this filter is targeted for NTSC movies and
* especially anime. Virtually all 1990s and early 2000s anime is
......@@ -2507,8 +2506,8 @@ static int RenderPhosphor( filter_t *p_filter,
* Finally, note also that IVTC is the only correct way to deinterlace NTSC
* telecined material. Simply applying an interpolating deinterlacing filter
* (with no framerate doubling) is harmful for two reasons. First, even if
* (the filter does not damage already progressive frames, it will lose half
* (of the available vertical resolution of those frames that are judged
* the filter does not damage already progressive frames, it will lose half
* of the available vertical resolution of those frames that are judged
* interlaced. Some algorithms combining data from multiple frames may be
* able to counter this to an extent, effectively performing something akin
* to the frame reconstruction part of IVTC. A more serious problem is that
......@@ -2584,7 +2583,7 @@ static int RenderPhosphor( filter_t *p_filter,
* field renderer displays the material (one field at a time, dominant
* field first).
*
* Note that the VFD may, *correctly*, flip mid-stream, if soft field repeats
* The VFD may, *correctly*, flip mid-stream, if soft field repeats
* (repeat_pict) have been used. They are commonly used in soft telecine
* (see below), but also occasional lone field repeats exist in some streams,
* e.g., Sol Bianca.
......@@ -2597,7 +2596,7 @@ static int RenderPhosphor( filter_t *p_filter,
* The reason for the words "classical telecine" above, when field
* duplication was first mentioned, is that there exists a
* "full field blended" version, where the added fields are not exact
* "duplicates, but are blends of the original film frames. This is rare
* duplicates, but are blends of the original film frames. This is rare
* in NTSC, but some material like this reportedly exists. See
* http://www.animemusicvideos.org/guides/avtech/videogetb2a.html
* In these cases, the additional fields are a (probably 50%) blend of the
......@@ -2638,7 +2637,7 @@ static int RenderPhosphor( filter_t *p_filter,
* Finally, note that telecined video is often edited directly in interlaced
* form, disregarding safe cut positions as pertains to the telecine sequence
* (there are only two: between "d" and "e", or between "e" and the
* (next "a"). Thus, the telecine sequence will in practice jump erratically
* next "a"). Thus, the telecine sequence will in practice jump erratically
* at cuts [**]. An aggressive detection strategy is needed to cope with
* this.
*
......@@ -2651,8 +2650,8 @@ static int RenderPhosphor( filter_t *p_filter,
* if the interlaced picture is viewed as-is, the luma alternates every line,
* while the chroma alternates only every two lines of the picture.
*
* That is, an interlaced frame from a 4:2:0 telecine looks like this
* (numbers indicate which frame the data comes from):
* That is, an interlaced frame in a 4:2:0 telecine looks like this
* (numbers indicate which film frame the data comes from):
*
* luma stored 4:2:0 chroma displayed chroma
* 1111 1111 1111
......@@ -2661,10 +2660,9 @@ static int RenderPhosphor( filter_t *p_filter,
* 2222 2222
* ... ... ...
*
* The deinterlace filter sees the stored 4:2:0 chroma.
* The "displayed chroma" is only generated later in the filter chain
* (probably when YUV is converted to the display format, if the display
* does not accept YUV 4:2:0 directly).
* The deinterlace filter sees the stored 4:2:0 chroma. The "displayed chroma"
* is only generated later in the filter chain (probably when YUV is converted
* to the display format, if the display does not accept YUV 4:2:0 directly).
*
*
* Next, how NTSC soft telecine works:
......@@ -2721,7 +2719,7 @@ static int RenderPhosphor( filter_t *p_filter,
*
* Finally, note also that a stream may also request a lone field repeat
* (a sudden "3" surrounded by "2"s). Fortunately, these can be handled as
* (a two-frame soft telecine, as they match the first and third
* a two-frame soft telecine, as they match the first and third
* flag patterns above.
*
* Combinations with several "3"s in a row are not valid for soft or hard
......@@ -2783,16 +2781,15 @@ static int RenderPhosphor( filter_t *p_filter,
* From these cadence tables we can extract two strategies for
* cadence detection. We use both.
*
* Strategy 1: duplicated fields.
* Strategy 1: duplicated fields ("vektor").
*
* Consider that each stencil position has a unique duplicate field
* condition. In one unique position, "dea", there is no match; in all
* other positions, exactly one. By conservatively filtering the
* possibilities based on detected hard field repeats (identical fields
* in successive input frames), it is possible to gradually lock on
* to the cadence. This kind of strategy is used by Vektor's classic
* IVTC filter from TVTime (although there are some implementation
* differences when compared to ours).
* to the cadence. This kind of strategy is used by the classic IVTC filter
* in TVTime/Xine by Billy Biggs (Vektor), hence the name.
*
* "Conservative" here means that we do not rule anything out, but start at
* each stencil position by suggesting the position "dea", and then only add
......@@ -2807,7 +2804,7 @@ static int RenderPhosphor( filter_t *p_filter,
* duplicate field detection against the input. It is very good at staying
* locked on once it acquires the cadence, and it does so correctly very
* often. These are indeed characteristics that can be observed in the
* behaviour of Vektor's classic filter.
* behaviour of the TVTime/Xine filter.
*
* Note especially that 8fps/12fps animation, common in anime, will cause
* spurious hard-repeated fields. The conservative nature of the method
......@@ -2835,10 +2832,10 @@ static int RenderPhosphor( filter_t *p_filter,
* is detected.
*
*
* Strategy 2: progressive/interlaced field combinations.
* Strategy 2: progressive/interlaced field combinations ("scores").
*
* We can also form a second strategy, which is not as reliable in practice,
* but which locks on faster. This is original to this filter.
* but which locks on faster when it does. This is original to this filter.
*
* Consider all possible field pairs from two successive frames: TCBC, TCBN,
* TNBC, TNBN. After one frame, these become TPBP, TPBC, TCBP, TCBC.
......@@ -2846,18 +2843,20 @@ static int RenderPhosphor( filter_t *p_filter,
* are the exhaustive list of possible field pairs from two successive
* frames in the three-frame PCN stencil.
*
* The field pairs can be used for cadence position detection. The above
* tables list triplets of field pair combinations for each cadence position,
* which should produce progressive frames. All the given triplets are unique
* in each table alone, although the one at "dea" is indistinguishable from
* the case of pure progressive material. It is also the only one which is
* not unique across both tables.
* The above tables list triplets of field pair combinations for each cadence
* position, which should produce progressive frames. All the given triplets
* are unique in each table alone, although the one at "dea" is
* indistinguishable from the case of pure progressive material. It is also
* the only one which is not unique across both tables.
*
* Thus, all sequences of two neighboring triplets are unique across both
* tables. (For "neighboring", each table is considered to wrap around from
* "eab" back to "abc", i.e. from the last row back to the first row.)
* Furthermore, each sequence of three neighboring triplets is redundantly
* unique (i.e. is unique, and reduces the chance of false positives).
* (In practice, though, we already know which table to consider, from the fact
* that TFD and VFD must match. Checking only the relevant table makes the
* strategy slightly more robust.)
*
* The important idea is: *all other* field pair combinations should produce
* frames that look interlaced. This includes those combinations present in
......@@ -2866,27 +2865,26 @@ static int RenderPhosphor( filter_t *p_filter,
* uniqueness property, *every* "wrong" row will always contain at least one
* combination that differs from those in the "correct" row).
*
* As for how we use these observations, we generate the artificial frames
* TCBC, TCBN, TNBC and TNBN (virtually; no data is actually moved).
* Two of these are just the frames C and N, which already exist; the two
* others correspond to composing the given field pairs. We then compute
* the interlace score for each of these frames. The interlace scores
* of what are now TPBP, TPBC and TCBP, also needed, were computed by
* this same mechanism during the previous input frame. These can be slided
* in history and reused.
* We generate the artificial frames TCBC, TCBN, TNBC and TNBN (virtually;
* no data is actually moved). Two of these are just the frames C and N,
* which already exist; the two others correspond to composing the given
* field pairs. We then compute the interlace score for each of these frames.
* The interlace scores of what are now TPBP, TPBC and TCBP, also needed,
* were computed by this same mechanism during the previous input frame.
* These can be slided in history and reused.
*
* We then check, using the computed interlace scores, and taking into
* account the video field dominance information (to only check valid
* combinations), which field combination triplet given in the tables
* produces the smallest sum of interlace scores. Unless we are at
* PCN = "dea" (which could also be pure progressive!), this immediately
* gives us the most likely current cadence position. Combined with a
* two-step history, the sequence of three most likely positions found this
* way always allows us to make a more or less reliable detection. (That is,
* when a reliable detection is possible; note that if the video has no
* motion at all, every detection will report the position "dea". In anime,
* still shots are common. Thus we must augment this with a full-frame motion
* detection that switches the detector off if no motion was detected.)
* account the video field dominance information, which field combination
* triplet given in the appropriate table produces the smallest sum of
* interlace scores. Unless we are at PCN = "dea" (which could also be pure
* progressive!), this immediately gives us the most likely current cadence
* position. Combined with a two-step history, the sequence of three most
* likely positions found this way always allows us to make a more or less
* reliable detection. (That is, when a reliable detection is possible; if the
* video has no motion at all, every detection will report the position "dea".
* In anime, still shots are common. Thus we must augment this with a
* full-frame motion detection that switches the detector off if no motion
* was detected.)
*
* The detection seems to need four full-frame interlace analyses per frame.
* Actually, three are enough, because the previous N is the new C, so we can
......@@ -2923,11 +2921,11 @@ static int RenderPhosphor( filter_t *p_filter,
* reliably on a valid cadence.
*
* When the cadence fails (we detect this from a sudden upward jump in the
* interlace scores of the constructed frames), we reset the "TVTime"
* interlace scores of the constructed frames), we reset the "vektor"
* detector strategy and fall back to an emergency frame composer, where we
* use ideas from Transcode's IVTC.
*
* In the emergency mode, we simply output the least interlaced frame out of
* In this emergency mode, we simply output the least interlaced frame out of
* the combinations TNBN, TNBC and TCBN (where only one of the last two is
* tested, based on the stream TFF/BFF information). In this mode, we do not
* touch the timestamps, and just pass all five frames from each group right
......@@ -2944,7 +2942,8 @@ static int RenderPhosphor( filter_t *p_filter,
*
* To make five into four we need to extend frame durations by 25%.
* Consider the following diagram (times given in 90kHz ticks, rounded to
* integers; this is just for illustration):
* integers; this is just for illustration, and for comparison with the
* "scratch paper" comments in pulldown.c of TVTime/Xine):
*
* NTSC input (29.97 fps)
* a b c d e a (from next group) ...
......@@ -2955,7 +2954,7 @@ static int RenderPhosphor( filter_t *p_filter,
*
* Three of the film frames have length 3754, and one has 3753
* (it is 1/90000 sec shorter). This rounding was chosen so that the lengths
* (of the group of four sum to the original 15015.
* of the group of four sum to the original 15015.
*
* From the diagram we get these deltas for presentation timestamp adjustment
* (in 90 kHz ticks, for illustration):
......@@ -2979,9 +2978,9 @@ static int RenderPhosphor( filter_t *p_filter,
* position "d". (Alternatively, upon lock-on, we could wait until we are
* at "a" before switching on IVTC, but this makes the maximal delay
* [max. detection + max. wait] = 3 + 4 = 7 input frames, which comes to
* [7/30 ~ 0.23 seconds instead of the 3/30 = 0.10 seconds from purely
* the detection. I prefer the one-time jerk, which also happens to be
* simpler to implement.)
* 7/30 ~ 0.23 seconds instead of the 3/30 = 0.10 seconds from purely
* the detection. The one-time jerk is simpler to implement and gives the
* faster lock-on.)
*
* It is clear that "e" is a safe choice for the dropped frame. This can be
* seen from the timings and the cadence tables. First, consider the timings.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment