Commit b48b2a5c authored by Juha Jeronen's avatar Juha Jeronen Committed by Jean-Baptiste Kempf

IVTC trivial fixes 2

Signed-off-by: default avatarJean-Baptiste Kempf <jb@videolan.org>
parent f7c77e92
...@@ -2454,14 +2454,13 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2454,14 +2454,13 @@ static int RenderPhosphor( filter_t *p_filter,
* Transcode, some from TVTime, and some original. * Transcode, some from TVTime, and some original.
* *
* If the input material is pure NTSC telecined film, inverse telecine * If the input material is pure NTSC telecined film, inverse telecine
* (also known as "film mode") will (ideally) exactly recover the original * will (ideally) exactly recover the original progressive film frames.
* (progressive film frames. The output will run at 4/5 of the original * The output will run at 4/5 of the original framerate with no loss of
* (framerate with no loss of information. Interlacing artifacts are removed, * information. Interlacing artifacts are removed, and motion becomes
* and motion becomes as smooth as it was on the original film. * as smooth as it was on the original film. For soft-telecined material,
* For soft-telecined material, on the other hand, the progressive frames * on the other hand, the progressive frames alredy exist, so only the
* alredy exist, so only the timings are changed such that the output * timings are changed such that the output becomes smooth 24fps (or would,
* becomes smooth 24fps (or would, if the output device had an infinite * if the output device had an infinite framerate).
* framerate).
* *
* Put in simple terms, this filter is targeted for NTSC movies and * Put in simple terms, this filter is targeted for NTSC movies and
* especially anime. Virtually all 1990s and early 2000s anime is * especially anime. Virtually all 1990s and early 2000s anime is
...@@ -2507,8 +2506,8 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2507,8 +2506,8 @@ static int RenderPhosphor( filter_t *p_filter,
* Finally, note also that IVTC is the only correct way to deinterlace NTSC * Finally, note also that IVTC is the only correct way to deinterlace NTSC
* telecined material. Simply applying an interpolating deinterlacing filter * telecined material. Simply applying an interpolating deinterlacing filter
* (with no framerate doubling) is harmful for two reasons. First, even if * (with no framerate doubling) is harmful for two reasons. First, even if
* (the filter does not damage already progressive frames, it will lose half * the filter does not damage already progressive frames, it will lose half
* (of the available vertical resolution of those frames that are judged * of the available vertical resolution of those frames that are judged
* interlaced. Some algorithms combining data from multiple frames may be * interlaced. Some algorithms combining data from multiple frames may be
* able to counter this to an extent, effectively performing something akin * able to counter this to an extent, effectively performing something akin
* to the frame reconstruction part of IVTC. A more serious problem is that * to the frame reconstruction part of IVTC. A more serious problem is that
...@@ -2584,7 +2583,7 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2584,7 +2583,7 @@ static int RenderPhosphor( filter_t *p_filter,
* field renderer displays the material (one field at a time, dominant * field renderer displays the material (one field at a time, dominant
* field first). * field first).
* *
* Note that the VFD may, *correctly*, flip mid-stream, if soft field repeats * The VFD may, *correctly*, flip mid-stream, if soft field repeats
* (repeat_pict) have been used. They are commonly used in soft telecine * (repeat_pict) have been used. They are commonly used in soft telecine
* (see below), but also occasional lone field repeats exist in some streams, * (see below), but also occasional lone field repeats exist in some streams,
* e.g., Sol Bianca. * e.g., Sol Bianca.
...@@ -2597,7 +2596,7 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2597,7 +2596,7 @@ static int RenderPhosphor( filter_t *p_filter,
* The reason for the words "classical telecine" above, when field * The reason for the words "classical telecine" above, when field
* duplication was first mentioned, is that there exists a * duplication was first mentioned, is that there exists a
* "full field blended" version, where the added fields are not exact * "full field blended" version, where the added fields are not exact
* "duplicates, but are blends of the original film frames. This is rare * duplicates, but are blends of the original film frames. This is rare
* in NTSC, but some material like this reportedly exists. See * in NTSC, but some material like this reportedly exists. See
* http://www.animemusicvideos.org/guides/avtech/videogetb2a.html * http://www.animemusicvideos.org/guides/avtech/videogetb2a.html
* In these cases, the additional fields are a (probably 50%) blend of the * In these cases, the additional fields are a (probably 50%) blend of the
...@@ -2638,7 +2637,7 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2638,7 +2637,7 @@ static int RenderPhosphor( filter_t *p_filter,
* Finally, note that telecined video is often edited directly in interlaced * Finally, note that telecined video is often edited directly in interlaced
* form, disregarding safe cut positions as pertains to the telecine sequence * form, disregarding safe cut positions as pertains to the telecine sequence
* (there are only two: between "d" and "e", or between "e" and the * (there are only two: between "d" and "e", or between "e" and the
* (next "a"). Thus, the telecine sequence will in practice jump erratically * next "a"). Thus, the telecine sequence will in practice jump erratically
* at cuts [**]. An aggressive detection strategy is needed to cope with * at cuts [**]. An aggressive detection strategy is needed to cope with
* this. * this.
* *
...@@ -2651,8 +2650,8 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2651,8 +2650,8 @@ static int RenderPhosphor( filter_t *p_filter,
* if the interlaced picture is viewed as-is, the luma alternates every line, * if the interlaced picture is viewed as-is, the luma alternates every line,
* while the chroma alternates only every two lines of the picture. * while the chroma alternates only every two lines of the picture.
* *
* That is, an interlaced frame from a 4:2:0 telecine looks like this * That is, an interlaced frame in a 4:2:0 telecine looks like this
* (numbers indicate which frame the data comes from): * (numbers indicate which film frame the data comes from):
* *
* luma stored 4:2:0 chroma displayed chroma * luma stored 4:2:0 chroma displayed chroma
* 1111 1111 1111 * 1111 1111 1111
...@@ -2661,10 +2660,9 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2661,10 +2660,9 @@ static int RenderPhosphor( filter_t *p_filter,
* 2222 2222 * 2222 2222
* ... ... ... * ... ... ...
* *
* The deinterlace filter sees the stored 4:2:0 chroma. * The deinterlace filter sees the stored 4:2:0 chroma. The "displayed chroma"
* The "displayed chroma" is only generated later in the filter chain * is only generated later in the filter chain (probably when YUV is converted
* (probably when YUV is converted to the display format, if the display * to the display format, if the display does not accept YUV 4:2:0 directly).
* does not accept YUV 4:2:0 directly).
* *
* *
* Next, how NTSC soft telecine works: * Next, how NTSC soft telecine works:
...@@ -2721,7 +2719,7 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2721,7 +2719,7 @@ static int RenderPhosphor( filter_t *p_filter,
* *
* Finally, note also that a stream may also request a lone field repeat * Finally, note also that a stream may also request a lone field repeat
* (a sudden "3" surrounded by "2"s). Fortunately, these can be handled as * (a sudden "3" surrounded by "2"s). Fortunately, these can be handled as
* (a two-frame soft telecine, as they match the first and third * a two-frame soft telecine, as they match the first and third
* flag patterns above. * flag patterns above.
* *
* Combinations with several "3"s in a row are not valid for soft or hard * Combinations with several "3"s in a row are not valid for soft or hard
...@@ -2783,16 +2781,15 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2783,16 +2781,15 @@ static int RenderPhosphor( filter_t *p_filter,
* From these cadence tables we can extract two strategies for * From these cadence tables we can extract two strategies for
* cadence detection. We use both. * cadence detection. We use both.
* *
* Strategy 1: duplicated fields. * Strategy 1: duplicated fields ("vektor").
* *
* Consider that each stencil position has a unique duplicate field * Consider that each stencil position has a unique duplicate field
* condition. In one unique position, "dea", there is no match; in all * condition. In one unique position, "dea", there is no match; in all
* other positions, exactly one. By conservatively filtering the * other positions, exactly one. By conservatively filtering the
* possibilities based on detected hard field repeats (identical fields * possibilities based on detected hard field repeats (identical fields
* in successive input frames), it is possible to gradually lock on * in successive input frames), it is possible to gradually lock on
* to the cadence. This kind of strategy is used by Vektor's classic * to the cadence. This kind of strategy is used by the classic IVTC filter
* IVTC filter from TVTime (although there are some implementation * in TVTime/Xine by Billy Biggs (Vektor), hence the name.
* differences when compared to ours).
* *
* "Conservative" here means that we do not rule anything out, but start at * "Conservative" here means that we do not rule anything out, but start at
* each stencil position by suggesting the position "dea", and then only add * each stencil position by suggesting the position "dea", and then only add
...@@ -2807,7 +2804,7 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2807,7 +2804,7 @@ static int RenderPhosphor( filter_t *p_filter,
* duplicate field detection against the input. It is very good at staying * duplicate field detection against the input. It is very good at staying
* locked on once it acquires the cadence, and it does so correctly very * locked on once it acquires the cadence, and it does so correctly very
* often. These are indeed characteristics that can be observed in the * often. These are indeed characteristics that can be observed in the
* behaviour of Vektor's classic filter. * behaviour of the TVTime/Xine filter.
* *
* Note especially that 8fps/12fps animation, common in anime, will cause * Note especially that 8fps/12fps animation, common in anime, will cause
* spurious hard-repeated fields. The conservative nature of the method * spurious hard-repeated fields. The conservative nature of the method
...@@ -2835,10 +2832,10 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2835,10 +2832,10 @@ static int RenderPhosphor( filter_t *p_filter,
* is detected. * is detected.
* *
* *
* Strategy 2: progressive/interlaced field combinations. * Strategy 2: progressive/interlaced field combinations ("scores").
* *
* We can also form a second strategy, which is not as reliable in practice, * We can also form a second strategy, which is not as reliable in practice,
* but which locks on faster. This is original to this filter. * but which locks on faster when it does. This is original to this filter.
* *
* Consider all possible field pairs from two successive frames: TCBC, TCBN, * Consider all possible field pairs from two successive frames: TCBC, TCBN,
* TNBC, TNBN. After one frame, these become TPBP, TPBC, TCBP, TCBC. * TNBC, TNBN. After one frame, these become TPBP, TPBC, TCBP, TCBC.
...@@ -2846,18 +2843,20 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2846,18 +2843,20 @@ static int RenderPhosphor( filter_t *p_filter,
* are the exhaustive list of possible field pairs from two successive * are the exhaustive list of possible field pairs from two successive
* frames in the three-frame PCN stencil. * frames in the three-frame PCN stencil.
* *
* The field pairs can be used for cadence position detection. The above * The above tables list triplets of field pair combinations for each cadence
* tables list triplets of field pair combinations for each cadence position, * position, which should produce progressive frames. All the given triplets
* which should produce progressive frames. All the given triplets are unique * are unique in each table alone, although the one at "dea" is
* in each table alone, although the one at "dea" is indistinguishable from * indistinguishable from the case of pure progressive material. It is also
* the case of pure progressive material. It is also the only one which is * the only one which is not unique across both tables.
* not unique across both tables.
* *
* Thus, all sequences of two neighboring triplets are unique across both * Thus, all sequences of two neighboring triplets are unique across both
* tables. (For "neighboring", each table is considered to wrap around from * tables. (For "neighboring", each table is considered to wrap around from
* "eab" back to "abc", i.e. from the last row back to the first row.) * "eab" back to "abc", i.e. from the last row back to the first row.)
* Furthermore, each sequence of three neighboring triplets is redundantly * Furthermore, each sequence of three neighboring triplets is redundantly
* unique (i.e. is unique, and reduces the chance of false positives). * unique (i.e. is unique, and reduces the chance of false positives).
* (In practice, though, we already know which table to consider, from the fact
* that TFD and VFD must match. Checking only the relevant table makes the
* strategy slightly more robust.)
* *
* The important idea is: *all other* field pair combinations should produce * The important idea is: *all other* field pair combinations should produce
* frames that look interlaced. This includes those combinations present in * frames that look interlaced. This includes those combinations present in
...@@ -2866,27 +2865,26 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2866,27 +2865,26 @@ static int RenderPhosphor( filter_t *p_filter,
* uniqueness property, *every* "wrong" row will always contain at least one * uniqueness property, *every* "wrong" row will always contain at least one
* combination that differs from those in the "correct" row). * combination that differs from those in the "correct" row).
* *
* As for how we use these observations, we generate the artificial frames * We generate the artificial frames TCBC, TCBN, TNBC and TNBN (virtually;
* TCBC, TCBN, TNBC and TNBN (virtually; no data is actually moved). * no data is actually moved). Two of these are just the frames C and N,
* Two of these are just the frames C and N, which already exist; the two * which already exist; the two others correspond to composing the given
* others correspond to composing the given field pairs. We then compute * field pairs. We then compute the interlace score for each of these frames.
* the interlace score for each of these frames. The interlace scores * The interlace scores of what are now TPBP, TPBC and TCBP, also needed,
* of what are now TPBP, TPBC and TCBP, also needed, were computed by * were computed by this same mechanism during the previous input frame.
* this same mechanism during the previous input frame. These can be slided * These can be slided in history and reused.
* in history and reused.
* *
* We then check, using the computed interlace scores, and taking into * We then check, using the computed interlace scores, and taking into
* account the video field dominance information (to only check valid * account the video field dominance information, which field combination
* combinations), which field combination triplet given in the tables * triplet given in the appropriate table produces the smallest sum of
* produces the smallest sum of interlace scores. Unless we are at * interlace scores. Unless we are at PCN = "dea" (which could also be pure
* PCN = "dea" (which could also be pure progressive!), this immediately * progressive!), this immediately gives us the most likely current cadence
* gives us the most likely current cadence position. Combined with a * position. Combined with a two-step history, the sequence of three most
* two-step history, the sequence of three most likely positions found this * likely positions found this way always allows us to make a more or less
* way always allows us to make a more or less reliable detection. (That is, * reliable detection. (That is, when a reliable detection is possible; if the
* when a reliable detection is possible; note that if the video has no * video has no motion at all, every detection will report the position "dea".
* motion at all, every detection will report the position "dea". In anime, * In anime, still shots are common. Thus we must augment this with a
* still shots are common. Thus we must augment this with a full-frame motion * full-frame motion detection that switches the detector off if no motion
* detection that switches the detector off if no motion was detected.) * was detected.)
* *
* The detection seems to need four full-frame interlace analyses per frame. * The detection seems to need four full-frame interlace analyses per frame.
* Actually, three are enough, because the previous N is the new C, so we can * Actually, three are enough, because the previous N is the new C, so we can
...@@ -2923,11 +2921,11 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2923,11 +2921,11 @@ static int RenderPhosphor( filter_t *p_filter,
* reliably on a valid cadence. * reliably on a valid cadence.
* *
* When the cadence fails (we detect this from a sudden upward jump in the * When the cadence fails (we detect this from a sudden upward jump in the
* interlace scores of the constructed frames), we reset the "TVTime" * interlace scores of the constructed frames), we reset the "vektor"
* detector strategy and fall back to an emergency frame composer, where we * detector strategy and fall back to an emergency frame composer, where we
* use ideas from Transcode's IVTC. * use ideas from Transcode's IVTC.
* *
* In the emergency mode, we simply output the least interlaced frame out of * In this emergency mode, we simply output the least interlaced frame out of
* the combinations TNBN, TNBC and TCBN (where only one of the last two is * the combinations TNBN, TNBC and TCBN (where only one of the last two is
* tested, based on the stream TFF/BFF information). In this mode, we do not * tested, based on the stream TFF/BFF information). In this mode, we do not
* touch the timestamps, and just pass all five frames from each group right * touch the timestamps, and just pass all five frames from each group right
...@@ -2944,7 +2942,8 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2944,7 +2942,8 @@ static int RenderPhosphor( filter_t *p_filter,
* *
* To make five into four we need to extend frame durations by 25%. * To make five into four we need to extend frame durations by 25%.
* Consider the following diagram (times given in 90kHz ticks, rounded to * Consider the following diagram (times given in 90kHz ticks, rounded to
* integers; this is just for illustration): * integers; this is just for illustration, and for comparison with the
* "scratch paper" comments in pulldown.c of TVTime/Xine):
* *
* NTSC input (29.97 fps) * NTSC input (29.97 fps)
* a b c d e a (from next group) ... * a b c d e a (from next group) ...
...@@ -2955,7 +2954,7 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2955,7 +2954,7 @@ static int RenderPhosphor( filter_t *p_filter,
* *
* Three of the film frames have length 3754, and one has 3753 * Three of the film frames have length 3754, and one has 3753
* (it is 1/90000 sec shorter). This rounding was chosen so that the lengths * (it is 1/90000 sec shorter). This rounding was chosen so that the lengths
* (of the group of four sum to the original 15015. * of the group of four sum to the original 15015.
* *
* From the diagram we get these deltas for presentation timestamp adjustment * From the diagram we get these deltas for presentation timestamp adjustment
* (in 90 kHz ticks, for illustration): * (in 90 kHz ticks, for illustration):
...@@ -2979,9 +2978,9 @@ static int RenderPhosphor( filter_t *p_filter, ...@@ -2979,9 +2978,9 @@ static int RenderPhosphor( filter_t *p_filter,
* position "d". (Alternatively, upon lock-on, we could wait until we are * position "d". (Alternatively, upon lock-on, we could wait until we are
* at "a" before switching on IVTC, but this makes the maximal delay * at "a" before switching on IVTC, but this makes the maximal delay
* [max. detection + max. wait] = 3 + 4 = 7 input frames, which comes to * [max. detection + max. wait] = 3 + 4 = 7 input frames, which comes to
* [7/30 ~ 0.23 seconds instead of the 3/30 = 0.10 seconds from purely * 7/30 ~ 0.23 seconds instead of the 3/30 = 0.10 seconds from purely
* the detection. I prefer the one-time jerk, which also happens to be * the detection. The one-time jerk is simpler to implement and gives the
* simpler to implement.) * faster lock-on.)
* *
* It is clear that "e" is a safe choice for the dropped frame. This can be * It is clear that "e" is a safe choice for the dropped frame. This can be
* seen from the timings and the cadence tables. First, consider the timings. * seen from the timings and the cadence tables. First, consider the timings.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment