14 Commits

Author SHA1 Message Date
Carl Eugen Hoyos
19a6431ec2 Print a warning if a subtitle demuxer changes utf16 to utf8.
This does not fix anything but gives users a chance to
know that they must not pass -sub_charenc UTF-16 to ffmpeg.

Fixes ticket .
2014-10-29 01:32:44 +01:00
wm4
b7f641dc9b avformat/realtextdec: UTF-16 support
Also remove ff_smil_extract_next_chunk - this was the last user of it.
2014-09-05 23:13:07 +02:00
wm4
231a514dd3 avformat/samidec: UTF-16 support
ff_smil_extract_next_chunk() is still used by RealText.
2014-09-05 23:13:07 +02:00
wm4
d658ef18e3 avformat/srtdec: UTF-16 support 2014-09-05 23:13:07 +02:00
wm4
3e8426170c avformat/assdec: UTF-16 support
Use the UTF-16 BOM to detect UTF-16 encoding. Convert the file contents
to UTF-8 on the fly using FFTextReader, which acts as converting wrapper
around AVIOContext. It also can work on a static buffer, needed for
format probing. The FFTextReader wrapper now also takes care of skipping
the UTF-8 BOM.

Fix Ticket .
2014-09-05 23:13:07 +02:00
Clément Bœsch
dbfe61100b avformat/vobsub: fix several issues.
Here is an extract of fate-samples/sub/vobsub.idx, with an additional
text at the end of each line to better identify each bitmap:

    timestamp: 00:04:55:445, filepos: 00001b000 Ace!
    timestamp: 00:05:00:049, filepos: 00001b800 Wake up, honey!
    timestamp: 00:05:02:018, filepos: 00001c800 I gotta go to work.
    timestamp: 00:05:02:035, filepos: 00001d000 <???>
    timestamp: 00:05:04:203, filepos: 00001d800 Look after Clayton, okay?
    timestamp: 00:05:05:947, filepos: 00001e800 I'll be back tonight.
    timestamp: 00:05:07:957, filepos: 00001f800 Bye! Love you.
    timestamp: 00:05:21:295, filepos: 000020800 Hey, Ace! What's up?
    timestamp: 00:05:23:356, filepos: 000021800 Hey, how's it going?
    timestamp: 00:05:24:640, filepos: 000022800 Remember what today is? The 3rd!
    timestamp: 00:05:27:193, filepos: 000023800 Look over there!
    timestamp: 00:05:28:369, filepos: 000024800 Where are they going?
    timestamp: 00:05:28:361, filepos: 000025000 <???>
    timestamp: 00:05:29:946, filepos: 000025800 Let's go see.
    timestamp: 00:05:31:230, filepos: 000026000 I can't, man. I got Clayton.

Note the two "<???>": they are basically split subtitles (with the
previous one), which the dvdsub decoder is now supposed to reconstruct
with a previous commit. But also note that while the first chunk has
increasing timestamps,

    timestamp: 00:05:02:018, filepos: 00001c800
    timestamp: 00:05:02:035, filepos: 00001d000

...it's not the case of the second one (and this is not an exception in the
original file):

    timestamp: 00:05:28:369, filepos: 000024800
    timestamp: 00:05:28:361, filepos: 000025000

For the dvdsub decoder, they need to be "filepos'ed" ordered, but the
FFDemuxSubtitlesQueue is timestamps ordered, which is the reason of the
introduction of a sub sort method in the context, to allow giving
priority to the position, and then the timestamps. With that change, the
dvdsub decoder get fed with ordered packets.

Now the packet size estimation was also broken: the filepos differences
in the vobsub index defines the full data read between two subtitles
chunks, and it is necessary to take into account what is read by the
mpegps_read_pes_header() function since the length returned by that
function doesn't count the size of the data it reads. This is fixed with
the introduction of total_read, and {old,new}_pos. By doing this change,
we can drop the unreliable len16 heuristic and simplify the whole loop.
Note that mpegps_read_pes_header() often read more than one PES packet
(typically in one call it can read 0x1ba and 0x1be chunk along with the
relevant 0x1bd packet), which triggers the "total_read + pkt_size >
psize" check. This is an expected behaviour, which could be avoided by
having a more chunked version of mpegps_read_pes_header().

The latest change is the extraction of each stream into its own
subtitles queue. If we don't do this, the maximum size for a subtitle
chunk is broken, and the previous changes can not work. Having each
stream in a different queue requires some little adjustments in the
seek code of the demuxer.

This commit is only meaningful as a whole change and can not be easily
split. The FATE test changes because it uses the vobsub demuxer.
2013-10-04 07:59:49 +02:00
Alexander Strasser
069010ffae lavf/subtitles: Make comment less arrogant
Signed-off-by: Alexander Strasser <eclipse7@gmx.net>
2013-09-15 22:37:13 +02:00
Clément Bœsch
378a830e7b avformat/subtitles: support standalone CR (MacOS).
Recent .srt files with CR only were found in the wild.
2013-09-08 18:48:35 +02:00
Clément Bœsch
90fc00a623 avformat/subtitles: add a next line jumper and use it.
This fixes a bunch of possible overread in avformat with the idiom p +=
strcspn(p, "\n") + 1 (strcspn() can focus on the trailing '\0' if no
'\n' is found, so the +1 leads to an overread).

Note on lavf/matroskaenc: no extra subtitles.o Makefile dependency is
added because only the header is required for ff_subtitles_next_line().

Note on lavf/mpsubdec: code gets slightly complex to avoid an infinite
loop in the probing since there is no more forced increment.
2013-09-08 18:48:09 +02:00
Clément Bœsch
949506191a lavf/subtitles: fix CLRF/CRLF typo. 2012-12-30 23:14:34 +01:00
Clément Bœsch
d9ac8d2967 lavf: move srtdec:read_chunk() to subtitles utils.
This function can be useful for various other subtitles formats.
2012-12-30 22:58:58 +01:00
Clément Bœsch
ff3624b1ad lavf/subtitles: add ff_subtitles_queue_seek().
This function is almost identical to lavf/assdec:read_seek2(). It
performs a generic seek for text subtitles demuxers for the new seeking
API.

The only difference with assdec:read_seek2 is the ts_diff being
unsigned to avoid overflows.

The seek callback in the ASS demuxer will be removed when it is
redesigned to use FFDemuxSubtitlesQueue.
2012-12-02 00:06:03 +01:00
Clément Bœsch
d948893dbd lavf/subtitles: add some SMIL helpers.
This is needed for SAMI and RealText demuxers.
2012-06-29 20:20:02 +02:00
Clément Bœsch
7c9f9685ae lavf: add internal demuxer helpers for subtitles. 2012-06-29 19:13:24 +02:00