ffmpeg

Author	SHA1	Message	Date
Christophe Gisquet	96b165fae2	dnxhd: interleave AC levels and flags This allows more efficient access to the array as the level and flags are contiguous. Around 4% faster coefficient decoding. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-15 02:46:29 +02:00
Hendrik Leppkes	15db457ea8	Merge commit 'd15368ee3926152a3a301c13cc638fbf7a062ddf' * commit 'd15368ee3926152a3a301c13cc638fbf7a062ddf': h264: Run VLC init under pthread_once Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-10-14 23:06:06 +02:00
Hendrik Leppkes	b66a94ab53	Merge commit '08377f9c3bf6dbe216512a2e05c9fac837b13fc0' * commit '08377f9c3bf6dbe216512a2e05c9fac837b13fc0': dxva: Include last the internal header Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-10-14 23:02:00 +02:00
Hendrik Leppkes	8ededd5836	Merge commit '6a23a34274b747280c1e4a00ad22f97f99bbb48a' * commit '6a23a34274b747280c1e4a00ad22f97f99bbb48a': mimic: drop AVPicture usage Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-10-14 15:01:54 +02:00
Hendrik Leppkes	3d93ff289e	Merge commit '6fdd4c678ac1ce0776f9645cd534209e5f1ae1e3' * commit '6fdd4c678ac1ce0776f9645cd534209e5f1ae1e3': libschroedinger: Properly use AVFrame API Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-10-14 15:00:53 +02:00
Hendrik Leppkes	9c3f75c29d	Merge commit '901f9c0a32985f48672fd68594111dc55d88a57a' * commit '901f9c0a32985f48672fd68594111dc55d88a57a': qtrle: Properly use AVFrame API Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-10-14 14:56:16 +02:00
Derek Buitenhuis	d15368ee39	h264: Run VLC init under pthread_once This makes the h.264 decoder threadsafe to initialize. Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2015-10-14 14:35:34 +02:00
Luca Barbato	08377f9c3b	dxva: Include last the internal header It redefines _WIN32_WINNT, possibly causing problems with the w32pthreads.h header.	2015-10-14 14:35:34 +02:00
Hendrik Leppkes	037b44a3b4	Merge commit '00332e0a064dad866812de9162b009cbaba6f5df' * commit '00332e0a064dad866812de9162b009cbaba6f5df': wrapped_avframe: Initial implementation Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>	2015-10-14 13:26:17 +02:00
wm4	6a23a34274	mimic: drop AVPicture usage Work on the AVFrame references directly. Instead of setting up a flipped/swapped "view" on the pictures, flip/swap them when returning decoded frames to the API user.	2015-10-14 11:25:53 +02:00
Vittorio Giovara	6fdd4c678a	libschroedinger: Properly use AVFrame API Rather than copying data buffers around, allocate a proper frame, and use the standard AVFrame functions. This effectively makes the decoder capable of direct rendering. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2015-10-14 11:24:55 +02:00
Vittorio Giovara	901f9c0a32	qtrle: Properly use AVFrame API Rather than copying data buffers around, just add a reference to the current frame. Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>	2015-10-14 11:24:24 +02:00
James Almer	74a87ae210	x86/vp9itxfm: fix register clobbering in ff_vp9_idct_idct_4x4_add_12_sse2 Reviewed-by: Henrik Gramner <henrik@gramner.com> Signed-off-by: James Almer <jamrial@gmail.com>	2015-10-13 20:21:33 -03:00
Christophe Gisquet	234369d0fd	dnxhdenc: fix access outside of image This is the same test as for the 8bit case.	2015-10-13 18:53:10 -03:00
Christophe Gisquet	74c414202f	x86: simple_idct10_template: use const This avoid going through constants.c while still sharing them with proresdsp.asm Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 22:52:33 +02:00
Nedeljko Babic	de262d018d	avcodec/mips/aaccoder_mips: Sync with the generic code This patch fixes build of AAC encoder optimized for mips that was broken due to some changes in generic code that were not propagated to the optimized code. Also, some functions in the optimized code are basically duplicate of functions from generic code. Since they do not bring enough improvement to the optimized code to justify their existence, they are removed (which improves maintainability of the optimized code). Optimizations disabled in `97437bd` are enabled again. Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com> Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 17:22:56 +02:00
Ronald S. Bultje	e578638382	vp9: use registers for constant loading where possible.	2015-10-13 11:06:01 -04:00
Ronald S. Bultje	408bb8556f	vp9: refactor itx coefficients and share between 8 and 10/12bpp.	2015-10-13 11:06:01 -04:00
Ronald S. Bultje	eb4b5ff738	vp9: add itxfm_add eob shortcuts to 10/12bpp functions. These aren't quite as helpful as the ones in 8bpp, since over there, we can use pmulhrsw, but here the coefficients have too many bits to be able to take advantage of pmulhrsw. However, we can still skip cols for which all coefs are 0, and instead just zero the input data for the row itx. This helps a few % on overall decoding speed.	2015-10-13 11:06:01 -04:00
Ronald S. Bultje	488fadebbc	vp9: add 10/12bpp idct_idct_32x32 sse2 SIMD version.	2015-10-13 11:06:00 -04:00
Ronald S. Bultje	3d0ca2fe89	vp9: 10/12bpp sse2 SIMD for iadst16.	2015-10-13 11:06:00 -04:00
Ronald S. Bultje	0e80265b0a	vp9: refactor 10/12bpp dc-only code in 4x4/8x8 and add to 16x16.	2015-10-13 11:06:00 -04:00
Ronald S. Bultje	1338fb79d4	vp9: add 10/12bpp sse2 SIMD version for idct_idct_16x16.	2015-10-13 11:06:00 -04:00
Ronald S. Bultje	cb054d061a	vp9: add 10/12bpp sse2 SIMD versions of iadst8x8.	2015-10-13 11:05:59 -04:00
Ronald S. Bultje	e0610787b2	vp9: add 10/12bpp sse2 SIMD for idct_idct_8x8.	2015-10-13 11:05:59 -04:00
Ronald S. Bultje	a35f6bdb38	vp9: add 12bpp sse2 versions of iadst4.	2015-10-13 11:05:59 -04:00
Ronald S. Bultje	235e76aeb8	vp9: initial attempt at a idct_idct_4x4 12bpp x86 simd (sse2) impl. The trouble with this function is that intermediates overflow 31+sign bits, so I've added some helpers (that will also be used in 10/12bpp 8x8, 16x16 and 32x32) to make that easier, basically emulating a half- assed pmaddqd using 2xpmaddwd. It's currently sse2-only, if anyone sees potential in adding ssse3, I'd love to hear it.	2015-10-13 11:05:58 -04:00
Ronald S. Bultje	f76423d097	vp9: add x86 simd (sse2/ssse3) for iadst4 10bpp functions.	2015-10-13 11:05:58 -04:00
Ronald S. Bultje	6b579cf547	vp9: add 10bpp simd (mmxext/ssse3) for idct_idct_4x4.	2015-10-13 11:05:58 -04:00
Ronald S. Bultje	1c3be32533	vp9: add 10/12bpp mmxext-optimized iwht_iwht_4x4 function.	2015-10-13 11:05:57 -04:00
Christophe Gisquet	b6594a9605	x86: dct-test: add more idcts In particular for 10 and 12 bits. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 16:03:04 +02:00
Michael Niedermayer	a745d1a9e4	avcodec/dct-test: Print failure notice below the failed *dct This makes it easier to see where a failure happens Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 16:03:03 +02:00
Christophe Gisquet	7ece8b50b1	x86: simple_idct: 12bits versions On 12 frames of a 444p 12 bits DNxHR sequence, _put function: C: 78902 decicycles in idct, 262071 runs, 73 skips avx: 32478 decicycles in idct, 262045 runs, 99 skips Difference between the 2: stddev: 0.39 PSNR:104.47 MAXDIFF: 2 This is unavoidable and due to the scale factors used in the x86 version, which cannot match the C ones. In addition, the trick of adding an initial bias to the input of a pass can overflow, as the input coefficients are already 15bits, which is the maximum this function can handle. Overall, however, the omse on 12 bits samples goes from 0.16916 to 0.16883. Reducing rowshift by 1 improves to 0.0908, but causes overflows. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 15:34:32 +02:00
Derek Buitenhuis	17e41cf361	avcodec: Do not lock during init if there is no init function Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>	2015-10-13 13:43:29 +02:00
Christophe Gisquet	4369b9dc7b	x86: simple_idct(_put): 10bits versions Modeled from the prores version. Clips to [0;1023] and is bitexact. Bitexactness requires to add offsets in different places compared to prores or C, and makes the function approximately 2% slower. For 16 frames of a DNxHD 4:2:2 10bits test sequence: C: 60861 decicycles in idct, 1048205 runs, 371 skips sse2: 27567 decicycles in idct, 1048216 runs, 360 skips avx: 26272 decicycles in idct, 1048171 runs, 405 skips The add version is not implemented, so the corresponding dsp function is set to NULL to make it clear in a code executing it. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 13:32:21 +02:00
Christophe Gisquet	e652f69b35	x86: simple_idct10_template: fix overflow in pass When the input of a pass has 15 or 16 bits of precision (in particular the column pass), the addition of a bias to W4 may lead to overflows in the input to pmaddwd. This requires postponing the adding of the bias to after the first butterfly. To do so, the fact that m15, unused although zeroed, is exploited. In case the pass is safe, an address can be directly used, and the number of xmm regs can be decreased. Otherwise, the 32bits bias is loaded into it. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 12:51:10 +02:00
Christophe Gisquet	2fd14dd8eb	avcodec/simple_idct10: improve precision omse goes from 0.03060703 (which fails for dct-test) to 0.01663750. This also actually improve the error of decoding the sample generated by fate-vsynth3-dnxhd1080i-10bit using simple_idct10 to FAANI, which goes (when resampled to yuv422p) from: stddev: 0.06 PSNR: 72.28 MAXDIFF: 1 to identical. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 02:10:51 +02:00
Christophe Gisquet	e9a68b0316	x86: prores: templatize 10 bits simple_idct This should be reused for a generic simple_idct10 function. Requires a bit of trickery to declare common constants in C. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 01:10:34 +02:00
Rostislav Pehlivanov	93e6b23c9f	aacenc: shorten name of ff_aac_adjust_common_prediction To keep it similar to the other functions which are all named *_pred.	2015-10-12 23:33:07 +01:00
Rostislav Pehlivanov	65f5b96dd8	aacenc: increase size of s->planar_samples[] from 6 to 8 Left out of last commit which added support for eight channel audio.	2015-10-12 23:25:45 +01:00
Christophe Gisquet	9f3bfe30dd	mpegvideo: dnxhdenc: permute 10bits content Dequant or encoding were trying to reverse a scan that hadn't been applied... Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 00:01:39 +02:00
Michael Niedermayer	97437bd17a	avcodec/mips/aaccoder_mips: Disable ff_aac_coder_init_mips() to prevent build failure Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>	2015-10-13 00:01:39 +02:00
Ricardo Constantino	53886d6955	avcodec/webvttdec: Deal with WebVTT escapes Bare ampersand characters are still accepted, even though out-of-spec. Also fixes adjacent tags not being parsed. Fixes trac #4915 Signed-off-by: Ricardo Constantino <wiiaboo@gmail.com>	2015-10-12 22:04:05 +02:00
Derek Buitenhuis	1156b634c1	avcodec: Don't lock on init for codecs without an init function Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>	2015-10-12 15:25:51 -03:00
Rostislav Pehlivanov	e2749ef60a	aacenc_utils: fit find_form_factor() below 80 chars per line	2015-10-12 17:14:50 +01:00
Rostislav Pehlivanov	0f4334df45	aacenc: add support for changing options based on a profile This commit adds the ability for a profile to set the default options, as well as for the user to override such options by simply stating them in the command line while still keeping the same profile, as long as those options are still permitted by the profile. Example: setting the profile to aac_low (the default) will turn PNS and IS on. They can be disabled by -aac_pns 0 and -aac_is 0, respectively. Turning on -aac_pred 1 will cause the profile to be elevated to aac_main, as long as no options forbidding aac_main have been entered (like AAC-LTP, which will be pushed soon). A useful feature is that by setting the profile to mpeg2_aac_low, all MPEG4 features will be disabled and if the user tries to enable them then the program will exit with an error. This profile is signalled with the same bitstream as aac_low (MPEG4) but some devices and decoders will fail if any MPEG4 features have been enabled.	2015-10-12 16:57:56 +01:00
Rostislav Pehlivanov	b3deaece87	aacenc: add support for encoding 7.1 channel audio This commit implements support for 7.1 channel audio. There's no more predefined bitstream channel mappings so going beyond 8 channels (and 7 channels exactly) will require programmable channel elements, which is already underway.	2015-10-12 15:53:17 +01:00
Rostislav Pehlivanov	e679a1e65f	aacenc_quantization: fix header description Two guesses as to which file was used as boilerplate.	2015-10-12 15:41:50 +01:00
Claudio Freire	b629c67ddf	AAC encoder: memoize quantize_band_cost The bulk of calls to quantize_band_cost are replaced by a call to a version that memoizes, greatly improving performance, since during coefficient search there is a great deal of repeat work. Memoization cannot always be applied, so do this in a different function, and leave the original as-is.	2015-10-12 03:56:22 -03:00
Claudio Freire	07b3b779a9	AAC encoder: fix assertion error re SF differences Intermediate results can indeed violate SF delta. Instead of asserting there, just make the code safe, and assert on the final result. Also re-clamp SFs more often in short windows (which tend to violate the restriction when encoding the switch from one window to the other)	2015-10-11 23:00:46 -03:00

1 2 3 4 5 ...

33907 Commits