Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.
Benchmarks of BQTerrace_1920x1080_60_qp22.bin with an Intel Core i5-4200U
Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0 skips
5205 decicycles in ff_hevc_sao_edge_filter_32_8_ssse3, 32767 runs, 1 skips
2942 decicycles in ff_hevc_sao_edge_filter_32_8_avx2, 32767 runs, 1 skips
Width 64
705639 decicycles in sao_edge_filter_8, 262144 runs, 0 skips
19224 decicycles in ff_hevc_sao_edge_filter_64_8_ssse3, 262111 runs, 33 skips
10433 decicycles in ff_hevc_sao_edge_filter_64_8_avx2, 262115 runs, 29 skips
Signed-off-by: James Almer <jamrial@gmail.com>
This ensures we do not loose the frame in case or multiple clears
Fixes out of array read
Fixes: asan_heap-oob_2fa47ea_2100_cov_1278768963_ff_add_pixels_clamped_mmx.m2ts
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
ff_avc_write_annexb_extradata() allocates extradata, but don't add
FF_INPUT_BUFFER_PADDING_SIZE value
Signed-off-by: Lukasz Marek <lukasz.m.luki2@gmail.com>
For reasons we are not privy to, nvidia decided that the nvenc encoder
should apply aspect ratio compensation to 'DVD like' content, assuming that
the content is not bt.601 compliant, but needs to be bt.601 compliant. In
this context, that means that they make the following, questionable,
assumptions:
1) If the input dimensions are 720x480 or 720x576, assume the content has
an active area of 704x480 or 704x576.
2) Assume that whatever the input sample aspect ratio is, it does not account
for the difference between 'physical' and 'active' dimensions.
From, these assumptions, they then conclude that they can 'help', by adjusting
the sample aspect ratio by a factor of 45/44. And indeed, if you wanted to
display only the 704 wide active area with the same aspect ratio as the full
720 wide image - this would be the correct adjustment factor, but what if you
don't? And more importantly, what if you're used to ffmpeg not making this kind
of adjustment at encode time - because none of the other encoders do this!
And, what if you had already accounted for bt.601 and your input had the
correct attributes? Well, it's going to apply the compensation anyway!
So, if you take some content, and feed it through nvenc repeatedly, it
will keep scaling the aspect ratio every time, stretching your video out
more and more and more.
So, clearly, regardless of whether you want to apply bt.601 aspect ratio
adjustments or not, this is not the way to do it. With any other ffmpeg
encoder, you would do it as part of defining your input paramters or
do the adjustment at playback time, and there's no reason by nvenc
should be any different.
This change adds some logic to undo the compensation that nvenc would
otherwise do.
nvidia engineers have told us that they will work to make this
compensation mechanism optional in a future release of the nvenc
SDK. At that point, we can adapt accordingly.
Signed-off-by: Philip Langdale <philipl@overt.org>
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: asan_heap-oob_1fb2f9b_3780_cov_3984375136_usf.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes integer overflow and out of array read
Fixes: asan_heap-oob_1fb2f9b_3780_cov_3984375136_usf.mkv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
As with sao_band_filter, pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions.
Reviewed-by: Mickaël Raulet <mraulet@insa-rennes.fr>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes out of array accesses
Fixes: asan_heap-oob_1c1a4ea_1242_cov_2274415971_TESTcmyk.jpg
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The scaling list can be specified in either the SPS or PPS.
Additionally, compensate for the diagonal scan permutation applied in the decoder.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'd615187f74ddf3413778a8b5b7ae17255b0df88e':
aacdec: Support for ER AAC ELD 480.
Conflicts:
libavcodec/aacdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '0ee2573347ecdb9cb5656001f7201d819eec16d8':
aacdec: Support for ER AAC in LATM
Conflicts:
libavcodec/aacdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Some files contain a few additional, all-0 bits.
Check for that case and don't print incorrect "not supported"
message.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Alex Converse <alex.converse@gmail.com>
Use edge emu buffers
And enable the code unconditionally
Speed difference without USE_SAO_SMALL_BUFFER and with the new code:
Decicycles: 26772->26220 (BO32), 83803->80942 (BO64)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
cherry picked from commit 5d9f79edef2c11b915bdac3a025b59a32082f409
SAO edge filter uses pre-SAO pixel data on the left and top of the ctb, so
this data must be kept available. This was done previously by having 2
copies of the frame, one before and one after SAO.
This commit reduces the storage to just that, instead of the previous whole
frame.
Commit message taken from patch by Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>