generic-library/vpx

Author	SHA1	Message	Date
Yunqing Wang	0d43bd77e5	Bug fix in ssse3 quantize function A bug was reported in Issue 702: "SIGILL (Illegal instruction) when transcoding with vp9 - using FFmpeg". It was reproduced and fixed. Change-Id: Ie32c149a89af02856084aeaf289e848a905c7700	2014-02-07 14:32:30 -08:00
Jingning Han	09bc942b47	Fix overflow issue in 16x16 quantization SSSE3 The 16x16 transform unit test suggested that the peak coefficient value can reach 32639. This could cause potential overflow issue in the SSSE3 implmentation of 16x16 block quantization. This commit fixes this issue by replacing addition with saturated addition. Change-Id: I6d5bb7c5faad4a927be53292324bd2728690717e	2013-09-06 21:06:10 -07:00
Jingning Han	458c2833c0	Use saturated addition in SSSE3 of 32x32 quant The 32x32 forward transform can potentially reach peak coefficient value close to 32700, while the rounding factor can go upto 610. This could cause overflow issue in the SSSE3 implementation of 32x32 quantization process. This commit resolves this issue by replacing the addition operations with saturated addition operations in 32x32 block quantization. Change-Id: Id6b98996458e16c5b6241338ca113c332bef6e70	2013-09-05 12:49:12 -07:00
Jingning Han	abff678866	Fix overflow issue in SSSE3 32x32 quantization The 32x32 quantization process can potentially have the intermediate stacks over 16-bit range, thereby causing enc/dec mismatch. This commit fixes this overflow issue in the SSSE3 implementation, as well as the prototype, of 32x32 quantization. This fixes issue 607 from webm@googlecode. Change-Id: I85635e6ca236b90c3dcfc40d449215c7b9caa806	2013-08-29 11:00:54 -07:00
Ronald S. Bultje	e5fb4b61b6	Use pmovmskb to skip quantize loops over empty coefficients. If none of the 16 coefficients that we quantize per loop iteration are larger than the zbin, directly skip to the next round of coeffs, rather than doing a full quantize loop that will eventually result in 16 zeroes. This incurs a jump cost, but saves a lot of other work. 32x32 quant goes from 1349 -> 1184 cycles. The same approach yielded no significantly positive results for smaller transforms, so is not used there (8x8: 103 -> 101 cycles; 16x16: 302 -> 306 cycles). Change-Id: I8fca17dc2543fc8eed1dbcd5100145e3c3a9b647	2013-07-02 16:34:24 -07:00
Ronald S. Bultje	c8defcfdee	Update quantize SSSE3 SIMD to cover 32x32 transform case also. Encode time of bus (speed 0) 50 frames @ 1500kbps goes from 2min14.4 to 2min10.1, i.e. a 2.3% overall speed increase. Change-Id: I3699580e74ec26c7d24e03681bc47ba25ee1ee87	2013-07-01 11:36:33 -07:00
Ronald S. Bultje	7353ceab9d	Quantize (64-bit only, for now) SSSE3 SIMD. Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904	2013-07-01 11:36:07 -07:00
Johann	e43662e8e6	Remove unused quantize optimizations. Files were copied from vp8 and never maintained. Change-Id: I9659a8755985da73e8c19c3c984423b6666d8871	2013-04-30 18:42:05 -07:00
John Koleszar	15255eef82	Move dequant from BLOCKD to per-plane MACROBLOCKD This data can vary per-plane, but not per-block. Change-Id: I1971b0b2c2e697d2118e38b54ef446e52f63c65a	2013-04-25 11:57:20 -07:00
Frank Galligan	f67d740b34	Add support for x64 and win64 yasm flags. Some projects must define only win64 for Windows 64bit builds using yasm. Change-Id: I1d09590d66a7bfc8b4412e1cc8685978ac60b748	2013-01-31 16:25:37 -08:00
Jim Bankoski	1dffce7f96	add private to assembly files to insure proper chromebuild Change-Id: I6e43ca73f35401a974ed8ee27738d4318f09fd37	2012-12-20 09:40:18 -08:00
John Koleszar	fcccbcbb39	Add vp9_ prefix to all vp9 files Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc	2012-11-27 14:12:30 -08:00

12 Commits