generic-library/vpx

Author	SHA1	Message	Date
Gabriel Marin	0549f5aae9	Simplify address arithmetic in vp9_optimize_b Simplify address arithmetic on token_costs to reduce the number of generated instructions that are used for address arithmetic inside routine vp9_optimize_b. It also helps improve instruction scheduling depending on compiler and optimization level. Measured a 9.3% reduction in retired instructions and 5.3% reduction in execution time for this routine with GCC v4.8.4 and optimization flags -O3, and a reduction of up to 11.6% in execution time with other compilers. No change in behavior. TEST=Verified that encoded files match bit for bit, with and without this change. BUG=b/33678225 Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f	2016-12-19 13:10:04 -08:00
clang-format	5f6d143b41	apply clang-format Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487	2016-09-15 15:07:53 -07:00
clang-format	e0cc52db3f	vp9/encoder: apply clang-format Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2	2016-08-02 16:47:11 -07:00
Alex Converse	e446ffda45	Cache optimizations in optimize_b(). Move best index into the token state. Shrink it down to one byte. This is more cache friendly (access are group together) and uses less total memory. Results in 4% fewer cycles in optimize_b(). Change-Id: I75db484fb3dc82f59928d54b659d79c80ee40452	2016-07-29 12:06:49 -07:00
hui su	248f6ad771	Revert "Eliminate isolated and small tail coefficients:" This reverts commit `ff19cdafdb`. Change-Id: I81f68870ca27a1ff683ee22090530b6997815fb2	2016-07-13 11:14:44 -07:00
Jingning Han	2f28f9072e	Enable coeff optimization for intra modes This further improves the coding performance by lowres 0.3% midres 0.5% hdres 0.6% Change-Id: I6a03b6da210b9cbc261474bad4a103e0ba021c68	2016-07-07 12:25:41 -07:00
Jingning Han	44354ee7bf	Use precise context to estimate coeff rate cost Use the precise context to estimate the zero token cost in trellis optimization process. This improves the speed 0 coding performance by 0.15% for lowres and 0.1% for midres. It improves the speed 1 coding performance by 0.2% for midres and hdres. Change-Id: I59c7c08702fc79dc4f8534b64ca594da909e2c91	2016-07-07 12:25:33 -07:00
Jingning Han	62aa642d71	Enable uniform quantization with trellis optimization in speed 0 This commit allows the inter prediction residual to use uniform quantization followed by trellis coefficient optimization in speed 0. It improves the coding performance by lowres 0.79% midres 1.07% hdres 1.44% Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619	2016-07-07 12:25:33 -07:00
Min Ye	ff19cdafdb	Eliminate isolated and small tail coefficients: Improve hdres PSNR by 0.696% Improve midres PSNR by 0.313% Improve lowres PSNR by 0.142% Change-Id: Icabde78aa9689f539f6a03ec09f712c20758796c	2016-07-06 11:08:23 -07:00
Jingning Han	14011f037d	Remove txfrm_block_to_raster_xy() from vp9 encoder The transform block row and column positions are always available outside the callees. There is no need to re-compute these values again. This approach has been used by the decoder. This commit removes txfrm_block_to_raster_xy() function. Change-Id: I5b90f91a0d8b7c35cfa7d171da9edf8202630108	2016-07-04 18:41:47 -07:00
Alex Converse	50d3629c61	Repack vp9_token_state. Reduces size from 32 bytes to 24 bytes on x86_64. Change-Id: I8a22552343a1fc916117f35267fe6a295250f742	2016-06-20 12:56:32 -07:00
Jingning Han	9e185ed177	Refactor optimize_b for speed performance This commit refactors the trellis coefficient optimization process. It saves multiplications used to generate the final dequantized coefficients. It removes two memset operations on quantized and dequantized coefficient sets. This improves the unit speed by 10%. Change-Id: I23f47c6e14582520a7f952f03ce8f72183e7f0e6	2016-06-17 17:41:09 -07:00
Jingning Han	dba1d1a63d	Port optimize_b speed-up from vp10 This commit back ports the speed-up from vp10. It improves the unit speed by 15%. Change-Id: Ibe8c0e0974b03266d6abd16a41e89c3b91d8db2a	2016-06-17 17:41:05 -07:00
Jingning Han	f99f78c7af	Use 64-bit integer to store distortion in optimize_b This fixes the overflow issue. Bug=webm:1241 Change-Id: Ia168b7fae1ad214a6837aaa785a08bf8506987dd	2016-06-17 15:07:00 -07:00
hui su	a554bd8dac	Avoid a potential assertion fail in optimize_b() The eob of a block is not perperly set when skip_recode is true, thus triggering assert(eob <= default_eob) to fail. Change-Id: Ifecbe33dce2dc4903e0a80bd384dc09bf0dd8a44	2016-06-07 15:45:04 -07:00
Yaowu Xu	81eb71f00c	Change to use proper type in vp{9,10}_token_state "qc" in vp{9,10}_token_state is used to save quantized coefficients, this commit changes the type from short to tran_low_t to properly reflect the value range for highbitdepth build. This fixes an out-of-range bug when optimize_b is used in highbitdepth build. Change-Id: Ibf330879e6ac6ae8f099e085caa9d3d9a889fde8	2016-05-04 12:14:11 -07:00
hui su	c3a9247e09	VP9: adjust trellis quant optimization RD parameters Coding gain: lowres 0.64% midres 0.38% hdres 0.58% Change-Id: I233fa2a4b24bd1e15091a5f5ef6aff661f3f50ec	2016-04-26 10:17:38 -07:00
hui su	c8f56d2303	VP9: enable trellis quantization optimization for intra blocks Coding gain: lowres 0.18% midres 0.23% hdres 0.36% Change-Id: I044c8afbc481fc55b23d440352941071355b0afb	2016-04-26 10:17:29 -07:00
Jim Bankoski	1de659af06	vp9_encodemb.c: TODO clean up huisu did in nextgen branch -> please try in vp9 Change-Id: I0ff35db07ac38464e0e2858e303be686c03a5d0e	2016-04-21 20:35:54 +00:00
Alex Converse	d13385cee7	Switch to 9-bit rate cost constants built on a 256 probability denominator. -.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html -.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f	2016-01-27 19:34:30 +00:00
Alex Converse	4326cffa65	Merge "Tie the bit cost scale to a define."	2016-01-21 19:17:56 +00:00
Scott LaVarnway	5232326716	VP9: Eliminate MB_MODE_INFO Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185	2016-01-19 16:40:20 -08:00
Alex Converse	269428e35c	Tie the bit cost scale to a define. This is a pure-refactor in preparation to potentially raise the bit-cost resolution. Verified at good speed 0 and rt speed -6. Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9	2016-01-15 15:59:31 -08:00
Scott LaVarnway	2f8625d824	VP9: remove plane_type from macroblockd_plane Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec	2015-09-30 15:15:11 -07:00
Alex Converse	a8a08ce57e	Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4	2015-08-10 15:37:14 -07:00
Jingning Han	b4f2c567c8	Cosmetic - align format in vp9 Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e	2015-08-06 15:56:11 -07:00
Jingning Han	d621de7e8d	Change vp9_quantize to vpx_quantize This commit clears all the vp9_ prefix use case in vpx_dsp. It gets the vp9 folder ready to branch out vp10. Change-Id: I2906eec179ee792b4af8c9b4161313653050e931	2015-08-04 15:31:49 -07:00
Alex Converse	4ac5058afc	Give skip_txfm constants names. This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792	2015-07-31 10:08:08 -07:00
Jingning Han	4b5109cd73	Replace vp9_ prefix in 2D-DCT functions with vpx_ Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25	2015-07-28 16:06:44 -07:00
Hui Su	a15edeb76d	Merge "Code cleanup in vp9_encode_block_intra"	2015-07-24 17:40:37 +00:00
hui su	e298d650cb	Code cleanup in vp9_encode_block_intra Change-Id: Ie4d958b26e586db218f8ee95d5df4bf11f2345a1	2015-07-22 10:53:12 -07:00
Jingning Han	389ed6da10	Refactor highbd forward transform use case Separate the hybrid transform case from 2D-DCT case. This will allow us to clear up cross dependency between c and SIMD implementations later. Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf	2015-07-20 10:31:17 -07:00
Yunqing Wang	38f1fbbb75	Migrate quantization functions from vp9/ to vpx_dsp/ The following quantization functions were moved: vp9_quantize_b vp9_quantize_b_32x32 vp9_highbd_quantize_b vp9_highbd_quantize_b_32x32 vp9_quantize_dc vp9_quantize_dc_32x32 vp9_highbd_quantize_dc vp9_highbd_quantize_dc_32x32 The purpose of doing that was to allow these functions to be shared by multiple codecs. Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f	2015-07-17 16:38:14 -07:00
Jingning Han	81452cf0b7	Refactor intra block prediction function This commit simplifies the intra block boundary condition logic. It removes the block index from the argument set. Change-Id: If00142512eb88992613d6609356dfd73ba390138	2015-07-13 15:20:47 -07:00
Jingning Han	535cc6d87f	Format fixes in vp9_encodeframe.c and vp9_encodemb.c Change-Id: Ib1303dac9043ab1b1f8fce54611cf4ea8a208038	2015-07-09 00:04:28 +00:00
Jingning Han	432cd4bfb7	Move subtract functions from vp9 to vpx_dsp Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b	2015-07-06 12:22:47 -07:00
Scott LaVarnway	b962646fc5	Re-worked header files Various header/test files had to be re-worked in order to build "Remove cm parameter from vp9_decode_block_tokens()". This patch reverts the "Remove cm" part and only contains the re-worked header files. Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32	2015-05-22 11:19:51 -07:00
Johann	1d7ccd5325	Relocate memory operations for common code With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca	2015-05-13 11:41:15 -07:00
James Zern	f58011ada5	vpx_mem: remove vpx_memset vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201	2015-04-28 20:00:59 -07:00
Scott LaVarnway	8b17f7f4eb	Revert "Remove mi_grid_* structures." (see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6) For the test clip used, the decoder performance improved by ~2%. This is also an intermediate step towards adding back the mode_info streams. Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d	2015-04-21 11:16:45 -07:00
Deb Mukherjee	6910e92d04	dc quantizer fix for 32x32 transforms The rounding factor needs to be scaled down by a factor of 2. Also, the quantized and dequantized coefficients are memset to 0 when dc quantizer is used. Change-Id: Ifa68bab02addbf1b83d249c5b4cbd5cda796b1cf	2015-03-03 15:58:27 -08:00
Yaowu Xu	364b92dc88	Fix compiler warnigns for msvc2013 Change-Id: I1e32bf8f6872a6fb7e9cabe86483e94805e2f790	2015-01-05 17:31:19 -08:00
Jim Bankoski	b3c66f8a2f	WIP: Remove giant value cost table Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367	2014-12-23 15:06:17 -08:00
Jim Bankoski	d6d431c476	Merge "Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value."""	2014-12-22 13:43:56 -08:00
Jingning Han	d0f2377027	Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value."" This reverts commit `9946ee23e0`. Fix the ssse3 asm function. Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07	2014-12-22 10:09:25 -08:00
Jim Bankoski	4b8c6d96ec	Tokenization without huge tables. Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd	2014-12-22 08:42:52 -08:00
Paul Wilkins	9946ee23e0	Revert "Removal of legacy zbin_extra / zbin_oq_value." This reverts commit `e9b586e21b`. Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4	2014-12-19 15:02:58 +00:00
Paul Wilkins	e9b586e21b	Removal of legacy zbin_extra / zbin_oq_value. zbin extra / zbin_oq_value was widely passed around, hence removal touches a lot of code. Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5	2014-12-18 16:49:11 +00:00
Peter de Rivaz	a306bd8274	Use the RTC optimizations when in high bitdepth mode. Change 72193 made the encoder behave differently when configured with and without high bitdepth. This change means the same algorithm is used for both. Change-Id: I707a44a94afca773a9e0c2f7ebeeea83030257c5	2014-12-04 15:48:42 -08:00
Jingning Han	7428cebe4f	Rework forward txfm/quantization skip system in RTC coding mode This commit allows more aggressive decision to skip forward transform and quantization for luma component in RTC coding mode. The chroma components remains going through the normal coding routine, since they are not included in the non-RD mode search process. It reduces the runtime cost by 2% - 10%. In speed -6, vidyo1 1000 kbps 16576 b/f, 40.281 dB, 8402 ms -> 16576 b/f, 40.323 dB, 7764 ms nik720p 1000 kbps 33337 b/f, 38.622 dB, 7473 ms -> 33299 b/f, 38.660 dB, 7314 ms dark720p 1000 kbps 33330 b/f, 39.785 dB, 13505 ms -> 33325 b/f, 39.714 dB, 13105 ms The compression performance of speed -6 is improved by 0.44% in PSNR and 1.31% in SSIM. Change-Id: Iae9e3738de6255babea734e5897f29118bebc6d7	2014-11-21 12:46:40 -08:00

1 2 3 4 5 ...

341 Commits