generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	7e89c102c4	vp9-highbitdepth -> vpx-highbitdepth Change-Id: I1e90cf7ab4bb02c0ef119b0bd1596771edefedff	2016-08-05 15:41:33 -07:00
Yaowu Xu	3826383ca1	Fix compiling issues Change-Id: I530348b12a1c039842ce4e33d21046fe63878f19	2016-07-22 09:43:22 -07:00
Debargha Mukherjee	e5848dea5a	Rectangular transforms 4x8 & 8x4 Added a new expt rect-tx to be used in conjunction with ext-tx. [rect-tx is a temporary config flag and will eventually be merged into ext-tx once it works correctly with all other experiments]. Added 4x8 and 8x4 tranforms for use initially with rectangular sub8x8 y blocks as part of this experiment. There is about a -0.2% BDRATE improvement on lowres, others pending. When var-tx is on rectangular transforms are currently not used. That will be enabled in a subsequent patch. Change-Id: Iaf3f88ede2740ffe6a0ffb1ef5fc01a16cd0283a	2016-07-21 10:46:41 -07:00
Johann	2967bf355e	Merge changes from libvpx/master by cherry-pick This commit bring all up-to-date changes from master that are applicable to nextgenv2. Due to the remove VP10 code in master, we had to cherry pick the following commits to get those changes: Add default flags for arm64/armv8 builds Allows building simple targets with sane default flags. For example, using the Android arm64 toolchain from the NDK: https://developer.android.com/ndk/guides/standalone_toolchain.html ./build/tools/make-standalone-toolchain.sh --arch=arm64 \ --platform=android-24 --install-dir=/tmp/arm64 CROSS=/tmp/arm64/bin/aarch64-linux-android- \ ~/libvpx/configure --target=arm64-linux-gcc --disable-multithread BUG=webm:1143 vpx_lpf_horizontal_4_sse2: Remove dead load. Change-Id: I51026c52baa1f0881fcd5b68e1fdf08a2dc0916e Fail early when android target does not include --sdk-path Change-Id: I07e7e63476a2e32e3aae123abdee8b7bbbdc6a8c configure: clean up var style and set_all usage Use quotes whenever possible and {} always for variables. Replace multiple set_all calls with able_feature(). Conflicts: build/make/configure.sh vp9-svc: Remove some unneeded code/comment. datarate_test,DatarateTestLarge: normalize bits type quiets a msvc warning: conversion from 'const int64_t' to 'size_t', possible loss of data mips added p6600 cpu support Removed -funroll-loops psnr.c: use int64_t for sum of differences Since the values can be negative. .asm: normalize label format add a trailing ':', though it's optional with the tools we support, it's more common to use it to mark a label. this also quiets the orphan-labels warning with nasm/yasm. BUG=b/29583530 Prevent negative variance Due to rounding, hbd variance may become negative. This commit put in check and clamp of negative values to 0. configure: remove old visual studio support (<2010) BUG=b/29583530 Conflicts: configure configure: restore vs_version variable inadvertently lost in the final patchset of: 078dff7 configure: remove old visual studio support (<2010) this prevents an empty CONFIG_VS_VERSION and avoids make failure Require x86inc.asm Force enable x86inc.asm when building for x86. Previously there were compatibility issues so a flag was added to simplify disabling this code. The known issues have been resolved and x86inc.asm is the preferred abstraction layer (over x86_abi_support.asm). BUG=b:29583530 convolve_test: fix byte offsets in hbd build CONVERT_TO_BYTEPTR(x) was corrected in: 003a9d2 Port metric computation changes from nextgenv2 to use the more common (x) within the expansion. offsets should occur after converting the pointer to the desired type. + factorized some common expressions Conflicts: test/convolve_test.cc vpx_dsp: remove x86inc.asm distinction BUG=b:29583530 Conflicts: vpx_dsp/vpx_dsp.mk vpx_dsp/vpx_dsp_rtcd_defs.pl vpx_dsp/x86/highbd_variance_sse2.c vpx_dsp/x86/variance_sse2.c test: remove x86inc.asm distinction BUG=b:29583530 Conflicts: test/vp9_subtract_test.cc configure: remove x86inc.asm distinction BUG=b:29583530 Change-Id: I59a1192142e89a6a36b906f65a491a734e603617 Update vpx subpixel 1d filter ssse3 asm Speed test shows the new vertical filters have degradation on Celeron Chromebook. Added "X86_SUBPIX_VFILTER_PREFER_SLOW_CELERON" to control the vertical filters activated code. Now just simply active the code without degradation on Celeron. Later there should be 2 set of vertical filters ssse3 functions, and let jump table to choose based on CPU type. improve vpx_filter_block1d* based on replace paddsw+psrlw to pmulhrsw Make set_reference control API work in VP9 Moved the API patch from NextGenv2. An example was included. To try it, for example, run the following command: $ examples/vpx_cx_set_ref vp9 352 288 in.yuv out.ivf 4 30 Conflicts: examples.mk examples/vpx_cx_set_ref.c test/cx_set_ref.sh vp9/decoder/vp9_decoder.c deblock filter : moved from vp8 code branch The deblocking filters used in vp8 have been moved to vpx_dsp for use by both vp8 and vp9. vpx_thread.[hc]: update webp source reference + drop the blob hash, the updated reference will be updated in the commit message BUG=b/29583578 vpx_thread: use native windows cond var if available BUG=b/29583578 original webp change: commit 110ad5835ecd66995d0e7f66dca1b90dea595f5a Author: James Zern <jzern@google.com> Date: Mon Nov 23 19:49:58 2015 -0800 thread: use native windows cond var if available Vista / Server 2008 and up. no speed difference observed. 100644 blob 4fc372b7bc6980a9ed3618c8cce5b67ed7b0f412 src/utils/thread.c 100644 blob 840831185502d42a3246e4b7ff870121c8064791 src/utils/thread.h vpx_thread: use InitializeCriticalSectionEx if available BUG=b/29583578 original webp change: commit 63fadc9ffacc77d4617526a50c696d21d558a70b Author: James Zern <jzern@google.com> Date: Mon Nov 23 20:38:46 2015 -0800 thread: use InitializeCriticalSectionEx if available Windows Vista / Server 2008 and up 100644 blob f84207d89b3a6bb98bfe8f3fa55cad72dfd061ff src/utils/thread.c 100644 blob 840831185502d42a3246e4b7ff870121c8064791 src/utils/thread.h vpx_thread: use WaitForSingleObjectEx if available BUG=b/29583578 original webp change: commit 0fd0e12bfe83f16ce4f1c038b251ccbc13c62ac2 Author: James Zern <jzern@google.com> Date: Mon Nov 23 20:40:26 2015 -0800 thread: use WaitForSingleObjectEx if available Windows XP and up 100644 blob d58f74e5523dbc985fc531cf5f0833f1e9157cf0 src/utils/thread.c 100644 blob 840831185502d42a3246e4b7ff870121c8064791 src/utils/thread.h vpx_thread: use CreateThread for windows phone BUG=b/29583578 original webp change: commit d2afe974f9d751de144ef09d31255aea13b442c0 Author: James Zern <jzern@google.com> Date: Mon Nov 23 20:41:26 2015 -0800 thread: use CreateThread for windows phone _beginthreadex is unavailable for winrt/uwp Change-Id: Ie7412a568278ac67f0047f1764e2521193d74d4d 100644 blob 93f7622797f05f6acc1126e8296c481d276e4047 src/utils/thread.c 100644 blob 840831185502d42a3246e4b7ff870121c8064791 src/utils/thread.h vp9_postproc.c missing extern. BUG=webm:1256 deblock: missing const on extern const. postproc - move filling of noise buffer to vpx_dsp. Fix encoder crashes for odd size input clean-up vp9_intrapred_test remove tuple and overkill VP9IntraPredBase class. postproc: noise style fixes. gtest-all.cc: quiet an unused variable warning under windows / mingw builds vp9_intrapred_test: follow-up cleanup address few comments from ce050afaf3e288895c3bee4160336e2d2133b6ea Change-Id: I3eece7efa9335f4210303993ef6c1857ad5c29c8	2016-07-18 10:31:10 -07:00
Yi Luo	bfe4c0ae07	Integrate HBD inverse HT flip types sse4.1 optimization - tx_size: 4x4, 8x8, 16x16. - tx_type: FLIPADST_DCT, DCT_FLIPADST, FLIPADST_FLIPADST, ADST_FLIPADST, FLIPADST_ADST. - Encoder speed improvement: park_joy_1080p_12: ~11%, crowd_run_1080p_12: ~7%. - Add unit test cases for bit-exact against C. Change-Id: Ia69d069031fa76c4625e845bfbfe7e6f6ed6e841	2016-05-25 12:32:10 -07:00
Angie Chiang	6f28581b26	Turn on flip in inverse txfm2d Fix build failed Reduce txfm test time Change-Id: Ieaf6b27f3a272d06286f817f01230413fa8adcf6	2016-05-18 11:26:57 -07:00
Jingning Han	4b639fcf43	Merge "Remove unused highbd_ihalfcenter32_c function" into nextgenv2	2016-05-10 23:16:35 +00:00
Jingning Han	6b9a507f82	Remove unused highbd_ihalfcenter32_c function Change-Id: I4390fcbdf353d79dadc021d83d40891e518997dc	2016-05-10 14:27:16 -07:00
Yi Luo	cd8cfb8675	Change inverse HT function argument from TXFM_2D_CFG* to int This change has no performance impact. It prepares the proper function interface for better performance optimization. Change-Id: I12e2f2deaf7f3adc603de0a74852116468c762f6	2016-05-09 18:34:16 -07:00
Angie Chiang	02d23fbbf4	Fit adst/dct's stage range into 32-bit in bd12 Change-Id: Ie428c6f0655873de3e77e844a2f2e4203cf47dff	2016-04-14 15:44:05 -07:00
Angie Chiang	ff8c490b9a	Branch dct to new implementation for bd12 Change-Id: I9281935653aacce22ac3100f79fb956c249e2bf3	2016-04-04 12:40:10 -07:00
Angie Chiang	f1060f5bc4	Change dct32x32's range Bitdepth 10/12: Fit coefficient range into 32 bits Fit codfficient * const range into 32 bits Bitdepth 8: Fit coefficient range into 16 bits Fit codfficient * constant range into 32 bits Change-Id: I50b5a3132e8a9f5155c971ab0f6eb52876d2b5ca	2016-04-04 11:21:11 -07:00
Angie Chiang	64413a6ca7	Parameterize transform scale for quantizer This is to facilitate changing transform scale later Change-Id: Ic8ca5afba57d2489ebd191ccc40c1b31605a0d8c	2016-03-30 15:25:26 -07:00
Angie Chiang	46b234478f	Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10 Change-Id: I996c48a90d7d71b52594a91a35cb8712c7fc212e	2016-03-28 11:08:40 -07:00
Angie Chiang	d9a0cbb1b7	Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10 Change-Id: Ie35bdbd7aafae693e3106d7ccbbdd8e65ee8800c	2016-03-23 12:05:12 -07:00
Debargha Mukherjee	1b17559327	Adds 1D transforms for ADST/FlipADST to make 16 Makes a set of 16 transforms total, adding all 1D combinations of ADST and FlipADST, and removng all DST transforms. lowres, midres both improve by about 0.1% and hdres by -0.378% in BDRATE but with fewer transforms that are also simpler. Further experiments to continue later. Change-Id: I7348a4c0e12078fdea5ae3a2d36a89a319ffcc6e	2016-03-21 11:19:36 -07:00
Debargha Mukherjee	9b88762b17	Refactor 1D transforms In preparation for adding more 1D variants with ADST/FlipADST/etc. BDRATE actually improves by 0.21% on lowres. Change-Id: I2fa4720c69fe001fa666119a284dfc6b17fffab2	2016-03-14 22:30:09 -07:00
Jingning Han	c453ae53d0	Enable hybrid 1-D/2-D transform coding for highbd setting This commit enables the hybrid 1-D/2-D transform coding scheme for high bit-depth setting. It improves the compression performance of ext-tx experiment by 0.98% for lowres_all set. Change-Id: Ic27f5037f2c36b095a93b9f15dbae34bdcdf00aa	2016-03-10 08:58:07 -08:00
Jingning Han	a8dc9694a4	Hybrid 1-D/2-D transform coding This commit enables a hybrid 1-D/2-D transform coding scheme and the accompany entropy coding system. It currently uses hybrid 1-D/2-D DCT transform coding. It provides coding performance gains: lowres_all 0.55% hdres_all 0.43% Change-Id: I2b30dcafd21eb2bb3371f6e854cbab440a4dfa78	2016-03-07 09:27:46 -08:00
Debargha Mukherjee	7485498773	Extends ext-tx to support 32x32 masked transforms Adds new 32x32 masked 1-d transforms that combine 1-D length-16 DCT with length-16 identity transforms. To be continued in subsequent patches. Change-Id: I0b4f66492d44c079b3c3b531ba48a97201de1484	2016-02-17 09:31:34 -08:00
Debargha Mukherjee	1badceada8	Code cleanup: remove redundant DST1 code Removes the USE_DST2 flag that was on by default. DST2 performs slightly better that DST1 and is faster to compute. Change-Id: Ifb788f3f0a0e1995d7625230cec144b876f01206	2016-02-16 10:36:02 -08:00
Debargha Mukherjee	49d9730f60	Replace DST1 in ext_tx experiment with DST2 The DST2 is implemented by input alternate sign-flip, followed by DCT, followed by output reversal. Results are roughly the same, but it should be easier to optimize the DST2. [Interestingly a mtrix multuiply implementation is about 0.1% better]. Change-Id: If9ae5fdba87767fb0e6c163a62b77ee66a8d3afc	2015-12-15 11:30:48 -08:00
Angie Chiang	2b3f1d36b3	Merge changes Iea45fd22,If174d8dd,I9f539491 into nextgenv2 * changes: Add facade to inverse txfm Create hybrid_fwd_txfm.c merge txfm_#x#_1 into txfm_#x#	2015-12-03 22:29:03 +00:00
Angie Chiang	a245d9f88c	Add facade to inverse txfm Add inv_txfm and highbd_inv_txfm as facades of inverse transform such that the code flow in encodemb.c can be simpler Change-Id: Iea45fd22dd8b173f8eb3919ca6502636f7bcfcf7	2015-11-25 13:50:40 -08:00
Debargha Mukherjee	13e0cfb8c7	Fix ext-tx experiment for highbitdepth Change-Id: I610e18f150d73378283882ae81f5f77c367d2956	2015-11-24 10:38:37 -08:00
Geza Lore	4f5108090a	Flip the result of the inverse transform for FLIPADST. When using FLIPADST, the vp10_inv_txfm_add functions used to flip the destination array, add the result of the inverse transform, to it and then flip the destination back. This has been replaced by flipping the result of the inverse transform before adding it to the destination. Up-Down flipping is done by negating the destination stride, and staring from the bottom, so it should now be free. Left-right flipping is done with the usual SSE2 instructions in the optimized code. The C functions match the SSE2 functions as expected, so the C functions now do the flipping as well when required. Adding this cleanly required some refactoring of the C functions, but there is no measurable performance impact when ext-tx is not enabled. Encode speedup with ext-tx enabled is about 3%. Change-Id: I5b04e5d720f0b9f0d54fd8607a8764f2314c7234	2015-11-04 17:11:44 +00:00
Geza Lore	2b39bcec29	Fix transform tables in C implementations. These tables were out of sync with the indexing enum since the refactoring in commit 4f16f119 (change 303389), due to the removal of the ext_tx_to_txtype lookup table. This patch just puts them back in order. Change-Id: Ieb7d57654f61b99b511d54c9ba09abbd5e8d0d14	2015-11-03 17:10:51 +00:00
Debargha Mukherjee	8a4292441f	Refactoring tx-types to add more flexibility Allows inter and intra tx_types to have different sets of transforms for different tx_size/sb_type combinations. Change-Id: Ic0ac1daef7a9fb15c4210271e4d04cd36e5cec8e	2015-10-28 23:31:32 -07:00
Jingning Han	3ff3313502	Silence compiler warnings when high bit-depth is turned on Clear the compiler warnings when both ext-tx and high bit-depth are turned on. Change-Id: I2e02f1f29043f2952fe215f8183b5bfd80e16f58	2015-10-23 14:51:16 -07:00
Yaowu Xu	4ac2ae3a4d	Merge branch 'masterbase' into nextgenv2 Conflicts: configure test/vp9_encoder_parms_get_to_decoder.cc vp10/common/blockd.h vp10/common/entropymode.c vp10/common/entropymode.h vp10/common/idct.c vp10/decoder/decodeframe.c vp10/decoder/decodemv.c vp10/encoder/bitstream.c vp10/encoder/encodeframe.c vp10/encoder/encodemb.c vp10/encoder/encoder.c vp10/encoder/encoder.h vp10/encoder/rd.c vp10/encoder/rdopt.c vp10/encoder/tokenize.c vp10/encoder/tokenize.h vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_aq_cyclicrefresh.c vp9/encoder/vp9_encoder.h vp9/vp9_cx_iface.c vpx/vp8cx.h vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c vpx_scale/yv12config.h Change-Id: I604a329d38badec7a11e8ede16ca1404476e9b93	2015-10-22 11:40:44 -07:00
hui su	2afe7320c8	Add identity transform to ext-tx experiment ext-tx on derflr: +1.756% (was +1.648) Change-Id: I8a87970fa589e8f5f96db7aa68ec9b6c98e20188	2015-09-30 18:47:46 -07:00
Debargha Mukherjee	3e8cceb3fc	Speed up of DST and the search in ext_tx Adds an early termination to the ext_tx search, and also implements the DST transforms more efficiently. About 4 times faster with the ext-tx experiment. There is a 0.09% drop in performance on derflr from 1.735% to 1.648%, but worth it with the speedup achieved. Change-Id: I2ede9d69c557f25e0a76cd5d701cc0e36e825c7c	2015-09-29 19:11:43 -07:00
Yaowu Xu	7c514e2dfd	Merged branch 'master' into nextgenv2 Resolved Conflicts in the following files: configure vp10/common/idct.c vp10/encoder/dct.c vp10/encoder/encodemb.c vp10/encoder/rdopt.c Change-Id: I4cb3986b0b80de65c722ca29d53a0a57f5a94316	2015-09-29 16:17:32 -07:00
Ronald S. Bultje	bab8d38f7f	vp10: remove MACROBLOCK.{highbd_,}itxfm_add function pointer. This is preparatory work for allowing per-segment lossless coding. See issue 1035. Change-Id: I9487d02717ee3e766aee61a487780056bb35d2d3	2015-09-25 19:30:46 -04:00
Debargha Mukherjee	b8bc026c72	Misc. ext_tx fixes/enhancements derflr: +1.732% (8-bit) Change-Id: I9c04c8249646ff96eacacfa1dcb0bd118c04e84a	2015-09-15 10:00:54 -07:00
Debargha Mukherjee	4ce81d666e	Comprehensive support for symmetric DST Creates new hybrid transforms combining symmetric DST with ADST and DCT. Thus a total of 16 transforms are supported. derfl: +1.659% (up about 0.2%) Change-Id: Idde1cecdb59527890bf05da740099c3f6a5b9764	2015-09-10 11:13:59 -07:00
Debargha Mukherjee	9fc691efbe	Backport EXT_TX experiment from nextgen Does not include DST1 yet. derflr: +1.437 (8-bit internal), +7.243 (12-bit internal) with --enable-ext-tx Change-Id: I91f1759fd2de794755eb6384cda52e80e979cb7d	2015-09-09 09:42:51 -07:00
hui su	d76e5b3652	Refactoring on transform types Prepare for adding more transform varieties (EXT_TX and TX_SKIP in nextgen). Change-Id: I2dfe024f6be7a92078775917092ed62abc2e7d1e	2015-08-24 10:47:25 -07:00
Jingning Han	3acfe46e8d	Sync vp10 with vpx_ports/system_state.h Change-Id: Ic5004f8bdc1c2b025b598e80374ee1f286ea95ee	2015-08-12 09:21:25 -07:00
Jingning Han	54d66ef165	Remove vp9_ prefix from vp10 files Remove the vp9_ prefix from vp10 file names. Change-Id: I513a211b286a57d6126fc1b0fbfd6405120014f1	2015-08-11 21:24:08 -07:00

40 Commits