generic-library/vpx

Author	SHA1	Message	Date
James Zern	ce88d74d34	vp9_reconintra_neon: add d45 4x4 based on webp's LD4() ~59% faster over 20M pixels Change-Id: I371eaed9ce8f470451046997e130b0ba1a2f7a9c	2015-06-18 15:25:07 -07:00
James Zern	337b221e00	vp9_reconintra_neon: add d135 4x4 based on webp's RD4() ~50% faster over 20M pixels Change-Id: Ifcb7bf7f7fc8eabf79d9e3b219ce1be67abc524a	2015-06-18 15:25:06 -07:00
James Zern	41d8545ab6	Merge "vp9_reconintra_neon: add DC 4x4 predictors"	2015-06-18 22:24:55 +00:00
James Zern	6e44bf20f7	vp9_reconintra_neon: add DC 4x4 predictors ~85-89% faster over 20M pixels Change-Id: I3812e8adfffe5255034da88dfe6546e12f4d10ee	2015-06-18 15:22:43 -07:00
James Zern	e77f859d72	Merge "vp9_reconintra_neon: add DC 32x32 predictors"	2015-06-18 22:17:51 +00:00
Parag Salasakar	d9fedf7832	mips msa vp9 fdct 32x32 optimization average improvement ~4x-6x Change-Id: Ibcac3ef8ed5e207cf8c121e696570e6b63d3c0f4	2015-06-17 07:58:34 +05:30
Parag Salasakar	fa53008fb7	Merge "mips msa vp9 fdct 16x16 optimization"	2015-06-17 01:21:59 +00:00
Parag Salasakar	89b4b315aa	mips msa vp9 fdct 16x16 optimization average improvement ~4x-6x Change-Id: Id3b2243e5b3c7844c90c4231a5e75fa69911362c	2015-06-16 12:49:34 +05:30
James Zern	79fb3a013e	vp9_reconintra_neon: add DC 32x32 predictors ~84-85% faster over 20M pixels Change-Id: Ia67a7f4a342bf7b0a9280e05c25d81a774d90469	2015-06-15 20:57:28 -07:00
James Zern	98f0178611	enable vp9_d153_predictor_32x32_ssse3 unused since its initial commit ~91% faster over 20M pixels Change-Id: Ic8b5b3246bc97c8406be8bc4496601370403b70a	2015-06-12 19:48:22 -07:00
Parag Salasakar	fbac961b47	mips msa vp9 filter by weight optimization filter by weight - average improvement ~2x-3x Change-Id: I4832033335d339cdafdce697f07ce3e643920057	2015-06-12 12:06:42 +05:30
Parag Salasakar	a2288d274c	mips msa vp9 intra-pred optimization intra pred - average improvement ~2x-3x Change-Id: Ie3f7d6eded5ecb7ed7ee506ba8e4d98f93803b09	2015-06-06 22:29:32 +05:30
Parag Salasakar	d43fd99822	mips msa vp9 loopfilter 4, 8 optimization average improvement ~3x-4x Change-Id: I59279293ce4b2a1e99bd10579ac97740e943643f	2015-06-05 09:56:08 +05:30
Parag Salasakar	914f8f9ee0	mips msa vp9 loopfilter 16 optimization average improvement ~3x-4x Change-Id: I8ef263da6ebcf8f20aabaefeccf25a84640ba048	2015-06-04 11:50:41 +05:30
Parag Salasakar	bdfbc3e876	mips msa vp9 convolve8 avg hv optimization average improvement ~4x-6x Change-Id: I7c8b4f2334491be8a859592606e568bc95d019aa	2015-06-04 08:11:01 +05:30
Parag Salasakar	b8c1cdcd12	mips msa vp9 convolve8 avg horiz optimization average improvement ~5x-8x Change-Id: I179a69ec620fbd69979bd128f05d18113618aab4	2015-06-03 11:33:42 +05:30
Parag Salasakar	c543d38ac7	mips msa vp9 convolve8 avg vert optimization average improvement ~4x-6x Change-Id: Ia2e6f770da46416ebec31fdcea5cc7878879a9d9	2015-06-03 09:55:25 +05:30
Parag Salasakar	54a6f73958	mips msa vp9 idct4x4 and iwht4x4 optimization average improvement ~3x-4x moved assert to respective files Change-Id: I6c915059d456a00bdd76fab0dd2eede8b6c6ea58	2015-06-02 12:16:28 +05:30
Parag Salasakar	ebf7466cd8	mips msa vp9 updated convolve horiz, vert, hv, copy, avg module Updated sources according to improved version of common MSA macros. Enabled respective convolve MSA hooks and tests. Overall, this is just upgrading the code with styling changes. Change-Id: If5ad6ef8ea7ca47feed6d2fc9f34f0f0e8b6694d	2015-06-02 12:03:51 +05:30
Parag Salasakar	6af9d7f2e2	mips msa vp9 updated idct 8x8, 16x16 and 32x32 module Updated sources according to improved version of common MSA macros. Enabled idct MSA hooks and tests. Overall, this is just upgrading the code with styling changes. Change-Id: I1f488ab2c741f6c622b7a855388a202168082209	2015-06-01 09:24:23 +05:30
Parag Salasakar	71e88f903d	Merge "mips msa vp9 updated macros and disable all MSA functions"	2015-05-30 02:52:27 +00:00
James Zern	a2a13cbe5f	vp9_reconintra_neon: add DC 16x16 predictors 85-89% faster over 20M pixels Change-Id: I9b320ed6b9e67f27df738b84c8b43b65a93c50c2	2015-05-29 15:41:44 -07:00
James Zern	e97b849219	vp9_reconintra_neon: add DC 8x8 predictors ~90% faster over 20M pixels Change-Id: Iab791510cc57c8332c2f9a5da0ed50702e5f5763	2015-05-29 15:39:08 -07:00
Parag Salasakar	f9f078ebb6	mips msa vp9 updated macros and disable all MSA functions Done little restructuring/styling changes to the sources like generic macro definitions, their use to reduce code lines, better code alignments etc. Disabled all MSA hooks and tests Change-Id: Ic6f2dce0b501f46b80c06c46c0fe2043d557b190	2015-05-29 13:34:33 +05:30
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
James Zern	330fba41e2	vp9 intrinsics: add vp9_rtcd include silences a missing declaration warning Change-Id: I59a34e1a1377cf3529b678d7ec0122bd43ab1bf1	2015-05-15 10:43:47 -07:00
Parag Salasakar	686616a989	Merge "mips msa vp9 idct 8x8 optimization"	2015-05-13 04:36:34 +00:00
hkuang	f5574fb44c	Merge "Add more sse2 code for intra prediction."	2015-05-08 17:26:30 +00:00
Parag Salasakar	7c5f00f868	mips msa vp9 idct 8x8 optimization average improvement ~4x-6x Change-Id: I5edf713721b9e24c7e0ce2e69d8fc3ecab625d91	2015-05-08 12:23:27 +05:30
Parag Salasakar	a8a9c2bb45	Merge "mips msa vp9 idct 32x32 optimization"	2015-05-08 04:27:44 +00:00
Johann	76a08210b6	Merge "Move shared SAD code to vpx_dsp"	2015-05-07 18:33:06 +00:00
Parag Salasakar	1601c1385a	mips msa vp9 idct 32x32 optimization average improvement ~4x-6x Change-Id: Idaba7e49fbd7f388caee0d73773ccf6e4807ef17	2015-05-07 12:42:23 +05:30
hkuang	7153b822ed	Add more sse2 code for intra prediction. vp9_dc_left_predictor_16x16 vp9_dc_top_predictor_32x32 vp9_dc_left_predictor_32x32 vp9_dc_128_predictor_32x32 Change-Id: Ib9861deefd01c3527235b92ff6b3d571ef6b4bc6	2015-05-06 17:17:00 -07:00
Johann	d5d9289800	Move shared SAD code to vpx_dsp Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a	2015-05-06 16:58:20 -07:00
Parag Salasakar	d1cdda88bd	Merge "mips msa vp9 idct 16x16 optimization"	2015-05-06 06:40:56 +00:00
James Zern	ccae5d99d2	fix and enable vp9_dc_128_predictor_16x16 widen the loads and stores to 128-bit. this was added, but not enabled in: `493a857` Add some sse2 code for intra prediction. Change-Id: I277d7db608a7db7d75cc0bde86f48fa66ad487e4	2015-05-05 11:40:13 -07:00
hkuang	e47811ef8f	Merge "Add some sse2 code for intra prediction."	2015-05-05 17:11:07 +00:00
Parag Salasakar	60052b618f	mips msa vp9 idct 16x16 optimization average improvement ~4x-6x Change-Id: I55e95b7f2ba403dff11813958dc7c73a900dd022	2015-05-05 12:37:06 +05:30
Yaowu Xu	2061359fcf	Merge "Remove vp9_idct16x16_10_add_ssse3()"	2015-04-30 23:13:33 +00:00
hkuang	493a8579f1	Add some sse2 code for intra prediction. Change-Id: I16c0a62e52dab62837c547345df31e7518620ed4	2015-04-30 15:42:57 -07:00
Yaowu Xu	47767609fe	Remove vp9_idct16x16_10_add_ssse3() The rotation computation using 2X of cos(pi/16) has a potential to overflow 32 bit, this commit disable the function to allow further investigation and optimization. Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf	2015-04-30 09:07:30 -07:00
Parag Salasakar	95cb130f32	Merge "mips msa vp9 copy and avg convolve optimization"	2015-04-30 04:39:13 +00:00
Yaowu Xu	486a73a9ce	Disable ssse3 version idct16x16_256_add() The version is currently producing different result from c version for some input. Disable the use of it for now to allow time for investigation the source of mismatch. Change-Id: Id039455494ee531db4886a9f1fa4761174ef6df3	2015-04-29 16:58:59 -07:00
Parag Salasakar	2301d10f73	mips msa vp9 copy and avg convolve optimization average improvement ~3x-5x Change-Id: I422e4c33ea7e6d6783ba40029438ccf21b0e76bb	2015-04-29 12:28:17 +05:30
Parag Salasakar	ca90d4fd96	mips msa vp9 convolve8 horiz optimization average improvement ~6x-8x Change-Id: I7c91eec41aada3b0a5231dda7869b3b968f3ad18	2015-04-21 12:31:26 +05:30
Parag Salasakar	ef51c1ab5b	mips msa vp9 convolve8 hv optimization average improvement ~5x-8x Change-Id: I3214734cb3716e742907ce0d2d7a042d953df82b	2015-04-21 09:17:49 +05:30
Parag Salasakar	2e36149ccd	Merge "mips msa vp9 convolve8 vert optimization"	2015-04-18 23:39:25 -07:00
Parag Salasakar	27d083c1b9	mips msa vp9 convolve8 vert optimization average improvement ~6x-10x Change-Id: Ie3f3ab3a9005be84935919701e56b404e420affa	2015-04-18 08:13:04 +05:30
Marco Paniconi	f76ccce5bc	Revert "Revert "Force_split on 16x16 blocks in variance partition."" This reverts commit `004b9d83e3` Change-Id: I2f2d0bdb9368c2c07f1d29a69cd461267a3a8743	2015-04-16 17:52:13 -07:00
Yunqing Wang	004b9d83e3	Revert "Force_split on 16x16 blocks in variance partition." This reverts commit `eb8c667570`. The patch caused mismatch while using multi-threads. Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be	2015-04-14 15:19:31 -07:00
Marco	eb8c667570	Force_split on 16x16 blocks in variance partition. Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks. Also increase variance threshold for 32x32, and add exit condiiton in choose_partition (with very safe threshold) based on sad used to select reference frame. Some visual improvement near moving boundaries. Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%. Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip. Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577	2015-04-13 12:05:07 -07:00
Jingning Han	93d9c50419	Merge "SSSE3 assembly implementation of 8x8 Hadamard transform"	2015-04-09 11:16:11 -07:00
Jingning Han	7f629dfca4	SSSE3 assembly implementation of 8x8 Hadamard transform It uses about 10% less CPU cycles than the SSE2 intrinsic implementation. Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499	2015-04-04 09:59:37 -07:00
James Zern	44e3640923	Merge "vp9: enable sse4 sad functions"	2015-04-03 14:57:52 -07:00
James Zern	b644384bb5	Merge "vp9: fix high-bitdepth NEON build"	2015-04-01 23:36:17 -07:00
Jingning Han	1470529f62	Refactor block_yrd function for RTC coding mode This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10	2015-04-01 12:00:43 -07:00
James Zern	14e24a1297	vp9: enable sse4 sad functions sse4 isn't set by configure or used in rtcd, correct the sad entries to use sse4_1 without changing the signatures for now. this was done in vp8 post-vp9 branch. Change-Id: Ia9f1fff9f2476fdfa53ed022778dd2f708caa271	2015-03-31 21:00:55 -07:00
James Zern	8845334097	vp9: fix high-bitdepth NEON build remove incorrect specializations in rtcd and update a configuration check in partial_idct_test.cc Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0	2015-03-31 17:45:25 -07:00
Jingning Han	26d3d3af6a	Enable 16x16 Hadamard transform in SATD based mode decision This commit replaces the 16x16 2D-DCT transform with Hadamard transform for RTC coding mode. It reduces the CPU cycles cost on 16x16 transform by 5X. Overall it makes the speed -6 encoding speed 1.5% faster without compromise on compression performance. Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b	2015-03-30 15:43:31 -07:00
Jingning Han	8c411f74e0	Hadamard transform based coding mode decision process This commit uses Hadamard transform based rate-distortion cost estimate for rtc coding mode decision. It improves the compression performance of speed -6 for many hard clips at lower bit-rates. For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for niklas720p. This will introduce extra encoding cycle costs at this point. Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375	2015-03-30 14:46:05 -07:00
Jingning Han	1790d45252	Use variance metric for integral projection vector match This commit replaces the SAD with variance as metric for the integral projection vector match. It improves the search accuracy in the presence of slight light change. The average speed -6 compression performance for rtc set is improved by 1.7%. No speed changes are observed for the test clips. Change-Id: I71c1d27e42de2aa429fb3564e6549bba1c7d6d4d	2015-03-01 10:42:56 -08:00
Jingning Han	ed2dc59c1b	Integral projection based motion estimation This commit introduces a new block match motion estimation using integral projection measurement. The 2-D block and the nearby region is projected onto the horizontal and vertical 1-D vectors, respectively. It then runs vector match, instead of block match, over the two separate 1-D vectors to locate the motion compensated reference block. This process is run per 64x64 block to align the reference before choosing partitioning in speed 6. The overall CPU cycle cost due to this additional 64x64 block match (SSE2 version) takes around 2% at low bit-rate rtc speed 6. When strong motion activities exist in the video sequence, it substantially improves the partition selection accuracy, thereby achieving better compression performance and lower CPU cycles. The experiments were tested in RTC speed -6 setting: cloud 1080p 500 kbps 17006 b/f, 37.086 dB, 5386 ms -> 16669 b/f, 37.970 dB, 5085 ms (>0.9dB gain and 6% faster) pedestrian_area 1080p 500 kbps 53537 b/f, 36.771 dB, 18706 ms -> 51897 b/f, 36.792 dB, 18585 ms (4% bit-rate savings) blue_sky 1080p 500 kbps 70214 b/f, 33.600 dB, 13979 ms -> 53885 b/f, 33.645 dB, 10878 ms (30% bit-rate savings, 25% faster) jimred 400 kbps 13380 b/f, 36.014 dB, 5723 ms -> 13377 b/f, 36.087 dB, 5831 ms (2% bit-rate savings, 2% slower) Change-Id: Iffdb6ea5b16b77016bfa3dd3904d284168ae649c	2015-02-19 13:47:19 -08:00
Frank Galligan	e3167f7fbf	Add vp9_sad32x32x4d_neon Neon intrinsic function. On Nexus 7 speed -6 saw ~18% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c	2015-01-27 08:54:00 -08:00
Frank Galligan	9f574d0316	Add vp9_sad16x16x4d_neon Neon intrinsic function. On Nexus 7 speed -6 saw ~15% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9	2015-01-27 08:42:17 -08:00
Frank Galligan	54fa956715	Add vp9_sad64x64x4d_neon Neon intrinsic function. On Nexus 7 speed -6 saw ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: Id12af7d1883243c23e6692e898aea82299633d58	2015-01-27 08:33:40 -08:00
Frank Galligan	9f6eba419a	Add Neon intrinsic vp9_fdct8x8_quant_neon On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51	2015-01-24 22:49:50 -08:00
JackyChen	65f60f8e8c	Merge "SSE2 code for the filter in MFQE."	2015-01-23 11:08:16 -08:00
JackyChen	09673deba9	SSE2 code for the filter in MFQE. The SSE2 code is from VP8 MFQE, reuse it in VP9. No change on VP8 side. In our testing, we achieve 2X speed by adopting this change. Change-Id: Ib2b14144ae57c892005c1c4b84e3379d02e56716	2015-01-18 16:07:59 -08:00
Frank Galligan	6e7e1cf32f	Add Neon intrinsics for vp9_avg_8x8_neon On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee	2015-01-15 15:32:40 -08:00
Frank Galligan	ec1d8387e1	Add 64x64 sub_pel_variance Neon function On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334	2015-01-14 08:36:24 -08:00
Frank Galligan	74d40cd507	Add 64x variance Neon functions Add optimized Neon functions of: vp9_variance32x64 vp9_variance64x32 vp9_variance64x64 On Nexus 7 speed -5 and -6 saw about a 4% increase in perf. Speeds -7 and -8 saw about a 6% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa	2015-01-13 15:08:13 -08:00
Johann	377b6682f9	Disable vp9 _8_ loopfilters Investigating https://code.google.com/p/chromium/issues/detail?id=443839 Change-Id: Ibb7485d835c5aa5e1d40f31715596ba8d208eedb	2015-01-06 19:26:11 -08:00
Jingning Han	d0f2377027	Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value."" This reverts commit `9946ee23e0`. Fix the ssse3 asm function. Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07	2014-12-22 10:09:25 -08:00
Paul Wilkins	9946ee23e0	Revert "Removal of legacy zbin_extra / zbin_oq_value." This reverts commit `e9b586e21b`. Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4	2014-12-19 15:02:58 +00:00
Paul Wilkins	e9b586e21b	Removal of legacy zbin_extra / zbin_oq_value. zbin extra / zbin_oq_value was widely passed around, hence removal touches a lot of code. Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5	2014-12-18 16:49:11 +00:00
James Yu	aeeaa67987	VP9 common for ARMv8 by using NEON intrinsics 15 Re-write - vp9_lpf_horizontal_4_dual_neon in vp9_loopfilter_16_neon.c Change-Id: Ie14f63d352f9564ad01db3939a61d91cf6d21a31 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-16 20:00:26 -08:00
James Yu	aa8dd897c1	VP9 common for ARMv8 by using NEON intrinsics 16 Add vp9_reconintra_neon.c - vp9_v_predictor_4x4_neon - vp9_v_predictor_8x8_neon - vp9_v_predictor_16x16_neon - vp9_v_predictor_32x32_neon - vp9_h_predictor_4x4_neon - vp9_h_predictor_8x8_neon - vp9_h_predictor_16x16_neon - vp9_h_predictor_32x32_neon - vp9_tm_predictor_4x4_neon - vp9_tm_predictor_8x8_neon - vp9_tm_predictor_16x16_neon - vp9_tm_predictor_32x32_neon Change-Id: Ib5d54a4766a1b5127169045659974f33aa98376d Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-16 12:57:52 -08:00
James Yu	4f856cd7fa	VP9 common for ARMv8 by using NEON intrinsics 06 Add vp9_iht8x8_add_neon.c - vp9_iht8x8_64_add_neon The assembly did not previously implement tx_type 0 BUG=716 Change-Id: Icfc99dd24f3d59047f9184a7d0c761ba7e3de934 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-15 12:18:06 -08:00
James Yu	6b71013277	VP9 common for ARMv8 by using NEON intrinsics 05 Add vp9_iht4x4_add_neon.c - vp9_iht4x4_16_add_neon The assembly did not previously implement tx_type 0 BUG=715 Change-Id: I60034d1568de034edba45c5cdd13f3d87dbc73b6 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-15 12:16:19 -08:00
James Yu	3f7c12dab9	VP9 common for ARMv8 by using NEON intrinsics 18 Add vp9_idct32x32_add_neon.c - vp9_idct32x32_1024_add_neon Change-Id: Ic598b772c28bd3487a8ead7a4598a66b25f9b00f Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 18:20:04 -08:00
James Yu	3cfed4bf76	VP9 common for ARMv8 by using NEON intrinsics 14 Add vp9_idct16x16_add_neon.c - vp9_idct16x16_256_add_neon_pass1 - vp9_idct16x16_256_add_neon_pass2 - vp9_idct16x16_10_add_neon_pass1 - vp9_idct16x16_10_add_neon_pass2 Change-Id: I54d25b54a36f4371760f54e4036693aaea40a5de Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 18:19:54 -08:00
James Yu	ce76aeb00d	VP9 common for ARMv8 by using NEON intrinsics 13 Add vp9_idct8x8_add_neon.c - vp9_idct8x8_64_add_neon - vp9_idct8x8_10_add_neon Change-Id: I6ee7b4496765aa36ed52990f2ef73e9f24459610 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 14:56:54 -08:00
James Yu	8c25f4af6a	VP9 common for ARMv8 by using NEON intrinsics 12 Add vp9_idct4x4_add_neon.c - vp9_idct4x4_16_add_neon Change-Id: I011a96b10f1992dbd52246019ce05bae7ca8ea4f Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 14:49:59 -08:00
James Yu	420f58f2d2	VP9 common for ARMv8 by using NEON intrinsics 11 Add vp9_idct16x16_1_add_neon.c - vp9_idct16x16_1_add_neon Change-Id: I7c6524024ad4cb4e66aa38f1c887e733503c39df Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 13:06:58 -08:00
James Yu	030ca4d0e5	VP9 common for ARMv8 by using NEON intrinsics 10 Add vp9_idct32x32_1_add_neon.c - vp9_idct32x32_1_add_neon Change-Id: If9ffe9a857228f5c67f61dc2b428b40965816eda Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 13:04:29 -08:00
James Yu	2772b45ac0	VP9 common for ARMv8 by using NEON intrinsics 09 Add vp9_idct8x8_1_add_neon.c - vp9_idct8x8_1_add_neon Change-Id: I9d23e01fa96013febbf64db6c76c6c955f14e3ff Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 12:52:33 -08:00
James Yu	9114f0afdb	VP9 common for ARMv8 by using NEON intrinsics 08 Add vp9_idct4x4_1_add_neon.c - vp9_idct4x4_1_add_neon Change-Id: Ieab9af107dbd07a4f9503bc945890c90faccb8ac Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-10 12:49:28 -08:00
James Yu	01fc6f51e0	VP9 common for ARMv8 by using NEON intrinsics 07 Add vp9_convolve8_neon.c - vp9_convolve8_horiz_neon - vp9_convolve8_vert_neon Change-Id: I0bdd99ff72d275223fe211ac7243c25a5a60cf87 Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 20:03:07 -08:00
James Yu	893534a996	VP9 common for ARMv8 by using NEON intrinsics 04 Add vp9_convolve8_avg_neon.c - vp9_convolve8_avg_horiz_neon - vp9_convolve8_avg_vert_neon Change-Id: I617971e37b02186fec5aca181f4f9622050ea2df Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 20:03:07 -08:00
James Yu	d12757f5c6	VP9 common for ARMv8 by using NEON intrinsics 03 Add vp9_copy_neon.c - vp9_convolve_copy_neon Change-Id: I291fc5423d06240876411bbceab03eae5ef585be Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 20:02:46 -08:00
Scott LaVarnway	617382a2e3	VP9 common for ARMv8 by using NEON intrinsics 02 Add vp9_avg_neon.c - vp9_convolve_avg_neon Change-Id: Id2c9d5bcfa37cff1a16417aba1656ff07bdf10fd Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 19:00:21 -08:00
James Yu	5b098b1825	VP9 common for ARMv8 by using NEON intrinsics 01 Add vp9_loopfilter_neon.c - vp9_lpf_horizontal_4_neon - vp9_lpf_vertical_4_neon - vp9_lpf_horizontal_8_neon - vp9_lpf_vertical_8_neon Change-Id: I97a0d7b399a431c21ee77396be3d5f5a1f7ebccb Signed-off-by: James Yu <james.yu@linaro.org>	2014-12-09 12:26:56 -08:00
Peter de Rivaz	a306bd8274	Use the RTC optimizations when in high bitdepth mode. Change 72193 made the encoder behave differently when configured with and without high bitdepth. This change means the same algorithm is used for both. Change-Id: I707a44a94afca773a9e0c2f7ebeeea83030257c5	2014-12-04 15:48:42 -08:00
Marco	8fd3f9a2fb	Enable non-rd mode coding on key frame, for speed 6. For key frame at speed 6: enable the non-rd mode selection in speed setting and use the (non-rd) variance_based partition. Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames), mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16. Loss in key frame quality (~0.6-0.7dB) compared to rd coding, but speeds up key frame encoding by at least 6x. Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6. Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405	2014-12-03 09:18:08 -08:00
Peter de Rivaz	7e40a55ef9	Added high bitdepth sse2 transform functions Also removes some spurious changes in common/vp9_blockd.h which was introduced by a rebase issue between nextgen and master branches. Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282 (cherry picked from commit `005d80cd05`) (cherry picked from commit `08d2f54800`) (cherry picked from commit `4230c2306c`)	2014-12-02 11:16:24 -08:00
Debargha Mukherjee	02355a4abf	Merge "Added highbitdepth sse2 acceleration for quantize"	2014-11-21 16:08:47 -08:00
Peter de Rivaz	a7b2d09f36	Added highbitdepth sse2 acceleration for quantize Also includes block error. (This patch is mostly cherry picked from commit `db7192e0b0`) Change-Id: Idef18f90b111a0d0c9546543d3347e551908fd78	2014-11-19 23:55:19 -08:00
Jingning Han	c42715b721	Enable ssse3 version of vp9_fdct8x8_quant It improves the speed performance of vp9_fdct8x8_quant_sse2 by about 5%. Change-Id: I74b093ba4d81df64caf71ac7693f3d917f673097	2014-11-19 22:14:19 -08:00
Jingning Han	bf63652d34	Merge "Combine fdct8x8 and quantization process"	2014-11-19 11:17:44 -08:00
Jingning Han	ce77a7bcb0	Merge "Add sse2 version for vp9_quantize_fp"	2014-11-19 11:17:36 -08:00

1 2 3 4 5

249 Commits