generic-library/vpx

Author	SHA1	Message	Date
hkuang	e47811ef8f	Merge "Add some sse2 code for intra prediction."	2015-05-05 17:11:07 +00:00
James Zern	670b2c09ce	vp9_idct_intrin_sse2: cosmetics: reindent + fix some whitespace Change-Id: Id61b739282014288a7e5d3c17a9d6448d9d4cda2	2015-05-01 16:07:54 -07:00
James Zern	c77b1f5acd	vp9: RECON_AND_STORE4X4: remove dest offset offsetting by a variable stride prevents instruction reordering, resulting in poor assembly Change-Id: Id62d6b3299cdd23f8c44f97b630abf4fea241446	2015-04-30 19:14:17 -07:00
James Zern	778845da05	vp9_idct_intrin_*: RECON_AND_STORE: remove dest offset offsetting by a variable stride prevents instruction reordering, resulting in poor assembly. additionally reroll 16x16/32x32 loops to reduce register spill with this new format Change-Id: I0635b8ba21ecdb88116e927dbdab53acdf256e11	2015-04-30 19:14:17 -07:00
Yaowu Xu	2061359fcf	Merge "Remove vp9_idct16x16_10_add_ssse3()"	2015-04-30 23:13:33 +00:00
hkuang	493a8579f1	Add some sse2 code for intra prediction. Change-Id: I16c0a62e52dab62837c547345df31e7518620ed4	2015-04-30 15:42:57 -07:00
Yaowu Xu	47767609fe	Remove vp9_idct16x16_10_add_ssse3() The rotation computation using 2X of cos(pi/16) has a potential to overflow 32 bit, this commit disable the function to allow further investigation and optimization. Change-Id: I4a9803bc71303d459cb1ec5bbd7c4aaf8968e5cf	2015-04-30 09:07:30 -07:00
Parag Salasakar	95cb130f32	Merge "mips msa vp9 copy and avg convolve optimization"	2015-04-30 04:39:13 +00:00
Yaowu Xu	d45870be8d	Merge "Disable ssse3 version idct16x16_256_add()"	2015-04-30 03:09:23 +00:00
Yaowu Xu	486a73a9ce	Disable ssse3 version idct16x16_256_add() The version is currently producing different result from c version for some input. Disable the use of it for now to allow time for investigation the source of mismatch. Change-Id: Id039455494ee531db4886a9f1fa4761174ef6df3	2015-04-29 16:58:59 -07:00
Parag Salasakar	2301d10f73	mips msa vp9 copy and avg convolve optimization average improvement ~3x-5x Change-Id: I422e4c33ea7e6d6783ba40029438ccf21b0e76bb	2015-04-29 12:28:17 +05:30
James Zern	f58011ada5	vpx_mem: remove vpx_memset vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201	2015-04-28 20:00:59 -07:00
James Zern	f274c2199b	vpx_mem: remove vpx_memcpy vestigial. replace instances with memcpy() which they already were being defined to. Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c	2015-04-28 19:59:41 -07:00
Frank Galligan	2be50a1c9c	Merge "WIP: Use LUT for y_dequant/uv_dequant"	2015-04-28 16:12:10 +00:00
Scott LaVarnway	afcb62b414	WIP: Use LUT for y_dequant/uv_dequant instead of calculating every block. Change-Id: Ib19ff2546be8441f8755ae971ba2910f29412029	2015-04-28 07:52:06 -07:00
Yunqing Wang	297b2b99de	Fix debugmodes file to print modes and MVs correctly This patch fixed the issues in debugmodes file because of the recent changes in MODE_INFO struct. Change-Id: I4df83379ecc887c1f009d4a8329c9809c5b299d6	2015-04-27 17:09:38 -07:00
Parag Salasakar	1c9af9833d	Merge "mips msa vp9 convolve8 horiz optimization"	2015-04-21 22:08:25 -07:00
Johann	931c0a954f	Merge "Rename neon convolve avg file"	2015-04-21 15:45:29 -07:00
Johann	66b9933b8d	Rename neon convolve avg file Some build systems use just the basename for object files. Change-Id: I333e1107ee866f3906cc46476ef8d04c6200a8a0	2015-04-21 14:18:17 -07:00
Scott LaVarnway	8b17f7f4eb	Revert "Remove mi_grid_* structures." (see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6) For the test clip used, the decoder performance improved by ~2%. This is also an intermediate step towards adding back the mode_info streams. Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d	2015-04-21 11:16:45 -07:00
Parag Salasakar	ca90d4fd96	mips msa vp9 convolve8 horiz optimization average improvement ~6x-8x Change-Id: I7c91eec41aada3b0a5231dda7869b3b968f3ad18	2015-04-21 12:31:26 +05:30
Parag Salasakar	ef51c1ab5b	mips msa vp9 convolve8 hv optimization average improvement ~5x-8x Change-Id: I3214734cb3716e742907ce0d2d7a042d953df82b	2015-04-21 09:17:49 +05:30
Parag Salasakar	2e36149ccd	Merge "mips msa vp9 convolve8 vert optimization"	2015-04-18 23:39:25 -07:00
Parag Salasakar	27d083c1b9	mips msa vp9 convolve8 vert optimization average improvement ~6x-10x Change-Id: Ie3f3ab3a9005be84935919701e56b404e420affa	2015-04-18 08:13:04 +05:30
Marco Paniconi	f76ccce5bc	Revert "Revert "Force_split on 16x16 blocks in variance partition."" This reverts commit 004b9d83e37d355f590a6976a27b7b845d19a869 Change-Id: I2f2d0bdb9368c2c07f1d29a69cd461267a3a8743	2015-04-16 17:52:13 -07:00
Johann	14ef4aeafb	Reorganize *_rtcd() calling conventions Change-Id: Ib1e17d8aae9b713b87f560ab5e49952ee2bfdcc2	2015-04-15 11:12:05 -04:00
Yunqing Wang	004b9d83e3	Revert "Force_split on 16x16 blocks in variance partition." This reverts commit eb8c667570aa83134c7db0690de9dbdde4d90291. The patch caused mismatch while using multi-threads. Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be	2015-04-14 15:19:31 -07:00
Marco	eb8c667570	Force_split on 16x16 blocks in variance partition. Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks. Also increase variance threshold for 32x32, and add exit condiiton in choose_partition (with very safe threshold) based on sad used to select reference frame. Some visual improvement near moving boundaries. Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%. Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip. Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577	2015-04-13 12:05:07 -07:00
Parag Salasakar	2f693be8f8	Merge "mips msa vp9 common headers added"	2015-04-09 21:50:15 -07:00
Jingning Han	93d9c50419	Merge "SSSE3 assembly implementation of 8x8 Hadamard transform"	2015-04-09 11:16:11 -07:00
Parag Salasakar	481fb7640c	mips msa vp9 common headers added Change-Id: Ia31ada59172eb1818e1eb91009f83cbb1f581223	2015-04-09 15:35:12 +05:30
Jingning Han	7f629dfca4	SSSE3 assembly implementation of 8x8 Hadamard transform It uses about 10% less CPU cycles than the SSE2 intrinsic implementation. Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499	2015-04-04 09:59:37 -07:00
James Zern	44e3640923	Merge "vp9: enable sse4 sad functions"	2015-04-03 14:57:52 -07:00
James Zern	b644384bb5	Merge "vp9: fix high-bitdepth NEON build"	2015-04-01 23:36:17 -07:00
Yaowu Xu	54210f706c	Merge "use MAX_MB_PLANE consistently"	2015-04-01 18:24:39 -07:00
Yaowu Xu	f26b8c84f8	use MAX_MB_PLANE consistently Change-Id: Ic416a7f145001a88f5a7f70dde9b1edbc1b69381	2015-04-01 15:21:20 -07:00
Jingning Han	1470529f62	Refactor block_yrd function for RTC coding mode This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10	2015-04-01 12:00:43 -07:00
James Zern	14e24a1297	vp9: enable sse4 sad functions sse4 isn't set by configure or used in rtcd, correct the sad entries to use sse4_1 without changing the signatures for now. this was done in vp8 post-vp9 branch. Change-Id: Ia9f1fff9f2476fdfa53ed022778dd2f708caa271	2015-03-31 21:00:55 -07:00
James Zern	8845334097	vp9: fix high-bitdepth NEON build remove incorrect specializations in rtcd and update a configuration check in partial_idct_test.cc Change-Id: I20f551f38ce502092b476fb16d3ca0969dba56f0	2015-03-31 17:45:25 -07:00
hui su	d4f2f1dd5b	Merge "Move vp9_coef_con_tree to common/"	2015-03-31 10:51:10 -07:00
Jingning Han	db5ec37edc	Merge "Enable 16x16 Hadamard transform in SATD based mode decision"	2015-03-31 09:55:41 -07:00
hui su	302e24cb3e	Move vp9_coef_con_tree to common/ This tree should be defined in common/, as it is needed for both encoder and decoder. Change-Id: I4f5cbc80025cf2ced14182c98f7c82dc7d0f87db	2015-03-31 09:20:46 -07:00
Jingning Han	26d3d3af6a	Enable 16x16 Hadamard transform in SATD based mode decision This commit replaces the 16x16 2D-DCT transform with Hadamard transform for RTC coding mode. It reduces the CPU cycles cost on 16x16 transform by 5X. Overall it makes the speed -6 encoding speed 1.5% faster without compromise on compression performance. Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b	2015-03-30 15:43:31 -07:00
Jingning Han	f0ac5aaa08	Merge "Hadamard transform based coding mode decision process"	2015-03-30 15:43:15 -07:00
Jingning Han	8c411f74e0	Hadamard transform based coding mode decision process This commit uses Hadamard transform based rate-distortion cost estimate for rtc coding mode decision. It improves the compression performance of speed -6 for many hard clips at lower bit-rates. For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for niklas720p. This will introduce extra encoding cycle costs at this point. Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375	2015-03-30 14:46:05 -07:00
jackychen	68610ae568	vp9_postproc.c: eliminate -Wshadow build warnings. Change-Id: I6df525a9ad1ae3cfbba8710d21db8fee76e64dbb	2015-03-27 20:27:30 -07:00
Alex Converse	a1e20ec58f	Refactor fast loop filter code to handle 444. Change-Id: I921b1ebabdf617049f8fa26fbe462c3ff115c1ce	2015-03-24 11:17:50 -07:00
hkuang	9f4f98fdbd	Merge "Optimize the intra frame decode to skip some unnecessary copy."	2015-03-23 16:50:37 -07:00
hkuang	85107641a4	Optimize the intra frame decode to skip some unnecessary copy. This speeds up a normal YT style 1080P clip decode by ~1% on nexus 7. Change-Id: Ied7fa0d8bc941b2adb4db9382f549ee4d5654f3a	2015-03-23 10:11:49 -07:00
hkuang	b88dac8938	Safely free all the frame buffers after all the workers finish the work. Issue: 978 Change-Id: Ia7aa809095008f6819a44d7ecb0329def79b1117	2015-03-19 12:21:00 -07:00

1 2 3 4 5 ...

2769 Commits