Commit Graph

2294 Commits

Author SHA1 Message Date
Dmitry Kovalev
724fefb4cf Cleaning up vp9_get_pred_context_single_ref_p1().
Change-Id: I279343b474d7ff41afcf8f1493b6fbf716b51823
2014-02-05 11:48:01 -08:00
Dmitry Kovalev
a536237228 Merge "Cleaning up vp9_get_pred_context_single_ref_p2()." 2014-02-05 11:37:17 -08:00
Martin Storsjo
03bc491721 arm: Consistently use braces around doubleword arguments to vld
This isn't strictly necessary, but makes the file more consistent
with the other arm assembly source files.

Change-Id: I245c9677d89e0ab3f31991e473764858af35b180
2014-02-05 13:24:25 +02:00
Martin Storsjo
c2bb1aa544 arm: Use {} around quadword arguments to vld
This fixes building for iOS.

Change-Id: Ice082648c02a3faf93891f7ddc122875e2bdc9cb
2014-02-05 13:24:17 +02:00
James Zern
d89f861f4b vp9_systemdependent.h: relocate system includes
avoid wrapping msvc includes with extern "C"; this breaks some visual
studio builds of the (c++) tests.

Change-Id: Ie8062d55d4f4c049f6cd360a36da6a67607df132
2014-02-04 18:28:45 -08:00
Dmitry Kovalev
c31cf0d647 Merge "Moving x1 & y1 calculation under if condition." 2014-02-04 14:50:25 -08:00
hkuang
b0fec6ab4a With on demand border extension, clamping the MV
is not longer needed.

Change-Id: I40c37ef18c67ab27fc336694dfca3c43a87c47ca
2014-02-04 13:57:40 -08:00
Yunqing Wang
d1961e6fbf Optimize bilinear sub-pixel filters in ssse3
This patch added ssse3 optimization of bilinear sub-pixel filters.
The real time encoder was speeded up by ~1%.

Change-Id: Ie82e98976f411183cb8c61ab8d2ba0276e55a338
2014-02-04 08:01:55 -08:00
James Zern
2b7338aca4 Merge "vp9_filter.h: rename interp_kernel type" 2014-02-03 23:12:28 -08:00
Dmitry Kovalev
5daaff527e Moving x1 & y1 calculation under if condition.
Change-Id: Iae787d491f7cfe24855ef8f2d04e2c6c19350378
2014-02-03 18:03:17 -08:00
Dmitry Kovalev
64cca45c1d Cleaning up vp9_get_pred_context_single_ref_p2().
Change-Id: I294075acd3073c41e153079ff4462816898b3778
2014-02-03 17:46:34 -08:00
James Zern
cca4276dac vp9_filter.h: rename interp_kernel type
-> InterpKernel
avoids conflicts in variable names, fixing the build with various
toolchains.

broken since:
8691565 Removing subpix_fn_table struct.

Change-Id: Ib5f6fdbcb494a97b62c75b99d4d826ff25d4c981
2014-02-03 16:48:38 -08:00
Alex Converse
be1b41673f Merge "INLINE and reimplement get_unsigned_bits()." 2014-02-03 16:26:33 -08:00
Dmitry Kovalev
220b8f8644 Encoder quantization cleanup.
Change-Id: I633205c95f0e81ce0589580501d0be4425a3cb8e
2014-02-03 14:57:28 -08:00
Dmitry Kovalev
282f36adc4 Merge "Removing "_short" suffix from arm transform file names." 2014-02-03 14:28:47 -08:00
Alex Converse
ffd3d4834b INLINE and reimplement get_unsigned_bits().
The new implementation disagrees when the argument is equal to 2**n but
that is never called in practice and based on how it is used the new
implementation is correct in that case.

Change-Id: Ifbac4ad87d459fe6bd2fd0f400c0340f96617342
2014-02-03 12:16:22 -08:00
Yunqing Wang
2488cb34bc Optimize bilinear sub-pixel filters in sse2
Using bilinear filters could speed up the codec in real-time mode.
This patch added sse2 optimizations of bilinear filters that
operate on different-sized blocks.

Tests showed that the real-time encoder was speeded up by 3%.

Change-Id: If99a7ee4385fcc225c3ee7445d962d5752e57c3f
2014-02-03 10:34:45 -08:00
Marco Paniconi
6be2b750b8 Layer based rate control for CBR mode.
This patch adds a buffer-based rate control for temporal layers,
under CBR mode.

Added vpx_temporal_scalable_patters.c encoder for testing temporal
layers, for both vp9 and vp8 (replaces the old vp8_scalable_patterns).

Updated datarate unittest with tests for temporal layer rate-targeting.

Change-Id: I9cb6cce2494390ae6096ee17774af7fb9308bde7
2014-02-02 14:30:43 -08:00
Jim Bankoski
9dec7712ab static function convert to inline or global vp9_blockd.h
Change-Id: Ifdd951f24932839f06d1c700371662511dde6ebe
2014-01-31 19:50:40 -08:00
Yunqing Wang
7c6a49bada Merge "Rename a loopfilter parameter" 2014-01-31 18:33:33 -08:00
Dmitry Kovalev
c2ca97caaf Merge "Cleaning up motion compensation code." 2014-01-31 17:33:40 -08:00
Dmitry Kovalev
c49b08c9a1 Removing "_short" suffix from arm transform file names.
Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507
2014-01-31 17:19:02 -08:00
Dmitry Kovalev
6e4a03e844 Removing "_1d" suffix from mips transform code.
Unifying transform function names across libvpx, 1d is a redundant suffix.

Change-Id: I077c19f3bc7d4842ed7ca5814d77b3dce1728e13
2014-01-31 17:05:03 -08:00
Yunqing Wang
11a9366e3b Rename a loopfilter parameter
As pointed out by Dmitry and James, "partial" is a Microsoft-
specific c++ keyword, and it is renamed.

Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8
2014-01-31 16:30:04 -08:00
Dmitry Kovalev
88340b173b Merge "Combining fb_idx_ref_cnt[] and yv12_fb[] arrays." 2014-01-31 15:55:04 -08:00
Dmitry Kovalev
a8a2f22958 Merge "Renaming "mbskip" to "skip"." 2014-01-31 15:52:35 -08:00
Yunqing Wang
903801f1ef vp9 decoder: row-based multi-threaded loopfilter
Implemented parallel loopfiltering, which uses existing tile-
decoding threads. Each thread works on one row, and when that row
is loopfiltered, it moves to next unattended row. To ensure the
correct filtering order, threads are synchronized and one
superblock is filtered only if the superblocks it depends on are
filtered already.

To reduce synchronization overhead and speed up the decoder, we use
nsync > 1 for high resolution.

Performance tests:
1. on desktop:
8-tile 4k video using 8 threads, speedup: 70% - 80%
4-tile HD video using 4 threads, speedup: ~35%
2. on mobile device(Nexus 7):
4-tile 1080p video using 4 threads, speedup: 18% - 25%
4-tile 1080p video using 2 threads, speedup: 10% - 15%

Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a
2014-01-31 14:44:53 -08:00
Yaowu Xu
96dc80da61 Merge "create super fast rtc mode" 2014-01-29 16:36:20 -08:00
Dmitry Kovalev
b107f2c470 Renaming "mbskip" to "skip".
Change-Id: I27a30b43eae026a77f92958e2238d02d9cdf7832
2014-01-29 14:48:42 -08:00
Dmitry Kovalev
5670f1e2a8 Merge "Finally removing vp9_setup_interp_filters() function." 2014-01-29 12:54:21 -08:00
Dmitry Kovalev
6332063475 Combining fb_idx_ref_cnt[] and yv12_fb[] arrays.
Adding new RefCntBuffer struct which contains reference counter and image
buffer.

Change-Id: I71c1f532faa13442c32c43fc03ec45b6f88fb844
2014-01-29 12:48:01 -08:00
Dmitry Kovalev
b00eb5c464 Finally removing vp9_setup_interp_filters() function.
Change-Id: If446225afbb49f6033c2a4516a37c377de6f70f7
2014-01-29 11:29:34 -08:00
Jim Bankoski
ea8aaf15b5 create super fast rtc mode
This patch only works if the video is a width and height that are both
a multiple of 32..   It sets every partition to 16x16, and does INTRADC
only on the first frame and ZEROMV on every other frame.   It always does
does the largest possible transform, and loop filter level is set to 4.

Was ~20% faster than speed -5 of vp8

Now 20% slower but adds motion search ( every block ), nearest, near
and zeromv

The SVC test was changed because - while this realtime mode produces
bad quality albeit quickly, it isn't obeying all the rules it should
about which frames are available.

Change-Id: I235c0b22573957986d41497dfb84568ec1dec8c7
2014-01-29 08:39:39 -08:00
Yunqing Wang
3c29cbffbf Add macros for convolve functions
Added macros to reduce the code duplication.

Change-Id: I1916aa5a386ea07d961d4ec439ab09bb8c45487d
2014-01-28 18:40:23 -08:00
Dmitry Kovalev
b098c04290 Merge "Decoupling set_ref_ptrs() and vp9_setup_interp_filters()." 2014-01-28 10:37:58 -08:00
Dmitry Kovalev
4ce35d8f2d Merge "Removing _1d suffix from transform names." 2014-01-28 10:37:26 -08:00
hkuang
af87148a22 Merge "Add vp9_tm_predictor_32x32 neon implementation which is 7.8 times faster than C." 2014-01-28 09:57:08 -08:00
Dmitry Kovalev
ff41764920 Removing _1d suffix from transform names.
It is enough to specify (e.g.) idct16, it is obviously different from
idct16x16.

Change-Id: I6b408a37a945de3162429380b59a775b03b95db0
2014-01-27 16:15:36 -08:00
hkuang
770454f3a8 Add vp9_tm_predictor_32x32 neon implementation
which is 7.8 times faster than C.

Change-Id: I858ef4ec09202a07d445da8db702783d6d9d7321
2014-01-27 16:01:07 -08:00
Dmitry Kovalev
e5b31a1d8c Decoupling set_ref_ptrs() and vp9_setup_interp_filters().
Change-Id: I8d17867a4772554cbba2bd113cc5b4c99d50146d
2014-01-27 16:00:20 -08:00
Dmitry Kovalev
b2f0ae65c7 Merge "Removing subpix_fn_table struct." 2014-01-27 10:42:42 -08:00
hkuang
05d2081d38 Fix the vp9_tm_predictor_8x8_neon.
Change-Id: I832cf83871044bfee7b7e57dbd31bae05cbd53e9
2014-01-27 10:17:20 -08:00
Dmitry Kovalev
8691565441 Removing subpix_fn_table struct.
We don't use different filter kernels for x and y, it is always one kernel
for both directions.

Change-Id: Iefcbb02ec74bf46ea20d9dca672a3efd5d631517
2014-01-24 17:06:26 -08:00
Dmitry Kovalev
f9f936b82f Merge "Renaming INTERPOLATION_TYPE to INTERP_FILTER." 2014-01-24 16:52:10 -08:00
Frank Galligan
183361dadb Merge "Optimize vp9_tm_predictor_8x8_neon function" 2014-01-24 16:21:56 -08:00
Dmitry Kovalev
4264c93844 Renaming INTERPOLATION_TYPE to INTERP_FILTER.
Corresponding renames:
  subpel_kernel              => interp_kernel
  vp9_get_filter_kernel()    => vp9_get_interp_kernel()
  pred_filter_type           => pred_interp_filter
  adaptive_pred_filter_type  => adaptive_pred_interp_filter
  mcomp_filter_type          => interp_filter
  read_interp_filter_type()  => read_interp_filter()
  write_interp_filter_type() => write_interp_filter()
  fix_mcomp_filter_type()    => fix_interp_filter()

Change-Id: I1fa61fa1dc81ebbf043457c3ee2d8d4515bee6d3
2014-01-24 15:57:28 -08:00
Dmitry Kovalev
03eb63c114 Merge "Removing MODE_STATS." 2014-01-24 15:53:12 -08:00
Frank Galligan
c6d537155c Merge "Revert external frame buffer code." 2014-01-24 11:31:23 -08:00
Frank Galligan
56a8a0b54b Optimize vp9_tm_predictor_8x8_neon function
Change-Id: Ia12aae491202098ff66366145aa0c3da38dc97e5
2014-01-24 11:07:14 -08:00
hkuang
92ab96a7ae Merge "Add vp9_tm_predictor_16x16 neon implementation which is 3.5 times faster than C." 2014-01-24 10:48:44 -08:00
James Zern
26c88ec14e Merge changes I826655a7,I5164df72,Iba9b198c,Ide9a6846,I4f51ce85,I0e6aa00f,Ic334da9a,I252f5f8a,I7865db2d,I13b434b1
* changes:
  test/: remove unnecessary extern "C"s
  top-level: add extern "C" to headers
  vpx_ports: add extern "C" to headers
  vpx: add extern "C" to headers
  vp9/encoder: add extern "C" to headers
  vp9/decoder: add extern "C" to headers
  vp9/common: add extern "C" to headers
  vp8/encoder: add extern "C" to headers
  vp8/decoder: add extern "C" to headers
  vp8/common: add extern "C" to headers
2014-01-24 10:47:00 -08:00
hkuang
3633ffcbf7 Add vp9_tm_predictor_16x16 neon implementation
which is 3.5 times faster than C.

Change-Id: I24439ba7a2971829c11620f34848facf2c916678
2014-01-24 10:22:58 -08:00
Frank Galligan
b1c72b633e Revert external frame buffer code.
A future CL will add external frame buffers
differently.

Squash commit of four revert commits:
Revert "Increase required number of external frame buffers"

This reverts commit 9e41d569d7.

Revert "Add external constants."

This reverts commit bbf53047b0.

Revert "Add frame buffer lru cache."

This reverts commit fbada948fa.

Conflicts:
	vpxdec.c

Change-Id: I76fe42419923a6ea6c75d9997cbbf941d73d3005

Revert "Add support to pass in external frame buffers."

This reverts commit 10f891696b.

Conflicts:
	test/external_frame_buffer_test.cc
	vp9/common/vp9_alloccommon.c
	vp9/common/vp9_reconinter.c
	vp9/decoder/vp9_decodeframe.c
	vp9/encoder/vp9_onyx_if.c
	vp9/vp9_dx_iface.c
	vpx/vpx_decoder.h
	vpx/vpx_external_frame_buffer.h
	vpx_scale/generic/yv12config.c
	vpxdec.c

Change-Id: I7434cf590f1c852b38569980e4247fad0d939c2e
2014-01-24 10:10:20 -08:00
Adrian Grange
8b0537f631 Merge changes I24ad1f0f,I33be1366
* changes:
  Reorder functions to avoid forward declaration
  Rename set_scale_factors as set_ref_ptrs
2014-01-24 08:38:52 -08:00
Dmitry Kovalev
6c98df29e4 Cleaning up motion compensation code.
Change-Id: I74cf028e8c732cd0dbc070326152d3085b824a80
2014-01-23 17:15:30 -08:00
James Zern
0940c9cfde vp9/common: add extern "C" to headers
Change-Id: Ic334da9aee968e33762c2b25d9fbad24c844b411
2014-01-23 16:21:24 -08:00
Dmitry Kovalev
5f75fda9e9 Merge "Cleaning up vp9_refining_search_sad() function." 2014-01-22 17:15:22 -08:00
hkuang
97826df96b Add tm_predictor_8x8 neon implementation.
Change-Id: I76c2720546b737cb63018a8ab6a3ff62a291786d
2014-01-22 13:43:20 -08:00
Adrian Grange
e37eb0ade7 Rename set_scale_factors as set_ref_ptrs
New name better describes what the function does.

Change-Id: I33be1366a81f058a9854b804bcde211061187dc7
2014-01-22 13:04:30 -08:00
Johann
4e9dc6d45d Merge "Match vp9_coefband_trans_* declarations" 2014-01-22 11:10:51 -08:00
Johann
6c492fc2f9 Match vp9_coefband_trans_* declarations
VS2013 Chromium builds failed with:
warning C4742: 'vp9_coefband_trans_8x8plus' has different alignment in

https://code.google.com/p/chromium/issues/detail?id=336620

Change-Id: I865f72bc23ae958531eeb5f497002c12e9a36fcd
2014-01-21 17:07:23 -08:00
hkuang
437004c710 Seperate the border size for encoder and decoder.
Encoder's boarder is still 160, while decoder's boarder will be 32.
With on demand and separate boarder buffer for boarder extension.
The decoder's boarder does not need to to 160 anymore.

Change-Id: I93d5aaff15a33a2213e9761eaa37c5f2870747db
2014-01-21 15:28:41 -08:00
Dmitry Kovalev
a001016996 Removing MODE_STATS.
Change-Id: I7520e1cc82b749187c9445356dd7b54f3f3826cc
2014-01-17 17:30:22 -08:00
Jingning Han
b461c0884e Deprecate best_mv from encoder
This commit deprecates the use of best_mv from encoding and bit-stream
writing stages. It hence removes the definition from MACROBLOCKD.

Change-Id: I8e5302775a2aa4a18900726df407bff881f2dfb1
2014-01-17 17:15:34 -08:00
hkuang
671df8486d Merge "Use a temp buffer for reconstruction when reference buffer is out of boarder." 2014-01-17 16:17:36 -08:00
hkuang
7459fee8c6 Use a temp buffer for reconstruction when
reference buffer is out of boarder.

Change-Id: Ic7ad136e54a4d68abe0fd4345146a86b0ba824e1
2014-01-17 16:15:54 -08:00
Dmitry Kovalev
d8bfe9e24c Cleaning up vp9_refining_search_sad() function.
Change-Id: I660b53da8ebf3049832ce8a10721051c4e0ebb00
2014-01-17 15:20:28 -08:00
Dmitry Kovalev
ac40c87f68 Removing unused vp9_yv12_copy_partial_frame() function.
Change-Id: I3149e562fe9500914f67b6f908283edcdc381ac6
2014-01-16 18:16:34 -08:00
Yunqing Wang
d2bb0c51d3 Revert "Revert "Revert "SSSE3 convolution optimization"""
This reverts commit f9404f2406.

This patch caused some ASAN error.

Change-Id: If15b7e581310e19061d111c69f2931809662ed19
2014-01-16 16:11:46 -08:00
hkuang
2a2d8c140f Merge "Add vp9_tm_predictor_4x4 neon implementation" 2014-01-16 10:18:12 -08:00
Dmitry Kovalev
67e4ca2a1a Merge "Cleaning up postproc code." 2014-01-15 16:23:54 -08:00
Yaowu Xu
056db03d17 Merge "Revert "Revert "SSSE3 convolution optimization""" 2014-01-15 15:03:25 -08:00
Deb Mukherjee
8ce5f68fe4 Merge "Rearranges the END_USAGE typedef" 2014-01-15 14:01:30 -08:00
hkuang
f2ef389256 Add vp9_tm_predictor_4x4 neon implementation
Change-Id: I10c423bde7ea5a3bac9f14f35c73b6bc31c8f3e3
2014-01-15 11:51:36 -08:00
Deb Mukherjee
f32106951a Rearranges the END_USAGE typedef
Rearranges the END_USAGE typedef to make it compatible with the
vpx user input.

Change-Id: Ic9fa9e9edbee7c0ad01e12e685b219582fcecd16
2014-01-15 10:10:23 -08:00
Adrian Grange
c3011e6f90 Delete outdated comment & tidy-up others
Change-Id: I83031180723ee59270ec8fb66b2f73c0796bee25
2014-01-15 09:53:03 -08:00
Dmitry Kovalev
a540f8a0b0 Cleaning up postproc code.
Change-Id: I7e53f6345a4cf89309262f50850c9ad08ed3c527
2014-01-14 15:49:19 -08:00
Yunqing Wang
f9404f2406 Revert "Revert "SSSE3 convolution optimization""
This reverts commit b645257121.

Change-Id: I60d1bf57ae8e9eb6127f42f2d5a780124ac51b45
2014-01-13 12:29:55 -08:00
James Zern
f83c12b540 Merge "cosmetics: vp9_reconinter.h: make some variables const" 2014-01-11 12:39:32 -08:00
Dmitry Kovalev
96be0a50ab Removing mi_height_log2_lookup table.
Change-Id: I1f0ae2edc3a96b33c0494d165ae756a8feba6184
2014-01-10 13:29:47 -08:00
Paul Wilkins
b645257121 Revert "SSSE3 convolution optimization"
This reverts commit 511d218c60.

In current form intrinsics break borg build.

Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9
2014-01-10 13:38:26 +00:00
Jingning Han
a4c94a94cc Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P2" 2014-01-09 18:17:25 -08:00
Jingning Han
faa2ba86cc Merge "Optimze inv 16x16 DCT with 10 non-zero coeffs - P1" 2014-01-09 18:17:12 -08:00
Dmitry Kovalev
c8e8d3a461 Merge "Renaming 'Sharpness' to 'sharpness'." 2014-01-09 13:42:55 -08:00
Jingning Han
af31b27aae Optimze inv 16x16 DCT with 10 non-zero coeffs - P2
This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
2014-01-09 12:46:09 -08:00
Yunqing Wang
f3b9b97c0e Merge "SSSE3 convolution optimization" 2014-01-09 12:39:47 -08:00
levytamar82
511d218c60 SSSE3 convolution optimization
Optimizing all SSSE3 assembly for convolution:
1. vp9_filter_block1d4_h8_sse2
2. vp9_filter_block1d8_h8_sse2
3. vp9_filter_block1d16_h8_sse2
4. vp9_filter_block1d4_v8_sse2
5. vp9_filter_block1d8_v8_sse2
6. vp9_filter_block1d16_v8_sse2
my optimization include:
-processing 2x8 elements in one 128 bit register instead of processing
8 elements in one 128 bit register.
-removing unecessary loads.
This optimization gives between 2.4% user level gain for 480p input
and 1.6% user level gain for 720p.
This Optimization done only for 64bit.

Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
2014-01-09 12:27:51 -07:00
Dmitry Kovalev
4fbe54d201 Merge "Renaming 'Mode' to 'mode'." 2014-01-08 16:29:29 -08:00
Jingning Han
ba6ab46cdc Optimze inv 16x16 DCT with 10 non-zero coeffs - P1
This commit is the first patch optimizing SSE2 implementation of inverse
16x16 DCT with <10 non-zero coefficients. It focused on the first 1-D (row)
transformation. It exploits the fact that only top-left 4x4 block contains
non-zero coefficients, in a 2-D inverse 16x16 DCT with <10 coeffients.

The average runtime of idct16x16_10 unit is reduced from
883 cycles -> 779 cycles (12% faster).

For pedestrian_area_1080p 300 frames at 4000 kbps, the speed 2 runtime goes
down from 310651 ms  -> 305910 ms. The decoding speed goes up from
80.37 fps -> 80.87 fps.

Change-Id: Ic6f3ac5a637a76c07ba73ddaafe318a699fea645
2014-01-08 15:36:45 -08:00
Alex Converse
8fcb74e6bb Merge "Add a C fallback for get_msb() and change inline to INLINE." 2014-01-08 14:43:46 -08:00
hkuang
5be0ed30dc Merge "Add initial intra frame neon optimization. 1~2% gain." 2014-01-08 14:41:43 -08:00
Dmitry Kovalev
962c8b241e Renaming 'Mode' to 'mode'.
Change-Id: I6cdd670d66288dbd66228f38bba6b30502d25362
2014-01-08 14:33:59 -08:00
Dmitry Kovalev
57be81369a Renaming 'Sharpness' to 'sharpness'.
Change-Id: I54513dc3b3321e0c0bb6b15ea5c34085ed80b4a4
2014-01-08 14:19:14 -08:00
Alex Converse
ce7ff3b63d Add a C fallback for get_msb() and change inline to INLINE.
For systems without __builtin_clz() or _BitScanReverse(), taken from libwep

Change-Id: Iead257efc1772c466c79e1dc0356ed571d38d43e
2014-01-08 12:25:47 -08:00
hkuang
691111aacf Add initial intra frame neon optimization. 1~2% gain.
More intra optimizations will be added.

Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
2014-01-08 11:58:42 -08:00
Yunqing Wang
a84029ad9c Merge "AVX2 Variance Optimization" 2014-01-08 11:33:42 -08:00
levytamar82
357b65369f AVX2 Variance Optimization
Optimizing the variance functions: vp9_variance16x16, vp9_variance32x32,
vp9_variance64x64, vp9_variance32x16, vp9_variance64x32,
vp9_mse16x16 by migrating to AVX2
some of the functions were optimized by processing 32 elements instead of 16.
some of the functions were optimized by processing 2 loop strides of 16
elements in a single 256 bit register
This optimization gives between 2.4% - 2.7% user level performance gain
and 42% function level gain.

Change-Id: I265ae08a2b0196057a224a86450153ef3aebd85d
2014-01-08 12:05:53 -07:00
Alex Converse
f2ca665f1c Replace RD modeling with a fixed point approximation.
Change-Id: I44eb44eb3f36c05d916ef140ef42cc84f72f99ec
2014-01-08 10:37:24 -08:00
Dmitry Kovalev
bbb25e6a39 Merge "Adding RefBuffer struct." 2014-01-06 14:19:44 -08:00
Jingning Han
b49e9fb433 Merge "Tune IDCT8_1D macro function interface" 2014-01-06 09:38:19 -08:00
Dmitry Kovalev
0c5575fe57 Merge "Moving hev mask calculation into filter4() function." 2014-01-03 15:56:16 -08:00
Jingning Han
3e0c62b53f Tune IDCT8_1D macro function interface
This commit adds input/output ports for IDCT8_1D macro function to
provide more flexibility in variable use. It allows to skip several
buffer swap operations.

Change-Id: I21f3450509537322293043b3281bfd3949868677
2014-01-03 15:23:47 -08:00
Dmitry Kovalev
ba41e9d459 Adding RefBuffer struct.
Adding RefBuffer to simplify reference buffer management. The struct has a
pointer to image data and scale factors relative to the current frame.

Change-Id: If38eb1491ff687cc11428aee339f3e052e2c5d9e
2014-01-03 15:21:55 -08:00
Jingning Han
0b1a27135a Reduce num of buffer swap calls in idct8_1d_sse2
This commit merges the initial buffer swap operations in idct8_1d_sse2
into the array transpose step, hence reducing number of instructions
therein.

Change-Id: I219f6f50813390d2ec3ee37eecf2a4a2b44ae479
2014-01-03 12:12:03 -08:00
Jingning Han
1bb11781e2 Rework idct8x8_10 SSE2 implementation
This commit optimizes the SSE2 implmentation of idct8x8_10. It exploits
the fact that only top-left 4x4 block contains non-zero coefficients,
and hence reduces the instructions needed.

The runtime of idct8x8_10_sse2 goes down from 216 to 198 CPU cycles,
estimated by averaging over 100000 runs. For pedestrian_area_1080p 300
frames coded at 4000kbps, the average decoding speed goes up from
79.3 fps to 79.7 fps.

Change-Id: I6d277bbaa3ec9e1562667906975bae06904cb180
2014-01-03 12:04:09 -08:00
Yaowu Xu
8458c8c450 Merge "Fix show existing frame" 2014-01-02 09:27:28 -08:00
Dmitry Kovalev
f3beca079c Merge "Calculating has_second_ref only once for single_ref context." 2013-12-26 13:41:02 -08:00
Dmitry Kovalev
1e8b5bf4ac Merge "Removing vp9_findnearmv.{h, c} files." 2013-12-26 13:38:38 -08:00
James Zern
44963dfd37 cosmetics: vp9_reconinter.h: make some variables const
Change-Id: If5cd0a1487e97c8e9d13dc2e078c6dceaf79de4f
2013-12-26 14:02:46 -05:00
Dmitry Kovalev
87440aeb82 Moving MAX_PROB constant to vp9_prob.h.
Change-Id: I07470ad1b7a0344d088911428ffab8ba9a0d8708
2013-12-20 15:56:59 -08:00
Dmitry Kovalev
b3b9f4a4d0 Merge "Using single struct to represent scale factors." 2013-12-20 11:22:02 -08:00
Yunqing Wang
b6a0ac11f0 Merge "Code clean up" 2013-12-20 08:46:11 -08:00
Dmitry Kovalev
987810ad95 Removing vp9_findnearmv.{h, c} files.
Moving all code from that files to vp9_mvref_common.{h, c}.

Change-Id: Ibc4afcb8cea6847166ff411130e93611ebe63b20
2013-12-19 17:39:57 -08:00
Dmitry Kovalev
a3fbcc88bb Using single struct to represent scale factors.
Moving back to scale_factors struct. We don't need anymore x_offset_q4 and
y_offset_q4 because both values are calculated locally inside vp9_scale_mv
function.

Change-Id: I78a2122ba253c428a14558bda0e78ece738d2b5b
2013-12-19 16:06:33 -08:00
Dmitry Kovalev
c872d2be65 Call set_scaled_offsets() just before scale_mv() call.
Before mv scaling it is required to calculate x_offset_q4/y_offset_q4
by calling set_scaled_offsets(). Now offset configuration can not be
missed because it happens just before scale_mv().

Change-Id: I7dd1a85b85811a6cc67c46c9b01e6ccbbb06ce3a
2013-12-19 14:55:13 -08:00
Yunqing Wang
09faf55916 Code clean up
Removed unused filter coefficients.

Change-Id: Ib395a51305e23ff41ab69c1808d56946d25961cd
2013-12-19 11:09:23 -08:00
Dmitry Kovalev
c67ee5ea24 Merge "Converting vp9_treecoder.h to vp9_prob.{h, c}" 2013-12-19 11:03:30 -08:00
Marco Paniconi
02d5ebcfdc Merge "Updates for 1-pass CBR rate control." 2013-12-18 10:28:33 -08:00
Marco Paniconi
1b8b8b0d0d Updates for 1-pass CBR rate control.
Adjustments based on buffer level, frame dropper.

Change-Id: Iaa85b570493526a60c4b9fb7ded4c0226b1b3a33
2013-12-18 09:24:24 -08:00
Jim Bankoski
9d754dcca8 Merge "rename loop filter functions" 2013-12-17 18:56:09 -08:00
Jim Bankoski
b720ba165f rename loop filter functions
This renames all the loop filter functions so that they no
longer refer to mb

Change-Id: I8a58a8c7fd253d835cb619bde13913e896ece90b
2013-12-17 17:34:34 -08:00
Dmitry Kovalev
118c8fb3fb Calculating has_second_ref only once for single_ref context.
Change-Id: Ib1253e0606426850f53060a4c5303af86bf1c093
2013-12-17 17:02:24 -08:00
Dmitry Kovalev
c6a1ff223b Merge "Calling is_inter_block() only if mbmi is available." 2013-12-17 16:10:56 -08:00
Dmitry Kovalev
4821084b3f Moving hev mask calculation into filter4() function.
Change-Id: Ieccf2070b2b01b4135f4c5f9857667eb7825c761
2013-12-17 15:23:23 -08:00
Dmitry Kovalev
eb0c73b6e0 Merge "Converting mode_lf_lut struct member into static lookup table." 2013-12-17 15:20:05 -08:00
James Zern
bd9a388a06 vp9: normalize include guards
Change-Id: If4ddbdcfb3ab387cbca6910b42cf4df8111e6879
2013-12-16 19:40:49 -08:00
Yaowu Xu
3cce464342 Define POSITION to differentiate from MV
MV struct was ussed to indicate the postition of a MI_BLOCK with row
and col components. The expression was confusing, this commit added a
new stucture "POSITION" with row and col component to better describe
the position of a mi_block.

Change-Id: I59fdd4b45010fe7d85a8db22a55503265c4f5b2b
2013-12-16 17:28:00 -08:00
Yaowu Xu
50ec6311e6 Move two functions to encoder
As they are used by encoder only.

Change-Id: I7b1e6955b218aba66fe156523521a8121c9a84a4
2013-12-16 17:27:48 -08:00
Dmitry Kovalev
bb7b4bad6d Merge "Getting rid of b_{width, height}_log2 calls in non-420 loop filter." 2013-12-16 15:10:25 -08:00
Dmitry Kovalev
865d5b83f2 Calling is_inter_block() only if mbmi is available.
Modifying vp9_get_intra_inter_context(), vp9_get_reference_mode_context(),
vp9_get_pred_context_single_ref_p1(), vp9_get_pred_context_single_ref_p2()
functions.

Change-Id: Ifaa2c3eb0c76a544ae8bd1fe3155aada266eae78
2013-12-16 15:09:33 -08:00
hkuang
fb53409d2a Merge "Remove border extension in intra frame prediction." 2013-12-16 14:48:54 -08:00
Dmitry Kovalev
b1d821704b Merge "Yet another vp9_pred_common.c cleanup." 2013-12-16 14:10:52 -08:00
hkuang
25e5552630 Remove border extension in intra frame prediction.
Change-Id: Id677df4d3dbbed6fdf7319ca6464f19cf32c8176
2013-12-16 14:05:58 -08:00
Dmitry Kovalev
b5c9261832 Converting vp9_treecoder.h to vp9_prob.{h, c}
Moving vp9_norm probability table from vp9_entropy.c to vp9_prob.c

Change-Id: Ie757b73860c6f43130790c332b292e2a1a81b788
2013-12-16 12:53:09 -08:00
Frank Galligan
fbada948fa Add frame buffer lru cache.
Add an option for libvpx to return the least recently used
frame buffer.

Change-Id: I886a96ffb94984f1c42de53086e0131922df3260
2013-12-15 19:57:42 -08:00
Frank Galligan
d0ee1fd797 Merge "Add support to pass in external frame buffers." 2013-12-15 19:18:25 -08:00
Frank Galligan
10f891696b Add support to pass in external frame buffers.
VP9 decoder can now use frame buffers passed in by the application.

Change-Id: I599527ec85c577f3f5552831d79a693884fafb73
2013-12-15 18:45:46 -08:00
Dmitry Kovalev
4d2d1591a3 Converting mode_lf_lut struct member into static lookup table.
Change-Id: I6e6c7cb5ff5b60fbe6a7c314daec5ccdc2cafcc3
2013-12-14 17:42:12 -08:00
Dmitry Kovalev
2aadc06e0d Yet another vp9_pred_common.c cleanup.
Change-Id: I617d6c610d181076773c5c3d6f3dbc6717b02580
2013-12-14 17:39:24 -08:00
Dmitry Kovalev
64cf398713 Merge "Using MV struct instead of int_mv union in encoder." 2013-12-13 16:42:54 -08:00
Dmitry Kovalev
33df4f0483 Merge "vp9_convole.c cleanup." 2013-12-13 15:40:00 -08:00
Dmitry Kovalev
f54b515797 Merge "Cleaning up vp9_append_sub8x8_mvs_for_idx()." 2013-12-13 15:38:53 -08:00
Dmitry Kovalev
25da21b14e Using MV struct instead of int_mv union in encoder.
Change-Id: I8b81a3e4b4fa530a654c28d9c136afa0c1d379fd
2013-12-13 15:24:48 -08:00
Dmitry Kovalev
466cc94e7a Getting rid of b_{width, height}_log2 calls in non-420 loop filter.
Using num_{4x4, 8x8}_blocks_{wide, high}_lookup instead.

Change-Id: I66a7ab807fa57395253b2d0e636c2479fa8c4adf
2013-12-13 12:53:41 -08:00
James Zern
178db94cd6 vp9 asserts: fix compile warning
string literal to int within an assert

Change-Id: I0c889256b67a078e6e2a79577f0b7ae084243258
2013-12-12 19:49:19 -08:00
Dmitry Kovalev
629fb85f17 vp9_convole.c cleanup.
Making overall logic more clear, moving "hacked" calculation of base filter
array pointer to get_filter_base() function.

Change-Id: Ibbd38a9f937e48d35bbbfef3ad933ab36664cccb
2013-12-12 11:14:06 -08:00
Deb Mukherjee
7edd5170b5 Merge "Changes interfaces to vp9_get_compressed_data fn" 2013-12-11 15:50:40 -08:00
Dmitry Kovalev
e79103166f Merge "Renames for consistency in vp9_pred_common.{c, h} files." 2013-12-11 14:30:44 -08:00
Deb Mukherjee
e33855cc47 Changes interfaces to vp9_get_compressed_data fn
Silences some lint warnings in previous patches

Change-Id: I04bf47ebe7e63a95fd322719a3154e589c115d78
2013-12-11 14:22:51 -08:00
hkuang
9460226acd Merge "Fix valgrind error." 2013-12-11 13:22:32 -08:00
hkuang
1339f3842c Fix valgrind error.
Temporarily change memcpy to memmove.

Change-Id: I700a197bc1ce496be1ddad7118429c5da465b0ca
2013-12-11 13:21:28 -08:00
Dmitry Kovalev
3274fc30ee Renames for consistency in vp9_pred_common.{c, h} files.
Change-Id: Icba06e84ca55c419abbacedf5825eeb394a1b140
2013-12-10 18:31:46 -08:00
Dmitry Kovalev
098d13ba10 Cleaning up vp9_append_sub8x8_mvs_for_idx().
Replacing if-else with switch statement, reordering function arguments.

Change-Id: I4825d2ef311ba8999b6d4ceb0eef003587a13434
2013-12-10 17:56:53 -08:00
Dmitry Kovalev
2dd20e468a Cleaning up skip context calculation.
Renames:
  vp9_get_pred_context_mbskip => vp9_get_skip_context
  vp9_get_pred_prob_mbskip    => vp9_get_skip_prob

Change-Id: I2af499848ef73f3f5cd8cdb27852d0bcdfe31d09
2013-12-10 14:11:26 -08:00
Dmitry Kovalev
35b7b0b549 Merge "Removing unused vp9_get_pred_flag_mbskip() function." 2013-12-10 13:58:35 -08:00
hkuang
19bbe41c71 Merge "Refactor inter_predictor function." 2013-12-10 13:34:24 -08:00
Dmitry Kovalev
48088f210d Removing unused vp9_get_pred_flag_mbskip() function.
Change-Id: Ib46a97d8ff9f2915b9fa2abba3cd18b6711fcb0c
2013-12-10 12:53:17 -08:00
Dmitry Kovalev
e18eb7721e Merge "Renaming comp_pred_mode to reference_mode." 2013-12-10 10:52:34 -08:00
hkuang
6c9dcae532 Refactor inter_predictor function.
Change-Id: Ic429b2f16462e926f30efb3af4da3080026359d8
2013-12-10 10:36:44 -08:00
Dmitry Kovalev
d2dad31e79 Merge "Cleaning up vp9_get_pred_context_switchable_interp() functuion." 2013-12-09 17:34:30 -08:00
hkuang
d70a8c09c6 Merge "Implenment on demand border extension. In place extend the border now. Next commit will totally remove the border." 2013-12-09 17:16:31 -08:00
Dmitry Kovalev
9edd4d4db7 Cleaning up vp9_get_pred_context_switchable_interp() functuion.
Change-Id: I67a45a41312ca0efd8fe00ccd8bdc0f97675d09f
2013-12-09 17:02:38 -08:00
hkuang
ff2c96be1f Implenment on demand border extension. In place extend
the border now. Next commit will totally remove the border.

Change-Id: Ic1e1ca9cc34f81c688715b3948689b47df63a151
2013-12-09 16:44:08 -08:00
Jingning Han
f92b5842bf Merge "Full range motion search for regular block sizes" 2013-12-09 16:12:35 -08:00
Dmitry Kovalev
08c48ddc01 Renaming comp_pred_mode to reference_mode.
Change-Id: I83ffed2b1878a35ac35f07f9ee74309adc9c7b11
2013-12-09 15:13:34 -08:00
Dmitry Kovalev
347df4ce55 Merge "Renaming vp9_get_pred_context_tx_size() function." 2013-12-09 15:10:49 -08:00
Dmitry Kovalev
2c3120274a Removing max_uv_txsize_lookup lookup table.
Adding get_uv_tx_size_impl() with tx size selection logic, rewriting
get_uv_tx_size().

Change-Id: I3ecb108059a41be227a8c89a0710bd174f508951
2013-12-09 14:03:23 -08:00
Dmitry Kovalev
a19d694f09 Merge "Removing BLOCK_TYPES and adding PLANE_TYPES constant instead." 2013-12-07 02:20:41 -08:00
Dmitry Kovalev
cb92f4f042 Renaming vp9_get_pred_context_tx_size() function.
Change-Id: Ia6d6f4dfb1fd1ec0f8ba53796b59a802e9d7881d
2013-12-06 15:31:06 -08:00
Dmitry Kovalev
b6e5bb27c9 Merge "Renaming reference mode context calculation function." 2013-12-06 14:22:47 -08:00
Jingning Han
b295092b8f Full range motion search for regular block sizes
Add a full range motion search for regular block sizes. This runs
exhaustive search within the given reference area. This commit further
optimizes the search process by combining 4 points test into one
pipeline, which gives 30% speed-up as compared to run each individual
point at a time.

This full range search serves as a best possible motion search reference.
When replacing the diamond search with full range search, the speed 0
runtime of bus CIF at 2000 kbps goes from 153872ms to 623051ms. The
compression performance compared to speed 0 setting gains 0.585% for
derf set.

Change-Id: Ieef1225216b0b86b4ac4872fa7fb9e18bf2eabb3
2013-12-06 12:24:53 -08:00
Dmitry Kovalev
2da30a96d4 Merge "Removing duplicated C code from vp9_loopfilter_filters.c file." 2013-12-06 12:13:24 -08:00
Dmitry Kovalev
63963f51ef Renaming reference mode context calculation function.
Renames:
  vp9_get_pred_context_comp_inter_inter => vp9_get_reference_mode_context
  vp9_get_pred_prob_comp_inter_inter    => vp9_get_reference_mode_prob

Change-Id: I3bbb69481e6b0c848028667c9269f567f293d3bd
2013-12-06 11:23:01 -08:00
Dmitry Kovalev
d6b159d4a6 Removing BLOCK_TYPES and adding PLANE_TYPES constant instead.
Change-Id: Ic3bb862e93aedf6a489a33ea6f7e5097d96855ee
2013-12-06 10:54:00 -08:00
Dmitry Kovalev
cf4dfdc8e7 Merge "Moving vp9_tree_probs_from_distribution() to encoder." 2013-12-06 10:18:30 -08:00
Dmitry Kovalev
8eac2ca840 Merge "Renaming constants." 2013-12-06 09:55:02 -08:00
Dmitry Kovalev
5be34ba80f Merge "vp9_get_pred_context_intra_inter() clean up." 2013-12-06 09:14:36 -08:00
Adrian Grange
de2046275d Merge "Remove redundant calls to vp9_update_mode_info_border" 2013-12-06 08:59:47 -08:00
Dmitry Kovalev
4ac6a2552b Moving vp9_tree_probs_from_distribution() to encoder.
Writing custom coeff branch count calculation (which is much clearer) in
adapt_coef_probs() function. Removing vp9_treecoder.c file.

Change-Id: I8880fb7a39996c8bcf6cd0acf9898a8c712ba91f
2013-12-05 18:13:26 -08:00
Dmitry Kovalev
377fa8aff8 Renaming PREV_COEF_CONTEXTS to COEFF_CONTEXTS.
Also adding BAND_COEFF_CONTEXTS macro to simplify for loop logic.

Change-Id: I12a78a49cf1addf81e6b3fe2a3736ec2b79bd79e
2013-12-05 17:08:06 -08:00
Dmitry Kovalev
6fd71e1b09 vp9_get_pred_context_intra_inter() clean up.
Renaming:
 vp9_get_pred_context_intra_inter => vp9_get_intra_inter_context
 vp9_get_pred_prob_intra_inter    => vp9_get_intra_inter_prob

Change-Id: I2c1affea2e84f4e616137c6df82adb11c7845781
2013-12-05 17:01:03 -08:00
Dmitry Kovalev
f7396f3394 Merge "Removing vp9_default_coef_probs.h file." 2013-12-05 16:44:26 -08:00
Dmitry Kovalev
0d4b8d7e43 Renaming constants.
NUM_YV12_BUFFERS        => FRAME_BUFFERS
ALLOWED_REFS_PER_FRAME  => REFS_PER_FRAME
NUM_REF_FRAMES_LOG2     => REF_FRAMES_LOG2
NUM_REF_FRAMES          => REF_FRAMES
NUM_FRAME_CONTEXTS_LOG2 => FRAME_CONTEXTS_LOG2
NUM_FRAME_CONTEXTS      => FRAME_CONTEXTS

Change-Id: I4e1ada08f25d8fa30fdf03aebe1b1c9df0f87e63
2013-12-05 16:23:09 -08:00
Dmitry Kovalev
2b95a05bf6 Removing duplicated C code from vp9_loopfilter_filters.c file.
Change-Id: I299b621fca1c8ff5d296afde9698cdcccfecaf3f
2013-12-05 15:49:57 -08:00
Adrian Grange
93d8a3fd29 Remove redundant calls to vp9_update_mode_info_border
Removed calls to vp9_update_mode_info_border since
they immediately followed code that initialized the
entire buffer to 0.

Change-Id: Ife06794daa20439a0b607a83a87f88df59afac40
2013-12-05 15:02:32 -08:00
Dmitry Kovalev
6df9ec52a0 Merge "Cleaning up vp9_get_pred_context_tx_size() function." 2013-12-05 09:59:00 -08:00
Tero Rintaluoma
047b0b01bb Fix show existing frame
- Disable mode info update in case where current frame is coded
  as "show existing frame".
- Should fix issue 676.

Change-Id: Ibee681850eb307f982da6528d3e31cb94f881c08
2013-12-05 12:10:10 +02:00
Frank Galligan
7ecf3bc91c Fix ref count decrement code.
Buffer 0 would never be decremented, so it could only be used
once.

Change-Id: I605d99fa2a513eadae6a0e230161729880653282
2013-12-04 22:21:00 -08:00
Dmitry Kovalev
5eeffc9fc5 Cleaning up vp9_get_pred_context_tx_size() function.
Change-Id: Ia6ef876e3d1e66b2182a9c0bce3fd758691cd381
2013-12-04 21:35:30 -08:00
Dmitry Kovalev
a1123538a5 Moving vp9_token from common to encoder.
Change-Id: I40a070c353663e82c59e174d7c92eb84f72ed808
2013-12-04 19:36:58 -08:00
Frank Galligan
8363349b84 Merge "Fix the initial references to frame buffers." 2013-12-04 19:26:40 -08:00
Dmitry Kovalev
4afd141a05 Removing vp9_default_coef_probs.h file.
Moving all probability tables from removed file to vp9_entropy.c.

Change-Id: I12846f1da778c3016d96b82e53384d4634883430
2013-12-04 17:04:35 -08:00
Dmitry Kovalev
cf8e3d2c5c Merge "Cleaning up vp9_dec_build_inter_predictors_sb function." 2013-12-04 16:57:54 -08:00
Frank Galligan
9ed616a56c Fix the initial references to frame buffers.
The old code would start in a mixed state, where all the reference
frames were pointing to frame buffer 0, but the reference counts
were 0. This is why we needed special code for the first frame.

Change-Id: I734961012917654ff8c0c8b317aac00ab75ded1a
2013-12-04 16:53:18 -08:00
Dmitry Kovalev
3712b58c2f Merge "Cleaning up vp9_entropy.h file." 2013-12-04 16:46:41 -08:00
Dmitry Kovalev
c6ca5c5ad9 Compact formatting default_coef_probs_{4x4, 8x8, 16x16, 32x32}.
Change-Id: If40b930431766d5179b9769509b5e4ca1628e9cc
2013-12-04 15:45:28 -08:00
Dmitry Kovalev
da2da79012 Merge "Formatting vp9_pareto8_full array." 2013-12-04 12:22:50 -08:00
Dmitry Kovalev
beb35aba19 Cleaning up vp9_dec_build_inter_predictors_sb function.
Using get_plane_block_size() instead of manipulation with subsampling
values, calculating all required values only once without redundant calls
to b_width_log2().

Change-Id: I00303f2a0926f9c4cb17f34591adda60615f8919
2013-12-04 12:11:01 -08:00
Yunqing Wang
f6582d6928 Revert "Simplify mask checking in loop filters"
Jingning saw bitstream change with this patch. It could be true
that (mask_16x16_0 & 1) is 1, but (mask_16x16_1 & 1) is 0 in some
edge cases.

This reverts commit 8f05e70340.

Change-Id: I0a529435ce816a1e14653eb510d5090de276070a
2013-12-04 11:31:19 -08:00
Dmitry Kovalev
1470789927 Merge "Moving eob array to the encoder." 2013-12-04 10:58:02 -08:00
Yunqing Wang
920a074e89 Merge "Improve idct16x16: _256_add_sse2(x1.107)&_10_add_sse2(x1.012)" 2013-12-04 08:50:51 -08:00
Dmitry Kovalev
ff6d6a9f07 Formatting vp9_pareto8_full array.
Change-Id: Ic7f47a8d233daf5e61e82092865837ea4eda4095
2013-12-03 18:49:19 -08:00
Dmitry Kovalev
f00d157c12 Moving eob array to the encoder.
In the decoder we don't need to save eobs, we can pass eob as an argument.
That's why removing eob arrays from VP9Decompressor and TileWorkerData,
and moving eob pointer from macroblockd_plane to macroblock_plane.

Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a
2013-12-03 17:59:32 -08:00
Dmitry Kovalev
8e89e2f2e0 Cleaning up vp9_entropy.h file.
Renaming constants for consistency:
  DCT_VAL_CATEGORY1 => CATEGORY1_TOKEN
  DCT_VAL_CATEGORY2 => CATEGORY2_TOKEN
  DCT_VAL_CATEGORY3 => CATEGORY3_TOKEN
  DCT_VAL_CATEGORY4 => CATEGORY4_TOKEN
  DCT_VAL_CATEGORY5 => CATEGORY5_TOKEN
  DCT_VAL_CATEGORY6 => CATEGORY6_TOKEN
  DCT_EOB_TOKEN     => EOB_TOKEN
  DCT_EOB_MODEL_TOKEN => EOB_MODEL_TOKEN
  MAX_ENTROPY_TOKENS => ENTROPY_TOKENS

Moving constants:
  INTER_MODE_CONTEXTS from vp9_entropy.h to vp9_blockd.h.
  EOSB_TOKEN from vp9_entropy.h to vp9_tokenize.h

Change-Id: I5fcbf081318e1d365792b6d290a930c6cb0f3fc2
2013-12-03 17:23:03 -08:00
Dmitry Kovalev
09577b8c8d Merge "Removing dummy assignments." 2013-12-03 10:59:34 -08:00
Abo Talib Mahfoodh
e4419ab691 Improve idct16x16: _256_add_sse2(x1.107)&_10_add_sse2(x1.012)
The performance gain of idct16x16_10_add_sse2 function is not
noticeable. However since both functions use the IDCT16_1D,
idct16x16_10_add_sse2 should be modified as well.
Tested with: park_joy_420_720p50.y4m

Change-Id: I02b957e36fcf997c677d15baf496533895271bff
2013-12-02 21:08:56 -05:00
Yunqing Wang
8f182a1cac Merge "improve vp9_idct32x32_34(x1.472)&1024(x1.032)_add_sse2" 2013-12-02 15:10:05 -08:00
Yunqing Wang
37e68aba55 Merge "Simplify mask checking in loop filters" 2013-12-02 12:06:26 -08:00
Dmitry Kovalev
862c22cf7d Merge "Moving token-encoding related stuff from common to encoder." 2013-12-02 10:32:04 -08:00
Yunqing Wang
8f05e70340 Simplify mask checking in loop filters
Considering a horizontal edge, if mask_16x16 is 1 for an even-
indexed 8x8 block, then mask_16x16 is 1 for next 8x8 block in
same row. Similiar to a verticle edge, if mask_16x16 is 1 for
an even-rowed 8x8 block, then mask_16x16 is 1 for the 8x8 block
right below it in next raw. Based on that, the mask_16x16 checking
can be simplified to save cycles. The corresponding 8-pixel
vp9_mb_lpf_horizontal_edge code can also be removed.

Change-Id: Ic3fe7a5674322239208cbe2731dc3216ce2084f3
2013-11-27 14:10:57 -08:00
Dmitry Kovalev
d83d61d942 Moving reaster_block_offset{,_int16} from vp9_blockd.h to vp9_rdopt.h.
Change-Id: I5a5888d4639cc6b7eb266be47581dd15ba08c91e
2013-11-27 12:57:21 -08:00
Dmitry Kovalev
f9da823216 Moving token-encoding related stuff from common to encoder.
Change-Id: I0e59d320407b3bed0ba3622a7b29975f6fad7ebf
2013-11-27 11:27:57 -08:00
Dmitry Kovalev
e2f1d02eb3 Merge "Moving mode encodings from common to encoder + cleanup." 2013-11-27 11:00:54 -08:00
Yaowu Xu
e9c19617bf Merge "vp9_short_fdct32x32_rd vp9_short_fdct32x32 optimized for AVX2" 2013-11-27 10:27:32 -08:00
Dmitry Kovalev
d3a2e55af4 Removing qcoeff buffers from the decoder.
We only need qcoeff buffers in the encoder. Reducing TileWorkerData struct
and VP9Decompressor struct sizes by 24K.

Change-Id: Id148868461f7ffa3d3dd634b371503ae9c57e207
2013-11-26 18:52:10 -08:00
Dmitry Kovalev
fc3c3303f1 Removing dummy assignments.
Change-Id: I10d1a4bcac751a982d9dd135f019e3a4d92f8522
2013-11-26 15:35:11 -08:00
Dmitry Kovalev
f4bf712fbb Moving mode encodings from common to encoder + cleanup.
Change-Id: I248ccb1532e2cd95314d0b95108f2c2e71cf084f
2013-11-26 14:53:17 -08:00
Yaowu Xu
b60293e1ce Merge "Amended some comments for clarity" 2013-11-26 14:32:02 -08:00
Frank Galligan
b4874e2c82 Fix 16 wide neon horz loopfilter.
Multiply by 3 was on 8bit vectors when it should have been on
16bit vectors.

Change-Id: I248c1429b3134dfd171dfab0ebb109fd2437e1fc
2013-11-26 10:02:40 -08:00
Yunqing Wang
7a5fd6a1bf Merge "Do vertical loopfiltering in parallel" 2013-11-26 09:35:14 -08:00
Abo Talib Mahfoodh
f97d91ab67 improve vp9_idct32x32_34(x1.472)&1024(x1.032)_add_sse2
vp9_idct32x32_34_add_sse2:
speedup: 1.472
IDCT32_1D_34 and MULTIPLICATION_AND_ADD_2 are optimized
based on the fact that Only upper-left 8x8 has
non-zero values.

vp9_idct32x32_1024_add_sse2:
speedup: 1.032

Tested with: park_joy_420_720p50.y4m

Change-Id: I8670ce547552b48695049de298e2fc46ce28dfbc
2013-11-26 12:28:26 -05:00
Dmitry Kovalev
5488da280d Merge "Moving mv entropy encodings calculation to the encoder side." 2013-11-25 19:15:21 -08:00
Dmitry Kovalev
56d048c412 Moving mv entropy encodings calculation to the encoder side.
Moved arrays:
  vp9_mv_joint_encodings
  vp9_mv_class_encodings
  vp9_mv_class0_encodings
  vp9_mv_fp_encodings

Change-Id: Iaf5008c579fcbd6d77fdd81d1aef8c71b5f308b7
2013-11-25 16:36:28 -08:00
Dmitry Kovalev
7ba7a5f817 Merge "Removing redundant call of vp9_init_mbmode_probs()." 2013-11-25 16:08:42 -08:00
Dmitry Kovalev
cfc1f91c9f Merge "Moving {left, right}_block_mode to vp9_blockd.h." 2013-11-25 10:59:24 -08:00
Dmitry Kovalev
e8af3db88a Merge "Renaming COMPPREDMODE_TYPE enum and its members." 2013-11-25 10:59:08 -08:00
Yaowu Xu
dd69337e6e Amended some comments for clarity
Change-Id: I31c3908ba394095deb5d3a5d7b7c9b2b5328c3e8
2013-11-25 10:55:01 -08:00
Yaowu Xu
cc1e05ca5f Merge "In frame Q adjustment experiment." 2013-11-25 10:52:22 -08:00
Jingning Han
f547fb8e07 Merge "Use separate inter predictors for enc/dec" 2013-11-25 10:29:07 -08:00
Paul Wilkins
644bd87e8e In frame Q adjustment experiment.
The idea here is to allow "in frame" adjustment of the final Q
value used to encode each SB64, using segmentation.

There is also adjustment of the rd mult in regions of overspend.

Activated using aq_mode=2

Change-Id: I2f140cd898c9f877c32cd6d2e667f5e11ada4b1c
2013-11-25 10:22:55 -08:00
Yaowu Xu
3183135dd3 Merge "Fix a build issue with visual c." 2013-11-25 10:20:53 -08:00
Jingning Han
ba8b5e8d6d Use separate inter predictors for enc/dec
The decoder will construct inter predictor using lazy border extension,
while the encoder, going with multiple runs of motion search in the rate-
distortion optimization loop for each block, does border extension at
frame level. This commit makes separate the inter predictors for encoder
and decoder, respectively.

Change-Id: Ieca2fecba3a7201a6d64ef9f219e5d91e50559c3
2013-11-25 09:43:34 -08:00
Jingning Han
12e5ec6aa8 Merge "Separate setup_scale_factor/extend_frame_borders" 2013-11-25 09:14:46 -08:00
Yaowu Xu
86368faca9 Fix a build issue with visual c.
Change-Id: Ic8fc16ee1734cfde0d12a2e3abb3e9299382f3b1
2013-11-25 08:11:35 -08:00
Dmitry Kovalev
9fe88870c5 Merge "Cleaning up vp9_append_sub8x8_mvs_for_idx." 2013-11-24 16:08:20 -08:00
Dmitry Kovalev
52b43a2876 Inlining and removing vp9_set_pred_flag_seg_id() function.
Change-Id: I0fd76937e847f78378a7ab3fa0af00a7c2c52b42
2013-11-22 17:32:11 -08:00
Dmitry Kovalev
fb9c19c62d Renaming COMPPREDMODE_TYPE enum and its members.
List of renames:
  COMPPREDMODE_TYPE      => REFERENCE_MODE
  SINGLE_PREDICTION_ONLY => SINGLE_REFERENCE
  COMP_PREDICTION_ONLY   => COMPOUND_REFERENCE
  HYBRID_PREDICTION      => REFERENCE_MODE_SELECT (like TX_MODE_SELECT)
  NB_PREDICTION_TYPES    => REFERENCE_MODES

Change-Id: If723dabe9435325d0165dcd028142a2c78b417b4
2013-11-22 16:35:37 -08:00
Dmitry Kovalev
350731e8f9 Organizing all scan tables into lookup table.
Change-Id: Ie829ee58a55157e6972c63cebe69a5d0a3221349
2013-11-22 16:20:45 -08:00
Dmitry Kovalev
52fa10a9a3 Cleaning up vp9_append_sub8x8_mvs_for_idx.
Change-Id: Ic92f15d82ff5cfa3df655d08e460335c2ef8a325
2013-11-22 15:28:32 -08:00
Jingning Han
86d2a9b978 Separate setup_scale_factor/extend_frame_borders
This commit takes out vp9_extend_frame_borders from
vp9_setup_scale_factors.

The refactoring is for the preparation of the use of lazy border
extension at decoder. This makes it necessary to handle border
extension separately at encoder/decoder. The use of
vp9_extend_frame_borders will be removed, when lazy border extension
is ready.

Change-Id: Ia3baba3d179d5f11eee1634f19b3b319d2a59186
2013-11-22 12:02:08 -08:00
Dmitry Kovalev
e0ec61187e Merge "Removing txfrm_block_to_raster_xy() call from extend_for_intra()." 2013-11-22 10:51:38 -08:00
Yunqing Wang
ed36720b66 Do vertical loopfiltering in parallel
This patch followed "Add filter_selectively_vert_row2 to enable
parallel loopfiltering" commit, and added x86 SSE2 optimization
to do 16-pixel filtering in parallel. For other optimizations
(neon and dspr2), current 16-pixel functions were done by calling
8-pixel functions twice, and real 16-pixel functions could be added
later.

Decoder speedup:
tulip clip:     2% speed gain;
old_town_cross: 1.2% speed gain;
bus:            2% speed gain.

Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7
2013-11-22 10:04:51 -08:00
Dmitry Kovalev
7c8cac3c21 Removing txfrm_block_to_raster_xy() call from extend_for_intra().
Change-Id: I6a48d1f35ed5fe7a2c7499675b339994c9c3bdf2
2013-11-21 19:30:58 -08:00
Dmitry Kovalev
ad3333e2cd Merge "Removing plane_block_{width, height} functions." 2013-11-21 16:37:27 -08:00
levytamar82
8def766de2 vp9_short_fdct32x32_rd vp9_short_fdct32x32 optimized for AVX2
Change-Id: I6366e84490883b72362f762369d7e5bccb64f02f
2013-11-21 14:19:49 -08:00
Frank Galligan
97d1258375 Revert "Add 16 wide neon horz loopfilter."
The change caused mismatches with some test vectors on neon.

Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/

Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0
2013-11-21 14:01:33 -08:00
Dmitry Kovalev
4896d5c7ef Moving {left, right}_block_mode to vp9_blockd.h.
Both functions have no relation to motion vectors, so moving them from
vp9_findnearmv.h to vp9_blockd.h.

Change-Id: I74f524267886ab0fff4a2da793a10c906ed0f43a
2013-11-21 11:43:53 -08:00
Yunqing Wang
e002bb99a8 Merge "Add filter_selectively_vert_row2 to enable parallel loopfiltering" 2013-11-21 11:25:55 -08:00
hkuang
370bf116a2 Merge "Remove unnecessary eob checking." 2013-11-21 11:24:02 -08:00
Frank Galligan
2dd77580c0 Merge "Add 16 wide neon horz loopfilter." 2013-11-21 10:29:30 -08:00