7261 Commits

Author SHA1 Message Date
Frank Galligan
cc2da09d42 Fix variance Neon intrinsics > 32x32
The 16 bit sum vector was overflowing.

Change-Id: I0fdf38e832ee99457ec8680a92691a6175ff8c3f
2015-01-17 10:31:48 -08:00
Yunqing Wang
e76eaf05b1 vp9_ethread: add parallel loopfilter
1. Added row-based loopfilter in encoder;
2. Moved common multi-threaded loopfilter functions from decoder
   to common;
3. Merged multi-threaded loopfilter code, and made encoder/
   decoder call same function to reduce code duplication.

Encoder tests showed that 1% - 2% speedup was seen for good-quality
2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6%
speedup using 4 threads were seen for real-time mode(at speed 7).

Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4
2015-01-16 17:19:27 -08:00
Jingning Han
0220255fa0 Merge "Fix frame buffer swap in denoiser" 2015-01-16 16:58:37 -08:00
Jingning Han
dfda5cebc7 Fix frame buffer swap in denoiser
This commit fixes a bug in denoiser reference frame buffer swap,
which disables frame buffer update.

Change-Id: I39a9427180fd18f9692602064ad821f7af4714c0
2015-01-16 12:29:58 -08:00
Yaowu Xu
bc5d3fae5c Replace "colorspace" with "color_space"
This is to make the usage of the variable name consistent across
the code base.

Change-Id: I698739e55841c59358d1c6e5cc97c96088772943
2015-01-15 17:58:47 -08:00
Minghai Shang
220bc3a013 [two pass temporal svc]Fix crash issue in transcoder app caused by last fix.
Change-Id: I78ecc8ec3fa3ba5f69bb23813e68a5255d0534e1
2015-01-15 16:59:54 -08:00
Frank Galligan
6e7e1cf32f Add Neon intrinsics for vp9_avg_8x8_neon
On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5%
increase in perf for 720p.

Tested on Nexus 7, built with ndk r10d, gcc 4.9.

Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
2015-01-15 15:32:40 -08:00
Yunqing Wang
99b99831e4 Align thread data in vp9_ethread
On some platforms, such as 32bit Windows and 32bit Mac, the allocated
memory isn't aligned automatically. The thread data is aligned to
ensure the correct access in SIMD code.

Change-Id: I1108c145fe982ddbd3d9324952758297120e4806
2015-01-14 15:51:56 -08:00
Yaowu Xu
829a01dbb7 Merge "Add encoder control for setting color space" 2015-01-14 14:14:34 -08:00
Frank Galligan
c7d6c0c5a8 Merge "Switch remaining Neon variance functions to shifts" 2015-01-14 12:17:42 -08:00
Frank Galligan
68224a6e87 Merge "Add 64x64 sub_pel_variance Neon function" 2015-01-14 12:17:20 -08:00
Yaowu Xu
e94b415c34 Add encoder control for setting color space
This commit adds encoder side control for vp9 to set color space info
in the output compressed bitstream.

It also amends the "vp9_encoder_params_get_to_decoder" test to verify
the correct color space information is passed from the encoder end to
decoder end.

Change-Id: Ibf5fba2edcb2a8dc37557f6fae5c7816efa52650
2015-01-14 10:17:14 -08:00
Yaowu Xu
afae733eed Merge "Enable decoder to pass through color space info" 2015-01-14 10:04:15 -08:00
Frank Galligan
ec1d8387e1 Add 64x64 sub_pel_variance Neon function
On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10%
increase in perf for 720p.

Tested on Nexus 7, built with ndk r10d, gcc 4.9.

Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334
2015-01-14 08:36:24 -08:00
Frank Galligan
588f74f8a6 Switch remaining Neon variance functions to shifts
Saves 5 instructions on 8x8 and 16x16 and 8 instructions
on 32x32, when compiled with 4.9.

Change-Id: Id3da613a36a9d27d8c5169c59ba45d247c920c6c
2015-01-14 07:22:49 -08:00
Frank Galligan
bd3dbc588c Merge "Add 64x variance Neon functions" 2015-01-13 22:38:58 -08:00
Minghai Shang
a14415d171 [twopass temporal svc] Fix decoding error on seek.
Don't put small empty frame in front of a key frame. We will
put key frame flag in webm container if there's a visible key
frame. But there will be decoding error when we seek to here
if we put the small empty frame, which will be inter frame,
in front of it.

Change-Id: Id50c2c1fd31da0405ff6faa7375cc2f49c55402d
2015-01-13 15:44:22 -08:00
Yaowu Xu
6b223fcb58 Enable decoder to pass through color space info
This commit added a field to vpx_image_t for indicating color space,
the field is also added to YUV_BUFFER_CONFIG. This allows the color
space information pass through the decoder from input stream to the
output buffer.

The commit also updated compare_img() function with added verification
of matching color space to ensure the color space information to be
correctly passed from encode to decoder in compressed vp9 streams.

Change-Id: I412776ec83defd8a09d76759aeb057b8fa690371
2015-01-13 15:13:19 -08:00
Frank Galligan
74d40cd507 Add 64x variance Neon functions
Add optimized Neon functions of:
vp9_variance32x64
vp9_variance64x32
vp9_variance64x64

On Nexus 7 speed -5 and -6 saw about a 4% increase in perf.
Speeds -7 and -8 saw about a 6% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.

Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa
2015-01-13 15:08:13 -08:00
Yaowu Xu
6f6fbf9175 Merge "Added plumbing for setting color space" 2015-01-13 09:20:13 -08:00
Yaowu Xu
fe3f21099f Merge "Fix comments and color format" 2015-01-11 14:01:36 -08:00
Yaowu Xu
ce52b0f8d3 Added plumbing for setting color space
Change-Id: If64052cc6e404abc8a64a889f42930d14fad21d3
2015-01-09 10:54:25 -08:00
Yaowu Xu
ecbca31a1d Fix comments and color format
Replaced "color space" with "color format" in comments where color
sampling format is concerned, so to differentiate from the concept
defined in COLOR_SPACE.

Change-Id: I8c935034c166b24307a99352dab1686531276bb8
2015-01-09 10:36:43 -08:00
Paul Wilkins
ccffe318ff Merge "Use 64 bit to accumulate frame sse." 2015-01-09 06:05:11 -08:00
Jingning Han
ae537c151b Merge "Refactor mc reference block fetch in denoiser" 2015-01-08 17:56:53 -08:00
James Zern
4d6838627d Merge "vp9: add per-tile longjmp error handling" 2015-01-08 15:53:37 -08:00
James Zern
44b55dada8 Merge "vp9: fix -Wclobbered (longjmp + local variables)" 2015-01-08 15:53:02 -08:00
Jingning Han
a1daf009be Merge "Use lookup table to find pixel numbers in block" 2015-01-08 13:58:35 -08:00
Johann
00bbe342c2 Merge "Disable vp9 _8_ loopfilters" 2015-01-08 12:47:52 -08:00
Jingning Han
a0be730eae Refactor mc reference block fetch in denoiser
This commit refactors the motion compensated reference block fetch
process in denoiser. It skips the stage that generates motion
compensated reference block if denoiser decides to use copy block
mode. For high motion clips, this could speed up the denoising
process by about 10%.

Change-Id: I8ef4fa5fe766a8c4529119b9ec01faefb3d4ef53
2015-01-08 12:43:08 -08:00
Jingning Han
e3f0b19f3f Use lookup table to find pixel numbers in block
This could save one multiplication in each threshold funtion
called by the denoiser per block.

Change-Id: I35f437e09999f0a087180878ef7805f0d86e5819
2015-01-08 12:32:28 -08:00
Jingning Han
e535ad5067 Merge "Refactor denoiser frame buffer update" 2015-01-08 11:16:14 -08:00
Jingning Han
97dc782635 Merge "Initalize zeromv_sse and newmv_sse in vp9_pick_inter_mode" 2015-01-08 10:55:03 -08:00
Jingning Han
f1866a5792 Merge "Use vp9_convolve_copy in denoiser output" 2015-01-08 09:59:10 -08:00
hkuang
9130e7ad2e Merge "Remove unnecessary init_macroblockd." 2015-01-08 09:15:32 -08:00
Jingning Han
ea061a885d Refactor denoiser frame buffer update
Use frame buffer pointer swap instead of memcpy when possible.
These two CLs make the denoiser when running on vidyo1 720p at
speed -6 over 10% faster.

Change-Id: I64fe8a2422cafca6787a50c7f4dfb961191c0a9d
2015-01-07 18:33:13 -08:00
Jingning Han
29a5deb40c Use vp9_convolve_copy in denoiser output
Replace copy_block with vp9_convolve_copy for speed performance
improvement.

Change-Id: I3a08c4d01dff2253b6ee573efd02f65ccdc1b5a5
2015-01-07 18:23:17 -08:00
Zoe Liu
4cf636a60e Removed redundant local variables in the forward hybrid transforms.
Change-Id: I60f7ccbbc8dc624134e325bdce6042bc183075b6
2015-01-07 16:38:29 -08:00
Yaowu Xu
01eec75858 Merge "Refactor calculation of tile_cols" 2015-01-07 16:24:57 -08:00
Jingning Han
08055b639a Merge "Always check and free denoiser buffer memory space" 2015-01-07 15:54:06 -08:00
Jingning Han
e42b3ee765 Initalize zeromv_sse and newmv_sse in vp9_pick_inter_mode
These two parameters are used to control the denoiser cut-off
thresholds. They should be properly initialized when starting
mode search of a given block.

Change-Id: Iba8a25487026a0dbe0d350c347d7e4e4e237b637
2015-01-07 15:32:41 -08:00
JackyChen
1883c940b9 Merge "Use qdiff to adjust the threshold of sad and variance in MFQE." 2015-01-07 14:57:46 -08:00
Yaowu Xu
e9cf9b7dfe Refactor calculation of tile_cols
Change-Id: I2c38ea2bcf6d221a0b6b2fb9be4cebbee21006a3
2015-01-07 14:28:59 -08:00
Jingning Han
b208439b5a Merge "Fix best ref frame rd cost update in sub8x8 non-RD mode search" 2015-01-07 14:06:55 -08:00
Jingning Han
3e41563f33 Merge "Format fix in vp9_pick_inter_mode_sub8x8" 2015-01-07 14:06:06 -08:00
Jingning Han
802b798f67 Fix best ref frame rd cost update in sub8x8 non-RD mode search
This fixes the issue that sub8x8 inter blocks always end up
with GOLDEN_FRAME.

Change-Id: Id0c25cbb9c2003f43b4dff8fb1572512c246e077
2015-01-07 12:00:02 -08:00
Jingning Han
c3fd9bbdaf Format fix in vp9_pick_inter_mode_sub8x8
Replace ref_frame++ with ++ref_frame.

Change-Id: Ic39793081156c314bf1b85d5ab76def97f3bff52
2015-01-07 11:50:36 -08:00
Jingning Han
59f29f5e3f Merge "Fix denoiser chroma component initialization" 2015-01-07 11:30:15 -08:00
Jingning Han
9a0e694182 Merge "Skip duplicate denoiser frame buffer allocation" 2015-01-07 11:30:07 -08:00
Johann
d12f1f907d Merge "Rearrange loopfilter functions" 2015-01-07 11:07:54 -08:00