2757 Commits

Author SHA1 Message Date
James Yu
08e38f06db VP8 for ARMv8 by using NEON intrinsics 14
Add sixtappredict_neon.c
- vp8_sixtap_predict16x16_neon
- vp8_sixtap_predict8x8_neon
- vp8_sixtap_predict8x4_neon
- vp8_sixtap_predict4x4_neon

Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:12 -07:00
James Yu
18e9caad47 VP8 for ARMv8 by using NEON intrinsics 13
Add shortidct4x4llm_neon.c
- vp8_short_idct4x4llm_neon

Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:05 -07:00
Johann
140262d39f Merge "VP8 for ARMv8 by using NEON intrinsics 12" 2014-05-03 19:06:55 -07:00
Johann
0b12a40296 Merge "VP8 for ARMv8 by using NEON intrinsics 11" 2014-05-03 19:05:26 -07:00
Johann
8c7e798c9b Merge "VP8 for ARMv8 by using NEON intrinsics 10" 2014-05-03 19:04:57 -07:00
Johann
c1ba686064 Merge "VP8 for ARMv8 by using NEON intrinsics 09" 2014-05-03 19:04:18 -07:00
James Yu
feaf766bd0 VP8 for ARMv8 by using NEON intrinsics 12
Add sad_neon.c
- vp8_sad16x16_neon
- vp8_sad16x8_neon
- vp8_sad8x8_neon
- vp8_sad8x16_neon
- vp8_sad4x4_neon

Change-Id: I08eaae49ec03fb91b394354660a5df0367cea311
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:54:39 -07:00
James Yu
4a8336fa9d VP8 for ARMv8 by using NEON intrinsics 11
Add mbloopfilter_neon.c
- vp8_mbloop_filter_horizontal_edge_y_neon
- vp8_mbloop_filter_horizontal_edge_uv_neon
- vp8_mbloop_filter_vertical_edge_y_neon
- vp8_mbloop_filter_vertical_edge_uv_neon

Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:54:33 -07:00
Johann
1d65b3be2a Merge "Remove asm_offsets dependency in quantize_b_ssse3" 2014-05-03 04:21:16 -07:00
James Yu
c500fc22c1 VP8 for ARMv8 by using NEON intrinsics 10
Add loopfiltersimpleverticaledge_neon.c
- vp8_loop_filter_bvs_neon
- vp8_loop_filter_mbvs_neon

Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:11:00 -07:00
James Yu
55c95f2d2c VP8 for ARMv8 by using NEON intrinsics 09
Add loopfiltersimplehorizontaledge_neon.c
- vp8_loop_filter_bhs_neon
- vp8_loop_filter_mbhs_neon

Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:10:45 -07:00
Johann
cf2262c44c Merge "VP8 for ARMv8 by using NEON intrinsics 08" 2014-05-03 04:10:18 -07:00
Johann
fe437bc8f8 Merge "VP8 for ARMv8 by using NEON intrinsics 07" 2014-05-03 04:08:54 -07:00
Scott LaVarnway
e516a42527 Remove struct params from vp8_denoiser_filter
This eliminates the asm_offsets dependency for future
all-assembly versions of this function.

Change-Id: I3227073ecfcb8ee6e593934fab941e9081abdda0
2014-05-02 10:31:52 -07:00
Scott LaVarnway
dea687f733 Merge "Improved intrinsic version of vp8_denoiser_filter_neon" 2014-05-02 09:59:59 -07:00
James Yu
a5d79f43b9 VP8 for ARMv8 by using NEON intrinsics 08
Add loopfilter_neon.c
- vp8_loop_filter_horizontal_edge_y_neon
- vp8_loop_filter_horizontal_edge_uv_neon
- vp8_loop_filter_vertical_edge_y_neon
- vp8_loop_filter_vertical_edge_uv_neon

Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 09:32:11 -07:00
James Yu
930557be10 VP8 for ARMv8 by using NEON intrinsics 07
Add iwalsh_neon.c
- vp8_short_inv_walsh4x4_neon

Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 09:24:54 -07:00
Johann
570d43c020 Remove asm_offsets dependency in quantize_b_ssse3
Replace it with some intrinsic code and inline assembly.

Change-Id: I81b4df146db3d01039059be7dae31083e2943b97
2014-05-02 08:00:16 -07:00
James Yu
81ad047ee5 VP8 for ARMv8 by using NEON intrinsics 06
Add idct_dequant_full_2x_neon.c
- idct_dequant_full_2x_neon

==== Summary of apply VP8 decode patch series ====
Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core
Toolchain: linaro-1.13.1-4.8-2014.01
Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure
                     --target=armv7-linux-gcc --prefix=$HOME/out
                     --enable-shared --cpu=cortex-a7
Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm

NEON assembly   46.68 (fps)
Apply patch 06  46.65, -0.03
Apply patch 07  46.86, +0.21
Apply patch 08  46.58, -0.28
Apply patch 09  46.57, -0.01
Apply patch 10  46.51, -0.06
Apply patch 11  46.13, -0.38
Apply patch 12  45.42, -0.71
Apply patch 13  46.06, +0.64
Apply patch 14  45.19, -0.87
Apply patch 15  45.93, +0.74
Apply patch 16  45.48, -0.45
Apply patch 17  45.84, +0.36
Apply patch 18  45.91, +0.07  <= With all NEON intrinsics patches
                 Total -0.77 fps, 1.65% performance regression

Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 11:57:47 +08:00
Scott LaVarnway
ff209de82b Improved intrinsic version of vp8_denoiser_filter_neon
Used horizonal add instructions instead of adding
byte lanes.  The encoder performance improved by
~4% for the test clip used.

Change-Id: Iaddd10403fcffb5b3f53b1f591ab2fe0ff002c08
2014-04-30 06:58:16 -07:00
Yunqing Wang
096eaba728 Remove VP8 save_reg_neon function
This patch did a cleanup following the commit "Save NEON registers
in VP8 NEON functions". The pushing/poping of callee-saved NEON
registers was moved into individual NEON functions. Therefore,
we don't need to save those registers at the beginning of codec.
The related code was removed.

Change-Id: I5648166514fc9beffb780aa138495597731f49ea
2014-04-29 16:13:24 -07:00
Yunqing Wang
33df6d1fc1 Save NEON registers in VP8 NEON functions
The recent compiler can generate optimized code that uses NEON registers
for various operations besides floating-point operations. Therefore,
only saving callee-saved registers d8 - d15 at the beginning of the
encoder/decoder is not enough anymore. This patch added register saving
code in VP8 NEON functions that use those registers.

Change-Id: Ie9e44f5188cf410990c8aaaac68faceee9dffd31
2014-04-28 14:51:53 -07:00
Dmitry Kovalev
947748ed19 Fixing constant value used to calculate frame pts and duration.
Change-Id: Idbd017d1b42f7fdc7b1ce4e00370f5229800abd7
2014-04-25 17:39:48 -07:00
Joey Parrish
18c08607e0 Add VPXD_SET_DECRYPTOR support to the VP9 decoder.
Change-Id: I88f86c8ff9af34e0b6531028b691921b54c2fc48
2014-04-23 16:11:54 -07:00
Yunqing Wang
1893122e34 Fix dr memory VP8 encode/decode errors
This patch fixed errors reported in Issue 746: "dr memory VP8
encode errors" and Issue 745: "dr memory VP8 decode errors".
The "UNINITIALIZED READ" errors were fixed in x86 assembly
code. The list of files fixed is
vp8_intra_pred_uv_tm_sse2
vp8_intra_pred_uv_tm_ssse3

vp8_intra_pred_uv_ho_mmx2
vp8_intra_pred_uv_ho_ssse3

vp8_intra_pred_y_tm_sse2
vp8_intra_pred_y_tm_ssse3

vp8_intra_pred_y_ho_sse2

Change-Id: Ib6df7bf1d442077fe534edfd90e50ad16fadacdd
2014-04-21 17:04:05 -07:00
Yaowu Xu
99230aeb05 Prevent reading of uninitialized value
This commit added a check of reference frame to make sure that pre
buffer pointers are initialized only when necessary and make them
to 0 if ref frame is intra, hence those buffer should never be used.

Change-Id: Ieb474fcd9feb759f02e2f9c282b7348a8fa31117
2014-04-16 13:00:13 -07:00
Adrian Grange
f7bd1274e3 Enable vpxenc to specify internal coded frame size
Added command line flags "resize-width" & "resize-height"
to allow the user to specify the frame size to encode at.

These two flags are ignored if the "resize-allowed" switch
is not set to 1.

All frames in the clip are then encoded at this size, which
must be smaller than the raw frame size.

Change-Id: I3d64bd9303d5c0bd678461a866a1ea621700d744
2014-04-14 10:54:19 -07:00
Dmitry Kovalev
8503d72e6a Removing legacy XMA code from vp8.
Change-Id: Ib9f7fd3fd56e304e5f587f790c97ac34a3077265
2014-04-10 23:30:17 -07:00
Sergey Ulanov
409f8da265 Fix onyx_if.c to not to redefine M_LOG2_E if it's already defined.
This fixes warning when compiling libvpx for PNaCl. PNaCl's version
of math.h defines M_LOG2_E.

Change-Id: Iba9450441538e9f82447ad2936bea94d21bafdf1
2014-04-10 08:54:30 -07:00
Jan Gerber
9848d67bb3 Remove an unused typedef
Change-Id: Ie0eb9ac4529db00a322511e5241a59b501c289b7
2014-04-04 08:47:52 -07:00
Yunqing Wang
c8773416fb Fix uninitialized read in postprocessing
This patch fixed WebRTC Issue 3020: "Uninit error at
vp8_mbpost_proc_down_xmm". The first 8 values in d were not initialized,
but was accessed. This patch fixed c code as well as mmx and sse2 code.

Change-Id: Iaa5b41a4ed3bea971b15fb826ce34b7ab4e36fb1
2014-03-24 14:54:25 -07:00
James Zern
7ae5954d35 Merge "tokenize: quiet -Warray-bounds warnings" 2014-03-18 15:09:41 -07:00
James Zern
aad7b55b40 Merge "rdopt: quiet -Warray-bounds warnings" 2014-03-18 15:08:54 -07:00
James Zern
90de3b0124 tokenize: quiet -Warray-bounds warnings
eob is limited by GetCoeffs

Change-Id: Ie5c0d024796fe6c9b2db0374892544e421bd5d09
2014-03-15 10:39:23 -07:00
James Zern
268f32db21 rdopt: quiet -Warray-bounds warnings
eob is limited by GetCoeffs

Change-Id: Id48a92e600375a1d4fb956757c93c91ebb5df59a
2014-03-15 10:37:49 -07:00
James Zern
2a19c96362 onyx_if: quiet -Warray-bounds warnings
'number_of_layers' is range checked before assignment from the user
config.

Change-Id: Idefdaceb8736f126fa7c647da2b047dafb56ea52
2014-03-15 10:36:27 -07:00
James Zern
805078a1bf build: convert rtcd.sh to perl
significantly speeds up file generation.

the goal of this change is to convert rtcd.sh to perl as directly as
possible to allow for simple comparison. future changes can make it more
perl-like.

---
Linux
    [CREATE] vpx_scale_rtcd.h
real    0m0.485s ->    0m0.022s
    [CREATE] vp8_rtcd.h
real    0m4.619s ->    0m0.060s
    [CREATE] vp9_rtcd.h
real    0m10.102s ->    0m0.087s

Windows
    [CREATE] vpx_scale_rtcd.h
real    0m8.360s ->    0m0.080s
    [CREATE] vp8_rtcd.h
real    1m8.083s ->    0m0.160s
    [CREATE] vp9_rtcd.h
real    2m6.489s ->    0m0.233s

Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee
2014-03-03 14:47:11 -08:00
Minghai Shang
3a8deeb8b6 Merge "[svc] Add target bitrate settings for each layers." 2014-02-27 10:51:26 -08:00
Dmitry Kovalev
7d5bffc452 Adding vpx_sse_to_psnr() function.
Removing all copies of identical vp8_mse2psnr/vp9_mse2psnr functions.
Using vpx_sse_to_psnr() instead in all places.

Change-Id: I15beef9834d43d8fc8a8a7a2d1fc5de3d658fed8
2014-02-26 16:21:12 -08:00
Minghai Shang
8c196b27b3 [svc] Add target bitrate settings for each layers.
Change-Id: Ia7677fb436667bc4f76db71f65e4784f433f7826
2014-02-26 13:30:50 -08:00
James Yu
fb5d281bb6 VP8 for ARMv8 by using NEON intrinsics 05
Add dequantizeb_neon.c
- vp8_dequantize_b_loop_neon

vpxdec  --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.23 (fps)

Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 10:16:00 +08:00
James Yu
28b2f82f97 VP8 for ARMv8 by using NEON intrinsics 04
Add dequant_idct_neon.c
- vp8_dequant_idct_add_neon

vpxdec  --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.22 (fps)

Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 09:59:23 +08:00
James Yu
d749ab6221 VP8 for ARMv8 by using NEON intrinsics 03
Add dc_only_idct_add_neon.c
- vp8_dc_only_idct_add_neon

vpxdec  --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.24 (fps)

Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 09:28:29 +08:00
James Yu
300a3bfc73 VP8 for ARMv8 by using NEON intrinsics 02
Add copymem_neon.c
- vp8_copy_mem16x16_neon
- vp8_copy_mem8x8_neon
- vp8_copy_mem8x4_neon

vpxdec  --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.25 (fps)

Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-23 22:56:53 +08:00
Adrian Grange
b7be30eb36 Cleanup some comments.
Change-Id: I568861ba1d43620865ad9a98a97eef37a51fd856
2014-02-14 15:05:30 -08:00
Yaowu Xu
ecf392a155 Merge "minor spelling cleanup in comments" 2014-02-14 14:29:35 -08:00
Frank Galligan
a4f30a5023 Add VP9 decoder support for external frame buffers
Added support for external frame buffers to libvpx's VP9 decoder.
If the external frame buffer functions are set then libvpx will
call the get function whenever it needs a new frame buffer to
decode a frame into. And it will call the release function
whenever there are no more references to that buffer.

Change-Id: Id2934d005f606af6e052fb6db0d5b7c02f567522
2014-02-13 13:14:19 -08:00
Andrew Russell
549c31f8ae minor spelling cleanup in comments
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
2014-02-12 16:32:51 -08:00
Tom Finegan
9ff89d9446 vp8/encoder: Silence MSVC warnings in firstpass.c.
Added some casts to int to silence MSVC warnings.

Change-Id: I72481ec2abd12110cf87a3d0da7a1cbe9ef2f47c
2014-02-06 17:02:02 -08:00
Adrian Grange
2554d5731a Remove delete_first_pass_file.
Change-Id: If46d93fb1c26e4629af1f492bfad7a82b4c4f778
2014-02-05 11:31:44 -08:00