The forward 32x32 2D-DCT functions are aligned in vpx_dsp folder.
The vp9_dct.h file is not effectively used now.
Change-Id: Ie7946b6fdd784b8e91496242337bc9002c75c281
This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward
transform operations into vpx_dsp folder.
Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d
The following quantization functions were moved:
vp9_quantize_b
vp9_quantize_b_32x32
vp9_highbd_quantize_b
vp9_highbd_quantize_b_32x32
vp9_quantize_dc
vp9_quantize_dc_32x32
vp9_highbd_quantize_dc
vp9_highbd_quantize_dc_32x32
The purpose of doing that was to allow these functions to be shared
by multiple codecs.
Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f
This reverts commit a42df86c03.
this change causes MSA/VP9SubpelVarianceTest.Ref and
MSA/VP9SubpelVarianceTest.ExtremeRef failures under
mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu
Change-Id: I40b71a0b774eaeb31f66f795733f95cf360909f7
this file shouldn't be built directly, it is included in vp9_dct_sse2.c
to create a non-high-bitdepth and a high-bitdepth version
silences missing prototype warnings for the unused FDCT* functions
Change-Id: Ide6ff8c24ab31bdb0f833260505ae33660a1ad5b
this file shouldn't be built directly, it is included in vp9_dct_sse2.c
to create a non-high-bitdepth and a high-bitdepth version
silences missing prototype warnings for the unused FDCT32x32* functions
Change-Id: I0e38f16dae5ea1728de184ee2c89287d48675c51
this file shouldn't be built directly, it is included in vp9_dct_avx2.c
to create a non-high-bitdepth and a high-bitdepth version
silences missing prototype warnings for the unused FDCT32x32* functions
Change-Id: I4c19935c0e035b393be513bde735e9a78064a494
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
PSNR HVS is a human visual system weighted version of SNR that's
gained some popularity from academia and apparently better matches
MOS testing.
This code is borrowed from the Daala Project but uses our FDCT code.
Change-Id: Idd10fbc93129f7f4734946f6009f87d0f44cd2d7
exclude files that only contain functions for non-high-bitdepth builds.
this removes some warnings related to missing prototypes
Change-Id: Ic6642998c46a7b808c6c53b2f9c34bcd4d037abe
Simple skin detection, from vp8; works reasonable on most of the
RTC clips, but could miss sometimes.
Added debug flag to write out skin map over source input.
Change-Id: I2caea7592f1c459047aac46627eeb24a94946464
On Nexus 7 speed -6 saw ~30% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5%
increase in perf for 720p.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
Currently, VP9 supports column-tile encoding, which allows a frame
to be encoded in multiple column tiles independently. The number of
column tiles are set by encoder option "--tile-columns". This
provides a way to encode a frame in parallel.
Based on previous set of patches, this patch implemented the tile-
based multi-threaded encoder. Each thread processes one or more
tiles.
Usage:
For HD clips:
--tile-columns=2 --threads=1/2/3/4
While using 4 threads, tests showed that the encoder achieved
2.3X - 2.5X speedup at good-quality speed 3, and 2X speedup at
realtime speed 5.
Change-Id: Ied987f8f2618b1283a8643ad255e88341733c9d4
Also removes some spurious changes in common/vp9_blockd.h which
was introduced by a rebase issue between nextgen and master branches.
Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282
(cherry picked from commit 005d80cd05)
(cherry picked from commit 08d2f54800)
(cherry picked from commit 4230c2306c)
This change is made in preparation for a
subsequent patch which adds acceleration
for the highbitdepth transform functions.
The highbitdepth transform functions attempt
to use 16/32bit sse instructions where possible,
but fallback to using the C implementations if
potential overflow is detected. For this reason
the dct routines are made global so they can be
called from the acceleration functions in the
subsequent patch.
Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665
(cherry picked from commit 454342d4e7)