The previous patch "Fix issues in 32bit PIC enabled build" fixed
the x86inc.asm for macho32. Now we can enable use_x86inc while
building libvpx for 32bit pic enabled Darwin target, which makes
the encoder a lot faster(>2X) in this case by turning on the
existing optimizations.
Change-Id: I5f5c7add428d73f50c935c48d0a70aed2b1eb7af
gives a better summary of what is enabled / disabled outside of the
automatic toolchain options.
fixes issue #936
Change-Id: I1bf27593a5512713aab1473cb606c58cf3084d62
1. reduce the size of temporaray arrays on stack
2. avoid build_tree_distribution for tx size that is not used at all.
Change-Id: I0f8d7124e16a3789d3c15ad24cf02c1c12789e2c
This patch was to fix issue 924:
https://code.google.com/p/webm/issues/detail?id=924
The SECTION_RODATA macro was modified to support macho32 format.
The sub-pixel functions were modified to pass in 2 more parameters
to handle the global offsets for PIC build.
Change-Id: I3bfcd336bcae945edf300bca4ab40376a2628cd4
On Nexus 7 speed -6 saw ~18% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
On Nexus 7 speed -6 saw ~15% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
On Nexus 7 speed -6 saw ~30% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
The current method doesn't work with Xcode 4 and up, since they no
longer have a $DEVELOPER_DIR/SDKs directory. Using xcrun and xcodebuild
works all the way back to Xcode 3 on OS X 10.6 Snow Leopard, if not
earlier.
Change-Id: I7126f2fb4a8f1d6e46f921e70bbd090f00ce3d36
Floating point is used in vp9_convert_qindex_to_q(), so sometime unit
test ActiveMapTest would cause run time error without properly call
to clear_system_state to reset register status.
Change-Id: I181e9395148c44a6ca8b97d6e109bd4a152143c6
Add distortion threshold condition to refresh state of a coding block,
and allow for qp adjustment also for some intra modes and non-zero motion modes.
Also some code cleanup (remove unused variables/code).
Change-Id: I735fa2b28bc64f60e0323976b82510577b074203
Currently disabled by default: enabled using
#define GROUP_ADAPTIVE_MAXQ
In this patch the active max Q is adjusted for each GF
group based on the vbr bit allocation and raw first pass
group error.
This will tend to give a lower q for easy sections
and a higher value for very hard sections. As such it is
expected to improve quality in some of the easier
sections where quality issues have been reported.
This change tends to hurt overall psnr but help
average psnr. SSIM also shows a small gain.
Average results for derf, yt, std-hd and yt-hd test sets were
as follows (%change for average psnr, overal psnr and ssim):-
derf +0.291, - 0.252, -0.021
yt +6.466, -1.436, +0.552
std-hd +0.490, +0.014, +0.380
yt-hd +5.565, - 1.573, +0.099
Change-Id: Icc015499cebbf2a45054a05e8e31f3dfb43f944a
On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30%
increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
the result should have both bits set; previously this was converted from
webp incorrectly and resulted in a boolean check...
Change-Id: I2a7c7f2b491945f3a536ab4fca02247eccc892b8
This commit replaces an integer divide with a table-lookup. It is
to improve decoding speed, and at the same time, to reduce possible
complications with a bug in AMD Family 12h processors:
"665 Integer Divide Instruction May Cause Unpredictable Behavior"
Change-Id: I678b707a538798a923850bac467e66e847e6def7
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
For low spatial resolutions: bias partittion selection to smaller block sizes,
and base the variance computation on 4x4 down-sampling.
Also move the threshold computations into the choose_partitioning,
so they are computed once for each sb block.
On low-res clips (RTC_derf) PSNR/SSIMetrics increase by about 4-5%.
No change for resolutions above CIF.
Change-Id: I93f8ff742c8044786977bb6e31dcf8efda6dd1b0
Just before a forced key frame we often get a foreshortened
arf/gf group. In such a case, we do not want to update
rc->last_boosted_qindex, which is used to define the Q range
for the forced key frame itself.
This gives a small average metrics gain for the YT and YT-HD sets
(eg. YT SSIM +0.141%).
Change-Id: Ie06698bc4f249e87183b8f8fb27ff8f3fde216d9