Commit Graph

1242 Commits

Author SHA1 Message Date
John Koleszar
664cd5ac91 Merge remote branch 'internal/upstream' into HEAD 2011-07-23 00:05:14 -04:00
Johann
773bcc300d Merge "fix sharpness bug and clean up" 2011-07-22 09:34:55 -07:00
Johann
a04ed0e8f3 fix sharpness bug and clean up
sharpness was not recalculated in vp8cx_pick_filter_level_fast

remove last_filter_type. all values are calculated, don't need to update
the lfi data when it changes.

always use cm->sharpness_level. the extra indirection was annoying.

don't track last frame_type or sharpness_level manually. frame type
only matters for motion search and sharpness_level is taken care of in
frame_init

move function declarations to their proper header

Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db
2011-07-22 12:33:57 -04:00
Yunqing Wang
829179e888 Merge "Preload reference area to an intermediate buffer in sub-pixel motion search" 2011-07-22 06:56:15 -07:00
Yunqing Wang
20bd1446c0 Preload reference area to an intermediate buffer in sub-pixel motion search
In sub-pixel motion search, the search range is small(+/- 3 pixels).
Preload whole search area from reference buffer into a 32-byte
aligned buffer. Then in search, load reference data from this buffer
instead. This keeps data in cache, and reduces the crossing cache-
line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux)
showed encoder speed improvement:
  3.4%   at --rt --cpu-used =-4
  2.8%   at --rt --cpu-used =-3
  2.3%   at --rt --cpu-used =-2
  2.2%   at --rt --cpu-used =-1

Test on Atom notebook showed only 1.1% speed improvement(speed=-4).
Test on Xeon machine also showed less improvement, since unaligned
data access latency is greatly reduced in newer cores.

Next, I will apply similar idea to other 2 sub-pixel search functions
for encoding speed > 4.

Make this change exclusively for x86 platforms.

Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f
2011-07-22 09:28:06 -04:00
John Koleszar
7d44c805cf Merge remote branch 'internal/upstream' into HEAD 2011-07-22 00:05:06 -04:00
Yaowu Xu
f614661242 Merge "fix more merge issues" into experimental 2011-07-21 16:05:24 +00:00
Yaowu Xu
8c31484ea1 fix more merge issues
With this fix, the experimental branch now builds and encodes correctly
with the following two configure options respectively:
--enable-experimental --enable-t8x8
--enable-experimental

Change-Id: I3147c33c503fe713a85fd371e4f1a974805778bf
2011-07-21 09:01:53 -07:00
John Koleszar
2bdda84e37 Merge "Increase chrow row alignment to 16 bytes." 2011-07-21 07:32:39 -07:00
Yunqing Wang
c5fe641179 Merge "Add improvements made in good-quality mode to real-time mode" 2011-07-21 07:27:09 -07:00
John Koleszar
a53586d9d1 Merge remote branch 'internal/upstream' into HEAD 2011-07-21 00:05:05 -04:00
Yaowu Xu
1c24eb2b7b fixed a number of problems caused by auto merges
The auto merge process pull and merge commits from public git or master
branch. These automerges while worked well most time, but has created
a few problems. This commit fixed several issues existed long before
the latest 8x8 transform commit.

Change-Id: I895ca99713231b1aec521d57db5d9839f74aacfa
2011-07-20 12:45:35 -07:00
Timothy B. Terriberry
7d1b37cdac Increase chrow row alignment to 16 bytes.
This is done by expanding luma row to 32-byte alignment, since
 there is currently a bunch of code that assumes that
 uv_stride == y_stride/2 (see, for example, vp8/common/postproc.c,
 common/reconinter.c, common/arm/neon/recon16x16mb_neon.asm,
 encoder/temporal_filter.c, and possibly others; I haven't done a
 full audit).
It also uses replaces the hardcoded border of 16 in a number of
 encoder buffers with VP8BORDERINPIXELS (currently 32), as the
 chroma rows start at an offset of border/2.
Together, these two changes have the nice advantage that simply
 dumping the frame memory as a contiguous blob produces a valid,
 if padded, image.

Change-Id: Iaf5ea722ae5c82d5daa50f6e2dade9de753f1003
2011-07-20 10:20:31 -07:00
Deb Mukherjee
08f6471890 Add 8x8 transform to experimental branch
Please refer to previous commit messages for detailed info:
https://on2-git.corp.google.com/g/#change,5940
https://on2-git.corp.google.com/g/#change,6045

Change-Id: I8b16992f2f69c5a808ad40a3e32ef589cce7c59d
2011-07-20 09:49:22 -07:00
Attila Nagy
0afcc76971 encoder: don't set the fragment bit for the last partition
Change-Id: Icb4e4f0d7c3074a8507852178be87541a1cb5bac
2011-07-20 14:09:42 +03:00
John Koleszar
8e464cc4c2 Merge remote branch 'internal/upstream' into HEAD 2011-07-20 00:05:09 -04:00
Scott LaVarnway
b2d9700f53 Merge "Moved vp8_encode_bool into boolhuff.h" 2011-07-19 08:15:14 -07:00
Johann
6afafc313c remove old armv5 code
armv5 dequantizer is not referenced

Change-Id: Id1cc617dcee35ebd6a406816ec6aaa26e8bbc8ad
2011-07-19 09:20:38 -04:00
Scott LaVarnway
a25f6a9c88 Moved vp8_encode_bool into boolhuff.h
allowing the compiler to inline this function.  For real-time
encodes, this gave a boost of 1% to 2.5%, depending on the
speed setting.

Change-Id: I3929d176cca086b4261267b848419d5bcff21c02
2011-07-19 09:17:25 -04:00
John Koleszar
b3b34b0bc7 Merge remote branch 'internal/upstream' into HEAD 2011-07-19 00:05:05 -04:00
John Koleszar
b5ea2fbc2c Improved 1-pass CBR rate control
This patch attempts to improve the handling of CBR streams with
respect to the short term buffering requirements. The "buffer level"
is changed to be an average over the rc buffer, rather than a long
running average. Overshoot is also tracked over the same interval
and the golden frame targets suppressed accordingly to correct for
overly aggressive boosting.

Testing shows that this is fairly consistently positive in one
metric or another -- some clips that show significant decreases
in quality have better buffering characteristics, others show
improvenents in both.

Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920
2011-07-18 11:48:05 -04:00
John Koleszar
dc1c3f9024 Merge remote branch 'internal/upstream' into HEAD 2011-07-16 00:05:05 -04:00
Scott LaVarnway
e68894fa03 Merge "Tokenize MB optimized" 2011-07-15 07:54:14 -07:00
Tero Rintaluoma
4e82f01547 Tokenize MB optimized
Optimized C-code of the following functions:
 - vp8_tokenize_mb
 - tokenize1st_order_b
 - tokenize2nd_order_b
Gives ~1-5% speed-up for RT encoding on Cortex-A8/A9
depending on encoding parameters.

Change-Id: I6be86104a589a06dcbc9ed3318e8bf264ef4176c
2011-07-15 11:26:54 +03:00
John Koleszar
087b338d9e Merge remote branch 'internal/upstream' into HEAD 2011-07-15 00:05:04 -04:00
James Berry
6b6f367c3d bug fix vpx_copy_and_extend_frame size issue
vpx_copy_and_extend_frame could incorrectly
resize uv frames which could result in a crash.

Change-Id: Ie96f7078b1e328b3907a06eebeee44ca39a2e898
2011-07-14 15:58:15 -04:00
John Koleszar
04dce631a2 Remove unused speed features
min_fs_radius, max_fs_radius, full_freq were set but never read.

Change-Id: I82657f4e7f2ba2acc3cbc3faa5ec0de5b9c6ec74
2011-07-14 14:20:25 -04:00
John Koleszar
6901105e99 Merge remote branch 'internal/upstream' into HEAD 2011-07-14 00:05:04 -04:00
Yunqing Wang
f1f28535c3 Merge "Fix unnecessary casting of B_PREDICTION_MODE (issue 349)" 2011-07-13 13:32:57 -07:00
Yunqing Wang
139577f937 Fix unnecessary casting of B_PREDICTION_MODE (issue 349)
Minor fix.

Change-Id: Iaf93f6e47e882a33c479e57c7a0d0bf321e291c0
2011-07-13 15:52:07 -04:00
Yunqing Wang
0e9a6ed72a Add improvements made in good-quality mode to real-time mode
Several improvements we made in good-quality mode can be added
into real-time mode to speed up encoding in speed 1, 2, and 3
with small quality loss. Tests using tulip clip showed:

--rt --cpu-used=-1
(before change)
PSNR: 38.028
time: 1m33.195s
(after change)
PSNR: 38.014
time: 1m20.851s

--rt --cpu-used=-2
(before change)
PSNR: 37.773
time: 0m57.650s
(after change)
PSNR: 37.759
time: 0m54.594s

--rt --cpu-used=-3
(before change)
PSNR: 37.392
time: 0m42.865s
(after change)
PSNR: 37.375
time: 0m41.949s

Change-Id: I76ab2a38d72bc5efc91f6fe20d332c472f6510c9
2011-07-13 14:51:02 -04:00
Fritz Koenig
84c3cd79d1 Merge "Reduce motion vector search on alt-ref frame." 2011-07-13 10:07:30 -07:00
Johann
211694f67e Merge "update x86 asm for loopfilter" 2011-07-13 04:10:03 -07:00
Johann
8f910594bd Merge "Update armv6 loopfilter to new interface" 2011-07-13 04:09:55 -07:00
Johann
1a219c22b1 Merge "Update armv7 loopfilter to new interface" 2011-07-13 04:09:42 -07:00
Johann
d9b825cff2 Merge "New loop filter interface" 2011-07-13 04:09:26 -07:00
John Koleszar
791ad1bb37 Merge remote branch 'internal/upstream' into HEAD 2011-07-13 00:05:03 -04:00
Attila Nagy
c231b0175d Update armv6 loopfilter to new interface
Change-Id: I5fe581d797571a7a9432fbd17fc557591d0c1afa
2011-07-12 12:14:51 +03:00
Attila Nagy
283b0e25ac Update armv7 loopfilter to new interface
Change-Id: I65105a9c63832669237e6a6a7fcb4ea3ea683346
2011-07-12 12:12:25 +03:00
Fritz Koenig
ede0b15c9d Reduce motion vector search on alt-ref frame.
Clamp mv search to accomodate subpixel filtering
of UV mv.

Change-Id: Iab3ed405993ef6bf779ad7cf60863153068fb7d1
2011-07-11 09:05:43 -07:00
John Koleszar
c24479e870 Merge remote branch 'internal/upstream' into HEAD 2011-07-09 00:05:04 -04:00
Yunqing Wang
587ca06da9 Minor change in pick_inter_mode()
Scott suggested to move vp8_mv_pred() under "case NEWMV" to save
extra checks.

Change-Id: I09e69892f34a08dd425a4d81cfcc83674e344a20
2011-07-08 14:08:45 -04:00
Yunqing Wang
e83d36c053 Merge "Adjust full-pixel clamping and motion vector limit calculation" 2011-07-08 08:39:32 -07:00
Yunqing Wang
40991faeae Adjust full-pixel clamping and motion vector limit calculation
Do mvp clamping in full-pixel precision instead of 1/8-pixel
precision to avoid error caused by right shifting operation.
Also, further fixed the motion vector limit calculation in change:
b748045470

Change-Id: Ied88a4f7ddfb0476eb9f7afc6ceeddbf209fffd7
2011-07-08 11:34:28 -04:00
Johann
01433c5043 update x86 asm for loopfilter
Change-Id: I1ed739522db7c00c189851c7095c1b64ef6412ce
2011-07-08 09:23:38 -04:00
John Koleszar
ef7f489dc3 Merge remote branch 'internal/upstream-experimental' into HEAD 2011-07-08 08:57:03 -04:00
Johann
6ae12c415e Merge "clean up warnings when building arm with rtcd" 2011-07-08 05:16:09 -07:00
Attila Nagy
622958449b New loop filter interface
Separate simple filter with reduced no. of parameters.
MB filter level picking based on precalculated table. Level table updated for
each frame. Inside and edge limits precalculated and updated just when
sharpness changes. HEV threshhold is constant.
ARM targets use scalars and others vectors.

Change works only with --target=generic-gnu
All other targets have to be updated!

Change-Id: I6b73aca6b525075b20129a371699b2561bd4d51c
2011-07-08 09:31:41 +03:00
John Koleszar
2c3c54f747 Merge remote branch 'origin/master' into experimental
Change-Id: I9cead934ebea85d81aceaaec4674efc74367f984
2011-07-08 00:05:05 -04:00
John Koleszar
973a9c075d Merge "Set VPX_FRAME_IS_DROPPABLE" 2011-07-07 08:11:05 -07:00