Compare commits
7 Commits
v0.9.7-p1
...
sandbox/at
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
744a58bc1c | ||
|
|
86b5556f5a | ||
|
|
4375b4ac39 | ||
|
|
71595edd47 | ||
|
|
848dddee15 | ||
|
|
f1ba70e199 | ||
|
|
a22df2e29d |
1
.mailmap
1
.mailmap
@@ -2,4 +2,3 @@ Adrian Grange <agrange@google.com>
|
|||||||
Johann Koenig <johannkoenig@google.com>
|
Johann Koenig <johannkoenig@google.com>
|
||||||
Tero Rintaluoma <teror@google.com> <tero.rintaluoma@on2.com>
|
Tero Rintaluoma <teror@google.com> <tero.rintaluoma@on2.com>
|
||||||
Tom Finegan <tomfinegan@google.com>
|
Tom Finegan <tomfinegan@google.com>
|
||||||
Ralph Giles <giles@xiph.org> <giles@entropywave.com>
|
|
||||||
|
|||||||
12
AUTHORS
12
AUTHORS
@@ -4,11 +4,8 @@
|
|||||||
Aaron Watry <awatry@gmail.com>
|
Aaron Watry <awatry@gmail.com>
|
||||||
Adrian Grange <agrange@google.com>
|
Adrian Grange <agrange@google.com>
|
||||||
Alex Converse <alex.converse@gmail.com>
|
Alex Converse <alex.converse@gmail.com>
|
||||||
Alexis Ballier <aballier@gentoo.org>
|
|
||||||
Alok Ahuja <waveletcoeff@gmail.com>
|
|
||||||
Andoni Morales Alastruey <ylatuya@gmail.com>
|
Andoni Morales Alastruey <ylatuya@gmail.com>
|
||||||
Andres Mejia <mcitadel@gmail.com>
|
Andres Mejia <mcitadel@gmail.com>
|
||||||
Aron Rosenberg <arosenberg@logitech.com>
|
|
||||||
Attila Nagy <attilanagy@google.com>
|
Attila Nagy <attilanagy@google.com>
|
||||||
Fabio Pedretti <fabio.ped@libero.it>
|
Fabio Pedretti <fabio.ped@libero.it>
|
||||||
Frank Galligan <fgalligan@google.com>
|
Frank Galligan <fgalligan@google.com>
|
||||||
@@ -25,29 +22,20 @@ Jeff Muizelaar <jmuizelaar@mozilla.com>
|
|||||||
Jim Bankoski <jimbankoski@google.com>
|
Jim Bankoski <jimbankoski@google.com>
|
||||||
Johann Koenig <johannkoenig@google.com>
|
Johann Koenig <johannkoenig@google.com>
|
||||||
John Koleszar <jkoleszar@google.com>
|
John Koleszar <jkoleszar@google.com>
|
||||||
Joshua Bleecher Snyder <josh@treelinelabs.com>
|
|
||||||
Justin Clift <justin@salasaga.org>
|
Justin Clift <justin@salasaga.org>
|
||||||
Justin Lebar <justin.lebar@gmail.com>
|
Justin Lebar <justin.lebar@gmail.com>
|
||||||
Lou Quillio <louquillio@google.com>
|
|
||||||
Luca Barbato <lu_zero@gentoo.org>
|
Luca Barbato <lu_zero@gentoo.org>
|
||||||
Makoto Kato <makoto.kt@gmail.com>
|
Makoto Kato <makoto.kt@gmail.com>
|
||||||
Martin Ettl <ettl.martin78@googlemail.com>
|
Martin Ettl <ettl.martin78@googlemail.com>
|
||||||
Michael Kohler <michaelkohler@live.com>
|
Michael Kohler <michaelkohler@live.com>
|
||||||
Mike Hommey <mhommey@mozilla.com>
|
|
||||||
Mikhal Shemer <mikhal@google.com>
|
Mikhal Shemer <mikhal@google.com>
|
||||||
Pascal Massimino <pascal.massimino@gmail.com>
|
Pascal Massimino <pascal.massimino@gmail.com>
|
||||||
Patrik Westin <patrik.westin@gmail.com>
|
Patrik Westin <patrik.westin@gmail.com>
|
||||||
Paul Wilkins <paulwilkins@google.com>
|
Paul Wilkins <paulwilkins@google.com>
|
||||||
Pavol Rusnak <stick@gk2.sk>
|
Pavol Rusnak <stick@gk2.sk>
|
||||||
Philip Jägenstedt <philipj@opera.com>
|
Philip Jägenstedt <philipj@opera.com>
|
||||||
Rafael Ávila de Espíndola <rafael.espindola@gmail.com>
|
|
||||||
Ralph Giles <giles@xiph.org>
|
|
||||||
Ronald S. Bultje <rbultje@google.com>
|
|
||||||
Scott LaVarnway <slavarnway@google.com>
|
Scott LaVarnway <slavarnway@google.com>
|
||||||
Stefan Holmer <holmer@google.com>
|
|
||||||
Taekhyun Kim <takim@nvidia.com>
|
|
||||||
Tero Rintaluoma <teror@google.com>
|
Tero Rintaluoma <teror@google.com>
|
||||||
Thijs Vermeir <thijsvermeir@gmail.com>
|
|
||||||
Timothy B. Terriberry <tterribe@xiph.org>
|
Timothy B. Terriberry <tterribe@xiph.org>
|
||||||
Tom Finegan <tomfinegan@google.com>
|
Tom Finegan <tomfinegan@google.com>
|
||||||
Yaowu Xu <yaowu@google.com>
|
Yaowu Xu <yaowu@google.com>
|
||||||
|
|||||||
112
CHANGELOG
112
CHANGELOG
@@ -1,115 +1,3 @@
|
|||||||
2011-08-15 v0.9.7-p1 "Cayuga" patch 1
|
|
||||||
This is an incremental bugfix release against Cayuga. All users of that
|
|
||||||
release are strongly encouraged to upgrade.
|
|
||||||
|
|
||||||
- Fix potential OOB reads (cdae03a)
|
|
||||||
|
|
||||||
An unbounded out of bounds read was discovered when the
|
|
||||||
decoder was requested to perform error concealment (new in
|
|
||||||
Cayuga) given a frame with corrupt partition sizes.
|
|
||||||
|
|
||||||
A bounded out of bounds read was discovered affecting all
|
|
||||||
versions of libvpx. Given an multipartition input frame that
|
|
||||||
is truncated between the mode/mv partition and the first
|
|
||||||
residiual paritition (in the block of partition offsets), up
|
|
||||||
to 3 extra bytes could have been read from the source buffer.
|
|
||||||
The code will not take any action regardless of the contents
|
|
||||||
of these undefined bytes, as the truncated buffer is detected
|
|
||||||
immediately following the read based on the calculated
|
|
||||||
starting position of the coefficient partition.
|
|
||||||
|
|
||||||
- Fix potential error concealment crash when the very first frame
|
|
||||||
is missing or corrupt (a609be5)
|
|
||||||
|
|
||||||
- Fix significant artifacts in error concealment (a4c2211, 99d870a)
|
|
||||||
|
|
||||||
- Revert 1-pass CBR rate control changes (e961317)
|
|
||||||
Further testing showed this change produced undesirable visual
|
|
||||||
artifacts, rolling back for now.
|
|
||||||
|
|
||||||
|
|
||||||
2011-08-02 v0.9.7 "Cayuga"
|
|
||||||
Our third named release, focused on a faster, higher quality, encoder.
|
|
||||||
|
|
||||||
- Upgrading:
|
|
||||||
This release is backwards compatible with Aylesbury (v0.9.5) and
|
|
||||||
Bali (v0.9.6). Users of older releases should refer to the Upgrading
|
|
||||||
notes in this document for that release.
|
|
||||||
|
|
||||||
- Enhancements:
|
|
||||||
Stereo 3D format support for vpxenc
|
|
||||||
Runtime detection of available processor cores.
|
|
||||||
Allow specifying --end-usage by enum name
|
|
||||||
vpxdec: test for frame corruption
|
|
||||||
vpxenc: add quantizer histogram display
|
|
||||||
vpxenc: add rate histogram display
|
|
||||||
Set VPX_FRAME_IS_DROPPABLE
|
|
||||||
update configure for ios sdk 4.3
|
|
||||||
Avoid text relocations in ARM vp8 decoder
|
|
||||||
Generate a vpx.pc file for pkg-config.
|
|
||||||
New ways of passing encoded data between encoder and decoder.
|
|
||||||
|
|
||||||
- Speed:
|
|
||||||
This release includes across-the-board speed improvements to the
|
|
||||||
encoder. On x86, these measure at approximately 11.5% in Best mode,
|
|
||||||
21.5% in Good mode (speed 0), and 22.5% in Realtime mode (speed 6).
|
|
||||||
On ARM Cortex A9 with Neon extensions, real-time encoding of video
|
|
||||||
telephony content is 35% faster than Bali on single core and 48%
|
|
||||||
faster on multi-core. On the NVidia Tegra2 platform, real time
|
|
||||||
encoding is 40% faster than Bali.
|
|
||||||
|
|
||||||
Decoder speed was not a priority for this release, but improved
|
|
||||||
approximately 8.4% on x86.
|
|
||||||
|
|
||||||
Reduce motion vector search on alt-ref frame.
|
|
||||||
Encoder loopfilter running in its own thread
|
|
||||||
Reworked loopfilter to precalculate more parameters
|
|
||||||
SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}().
|
|
||||||
Make hor UV predict ~2x faster (73 vs 132 cycles) using SSSE3.
|
|
||||||
Removed redundant checks
|
|
||||||
Reduced structure sizes
|
|
||||||
utilize preload in ARMv6 MC/LPF/Copy routines
|
|
||||||
ARM optimized quantization, dfct, variance, subtract
|
|
||||||
Increase chrow row alignment to 16 bytes.
|
|
||||||
disable trellis optimization for first pass
|
|
||||||
Write SSSE3 sub-pixel filter function
|
|
||||||
Improve SSE2 half-pixel filter funtions
|
|
||||||
Add vp8_sub_pixel_variance16x8_ssse3 function
|
|
||||||
Reduce unnecessary distortion computation
|
|
||||||
Use diamond search to replace full search
|
|
||||||
Preload reference area in sub-pixel motion search (real-time mode)
|
|
||||||
|
|
||||||
- Quality:
|
|
||||||
This release focused primarily on one-pass use cases, including
|
|
||||||
video conferencing. Low latency data rate control was significantly
|
|
||||||
improved, improving streamability over bandwidth constrained links.
|
|
||||||
Added support for error concealment, allowing frames to maintain
|
|
||||||
visual quality in the presence of substantial packet loss.
|
|
||||||
|
|
||||||
Add rc_max_intra_bitrate_pct control
|
|
||||||
Limit size of initial keyframe in one-pass.
|
|
||||||
Improve framerate adaptation
|
|
||||||
Improved 1-pass CBR rate control
|
|
||||||
Improved KF insertion after fades to still.
|
|
||||||
Improved key frame detection.
|
|
||||||
Improved activity masking (lower PSNR impact for same SSIM boost)
|
|
||||||
Improved interaction between GF and ARFs
|
|
||||||
Adding error-concealment to the decoder.
|
|
||||||
Adding support for independent partitions
|
|
||||||
Adjusted rate-distortion constants
|
|
||||||
|
|
||||||
|
|
||||||
- Bug Fixes:
|
|
||||||
Removed firstpass motion map
|
|
||||||
Fix parallel make install
|
|
||||||
Fix multithreaded encoding for 1 MB wide frame
|
|
||||||
Fixed iwalsh_neon build problems with RVDS4.1
|
|
||||||
Fix semaphore emulation, spin-wait intrinsics on Windows
|
|
||||||
Fix build with xcode4 and simplify GLOBAL.
|
|
||||||
Mark ARM asm objects as allowing a non-executable stack.
|
|
||||||
Fix vpxenc encoding incorrect webm file header on big endian
|
|
||||||
|
|
||||||
|
|
||||||
2011-03-07 v0.9.6 "Bali"
|
2011-03-07 v0.9.6 "Bali"
|
||||||
Our second named release, focused on a faster, higher quality, encoder.
|
Our second named release, focused on a faster, higher quality, encoder.
|
||||||
|
|
||||||
|
|||||||
@@ -98,11 +98,11 @@ install::
|
|||||||
$(BUILD_PFX)%.c.d: %.c
|
$(BUILD_PFX)%.c.d: %.c
|
||||||
$(if $(quiet),@echo " [DEP] $@")
|
$(if $(quiet),@echo " [DEP] $@")
|
||||||
$(qexec)mkdir -p $(dir $@)
|
$(qexec)mkdir -p $(dir $@)
|
||||||
$(qexec)$(CC) $(INTERNAL_CFLAGS) $(CFLAGS) -M $< | $(fmt_deps) > $@
|
$(qexec)$(CC) $(CFLAGS) -M $< | $(fmt_deps) > $@
|
||||||
|
|
||||||
$(BUILD_PFX)%.c.o: %.c
|
$(BUILD_PFX)%.c.o: %.c
|
||||||
$(if $(quiet),@echo " [CC] $@")
|
$(if $(quiet),@echo " [CC] $@")
|
||||||
$(qexec)$(CC) $(INTERNAL_CFLAGS) $(CFLAGS) -c -o $@ $<
|
$(qexec)$(CC) $(CFLAGS) -c -o $@ $<
|
||||||
|
|
||||||
$(BUILD_PFX)%.asm.d: %.asm
|
$(BUILD_PFX)%.asm.d: %.asm
|
||||||
$(if $(quiet),@echo " [DEP] $@")
|
$(if $(quiet),@echo " [DEP] $@")
|
||||||
@@ -124,12 +124,6 @@ $(BUILD_PFX)%.s.o: %.s
|
|||||||
$(if $(quiet),@echo " [AS] $@")
|
$(if $(quiet),@echo " [AS] $@")
|
||||||
$(qexec)$(AS) $(ASFLAGS) -o $@ $<
|
$(qexec)$(AS) $(ASFLAGS) -o $@ $<
|
||||||
|
|
||||||
.PRECIOUS: %.c.S
|
|
||||||
%.c.S: CFLAGS += -DINLINE_ASM
|
|
||||||
$(BUILD_PFX)%.c.S: %.c
|
|
||||||
$(if $(quiet),@echo " [GEN] $@")
|
|
||||||
$(qexec)$(CC) -S $(CFLAGS) -o $@ $<
|
|
||||||
|
|
||||||
.PRECIOUS: %.asm.s
|
.PRECIOUS: %.asm.s
|
||||||
$(BUILD_PFX)%.asm.s: %.asm
|
$(BUILD_PFX)%.asm.s: %.asm
|
||||||
$(if $(quiet),@echo " [ASM CONVERSION] $@")
|
$(if $(quiet),@echo " [ASM CONVERSION] $@")
|
||||||
@@ -194,7 +188,7 @@ define linker_template
|
|||||||
$(1): $(filter-out -%,$(2))
|
$(1): $(filter-out -%,$(2))
|
||||||
$(1):
|
$(1):
|
||||||
$(if $(quiet),@echo " [LD] $$@")
|
$(if $(quiet),@echo " [LD] $$@")
|
||||||
$(qexec)$$(LD) $$(strip $$(INTERNAL_LDFLAGS) $$(LDFLAGS) -o $$@ $(2) $(3) $$(extralibs))
|
$(qexec)$$(LD) $$(strip $$(LDFLAGS) -o $$@ $(2) $(3) $$(extralibs))
|
||||||
endef
|
endef
|
||||||
# make-3.80 has a bug with expanding large input strings to the eval function,
|
# make-3.80 has a bug with expanding large input strings to the eval function,
|
||||||
# which was triggered in some cases by the following component of
|
# which was triggered in some cases by the following component of
|
||||||
@@ -336,10 +330,12 @@ ifneq ($(call enabled,DIST-SRCS),)
|
|||||||
DIST-SRCS-$(CONFIG_MSVS) += build/make/gen_msvs_proj.sh
|
DIST-SRCS-$(CONFIG_MSVS) += build/make/gen_msvs_proj.sh
|
||||||
DIST-SRCS-$(CONFIG_MSVS) += build/make/gen_msvs_sln.sh
|
DIST-SRCS-$(CONFIG_MSVS) += build/make/gen_msvs_sln.sh
|
||||||
DIST-SRCS-$(CONFIG_MSVS) += build/x86-msvs/yasm.rules
|
DIST-SRCS-$(CONFIG_MSVS) += build/x86-msvs/yasm.rules
|
||||||
DIST-SRCS-$(CONFIG_MSVS) += build/x86-msvs/obj_int_extract.bat
|
|
||||||
DIST-SRCS-$(CONFIG_RVCT) += build/make/armlink_adapter.sh
|
DIST-SRCS-$(CONFIG_RVCT) += build/make/armlink_adapter.sh
|
||||||
# Include obj_int_extract if we use offsets from asm_*_offsets
|
#
|
||||||
DIST-SRCS-$(ARCH_ARM)$(ARCH_X86)$(ARCH_X86_64) += build/make/obj_int_extract.c
|
# This isn't really ARCH_ARM dependent, it's dependent on whether we're
|
||||||
|
# using assembly code or not (CONFIG_OPTIMIZATIONS maybe). Just use
|
||||||
|
# this for now.
|
||||||
|
DIST-SRCS-$(ARCH_ARM) += build/make/obj_int_extract.c
|
||||||
DIST-SRCS-$(ARCH_ARM) += build/make/ads2gas.pl
|
DIST-SRCS-$(ARCH_ARM) += build/make/ads2gas.pl
|
||||||
DIST-SRCS-yes += $(target:-$(TOOLCHAIN)=).mk
|
DIST-SRCS-yes += $(target:-$(TOOLCHAIN)=).mk
|
||||||
endif
|
endif
|
||||||
|
|||||||
@@ -21,14 +21,8 @@ print "@ This file was created from a .asm file\n";
|
|||||||
print "@ using the ads2gas.pl script.\n";
|
print "@ using the ads2gas.pl script.\n";
|
||||||
print "\t.equ DO1STROUNDING, 0\n";
|
print "\t.equ DO1STROUNDING, 0\n";
|
||||||
|
|
||||||
# Stack of procedure names.
|
|
||||||
@proc_stack = ();
|
|
||||||
|
|
||||||
while (<STDIN>)
|
while (<STDIN>)
|
||||||
{
|
{
|
||||||
# Load and store alignment
|
|
||||||
s/@/,:/g;
|
|
||||||
|
|
||||||
# Comment character
|
# Comment character
|
||||||
s/;/@/g;
|
s/;/@/g;
|
||||||
|
|
||||||
@@ -85,10 +79,7 @@ while (<STDIN>)
|
|||||||
s/CODE([0-9][0-9])/.code $1/;
|
s/CODE([0-9][0-9])/.code $1/;
|
||||||
|
|
||||||
# No AREA required
|
# No AREA required
|
||||||
# But ALIGNs in AREA must be obeyed
|
s/^\s*AREA.*$/.text/;
|
||||||
s/^\s*AREA.*ALIGN=([0-9])$/.text\n.p2align $1/;
|
|
||||||
# If no ALIGN, strip the AREA and align to 4 bytes
|
|
||||||
s/^\s*AREA.*$/.text\n.p2align 2/;
|
|
||||||
|
|
||||||
# DCD to .word
|
# DCD to .word
|
||||||
# This one is for incoming symbols
|
# This one is for incoming symbols
|
||||||
@@ -123,8 +114,8 @@ while (<STDIN>)
|
|||||||
# put the colon at the end of the line in the macro
|
# put the colon at the end of the line in the macro
|
||||||
s/^([a-zA-Z_0-9\$]+)/$1:/ if !/EQU/;
|
s/^([a-zA-Z_0-9\$]+)/$1:/ if !/EQU/;
|
||||||
|
|
||||||
# ALIGN directive
|
# Strip ALIGN
|
||||||
s/ALIGN/.balign/g;
|
s/\sALIGN/@ ALIGN/g;
|
||||||
|
|
||||||
# Strip ARM
|
# Strip ARM
|
||||||
s/\sARM/@ ARM/g;
|
s/\sARM/@ ARM/g;
|
||||||
@@ -136,23 +127,9 @@ while (<STDIN>)
|
|||||||
# Strip PRESERVE8
|
# Strip PRESERVE8
|
||||||
s/\sPRESERVE8/@ PRESERVE8/g;
|
s/\sPRESERVE8/@ PRESERVE8/g;
|
||||||
|
|
||||||
# Use PROC and ENDP to give the symbols a .size directive.
|
# Strip PROC and ENDPROC
|
||||||
# This makes them show up properly in debugging tools like gdb and valgrind.
|
s/\sPROC/@/g;
|
||||||
if (/\bPROC\b/)
|
s/\sENDP/@/g;
|
||||||
{
|
|
||||||
my $proc;
|
|
||||||
/^_([\.0-9A-Z_a-z]\w+)\b/;
|
|
||||||
$proc = $1;
|
|
||||||
push(@proc_stack, $proc) if ($proc);
|
|
||||||
s/\bPROC\b/@ $&/;
|
|
||||||
}
|
|
||||||
if (/\bENDP\b/)
|
|
||||||
{
|
|
||||||
my $proc;
|
|
||||||
s/\bENDP\b/@ $&/;
|
|
||||||
$proc = pop(@proc_stack);
|
|
||||||
$_ = "\t.size $proc, .-$proc".$_ if ($proc);
|
|
||||||
}
|
|
||||||
|
|
||||||
# EQU directive
|
# EQU directive
|
||||||
s/(.*)EQU(.*)/.equ $1, $2/;
|
s/(.*)EQU(.*)/.equ $1, $2/;
|
||||||
@@ -171,6 +148,3 @@ while (<STDIN>)
|
|||||||
next if /^\s*END\s*$/;
|
next if /^\s*END\s*$/;
|
||||||
print;
|
print;
|
||||||
}
|
}
|
||||||
|
|
||||||
# Mark that this object doesn't need an executable stack.
|
|
||||||
printf ("\t.section\t.note.GNU-stack,\"\",\%\%progbits\n");
|
|
||||||
|
|||||||
@@ -41,9 +41,6 @@ sub trim($)
|
|||||||
|
|
||||||
while (<STDIN>)
|
while (<STDIN>)
|
||||||
{
|
{
|
||||||
# Load and store alignment
|
|
||||||
s/@/,:/g;
|
|
||||||
|
|
||||||
# Comment character
|
# Comment character
|
||||||
s/;/@/g;
|
s/;/@/g;
|
||||||
|
|
||||||
@@ -100,10 +97,7 @@ while (<STDIN>)
|
|||||||
s/CODE([0-9][0-9])/.code $1/;
|
s/CODE([0-9][0-9])/.code $1/;
|
||||||
|
|
||||||
# No AREA required
|
# No AREA required
|
||||||
# But ALIGNs in AREA must be obeyed
|
s/^\s*AREA.*$/.text/;
|
||||||
s/^\s*AREA.*ALIGN=([0-9])$/.text\n.p2align $1/;
|
|
||||||
# If no ALIGN, strip the AREA and align to 4 bytes
|
|
||||||
s/^\s*AREA.*$/.text\n.p2align 2/;
|
|
||||||
|
|
||||||
# DCD to .word
|
# DCD to .word
|
||||||
# This one is for incoming symbols
|
# This one is for incoming symbols
|
||||||
@@ -143,8 +137,8 @@ while (<STDIN>)
|
|||||||
# put the colon at the end of the line in the macro
|
# put the colon at the end of the line in the macro
|
||||||
s/^([a-zA-Z_0-9\$]+)/$1:/ if !/EQU/;
|
s/^([a-zA-Z_0-9\$]+)/$1:/ if !/EQU/;
|
||||||
|
|
||||||
# ALIGN directive
|
# Strip ALIGN
|
||||||
s/ALIGN/.balign/g;
|
s/\sALIGN/@ ALIGN/g;
|
||||||
|
|
||||||
# Strip ARM
|
# Strip ARM
|
||||||
s/\sARM/@ ARM/g;
|
s/\sARM/@ ARM/g;
|
||||||
|
|||||||
@@ -412,14 +412,11 @@ EOF
|
|||||||
write_common_target_config_h() {
|
write_common_target_config_h() {
|
||||||
cat > ${TMP_H} << EOF
|
cat > ${TMP_H} << EOF
|
||||||
/* This file automatically generated by configure. Do not edit! */
|
/* This file automatically generated by configure. Do not edit! */
|
||||||
#ifndef VPX_CONFIG_H
|
|
||||||
#define VPX_CONFIG_H
|
|
||||||
#define RESTRICT ${RESTRICT}
|
#define RESTRICT ${RESTRICT}
|
||||||
EOF
|
EOF
|
||||||
print_config_h ARCH "${TMP_H}" ${ARCH_LIST}
|
print_config_h ARCH "${TMP_H}" ${ARCH_LIST}
|
||||||
print_config_h HAVE "${TMP_H}" ${HAVE_LIST}
|
print_config_h HAVE "${TMP_H}" ${HAVE_LIST}
|
||||||
print_config_h CONFIG "${TMP_H}" ${CONFIG_LIST}
|
print_config_h CONFIG "${TMP_H}" ${CONFIG_LIST}
|
||||||
echo "#endif /* VPX_CONFIG_H */" >> ${TMP_H}
|
|
||||||
mkdir -p `dirname "$1"`
|
mkdir -p `dirname "$1"`
|
||||||
cmp "$1" ${TMP_H} >/dev/null 2>&1 || mv ${TMP_H} "$1"
|
cmp "$1" ${TMP_H} >/dev/null 2>&1 || mv ${TMP_H} "$1"
|
||||||
}
|
}
|
||||||
@@ -629,7 +626,7 @@ process_common_toolchain() {
|
|||||||
case ${toolchain} in
|
case ${toolchain} in
|
||||||
sparc-solaris-*)
|
sparc-solaris-*)
|
||||||
add_extralibs -lposix4
|
add_extralibs -lposix4
|
||||||
disable fast_unaligned
|
add_cflags "-DMUST_BE_ALIGNED"
|
||||||
;;
|
;;
|
||||||
*-solaris-*)
|
*-solaris-*)
|
||||||
add_extralibs -lposix4
|
add_extralibs -lposix4
|
||||||
@@ -642,8 +639,8 @@ process_common_toolchain() {
|
|||||||
# on arm, isa versions are supersets
|
# on arm, isa versions are supersets
|
||||||
enabled armv7a && soft_enable armv7 ### DEBUG
|
enabled armv7a && soft_enable armv7 ### DEBUG
|
||||||
enabled armv7 && soft_enable armv6
|
enabled armv7 && soft_enable armv6
|
||||||
enabled armv7 || enabled armv6 && soft_enable armv5te
|
enabled armv6 && soft_enable armv5te
|
||||||
enabled armv7 || enabled armv6 && soft_enable fast_unaligned
|
enabled armv6 && soft_enable fast_unaligned
|
||||||
enabled iwmmxt2 && soft_enable iwmmxt
|
enabled iwmmxt2 && soft_enable iwmmxt
|
||||||
enabled iwmmxt && soft_enable armv5te
|
enabled iwmmxt && soft_enable armv5te
|
||||||
|
|
||||||
@@ -692,7 +689,7 @@ process_common_toolchain() {
|
|||||||
if enabled armv7
|
if enabled armv7
|
||||||
then
|
then
|
||||||
check_add_cflags --cpu=Cortex-A8 --fpu=softvfp+vfpv3
|
check_add_cflags --cpu=Cortex-A8 --fpu=softvfp+vfpv3
|
||||||
check_add_asflags --cpu=Cortex-A8 --fpu=softvfp+vfpv3
|
check_add_asflags --cpu=Cortex-A8 --fpu=none
|
||||||
else
|
else
|
||||||
check_add_cflags --cpu=${tgt_isa##armv}
|
check_add_cflags --cpu=${tgt_isa##armv}
|
||||||
check_add_asflags --cpu=${tgt_isa##armv}
|
check_add_asflags --cpu=${tgt_isa##armv}
|
||||||
@@ -732,18 +729,19 @@ process_common_toolchain() {
|
|||||||
add_cflags -arch ${tgt_isa}
|
add_cflags -arch ${tgt_isa}
|
||||||
add_ldflags -arch_only ${tgt_isa}
|
add_ldflags -arch_only ${tgt_isa}
|
||||||
|
|
||||||
add_cflags "-isysroot ${SDK_PATH}/SDKs/iPhoneOS4.3.sdk"
|
add_cflags "-isysroot /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS4.2.sdk"
|
||||||
|
|
||||||
# This should be overridable
|
# This should be overridable
|
||||||
alt_libc=${SDK_PATH}/SDKs/iPhoneOS4.3.sdk
|
alt_libc=${SDK_PATH}/SDKs/iPhoneOS4.2.sdk
|
||||||
|
|
||||||
# Add the paths for the alternate libc
|
# Add the paths for the alternate libc
|
||||||
for d in usr/include usr/include/gcc/darwin/4.2/ usr/lib/gcc/arm-apple-darwin10/4.2.1/include/; do
|
# for d in usr/include usr/include/gcc/darwin/4.0/; do
|
||||||
|
for d in usr/include usr/include/gcc/darwin/4.0/ usr/lib/gcc/arm-apple-darwin10/4.2.1/include/; do
|
||||||
try_dir="${alt_libc}/${d}"
|
try_dir="${alt_libc}/${d}"
|
||||||
[ -d "${try_dir}" ] && add_cflags -I"${try_dir}"
|
[ -d "${try_dir}" ] && add_cflags -I"${try_dir}"
|
||||||
done
|
done
|
||||||
|
|
||||||
for d in lib usr/lib usr/lib/system; do
|
for d in lib usr/lib; do
|
||||||
try_dir="${alt_libc}/${d}"
|
try_dir="${alt_libc}/${d}"
|
||||||
[ -d "${try_dir}" ] && add_ldflags -L"${try_dir}"
|
[ -d "${try_dir}" ] && add_ldflags -L"${try_dir}"
|
||||||
done
|
done
|
||||||
@@ -754,24 +752,41 @@ process_common_toolchain() {
|
|||||||
linux*)
|
linux*)
|
||||||
enable linux
|
enable linux
|
||||||
if enabled rvct; then
|
if enabled rvct; then
|
||||||
# Check if we have CodeSourcery GCC in PATH. Needed for
|
# Compiling with RVCT requires an alternate libc (glibc) when
|
||||||
# libraries
|
# targetting linux.
|
||||||
hash arm-none-linux-gnueabi-gcc 2>&- || \
|
disabled builtin_libc \
|
||||||
die "Couldn't find CodeSourcery GCC from PATH"
|
|| die "Must supply --libc when targetting *-linux-rvct"
|
||||||
|
|
||||||
# Use armcc as a linker to enable translation of
|
# Set up compiler
|
||||||
# some gcc specific options such as -lm and -lpthread.
|
add_cflags --library_interface=aeabi_glibc
|
||||||
LD="armcc --translate_gcc"
|
add_cflags --no_hide_all
|
||||||
|
add_cflags --dwarf2
|
||||||
|
|
||||||
# create configuration file (uses path to CodeSourcery GCC)
|
# Set up linker
|
||||||
armcc --arm_linux_configure --arm_linux_config_file=arm_linux.cfg
|
add_ldflags --sysv --no_startup --no_ref_cpp_init
|
||||||
|
add_ldflags --entry=_start
|
||||||
|
add_ldflags --keep '"*(.init)"' --keep '"*(.fini)"'
|
||||||
|
add_ldflags --keep '"*(.init_array)"' --keep '"*(.fini_array)"'
|
||||||
|
add_ldflags --dynamiclinker=/lib/ld-linux.so.3
|
||||||
|
add_extralibs libc.so.6 -lc_nonshared crt1.o crti.o crtn.o
|
||||||
|
|
||||||
add_cflags --arm_linux_paths --arm_linux_config_file=arm_linux.cfg
|
# Add the paths for the alternate libc
|
||||||
add_asflags --no_hide_all --apcs=/interwork
|
for d in usr/include; do
|
||||||
add_ldflags --arm_linux_paths --arm_linux_config_file=arm_linux.cfg
|
try_dir="${alt_libc}/${d}"
|
||||||
enabled pic && add_cflags --apcs=/fpic
|
[ -d "${try_dir}" ] && add_cflags -J"${try_dir}"
|
||||||
enabled pic && add_asflags --apcs=/fpic
|
done
|
||||||
enabled shared && add_cflags --shared
|
add_cflags -J"${RVCT31INC}"
|
||||||
|
for d in lib usr/lib; do
|
||||||
|
try_dir="${alt_libc}/${d}"
|
||||||
|
[ -d "${try_dir}" ] && add_ldflags -L"${try_dir}"
|
||||||
|
done
|
||||||
|
|
||||||
|
|
||||||
|
# glibc has some struct members named __align, which is a
|
||||||
|
# storage modifier in RVCT. If we need to use this modifier,
|
||||||
|
# we'll have to #undef it in our code. Note that this must
|
||||||
|
# happen AFTER all libc inclues.
|
||||||
|
add_cflags -D__align=x_align_x
|
||||||
fi
|
fi
|
||||||
;;
|
;;
|
||||||
|
|
||||||
@@ -870,8 +885,6 @@ process_common_toolchain() {
|
|||||||
link_with_cc=gcc
|
link_with_cc=gcc
|
||||||
tune_cflags="-march="
|
tune_cflags="-march="
|
||||||
setup_gnu_toolchain
|
setup_gnu_toolchain
|
||||||
#for 32 bit x86 builds, -O3 did not turn on this flag
|
|
||||||
enabled optimizations && check_add_cflags -fomit-frame-pointer
|
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
|
|
||||||
@@ -939,23 +952,15 @@ process_common_toolchain() {
|
|||||||
enabled gcov &&
|
enabled gcov &&
|
||||||
check_add_cflags -fprofile-arcs -ftest-coverage &&
|
check_add_cflags -fprofile-arcs -ftest-coverage &&
|
||||||
check_add_ldflags -fprofile-arcs -ftest-coverage
|
check_add_ldflags -fprofile-arcs -ftest-coverage
|
||||||
|
|
||||||
if enabled optimizations; then
|
if enabled optimizations; then
|
||||||
if enabled rvct; then
|
enabled rvct && check_add_cflags -Otime
|
||||||
enabled small && check_add_cflags -Ospace || check_add_cflags -Otime
|
enabled small && check_add_cflags -O2 || check_add_cflags -O3
|
||||||
else
|
|
||||||
enabled small && check_add_cflags -O2 || check_add_cflags -O3
|
|
||||||
fi
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Position Independent Code (PIC) support, for building relocatable
|
# Position Independent Code (PIC) support, for building relocatable
|
||||||
# shared objects
|
# shared objects
|
||||||
enabled gcc && enabled pic && check_add_cflags -fPIC
|
enabled gcc && enabled pic && check_add_cflags -fPIC
|
||||||
|
|
||||||
# Work around longjmp interception on glibc >= 2.11, to improve binary
|
|
||||||
# compatibility. See http://code.google.com/p/webm/issues/detail?id=166
|
|
||||||
enabled linux && check_add_cflags -D_FORTIFY_SOURCE=0
|
|
||||||
|
|
||||||
# Check for strip utility variant
|
# Check for strip utility variant
|
||||||
${STRIP} -V 2>/dev/null | grep GNU >/dev/null && enable gnu_strip
|
${STRIP} -V 2>/dev/null | grep GNU >/dev/null && enable gnu_strip
|
||||||
|
|
||||||
@@ -974,9 +979,6 @@ EOF
|
|||||||
esac
|
esac
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# for sysconf(3) and friends.
|
|
||||||
check_header unistd.h
|
|
||||||
|
|
||||||
# glibc needs these
|
# glibc needs these
|
||||||
if enabled linux; then
|
if enabled linux; then
|
||||||
add_cflags -D_LARGEFILE_SOURCE
|
add_cflags -D_LARGEFILE_SOURCE
|
||||||
|
|||||||
@@ -365,7 +365,7 @@ generate_vcproj() {
|
|||||||
DebugInformationFormat="1" \
|
DebugInformationFormat="1" \
|
||||||
Detect64BitPortabilityProblems="true" \
|
Detect64BitPortabilityProblems="true" \
|
||||||
|
|
||||||
$uses_asm && tag Tool Name="YASM" IncludePaths="$incs" Debug="true"
|
$uses_asm && tag Tool Name="YASM" IncludePaths="$incs" Debug="1"
|
||||||
;;
|
;;
|
||||||
*)
|
*)
|
||||||
tag Tool \
|
tag Tool \
|
||||||
@@ -379,7 +379,7 @@ generate_vcproj() {
|
|||||||
DebugInformationFormat="1" \
|
DebugInformationFormat="1" \
|
||||||
Detect64BitPortabilityProblems="true" \
|
Detect64BitPortabilityProblems="true" \
|
||||||
|
|
||||||
$uses_asm && tag Tool Name="YASM" IncludePaths="$incs" Debug="true"
|
$uses_asm && tag Tool Name="YASM" IncludePaths="$incs" Debug="1"
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
;;
|
;;
|
||||||
@@ -447,8 +447,6 @@ generate_vcproj() {
|
|||||||
obj_int_extract)
|
obj_int_extract)
|
||||||
tag Tool \
|
tag Tool \
|
||||||
Name="VCCLCompilerTool" \
|
Name="VCCLCompilerTool" \
|
||||||
Optimization="2" \
|
|
||||||
FavorSizeorSpeed="1" \
|
|
||||||
AdditionalIncludeDirectories="$incs" \
|
AdditionalIncludeDirectories="$incs" \
|
||||||
PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_WARNINGS;_CRT_SECURE_NO_DEPRECATE" \
|
PreprocessorDefinitions="WIN32;NDEBUG;_CONSOLE;_CRT_SECURE_NO_WARNINGS;_CRT_SECURE_NO_DEPRECATE" \
|
||||||
RuntimeLibrary="$release_runtime" \
|
RuntimeLibrary="$release_runtime" \
|
||||||
@@ -464,8 +462,6 @@ generate_vcproj() {
|
|||||||
|
|
||||||
tag Tool \
|
tag Tool \
|
||||||
Name="VCCLCompilerTool" \
|
Name="VCCLCompilerTool" \
|
||||||
Optimization="2" \
|
|
||||||
FavorSizeorSpeed="1" \
|
|
||||||
AdditionalIncludeDirectories="$incs" \
|
AdditionalIncludeDirectories="$incs" \
|
||||||
PreprocessorDefinitions="WIN32;NDEBUG;_CRT_SECURE_NO_WARNINGS;_CRT_SECURE_NO_DEPRECATE;$defines" \
|
PreprocessorDefinitions="WIN32;NDEBUG;_CRT_SECURE_NO_WARNINGS;_CRT_SECURE_NO_DEPRECATE;$defines" \
|
||||||
RuntimeLibrary="$release_runtime" \
|
RuntimeLibrary="$release_runtime" \
|
||||||
@@ -480,8 +476,6 @@ generate_vcproj() {
|
|||||||
tag Tool \
|
tag Tool \
|
||||||
Name="VCCLCompilerTool" \
|
Name="VCCLCompilerTool" \
|
||||||
AdditionalIncludeDirectories="$incs" \
|
AdditionalIncludeDirectories="$incs" \
|
||||||
Optimization="2" \
|
|
||||||
FavorSizeorSpeed="1" \
|
|
||||||
PreprocessorDefinitions="WIN32;NDEBUG;_CRT_SECURE_NO_WARNINGS;_CRT_SECURE_NO_DEPRECATE;$defines" \
|
PreprocessorDefinitions="WIN32;NDEBUG;_CRT_SECURE_NO_WARNINGS;_CRT_SECURE_NO_DEPRECATE;$defines" \
|
||||||
RuntimeLibrary="$release_runtime" \
|
RuntimeLibrary="$release_runtime" \
|
||||||
UsePrecompiledHeader="0" \
|
UsePrecompiledHeader="0" \
|
||||||
|
|||||||
@@ -9,13 +9,25 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include <stdarg.h>
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <string.h>
|
|
||||||
|
|
||||||
#include "vpx_config.h"
|
#include "vpx_config.h"
|
||||||
|
|
||||||
|
#if defined(_MSC_VER) || defined(__MINGW32__)
|
||||||
|
#include <io.h>
|
||||||
|
#include <share.h>
|
||||||
#include "vpx/vpx_integer.h"
|
#include "vpx/vpx_integer.h"
|
||||||
|
#else
|
||||||
|
#include <stdint.h>
|
||||||
|
#include <unistd.h>
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#include <string.h>
|
||||||
|
#include <sys/types.h>
|
||||||
|
#include <sys/stat.h>
|
||||||
|
#include <fcntl.h>
|
||||||
|
#include <stdarg.h>
|
||||||
|
|
||||||
typedef enum
|
typedef enum
|
||||||
{
|
{
|
||||||
@@ -35,6 +47,7 @@ int log_msg(const char *fmt, ...)
|
|||||||
}
|
}
|
||||||
|
|
||||||
#if defined(__GNUC__) && __GNUC__
|
#if defined(__GNUC__) && __GNUC__
|
||||||
|
|
||||||
#if defined(__MACH__)
|
#if defined(__MACH__)
|
||||||
|
|
||||||
#include <mach-o/loader.h>
|
#include <mach-o/loader.h>
|
||||||
@@ -212,6 +225,73 @@ bail:
|
|||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
int main(int argc, char **argv)
|
||||||
|
{
|
||||||
|
int fd;
|
||||||
|
char *f;
|
||||||
|
struct stat stat_buf;
|
||||||
|
uint8_t *file_buf;
|
||||||
|
int res;
|
||||||
|
|
||||||
|
if (argc < 2 || argc > 3)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Usage: %s [output format] <obj file>\n\n", argv[0]);
|
||||||
|
fprintf(stderr, " <obj file>\tMachO format object file to parse\n");
|
||||||
|
fprintf(stderr, "Output Formats:\n");
|
||||||
|
fprintf(stderr, " gas - compatible with GNU assembler\n");
|
||||||
|
fprintf(stderr, " rvds - compatible with armasm\n");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
f = argv[2];
|
||||||
|
|
||||||
|
if (!((!strcmp(argv[1], "rvds")) || (!strcmp(argv[1], "gas"))))
|
||||||
|
f = argv[1];
|
||||||
|
|
||||||
|
fd = open(f, O_RDONLY);
|
||||||
|
|
||||||
|
if (fd < 0)
|
||||||
|
{
|
||||||
|
perror("Unable to open file");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (fstat(fd, &stat_buf))
|
||||||
|
{
|
||||||
|
perror("stat");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
file_buf = malloc(stat_buf.st_size);
|
||||||
|
|
||||||
|
if (!file_buf)
|
||||||
|
{
|
||||||
|
perror("malloc");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (read(fd, file_buf, stat_buf.st_size) != stat_buf.st_size)
|
||||||
|
{
|
||||||
|
perror("read");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (close(fd))
|
||||||
|
{
|
||||||
|
perror("close");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
res = parse_macho(file_buf, stat_buf.st_size);
|
||||||
|
free(file_buf);
|
||||||
|
|
||||||
|
if (!res)
|
||||||
|
return EXIT_SUCCESS;
|
||||||
|
|
||||||
|
bail:
|
||||||
|
return EXIT_FAILURE;
|
||||||
|
}
|
||||||
|
|
||||||
#elif defined(__ELF__)
|
#elif defined(__ELF__)
|
||||||
#include "elf.h"
|
#include "elf.h"
|
||||||
|
|
||||||
@@ -660,24 +740,96 @@ bail:
|
|||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
int main(int argc, char **argv)
|
||||||
|
{
|
||||||
|
int fd;
|
||||||
|
output_fmt_t mode;
|
||||||
|
char *f;
|
||||||
|
struct stat stat_buf;
|
||||||
|
uint8_t *file_buf;
|
||||||
|
int res;
|
||||||
|
|
||||||
|
if (argc < 2 || argc > 3)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "Usage: %s [output format] <obj file>\n\n", argv[0]);
|
||||||
|
fprintf(stderr, " <obj file>\tELF format object file to parse\n");
|
||||||
|
fprintf(stderr, "Output Formats:\n");
|
||||||
|
fprintf(stderr, " gas - compatible with GNU assembler\n");
|
||||||
|
fprintf(stderr, " rvds - compatible with armasm\n");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
f = argv[2];
|
||||||
|
|
||||||
|
if (!strcmp(argv[1], "rvds"))
|
||||||
|
mode = OUTPUT_FMT_RVDS;
|
||||||
|
else if (!strcmp(argv[1], "gas"))
|
||||||
|
mode = OUTPUT_FMT_GAS;
|
||||||
|
else
|
||||||
|
f = argv[1];
|
||||||
|
|
||||||
|
|
||||||
|
fd = open(f, O_RDONLY);
|
||||||
|
|
||||||
|
if (fd < 0)
|
||||||
|
{
|
||||||
|
perror("Unable to open file");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (fstat(fd, &stat_buf))
|
||||||
|
{
|
||||||
|
perror("stat");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
file_buf = malloc(stat_buf.st_size);
|
||||||
|
|
||||||
|
if (!file_buf)
|
||||||
|
{
|
||||||
|
perror("malloc");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (read(fd, file_buf, stat_buf.st_size) != stat_buf.st_size)
|
||||||
|
{
|
||||||
|
perror("read");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (close(fd))
|
||||||
|
{
|
||||||
|
perror("close");
|
||||||
|
goto bail;
|
||||||
|
}
|
||||||
|
|
||||||
|
res = parse_elf(file_buf, stat_buf.st_size, mode);
|
||||||
|
free(file_buf);
|
||||||
|
|
||||||
|
if (!res)
|
||||||
|
return EXIT_SUCCESS;
|
||||||
|
|
||||||
|
bail:
|
||||||
|
return EXIT_FAILURE;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
#endif
|
#endif
|
||||||
#endif /* defined(__GNUC__) && __GNUC__ */
|
|
||||||
|
|
||||||
|
|
||||||
#if defined(_MSC_VER) || defined(__MINGW32__) || defined(__CYGWIN__)
|
#if defined(_MSC_VER) || defined(__MINGW32__)
|
||||||
/* See "Microsoft Portable Executable and Common Object File Format Specification"
|
/* See "Microsoft Portable Executable and Common Object File Format Specification"
|
||||||
for reference.
|
for reference.
|
||||||
*/
|
*/
|
||||||
#define get_le32(x) ((*(x)) | (*(x+1)) << 8 |(*(x+2)) << 16 | (*(x+3)) << 24 )
|
#define get_le32(x) ((*(x)) | (*(x+1)) << 8 |(*(x+2)) << 16 | (*(x+3)) << 24 )
|
||||||
#define get_le16(x) ((*(x)) | (*(x+1)) << 8)
|
#define get_le16(x) ((*(x)) | (*(x+1)) << 8)
|
||||||
|
|
||||||
int parse_coff(uint8_t *buf, size_t sz)
|
int parse_coff(unsigned __int8 *buf, size_t sz)
|
||||||
{
|
{
|
||||||
unsigned int nsections, symtab_ptr, symtab_sz, strtab_ptr;
|
unsigned int nsections, symtab_ptr, symtab_sz, strtab_ptr;
|
||||||
unsigned int sectionrawdata_ptr;
|
unsigned int sectionrawdata_ptr;
|
||||||
unsigned int i;
|
unsigned int i;
|
||||||
uint8_t *ptr;
|
unsigned __int8 *ptr;
|
||||||
uint32_t symoffset;
|
unsigned __int32 symoffset;
|
||||||
|
|
||||||
char **sectionlist; //this array holds all section names in their correct order.
|
char **sectionlist; //this array holds all section names in their correct order.
|
||||||
//it is used to check if the symbol is in .bss or .data section.
|
//it is used to check if the symbol is in .bss or .data section.
|
||||||
@@ -755,7 +907,7 @@ int parse_coff(uint8_t *buf, size_t sz)
|
|||||||
|
|
||||||
for (i = 0; i < symtab_sz; i++)
|
for (i = 0; i < symtab_sz; i++)
|
||||||
{
|
{
|
||||||
int16_t section = get_le16(ptr + 12); //section number
|
__int16 section = get_le16(ptr + 12); //section number
|
||||||
|
|
||||||
if (section > 0 && ptr[16] == 2)
|
if (section > 0 && ptr[16] == 2)
|
||||||
{
|
{
|
||||||
@@ -826,21 +978,20 @@ bail:
|
|||||||
|
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
#endif /* defined(_MSC_VER) || defined(__MINGW32__) || defined(__CYGWIN__) */
|
|
||||||
|
|
||||||
int main(int argc, char **argv)
|
int main(int argc, char **argv)
|
||||||
{
|
{
|
||||||
output_fmt_t mode = OUTPUT_FMT_PLAIN;
|
int fd;
|
||||||
|
output_fmt_t mode;
|
||||||
const char *f;
|
const char *f;
|
||||||
uint8_t *file_buf;
|
struct _stat stat_buf;
|
||||||
|
unsigned __int8 *file_buf;
|
||||||
int res;
|
int res;
|
||||||
FILE *fp;
|
|
||||||
long int file_size;
|
|
||||||
|
|
||||||
if (argc < 2 || argc > 3)
|
if (argc < 2 || argc > 3)
|
||||||
{
|
{
|
||||||
fprintf(stderr, "Usage: %s [output format] <obj file>\n\n", argv[0]);
|
fprintf(stderr, "Usage: %s [output format] <obj file>\n\n", argv[0]);
|
||||||
fprintf(stderr, " <obj file>\tobject file to parse\n");
|
fprintf(stderr, " <obj file>\tELF format object file to parse\n");
|
||||||
fprintf(stderr, "Output Formats:\n");
|
fprintf(stderr, "Output Formats:\n");
|
||||||
fprintf(stderr, " gas - compatible with GNU assembler\n");
|
fprintf(stderr, " gas - compatible with GNU assembler\n");
|
||||||
fprintf(stderr, " rvds - compatible with armasm\n");
|
fprintf(stderr, " rvds - compatible with armasm\n");
|
||||||
@@ -856,22 +1007,15 @@ int main(int argc, char **argv)
|
|||||||
else
|
else
|
||||||
f = argv[1];
|
f = argv[1];
|
||||||
|
|
||||||
fp = fopen(f, "rb");
|
fd = _sopen(f, _O_BINARY, _SH_DENYNO, _S_IREAD | _S_IWRITE);
|
||||||
|
|
||||||
if (!fp)
|
if (_fstat(fd, &stat_buf))
|
||||||
{
|
|
||||||
perror("Unable to open file");
|
|
||||||
goto bail;
|
|
||||||
}
|
|
||||||
|
|
||||||
if (fseek(fp, 0, SEEK_END))
|
|
||||||
{
|
{
|
||||||
perror("stat");
|
perror("stat");
|
||||||
goto bail;
|
goto bail;
|
||||||
}
|
}
|
||||||
|
|
||||||
file_size = ftell(fp);
|
file_buf = malloc(stat_buf.st_size);
|
||||||
file_buf = malloc(file_size);
|
|
||||||
|
|
||||||
if (!file_buf)
|
if (!file_buf)
|
||||||
{
|
{
|
||||||
@@ -879,30 +1023,19 @@ int main(int argc, char **argv)
|
|||||||
goto bail;
|
goto bail;
|
||||||
}
|
}
|
||||||
|
|
||||||
rewind(fp);
|
if (_read(fd, file_buf, stat_buf.st_size) != stat_buf.st_size)
|
||||||
|
|
||||||
if (fread(file_buf, sizeof(char), file_size, fp) != file_size)
|
|
||||||
{
|
{
|
||||||
perror("read");
|
perror("read");
|
||||||
goto bail;
|
goto bail;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (fclose(fp))
|
if (_close(fd))
|
||||||
{
|
{
|
||||||
perror("close");
|
perror("close");
|
||||||
goto bail;
|
goto bail;
|
||||||
}
|
}
|
||||||
|
|
||||||
#if defined(__GNUC__) && __GNUC__
|
res = parse_coff(file_buf, stat_buf.st_size);
|
||||||
#if defined(__MACH__)
|
|
||||||
res = parse_macho(file_buf, file_size);
|
|
||||||
#elif defined(__ELF__)
|
|
||||||
res = parse_elf(file_buf, file_size, mode);
|
|
||||||
#endif
|
|
||||||
#endif
|
|
||||||
#if defined(_MSC_VER) || defined(__MINGW32__) || defined(__CYGWIN__)
|
|
||||||
res = parse_coff(file_buf, file_size);
|
|
||||||
#endif
|
|
||||||
|
|
||||||
free(file_buf);
|
free(file_buf);
|
||||||
|
|
||||||
@@ -912,3 +1045,4 @@ int main(int argc, char **argv)
|
|||||||
bail:
|
bail:
|
||||||
return EXIT_FAILURE;
|
return EXIT_FAILURE;
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|||||||
21
configure
vendored
21
configure
vendored
@@ -31,16 +31,14 @@ Advanced options:
|
|||||||
${toggle_md5} support for output of checksum data
|
${toggle_md5} support for output of checksum data
|
||||||
${toggle_static_msvcrt} use static MSVCRT (VS builds only)
|
${toggle_static_msvcrt} use static MSVCRT (VS builds only)
|
||||||
${toggle_vp8} VP8 codec support
|
${toggle_vp8} VP8 codec support
|
||||||
${toggle_internal_stats} output of encoder internal stats for debug, if supported (encoders)
|
${toggle_psnr} output of PSNR data, if supported (encoders)
|
||||||
${toggle_mem_tracker} track memory usage
|
${toggle_mem_tracker} track memory usage
|
||||||
${toggle_postproc} postprocessing
|
${toggle_postproc} postprocessing
|
||||||
${toggle_multithread} multithreaded encoding and decoding.
|
${toggle_multithread} multithreaded encoding and decoding.
|
||||||
${toggle_spatial_resampling} spatial sampling (scaling) support
|
${toggle_spatial_resampling} spatial sampling (scaling) support
|
||||||
${toggle_realtime_only} enable this option while building for real-time encoding
|
${toggle_realtime_only} enable this option while building for real-time encoding
|
||||||
${toggle_error_concealment} enable this option to get a decoder which is able to conceal losses
|
|
||||||
${toggle_runtime_cpu_detect} runtime cpu detection
|
${toggle_runtime_cpu_detect} runtime cpu detection
|
||||||
${toggle_shared} shared library support
|
${toggle_shared} shared library support
|
||||||
${toggle_static} static library support
|
|
||||||
${toggle_small} favor smaller size over speed
|
${toggle_small} favor smaller size over speed
|
||||||
${toggle_postproc_visualizer} macro block / block level visualizers
|
${toggle_postproc_visualizer} macro block / block level visualizers
|
||||||
|
|
||||||
@@ -154,7 +152,6 @@ enabled doxygen && php -v >/dev/null 2>&1 && enable install_docs
|
|||||||
enable install_bins
|
enable install_bins
|
||||||
enable install_libs
|
enable install_libs
|
||||||
|
|
||||||
enable static
|
|
||||||
enable optimizations
|
enable optimizations
|
||||||
enable fast_unaligned #allow unaligned accesses, if supported by hw
|
enable fast_unaligned #allow unaligned accesses, if supported by hw
|
||||||
enable md5
|
enable md5
|
||||||
@@ -214,7 +211,6 @@ HAVE_LIST="
|
|||||||
alt_tree_layout
|
alt_tree_layout
|
||||||
pthread_h
|
pthread_h
|
||||||
sys_mman_h
|
sys_mman_h
|
||||||
unistd_h
|
|
||||||
"
|
"
|
||||||
CONFIG_LIST="
|
CONFIG_LIST="
|
||||||
external_build
|
external_build
|
||||||
@@ -244,7 +240,7 @@ CONFIG_LIST="
|
|||||||
runtime_cpu_detect
|
runtime_cpu_detect
|
||||||
postproc
|
postproc
|
||||||
multithread
|
multithread
|
||||||
internal_stats
|
psnr
|
||||||
${CODECS}
|
${CODECS}
|
||||||
${CODEC_FAMILIES}
|
${CODEC_FAMILIES}
|
||||||
encoders
|
encoders
|
||||||
@@ -252,9 +248,7 @@ CONFIG_LIST="
|
|||||||
static_msvcrt
|
static_msvcrt
|
||||||
spatial_resampling
|
spatial_resampling
|
||||||
realtime_only
|
realtime_only
|
||||||
error_concealment
|
|
||||||
shared
|
shared
|
||||||
static
|
|
||||||
small
|
small
|
||||||
postproc_visualizer
|
postproc_visualizer
|
||||||
os_support
|
os_support
|
||||||
@@ -287,16 +281,14 @@ CMDLINE_SELECT="
|
|||||||
dc_recon
|
dc_recon
|
||||||
postproc
|
postproc
|
||||||
multithread
|
multithread
|
||||||
internal_stats
|
psnr
|
||||||
${CODECS}
|
${CODECS}
|
||||||
${CODEC_FAMILIES}
|
${CODEC_FAMILIES}
|
||||||
static_msvcrt
|
static_msvcrt
|
||||||
mem_tracker
|
mem_tracker
|
||||||
spatial_resampling
|
spatial_resampling
|
||||||
realtime_only
|
realtime_only
|
||||||
error_concealment
|
|
||||||
shared
|
shared
|
||||||
static
|
|
||||||
small
|
small
|
||||||
postproc_visualizer
|
postproc_visualizer
|
||||||
"
|
"
|
||||||
@@ -385,7 +377,6 @@ process_targets() {
|
|||||||
if [ -f "${source_path}/build/make/version.sh" ]; then
|
if [ -f "${source_path}/build/make/version.sh" ]; then
|
||||||
local ver=`"$source_path/build/make/version.sh" --bare $source_path`
|
local ver=`"$source_path/build/make/version.sh" --bare $source_path`
|
||||||
DIST_DIR="${DIST_DIR}-${ver}"
|
DIST_DIR="${DIST_DIR}-${ver}"
|
||||||
VERSION_STRING=${ver}
|
|
||||||
ver=${ver%%-*}
|
ver=${ver%%-*}
|
||||||
VERSION_PATCH=${ver##*.}
|
VERSION_PATCH=${ver##*.}
|
||||||
ver=${ver%.*}
|
ver=${ver%.*}
|
||||||
@@ -394,8 +385,6 @@ process_targets() {
|
|||||||
VERSION_MAJOR=${ver%.*}
|
VERSION_MAJOR=${ver%.*}
|
||||||
fi
|
fi
|
||||||
enabled child || cat <<EOF >> config.mk
|
enabled child || cat <<EOF >> config.mk
|
||||||
|
|
||||||
PREFIX=${prefix}
|
|
||||||
ifeq (\$(MAKECMDGOALS),dist)
|
ifeq (\$(MAKECMDGOALS),dist)
|
||||||
DIST_DIR?=${DIST_DIR}
|
DIST_DIR?=${DIST_DIR}
|
||||||
else
|
else
|
||||||
@@ -403,8 +392,6 @@ DIST_DIR?=\$(DESTDIR)${prefix}
|
|||||||
endif
|
endif
|
||||||
LIBSUBDIR=${libdir##${prefix}/}
|
LIBSUBDIR=${libdir##${prefix}/}
|
||||||
|
|
||||||
VERSION_STRING=${VERSION_STRING}
|
|
||||||
|
|
||||||
VERSION_MAJOR=${VERSION_MAJOR}
|
VERSION_MAJOR=${VERSION_MAJOR}
|
||||||
VERSION_MINOR=${VERSION_MINOR}
|
VERSION_MINOR=${VERSION_MINOR}
|
||||||
VERSION_PATCH=${VERSION_PATCH}
|
VERSION_PATCH=${VERSION_PATCH}
|
||||||
@@ -499,7 +486,7 @@ process_toolchain() {
|
|||||||
check_add_cflags -Wpointer-arith
|
check_add_cflags -Wpointer-arith
|
||||||
check_add_cflags -Wtype-limits
|
check_add_cflags -Wtype-limits
|
||||||
check_add_cflags -Wcast-qual
|
check_add_cflags -Wcast-qual
|
||||||
enabled extra_warnings || check_add_cflags -Wno-unused-function
|
enabled extra_warnings || check_add_cflags -Wno-unused
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if enabled icc; then
|
if enabled icc; then
|
||||||
|
|||||||
16
examples.mk
16
examples.mk
@@ -77,11 +77,6 @@ GEN_EXAMPLES-$(CONFIG_ENCODERS) += decode_with_drops.c
|
|||||||
endif
|
endif
|
||||||
decode_with_drops.GUID = CE5C53C4-8DDA-438A-86ED-0DDD3CDB8D26
|
decode_with_drops.GUID = CE5C53C4-8DDA-438A-86ED-0DDD3CDB8D26
|
||||||
decode_with_drops.DESCRIPTION = Drops frames while decoding
|
decode_with_drops.DESCRIPTION = Drops frames while decoding
|
||||||
ifeq ($(CONFIG_DECODERS),yes)
|
|
||||||
GEN_EXAMPLES-$(CONFIG_ERROR_CONCEALMENT) += decode_with_partial_drops.c
|
|
||||||
endif
|
|
||||||
decode_with_partial_drops.GUID = 61C2D026-5754-46AC-916F-1343ECC5537E
|
|
||||||
decode_with_partial_drops.DESCRIPTION = Drops parts of frames while decoding
|
|
||||||
GEN_EXAMPLES-$(CONFIG_ENCODERS) += error_resilient.c
|
GEN_EXAMPLES-$(CONFIG_ENCODERS) += error_resilient.c
|
||||||
error_resilient.GUID = DF5837B9-4145-4F92-A031-44E4F832E00C
|
error_resilient.GUID = DF5837B9-4145-4F92-A031-44E4F832E00C
|
||||||
error_resilient.DESCRIPTION = Error Resiliency Feature
|
error_resilient.DESCRIPTION = Error Resiliency Feature
|
||||||
@@ -127,8 +122,8 @@ else
|
|||||||
LIB_PATH := $(call enabled,LIB_PATH)
|
LIB_PATH := $(call enabled,LIB_PATH)
|
||||||
INC_PATH := $(call enabled,INC_PATH)
|
INC_PATH := $(call enabled,INC_PATH)
|
||||||
endif
|
endif
|
||||||
INTERNAL_CFLAGS = $(addprefix -I,$(INC_PATH))
|
CFLAGS += $(addprefix -I,$(INC_PATH))
|
||||||
INTERNAL_LDFLAGS += $(addprefix -L,$(LIB_PATH))
|
LDFLAGS += $(addprefix -L,$(LIB_PATH))
|
||||||
|
|
||||||
|
|
||||||
# Expand list of selected examples to build (as specified above)
|
# Expand list of selected examples to build (as specified above)
|
||||||
@@ -167,10 +162,8 @@ BINS-$(NOT_MSVS) += $(addprefix $(BUILD_PFX),$(ALL_EXAMPLES:.c=))
|
|||||||
|
|
||||||
# Instantiate linker template for all examples.
|
# Instantiate linker template for all examples.
|
||||||
CODEC_LIB=$(if $(CONFIG_DEBUG_LIBS),vpx_g,vpx)
|
CODEC_LIB=$(if $(CONFIG_DEBUG_LIBS),vpx_g,vpx)
|
||||||
CODEC_LIB_SUF=$(if $(CONFIG_SHARED),.so,.a)
|
|
||||||
$(foreach bin,$(BINS-yes),\
|
$(foreach bin,$(BINS-yes),\
|
||||||
$(if $(BUILD_OBJS),$(eval $(bin):\
|
$(if $(BUILD_OBJS),$(eval $(bin): $(LIB_PATH)/lib$(CODEC_LIB).a))\
|
||||||
$(LIB_PATH)/lib$(CODEC_LIB)$(CODEC_LIB_SUF)))\
|
|
||||||
$(if $(BUILD_OBJS),$(eval $(call linker_template,$(bin),\
|
$(if $(BUILD_OBJS),$(eval $(call linker_template,$(bin),\
|
||||||
$(call objs,$($(notdir $(bin)).SRCS)) \
|
$(call objs,$($(notdir $(bin)).SRCS)) \
|
||||||
-l$(CODEC_LIB) $(addprefix -l,$(CODEC_EXTRA_LIBS))\
|
-l$(CODEC_LIB) $(addprefix -l,$(CODEC_EXTRA_LIBS))\
|
||||||
@@ -221,8 +214,7 @@ $(1): $($(1:.vcproj=).SRCS)
|
|||||||
--ver=$$(CONFIG_VS_VERSION)\
|
--ver=$$(CONFIG_VS_VERSION)\
|
||||||
--proj-guid=$$($$(@:.vcproj=).GUID)\
|
--proj-guid=$$($$(@:.vcproj=).GUID)\
|
||||||
$$(if $$(CONFIG_STATIC_MSVCRT),--static-crt) \
|
$$(if $$(CONFIG_STATIC_MSVCRT),--static-crt) \
|
||||||
--out=$$@ $$(INTERNAL_CFLAGS) $$(CFLAGS) \
|
--out=$$@ $$(CFLAGS) $$(LDFLAGS) -l$$(CODEC_LIB) -lwinmm $$^
|
||||||
$$(INTERNAL_LDFLAGS) $$(LDFLAGS) -l$$(CODEC_LIB) -lwinmm $$^
|
|
||||||
endef
|
endef
|
||||||
PROJECTS-$(CONFIG_MSVS) += $(ALL_EXAMPLES:.c=.vcproj)
|
PROJECTS-$(CONFIG_MSVS) += $(ALL_EXAMPLES:.c=.vcproj)
|
||||||
INSTALL-BINS-$(CONFIG_MSVS) += $(foreach p,$(VS_PLATFORMS),\
|
INSTALL-BINS-$(CONFIG_MSVS) += $(foreach p,$(VS_PLATFORMS),\
|
||||||
|
|||||||
@@ -1,238 +0,0 @@
|
|||||||
@TEMPLATE decoder_tmpl.c
|
|
||||||
Decode With Partial Drops Example
|
|
||||||
=========================
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ INTRODUCTION
|
|
||||||
This is an example utility which drops a series of frames (or parts of frames),
|
|
||||||
as specified on the command line. This is useful for observing the error
|
|
||||||
recovery features of the codec.
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ INTRODUCTION
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ EXTRA_INCLUDES
|
|
||||||
#include <time.h>
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ EXTRA_INCLUDES
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HELPERS
|
|
||||||
struct parsed_header
|
|
||||||
{
|
|
||||||
char key_frame;
|
|
||||||
int version;
|
|
||||||
char show_frame;
|
|
||||||
int first_part_size;
|
|
||||||
};
|
|
||||||
|
|
||||||
int next_packet(struct parsed_header* hdr, int pos, int length, int mtu)
|
|
||||||
{
|
|
||||||
int size = 0;
|
|
||||||
int remaining = length - pos;
|
|
||||||
/* Uncompressed part is 3 bytes for P frames and 10 bytes for I frames */
|
|
||||||
int uncomp_part_size = (hdr->key_frame ? 10 : 3);
|
|
||||||
/* number of bytes yet to send from header and the first partition */
|
|
||||||
int remainFirst = uncomp_part_size + hdr->first_part_size - pos;
|
|
||||||
if (remainFirst > 0)
|
|
||||||
{
|
|
||||||
if (remainFirst <= mtu)
|
|
||||||
{
|
|
||||||
size = remainFirst;
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
size = mtu;
|
|
||||||
}
|
|
||||||
|
|
||||||
return size;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* second partition; just slot it up according to MTU */
|
|
||||||
if (remaining <= mtu)
|
|
||||||
{
|
|
||||||
size = remaining;
|
|
||||||
return size;
|
|
||||||
}
|
|
||||||
return mtu;
|
|
||||||
}
|
|
||||||
|
|
||||||
void throw_packets(unsigned char* frame, int* size, int loss_rate,
|
|
||||||
int* thrown, int* kept)
|
|
||||||
{
|
|
||||||
unsigned char loss_frame[256*1024];
|
|
||||||
int pkg_size = 1;
|
|
||||||
int pos = 0;
|
|
||||||
int loss_pos = 0;
|
|
||||||
struct parsed_header hdr;
|
|
||||||
unsigned int tmp;
|
|
||||||
int mtu = 1500;
|
|
||||||
|
|
||||||
if (*size < 3)
|
|
||||||
{
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
putc('|', stdout);
|
|
||||||
/* parse uncompressed 3 bytes */
|
|
||||||
tmp = (frame[2] << 16) | (frame[1] << 8) | frame[0];
|
|
||||||
hdr.key_frame = !(tmp & 0x1); /* inverse logic */
|
|
||||||
hdr.version = (tmp >> 1) & 0x7;
|
|
||||||
hdr.show_frame = (tmp >> 4) & 0x1;
|
|
||||||
hdr.first_part_size = (tmp >> 5) & 0x7FFFF;
|
|
||||||
|
|
||||||
/* don't drop key frames */
|
|
||||||
if (hdr.key_frame)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
*kept = *size/mtu + ((*size % mtu > 0) ? 1 : 0); /* approximate */
|
|
||||||
for (i=0; i < *kept; i++)
|
|
||||||
putc('.', stdout);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
while ((pkg_size = next_packet(&hdr, pos, *size, mtu)) > 0)
|
|
||||||
{
|
|
||||||
int loss_event = ((rand() + 1.0)/(RAND_MAX + 1.0) < loss_rate/100.0);
|
|
||||||
if (*thrown == 0 && !loss_event)
|
|
||||||
{
|
|
||||||
memcpy(loss_frame + loss_pos, frame + pos, pkg_size);
|
|
||||||
loss_pos += pkg_size;
|
|
||||||
(*kept)++;
|
|
||||||
putc('.', stdout);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
(*thrown)++;
|
|
||||||
putc('X', stdout);
|
|
||||||
}
|
|
||||||
pos += pkg_size;
|
|
||||||
}
|
|
||||||
memcpy(frame, loss_frame, loss_pos);
|
|
||||||
memset(frame + loss_pos, 0, *size - loss_pos);
|
|
||||||
*size = loss_pos;
|
|
||||||
}
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ HELPERS
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DEC_INIT
|
|
||||||
/* Initialize codec */
|
|
||||||
flags = VPX_CODEC_USE_ERROR_CONCEALMENT;
|
|
||||||
res = vpx_codec_dec_init(&codec, interface, &dec_cfg, flags);
|
|
||||||
if(res)
|
|
||||||
die_codec(&codec, "Failed to initialize decoder");
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ DEC_INIT
|
|
||||||
|
|
||||||
Usage
|
|
||||||
-----
|
|
||||||
This example adds a single argument to the `simple_decoder` example,
|
|
||||||
which specifies the range or pattern of frames to drop. The parameter is
|
|
||||||
parsed as follows:
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ USAGE
|
|
||||||
if(argc < 4 || argc > 6)
|
|
||||||
die("Usage: %s <infile> <outfile> [-t <num threads>] <N-M|N/M|L,S>\n",
|
|
||||||
argv[0]);
|
|
||||||
{
|
|
||||||
char *nptr;
|
|
||||||
int arg_num = 3;
|
|
||||||
if (argc == 6 && strncmp(argv[arg_num++], "-t", 2) == 0)
|
|
||||||
dec_cfg.threads = strtol(argv[arg_num++], NULL, 0);
|
|
||||||
n = strtol(argv[arg_num], &nptr, 0);
|
|
||||||
mode = (*nptr == '\0' || *nptr == ',') ? 2 : (*nptr == '-') ? 1 : 0;
|
|
||||||
|
|
||||||
m = strtol(nptr+1, NULL, 0);
|
|
||||||
if((!n && !m) || (*nptr != '-' && *nptr != '/' &&
|
|
||||||
*nptr != '\0' && *nptr != ','))
|
|
||||||
die("Couldn't parse pattern %s\n", argv[3]);
|
|
||||||
}
|
|
||||||
seed = (m > 0) ? m : (unsigned int)time(NULL);
|
|
||||||
srand(seed);thrown_frame = 0;
|
|
||||||
printf("Seed: %u\n", seed);
|
|
||||||
printf("Threads: %d\n", dec_cfg.threads);
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ USAGE
|
|
||||||
|
|
||||||
|
|
||||||
Dropping A Range Of Frames
|
|
||||||
--------------------------
|
|
||||||
To drop a range of frames, specify the starting frame and the ending
|
|
||||||
frame to drop, separated by a dash. The following command will drop
|
|
||||||
frames 5 through 10 (base 1).
|
|
||||||
|
|
||||||
$ ./decode_with_partial_drops in.ivf out.i420 5-10
|
|
||||||
|
|
||||||
|
|
||||||
Dropping A Pattern Of Frames
|
|
||||||
----------------------------
|
|
||||||
To drop a pattern of frames, specify the number of frames to drop and
|
|
||||||
the number of frames after which to repeat the pattern, separated by
|
|
||||||
a forward-slash. The following command will drop 3 of 7 frames.
|
|
||||||
Specifically, it will decode 4 frames, then drop 3 frames, and then
|
|
||||||
repeat.
|
|
||||||
|
|
||||||
$ ./decode_with_partial_drops in.ivf out.i420 3/7
|
|
||||||
|
|
||||||
Dropping Random Parts Of Frames
|
|
||||||
-------------------------------
|
|
||||||
A third argument tuple is available to split the frame into 1500 bytes pieces
|
|
||||||
and randomly drop pieces rather than frames. The frame will be split at
|
|
||||||
partition boundaries where possible. The following example will seed the RNG
|
|
||||||
with the seed 123 and drop approximately 5% of the pieces. Pieces which
|
|
||||||
are depending on an already dropped piece will also be dropped.
|
|
||||||
|
|
||||||
$ ./decode_with_partial_drops in.ivf out.i420 5,123
|
|
||||||
|
|
||||||
|
|
||||||
Extra Variables
|
|
||||||
---------------
|
|
||||||
This example maintains the pattern passed on the command line in the
|
|
||||||
`n`, `m`, and `is_range` variables:
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ EXTRA_VARS
|
|
||||||
int n, m, mode;
|
|
||||||
unsigned int seed;
|
|
||||||
int thrown=0, kept=0;
|
|
||||||
int thrown_frame=0, kept_frame=0;
|
|
||||||
vpx_codec_dec_cfg_t dec_cfg = {0};
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ EXTRA_VARS
|
|
||||||
|
|
||||||
|
|
||||||
Making The Drop Decision
|
|
||||||
------------------------
|
|
||||||
The example decides whether to drop the frame based on the current
|
|
||||||
frame number, immediately before decoding the frame.
|
|
||||||
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ PRE_DECODE
|
|
||||||
/* Decide whether to throw parts of the frame or the whole frame
|
|
||||||
depending on the drop mode */
|
|
||||||
thrown_frame = 0;
|
|
||||||
kept_frame = 0;
|
|
||||||
switch (mode)
|
|
||||||
{
|
|
||||||
case 0:
|
|
||||||
if (m - (frame_cnt-1)%m <= n)
|
|
||||||
{
|
|
||||||
frame_sz = 0;
|
|
||||||
}
|
|
||||||
break;
|
|
||||||
case 1:
|
|
||||||
if (frame_cnt >= n && frame_cnt <= m)
|
|
||||||
{
|
|
||||||
frame_sz = 0;
|
|
||||||
}
|
|
||||||
break;
|
|
||||||
case 2:
|
|
||||||
throw_packets(frame, &frame_sz, n, &thrown_frame, &kept_frame);
|
|
||||||
break;
|
|
||||||
default: break;
|
|
||||||
}
|
|
||||||
if (mode < 2)
|
|
||||||
{
|
|
||||||
if (frame_sz == 0)
|
|
||||||
{
|
|
||||||
putc('X', stdout);
|
|
||||||
thrown_frame++;
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
putc('.', stdout);
|
|
||||||
kept_frame++;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
thrown += thrown_frame;
|
|
||||||
kept += kept_frame;
|
|
||||||
fflush(stdout);
|
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ PRE_DECODE
|
|
||||||
@@ -42,8 +42,6 @@ static void die(const char *fmt, ...) {
|
|||||||
|
|
||||||
@DIE_CODEC
|
@DIE_CODEC
|
||||||
|
|
||||||
@HELPERS
|
|
||||||
|
|
||||||
int main(int argc, char **argv) {
|
int main(int argc, char **argv) {
|
||||||
FILE *infile, *outfile;
|
FILE *infile, *outfile;
|
||||||
vpx_codec_ctx_t codec;
|
vpx_codec_ctx_t codec;
|
||||||
|
|||||||
@@ -111,6 +111,8 @@ int main(int argc, char **argv) {
|
|||||||
vpx_codec_ctx_t codec;
|
vpx_codec_ctx_t codec;
|
||||||
vpx_codec_enc_cfg_t cfg;
|
vpx_codec_enc_cfg_t cfg;
|
||||||
int frame_cnt = 0;
|
int frame_cnt = 0;
|
||||||
|
unsigned char file_hdr[IVF_FILE_HDR_SZ];
|
||||||
|
unsigned char frame_hdr[IVF_FRAME_HDR_SZ];
|
||||||
vpx_image_t raw;
|
vpx_image_t raw;
|
||||||
vpx_codec_err_t res;
|
vpx_codec_err_t res;
|
||||||
long width;
|
long width;
|
||||||
|
|||||||
@@ -21,7 +21,7 @@ res = vpx_codec_dec_init(&codec, interface, NULL,
|
|||||||
if(res == VPX_CODEC_INCAPABLE) {
|
if(res == VPX_CODEC_INCAPABLE) {
|
||||||
printf("NOTICE: Postproc not supported by %s\n",
|
printf("NOTICE: Postproc not supported by %s\n",
|
||||||
vpx_codec_iface_name(interface));
|
vpx_codec_iface_name(interface));
|
||||||
res = vpx_codec_dec_init(&codec, interface, NULL, flags);
|
res = vpx_codec_dec_init(&codec, interface, NULL, 0);
|
||||||
}
|
}
|
||||||
if(res)
|
if(res)
|
||||||
die_codec(&codec, "Failed to initialize decoder");
|
die_codec(&codec, "Failed to initialize decoder");
|
||||||
|
|||||||
@@ -120,7 +120,7 @@ enum mkv
|
|||||||
//video
|
//video
|
||||||
Video = 0xE0,
|
Video = 0xE0,
|
||||||
FlagInterlaced = 0x9A,
|
FlagInterlaced = 0x9A,
|
||||||
StereoMode = 0x53B8,
|
// StereoMode = 0x53B8,
|
||||||
PixelWidth = 0xB0,
|
PixelWidth = 0xB0,
|
||||||
PixelHeight = 0xBA,
|
PixelHeight = 0xBA,
|
||||||
PixelCropBottom = 0x54AA,
|
PixelCropBottom = 0x54AA,
|
||||||
|
|||||||
@@ -11,7 +11,6 @@
|
|||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <wchar.h>
|
#include <wchar.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
#include <limits.h>
|
|
||||||
#if defined(_MSC_VER)
|
#if defined(_MSC_VER)
|
||||||
#define LITERALU64(n) n
|
#define LITERALU64(n) n
|
||||||
#else
|
#else
|
||||||
@@ -34,7 +33,7 @@ void Ebml_WriteLen(EbmlGlobal *glob, long long val)
|
|||||||
|
|
||||||
val |= (LITERALU64(0x000000000000080) << ((size - 1) * 7));
|
val |= (LITERALU64(0x000000000000080) << ((size - 1) * 7));
|
||||||
|
|
||||||
Ebml_Serialize(glob, (void *) &val, sizeof(val), size);
|
Ebml_Serialize(glob, (void *) &val, size);
|
||||||
}
|
}
|
||||||
|
|
||||||
void Ebml_WriteString(EbmlGlobal *glob, const char *str)
|
void Ebml_WriteString(EbmlGlobal *glob, const char *str)
|
||||||
@@ -61,26 +60,21 @@ void Ebml_WriteUTF8(EbmlGlobal *glob, const wchar_t *wstr)
|
|||||||
|
|
||||||
void Ebml_WriteID(EbmlGlobal *glob, unsigned long class_id)
|
void Ebml_WriteID(EbmlGlobal *glob, unsigned long class_id)
|
||||||
{
|
{
|
||||||
int len;
|
|
||||||
|
|
||||||
if (class_id >= 0x01000000)
|
if (class_id >= 0x01000000)
|
||||||
len = 4;
|
Ebml_Serialize(glob, (void *)&class_id, 4);
|
||||||
else if (class_id >= 0x00010000)
|
else if (class_id >= 0x00010000)
|
||||||
len = 3;
|
Ebml_Serialize(glob, (void *)&class_id, 3);
|
||||||
else if (class_id >= 0x00000100)
|
else if (class_id >= 0x00000100)
|
||||||
len = 2;
|
Ebml_Serialize(glob, (void *)&class_id, 2);
|
||||||
else
|
else
|
||||||
len = 1;
|
Ebml_Serialize(glob, (void *)&class_id, 1);
|
||||||
|
|
||||||
Ebml_Serialize(glob, (void *)&class_id, sizeof(class_id), len);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void Ebml_SerializeUnsigned64(EbmlGlobal *glob, unsigned long class_id, uint64_t ui)
|
void Ebml_SerializeUnsigned64(EbmlGlobal *glob, unsigned long class_id, uint64_t ui)
|
||||||
{
|
{
|
||||||
unsigned char sizeSerialized = 8 | 0x80;
|
unsigned char sizeSerialized = 8 | 0x80;
|
||||||
Ebml_WriteID(glob, class_id);
|
Ebml_WriteID(glob, class_id);
|
||||||
Ebml_Serialize(glob, &sizeSerialized, sizeof(sizeSerialized), 1);
|
Ebml_Serialize(glob, &sizeSerialized, 1);
|
||||||
Ebml_Serialize(glob, &ui, sizeof(ui), 8);
|
Ebml_Serialize(glob, &ui, 8);
|
||||||
}
|
}
|
||||||
|
|
||||||
void Ebml_SerializeUnsigned(EbmlGlobal *glob, unsigned long class_id, unsigned long ui)
|
void Ebml_SerializeUnsigned(EbmlGlobal *glob, unsigned long class_id, unsigned long ui)
|
||||||
@@ -103,8 +97,8 @@ void Ebml_SerializeUnsigned(EbmlGlobal *glob, unsigned long class_id, unsigned l
|
|||||||
}
|
}
|
||||||
|
|
||||||
sizeSerialized = 0x80 | size;
|
sizeSerialized = 0x80 | size;
|
||||||
Ebml_Serialize(glob, &sizeSerialized, sizeof(sizeSerialized), 1);
|
Ebml_Serialize(glob, &sizeSerialized, 1);
|
||||||
Ebml_Serialize(glob, &ui, sizeof(ui), size);
|
Ebml_Serialize(glob, &ui, size);
|
||||||
}
|
}
|
||||||
//TODO: perhaps this is a poor name for this id serializer helper function
|
//TODO: perhaps this is a poor name for this id serializer helper function
|
||||||
void Ebml_SerializeBinary(EbmlGlobal *glob, unsigned long class_id, unsigned long bin)
|
void Ebml_SerializeBinary(EbmlGlobal *glob, unsigned long class_id, unsigned long bin)
|
||||||
@@ -125,14 +119,14 @@ void Ebml_SerializeFloat(EbmlGlobal *glob, unsigned long class_id, double d)
|
|||||||
unsigned char len = 0x88;
|
unsigned char len = 0x88;
|
||||||
|
|
||||||
Ebml_WriteID(glob, class_id);
|
Ebml_WriteID(glob, class_id);
|
||||||
Ebml_Serialize(glob, &len, sizeof(len), 1);
|
Ebml_Serialize(glob, &len, 1);
|
||||||
Ebml_Serialize(glob, &d, sizeof(d), 8);
|
Ebml_Serialize(glob, &d, 8);
|
||||||
}
|
}
|
||||||
|
|
||||||
void Ebml_WriteSigned16(EbmlGlobal *glob, short val)
|
void Ebml_WriteSigned16(EbmlGlobal *glob, short val)
|
||||||
{
|
{
|
||||||
signed long out = ((val & 0x003FFFFF) | 0x00200000) << 8;
|
signed long out = ((val & 0x003FFFFF) | 0x00200000) << 8;
|
||||||
Ebml_Serialize(glob, &out, sizeof(out), 3);
|
Ebml_Serialize(glob, &out, 3);
|
||||||
}
|
}
|
||||||
|
|
||||||
void Ebml_SerializeString(EbmlGlobal *glob, unsigned long class_id, const char *s)
|
void Ebml_SerializeString(EbmlGlobal *glob, unsigned long class_id, const char *s)
|
||||||
@@ -149,6 +143,7 @@ void Ebml_SerializeUTF8(EbmlGlobal *glob, unsigned long class_id, wchar_t *s)
|
|||||||
|
|
||||||
void Ebml_SerializeData(EbmlGlobal *glob, unsigned long class_id, unsigned char *data, unsigned long data_length)
|
void Ebml_SerializeData(EbmlGlobal *glob, unsigned long class_id, unsigned char *data, unsigned long data_length)
|
||||||
{
|
{
|
||||||
|
unsigned char size = 4;
|
||||||
Ebml_WriteID(glob, class_id);
|
Ebml_WriteID(glob, class_id);
|
||||||
Ebml_WriteLen(glob, data_length);
|
Ebml_WriteLen(glob, data_length);
|
||||||
Ebml_Write(glob, data, data_length);
|
Ebml_Write(glob, data, data_length);
|
||||||
|
|||||||
@@ -15,7 +15,7 @@
|
|||||||
#include "vpx/vpx_integer.h"
|
#include "vpx/vpx_integer.h"
|
||||||
|
|
||||||
typedef struct EbmlGlobal EbmlGlobal;
|
typedef struct EbmlGlobal EbmlGlobal;
|
||||||
void Ebml_Serialize(EbmlGlobal *glob, const void *, int, unsigned long);
|
void Ebml_Serialize(EbmlGlobal *glob, const void *, unsigned long);
|
||||||
void Ebml_Write(EbmlGlobal *glob, const void *, unsigned long);
|
void Ebml_Write(EbmlGlobal *glob, const void *, unsigned long);
|
||||||
/////
|
/////
|
||||||
|
|
||||||
|
|||||||
@@ -35,11 +35,11 @@ void writeSimpleBlock(EbmlGlobal *glob, unsigned char trackNumber, short timeCod
|
|||||||
Ebml_WriteID(glob, SimpleBlock);
|
Ebml_WriteID(glob, SimpleBlock);
|
||||||
unsigned long blockLength = 4 + dataLength;
|
unsigned long blockLength = 4 + dataLength;
|
||||||
blockLength |= 0x10000000; //TODO check length < 0x0FFFFFFFF
|
blockLength |= 0x10000000; //TODO check length < 0x0FFFFFFFF
|
||||||
Ebml_Serialize(glob, &blockLength, sizeof(blockLength), 4);
|
Ebml_Serialize(glob, &blockLength, 4);
|
||||||
trackNumber |= 0x80; //TODO check track nubmer < 128
|
trackNumber |= 0x80; //TODO check track nubmer < 128
|
||||||
Ebml_Write(glob, &trackNumber, 1);
|
Ebml_Write(glob, &trackNumber, 1);
|
||||||
//Ebml_WriteSigned16(glob, timeCode,2); //this is 3 bytes
|
//Ebml_WriteSigned16(glob, timeCode,2); //this is 3 bytes
|
||||||
Ebml_Serialize(glob, &timeCode, sizeof(timeCode), 2);
|
Ebml_Serialize(glob, &timeCode, 2);
|
||||||
unsigned char flags = 0x00 | (isKeyframe ? 0x80 : 0x00) | (lacingFlag << 1) | discardable;
|
unsigned char flags = 0x00 | (isKeyframe ? 0x80 : 0x00) | (lacingFlag << 1) | discardable;
|
||||||
Ebml_Write(glob, &flags, 1);
|
Ebml_Write(glob, &flags, 1);
|
||||||
Ebml_Write(glob, data, dataLength);
|
Ebml_Write(glob, data, dataLength);
|
||||||
|
|||||||
101
libs.mk
101
libs.mk
@@ -35,7 +35,6 @@ ifeq ($(CONFIG_VP8_ENCODER),yes)
|
|||||||
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_CX_SRCS))
|
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_CX_SRCS))
|
||||||
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_CX_EXPORTS))
|
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_CX_EXPORTS))
|
||||||
CODEC_SRCS-yes += $(VP8_PREFIX)vp8cx.mk vpx/vp8.h vpx/vp8cx.h vpx/vp8e.h
|
CODEC_SRCS-yes += $(VP8_PREFIX)vp8cx.mk vpx/vp8.h vpx/vp8cx.h vpx/vp8e.h
|
||||||
CODEC_SRCS-$(ARCH_ARM) += $(VP8_PREFIX)vp8cx_arm.mk
|
|
||||||
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8e.h include/vpx/vp8cx.h
|
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8e.h include/vpx/vp8cx.h
|
||||||
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
|
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
|
||||||
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8cx.h
|
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8cx.h
|
||||||
@@ -48,7 +47,6 @@ ifeq ($(CONFIG_VP8_DECODER),yes)
|
|||||||
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_DX_SRCS))
|
CODEC_SRCS-yes += $(addprefix $(VP8_PREFIX),$(call enabled,VP8_DX_SRCS))
|
||||||
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_DX_EXPORTS))
|
CODEC_EXPORTS-yes += $(addprefix $(VP8_PREFIX),$(VP8_DX_EXPORTS))
|
||||||
CODEC_SRCS-yes += $(VP8_PREFIX)vp8dx.mk vpx/vp8.h vpx/vp8dx.h
|
CODEC_SRCS-yes += $(VP8_PREFIX)vp8dx.mk vpx/vp8.h vpx/vp8dx.h
|
||||||
CODEC_SRCS-$(ARCH_ARM) += $(VP8_PREFIX)vp8dx_arm.mk
|
|
||||||
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8dx.h
|
INSTALL-LIBS-yes += include/vpx/vp8.h include/vpx/vp8dx.h
|
||||||
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
|
INSTALL_MAPS += include/vpx/% $(SRC_PATH_BARE)/$(VP8_PREFIX)/%
|
||||||
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8dx.h
|
CODEC_DOC_SRCS += vpx/vp8.h vpx/vp8dx.h
|
||||||
@@ -91,7 +89,6 @@ $(eval $(if $(filter universal%,$(TOOLCHAIN)),LIPO_LIBVPX,BUILD_LIBVPX):=yes)
|
|||||||
|
|
||||||
CODEC_SRCS-$(BUILD_LIBVPX) += build/make/version.sh
|
CODEC_SRCS-$(BUILD_LIBVPX) += build/make/version.sh
|
||||||
CODEC_SRCS-$(BUILD_LIBVPX) += vpx/vpx_integer.h
|
CODEC_SRCS-$(BUILD_LIBVPX) += vpx/vpx_integer.h
|
||||||
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/asm_offsets.h
|
|
||||||
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/vpx_timer.h
|
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/vpx_timer.h
|
||||||
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/mem.h
|
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/mem.h
|
||||||
CODEC_SRCS-$(BUILD_LIBVPX) += $(BUILD_PFX)vpx_config.c
|
CODEC_SRCS-$(BUILD_LIBVPX) += $(BUILD_PFX)vpx_config.c
|
||||||
@@ -103,7 +100,7 @@ CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/x86_abi_support.asm
|
|||||||
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/x86_cpuid.c
|
CODEC_SRCS-$(BUILD_LIBVPX) += vpx_ports/x86_cpuid.c
|
||||||
endif
|
endif
|
||||||
CODEC_SRCS-$(ARCH_ARM) += vpx_ports/arm_cpudetect.c
|
CODEC_SRCS-$(ARCH_ARM) += vpx_ports/arm_cpudetect.c
|
||||||
CODEC_SRCS-$(ARCH_ARM) += vpx_ports/arm.h
|
CODEC_SRCS-$(ARCH_ARM) += $(BUILD_PFX)vpx_config.asm
|
||||||
CODEC_EXPORTS-$(BUILD_LIBVPX) += vpx/exports_com
|
CODEC_EXPORTS-$(BUILD_LIBVPX) += vpx/exports_com
|
||||||
CODEC_EXPORTS-$(CONFIG_ENCODERS) += vpx/exports_enc
|
CODEC_EXPORTS-$(CONFIG_ENCODERS) += vpx/exports_enc
|
||||||
CODEC_EXPORTS-$(CONFIG_DECODERS) += vpx/exports_dec
|
CODEC_EXPORTS-$(CONFIG_DECODERS) += vpx/exports_dec
|
||||||
@@ -124,7 +121,7 @@ INSTALL-LIBS-$(CONFIG_SHARED) += $(foreach p,$(VS_PLATFORMS),$(LIBSUBDIR)/$(p)/v
|
|||||||
INSTALL-LIBS-$(CONFIG_SHARED) += $(foreach p,$(VS_PLATFORMS),$(LIBSUBDIR)/$(p)/vpx.exp)
|
INSTALL-LIBS-$(CONFIG_SHARED) += $(foreach p,$(VS_PLATFORMS),$(LIBSUBDIR)/$(p)/vpx.exp)
|
||||||
endif
|
endif
|
||||||
else
|
else
|
||||||
INSTALL-LIBS-$(CONFIG_STATIC) += $(LIBSUBDIR)/libvpx.a
|
INSTALL-LIBS-yes += $(LIBSUBDIR)/libvpx.a
|
||||||
INSTALL-LIBS-$(CONFIG_DEBUG_LIBS) += $(LIBSUBDIR)/libvpx_g.a
|
INSTALL-LIBS-$(CONFIG_DEBUG_LIBS) += $(LIBSUBDIR)/libvpx_g.a
|
||||||
endif
|
endif
|
||||||
|
|
||||||
@@ -132,14 +129,6 @@ CODEC_SRCS=$(call enabled,CODEC_SRCS)
|
|||||||
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(CODEC_SRCS)
|
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(CODEC_SRCS)
|
||||||
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(call enabled,CODEC_EXPORTS)
|
INSTALL-SRCS-$(CONFIG_CODEC_SRCS) += $(call enabled,CODEC_EXPORTS)
|
||||||
|
|
||||||
|
|
||||||
# Generate a list of all enabled sources, in particular for exporting to gyp
|
|
||||||
# based build systems.
|
|
||||||
libvpx_srcs.txt:
|
|
||||||
@echo " [CREATE] $@"
|
|
||||||
@echo $(CODEC_SRCS) | xargs -n1 echo | sort -u > $@
|
|
||||||
|
|
||||||
|
|
||||||
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
|
ifeq ($(CONFIG_EXTERNAL_BUILD),yes)
|
||||||
ifeq ($(CONFIG_MSVS),yes)
|
ifeq ($(CONFIG_MSVS),yes)
|
||||||
|
|
||||||
@@ -188,15 +177,14 @@ endif
|
|||||||
else
|
else
|
||||||
LIBVPX_OBJS=$(call objs,$(CODEC_SRCS))
|
LIBVPX_OBJS=$(call objs,$(CODEC_SRCS))
|
||||||
OBJS-$(BUILD_LIBVPX) += $(LIBVPX_OBJS)
|
OBJS-$(BUILD_LIBVPX) += $(LIBVPX_OBJS)
|
||||||
LIBS-$(if $(BUILD_LIBVPX),$(CONFIG_STATIC)) += $(BUILD_PFX)libvpx.a $(BUILD_PFX)libvpx_g.a
|
LIBS-$(BUILD_LIBVPX) += $(BUILD_PFX)libvpx.a $(BUILD_PFX)libvpx_g.a
|
||||||
$(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS)
|
$(BUILD_PFX)libvpx_g.a: $(LIBVPX_OBJS)
|
||||||
|
|
||||||
BUILD_LIBVPX_SO := $(if $(BUILD_LIBVPX),$(CONFIG_SHARED))
|
BUILD_LIBVPX_SO := $(if $(BUILD_LIBVPX),$(CONFIG_SHARED))
|
||||||
LIBVPX_SO := libvpx.so.$(VERSION_MAJOR).$(VERSION_MINOR).$(VERSION_PATCH)
|
LIBVPX_SO := libvpx.so.$(VERSION_MAJOR).$(VERSION_MINOR).$(VERSION_PATCH)
|
||||||
LIBS-$(BUILD_LIBVPX_SO) += $(BUILD_PFX)$(LIBVPX_SO)\
|
LIBS-$(BUILD_LIBVPX_SO) += $(BUILD_PFX)$(LIBVPX_SO)
|
||||||
$(notdir $(LIBVPX_SO_SYMLINKS))
|
|
||||||
$(BUILD_PFX)$(LIBVPX_SO): $(LIBVPX_OBJS) libvpx.ver
|
$(BUILD_PFX)$(LIBVPX_SO): $(LIBVPX_OBJS) libvpx.ver
|
||||||
$(BUILD_PFX)$(LIBVPX_SO): extralibs += -lm
|
$(BUILD_PFX)$(LIBVPX_SO): extralibs += -lm -pthread
|
||||||
$(BUILD_PFX)$(LIBVPX_SO): SONAME = libvpx.so.$(VERSION_MAJOR)
|
$(BUILD_PFX)$(LIBVPX_SO): SONAME = libvpx.so.$(VERSION_MAJOR)
|
||||||
$(BUILD_PFX)$(LIBVPX_SO): SO_VERSION_SCRIPT = libvpx.ver
|
$(BUILD_PFX)$(LIBVPX_SO): SO_VERSION_SCRIPT = libvpx.ver
|
||||||
LIBVPX_SO_SYMLINKS := $(addprefix $(LIBSUBDIR)/, \
|
LIBVPX_SO_SYMLINKS := $(addprefix $(LIBSUBDIR)/, \
|
||||||
@@ -210,41 +198,12 @@ libvpx.ver: $(call enabled,CODEC_EXPORTS)
|
|||||||
$(qexec)echo "local: *; };" >> $@
|
$(qexec)echo "local: *; };" >> $@
|
||||||
CLEAN-OBJS += libvpx.ver
|
CLEAN-OBJS += libvpx.ver
|
||||||
|
|
||||||
define libvpx_symlink_template
|
$(addprefix $(DIST_DIR)/,$(LIBVPX_SO_SYMLINKS)):
|
||||||
$(1): $(2)
|
@echo " [LN] $@"
|
||||||
@echo " [LN] $$@"
|
$(qexec)ln -sf $(LIBVPX_SO) $@
|
||||||
$(qexec)ln -sf $(LIBVPX_SO) $$@
|
|
||||||
endef
|
|
||||||
|
|
||||||
$(eval $(call libvpx_symlink_template,\
|
|
||||||
$(addprefix $(BUILD_PFX),$(notdir $(LIBVPX_SO_SYMLINKS))),\
|
|
||||||
$(BUILD_PFX)$(LIBVPX_SO)))
|
|
||||||
$(eval $(call libvpx_symlink_template,\
|
|
||||||
$(addprefix $(DIST_DIR)/,$(LIBVPX_SO_SYMLINKS)),\
|
|
||||||
$(DIST_DIR)/$(LIBSUBDIR)/$(LIBVPX_SO)))
|
|
||||||
|
|
||||||
INSTALL-LIBS-$(CONFIG_SHARED) += $(LIBVPX_SO_SYMLINKS)
|
INSTALL-LIBS-$(CONFIG_SHARED) += $(LIBVPX_SO_SYMLINKS)
|
||||||
INSTALL-LIBS-$(CONFIG_SHARED) += $(LIBSUBDIR)/$(LIBVPX_SO)
|
INSTALL-LIBS-$(CONFIG_SHARED) += $(LIBSUBDIR)/$(LIBVPX_SO)
|
||||||
|
|
||||||
LIBS-$(BUILD_LIBVPX) += vpx.pc
|
|
||||||
vpx.pc: config.mk libs.mk
|
|
||||||
@echo " [CREATE] $@"
|
|
||||||
$(qexec)echo '# pkg-config file from libvpx $(VERSION_STRING)' > $@
|
|
||||||
$(qexec)echo 'prefix=$(PREFIX)' >> $@
|
|
||||||
$(qexec)echo 'exec_prefix=$${prefix}' >> $@
|
|
||||||
$(qexec)echo 'libdir=$${prefix}/lib' >> $@
|
|
||||||
$(qexec)echo 'includedir=$${prefix}/include' >> $@
|
|
||||||
$(qexec)echo '' >> $@
|
|
||||||
$(qexec)echo 'Name: vpx' >> $@
|
|
||||||
$(qexec)echo 'Description: WebM Project VPx codec implementation' >> $@
|
|
||||||
$(qexec)echo 'Version: $(VERSION_MAJOR).$(VERSION_MINOR).$(VERSION_PATCH)' >> $@
|
|
||||||
$(qexec)echo 'Requires:' >> $@
|
|
||||||
$(qexec)echo 'Conflicts:' >> $@
|
|
||||||
$(qexec)echo 'Libs: -L$${libdir} -lvpx' >> $@
|
|
||||||
$(qexec)echo 'Cflags: -I$${includedir}' >> $@
|
|
||||||
INSTALL-LIBS-yes += $(LIBSUBDIR)/pkgconfig/vpx.pc
|
|
||||||
INSTALL_MAPS += $(LIBSUBDIR)/pkgconfig/%.pc %.pc
|
|
||||||
CLEAN-OBJS += vpx.pc
|
|
||||||
endif
|
endif
|
||||||
|
|
||||||
LIBS-$(LIPO_LIBVPX) += libvpx.a
|
LIBS-$(LIPO_LIBVPX) += libvpx.a
|
||||||
@@ -278,24 +237,8 @@ $(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)vpx_config.asm
|
|||||||
#
|
#
|
||||||
# Calculate platform- and compiler-specific offsets for hand coded assembly
|
# Calculate platform- and compiler-specific offsets for hand coded assembly
|
||||||
#
|
#
|
||||||
|
ifeq ($(CONFIG_EXTERNAL_BUILD),) # Visual Studio uses obj_int_extract.bat
|
||||||
ifeq ($(filter icc gcc,$(TGT_CC)), $(TGT_CC))
|
ifeq ($(ARCH_ARM), yes)
|
||||||
$(BUILD_PFX)asm_com_offsets.asm: $(BUILD_PFX)$(VP8_PREFIX)common/asm_com_offsets.c.S
|
|
||||||
grep EQU $< | tr -d '$$\#' $(ADS2GAS) > $@
|
|
||||||
$(BUILD_PFX)$(VP8_PREFIX)common/asm_com_offsets.c.S: $(VP8_PREFIX)common/asm_com_offsets.c
|
|
||||||
CLEAN-OBJS += $(BUILD_PFX)asm_com_offsets.asm $(BUILD_PFX)$(VP8_PREFIX)common/asm_com_offsets.c.S
|
|
||||||
|
|
||||||
$(BUILD_PFX)asm_enc_offsets.asm: $(BUILD_PFX)$(VP8_PREFIX)encoder/asm_enc_offsets.c.S
|
|
||||||
grep EQU $< | tr -d '$$\#' $(ADS2GAS) > $@
|
|
||||||
$(BUILD_PFX)$(VP8_PREFIX)encoder/asm_enc_offsets.c.S: $(VP8_PREFIX)encoder/asm_enc_offsets.c
|
|
||||||
CLEAN-OBJS += $(BUILD_PFX)asm_enc_offsets.asm $(BUILD_PFX)$(VP8_PREFIX)encoder/asm_enc_offsets.c.S
|
|
||||||
|
|
||||||
$(BUILD_PFX)asm_dec_offsets.asm: $(BUILD_PFX)$(VP8_PREFIX)decoder/asm_dec_offsets.c.S
|
|
||||||
grep EQU $< | tr -d '$$\#' $(ADS2GAS) > $@
|
|
||||||
$(BUILD_PFX)$(VP8_PREFIX)decoder/asm_dec_offsets.c.S: $(VP8_PREFIX)decoder/asm_dec_offsets.c
|
|
||||||
CLEAN-OBJS += $(BUILD_PFX)asm_dec_offsets.asm $(BUILD_PFX)$(VP8_PREFIX)decoder/asm_dec_offsets.c.S
|
|
||||||
else
|
|
||||||
ifeq ($(filter rvct,$(TGT_CC)), $(TGT_CC))
|
|
||||||
asm_com_offsets.asm: obj_int_extract
|
asm_com_offsets.asm: obj_int_extract
|
||||||
asm_com_offsets.asm: $(VP8_PREFIX)common/asm_com_offsets.c.o
|
asm_com_offsets.asm: $(VP8_PREFIX)common/asm_com_offsets.c.o
|
||||||
./obj_int_extract rvds $< $(ADS2GAS) > $@
|
./obj_int_extract rvds $< $(ADS2GAS) > $@
|
||||||
@@ -303,19 +246,23 @@ else
|
|||||||
CLEAN-OBJS += asm_com_offsets.asm
|
CLEAN-OBJS += asm_com_offsets.asm
|
||||||
$(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)asm_com_offsets.asm
|
$(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)asm_com_offsets.asm
|
||||||
|
|
||||||
asm_enc_offsets.asm: obj_int_extract
|
ifeq ($(CONFIG_VP8_ENCODER), yes)
|
||||||
asm_enc_offsets.asm: $(VP8_PREFIX)encoder/asm_enc_offsets.c.o
|
asm_enc_offsets.asm: obj_int_extract
|
||||||
|
asm_enc_offsets.asm: $(VP8_PREFIX)encoder/asm_enc_offsets.c.o
|
||||||
./obj_int_extract rvds $< $(ADS2GAS) > $@
|
./obj_int_extract rvds $< $(ADS2GAS) > $@
|
||||||
OBJS-yes += $(VP8_PREFIX)encoder/asm_enc_offsets.c.o
|
OBJS-yes += $(VP8_PREFIX)encoder/asm_enc_offsets.c.o
|
||||||
CLEAN-OBJS += asm_enc_offsets.asm
|
CLEAN-OBJS += asm_enc_offsets.asm
|
||||||
$(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)asm_enc_offsets.asm
|
$(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)asm_enc_offsets.asm
|
||||||
|
endif
|
||||||
|
|
||||||
asm_dec_offsets.asm: obj_int_extract
|
ifeq ($(CONFIG_VP8_DECODER), yes)
|
||||||
asm_dec_offsets.asm: $(VP8_PREFIX)decoder/asm_dec_offsets.c.o
|
asm_dec_offsets.asm: obj_int_extract
|
||||||
|
asm_dec_offsets.asm: $(VP8_PREFIX)decoder/asm_dec_offsets.c.o
|
||||||
./obj_int_extract rvds $< $(ADS2GAS) > $@
|
./obj_int_extract rvds $< $(ADS2GAS) > $@
|
||||||
OBJS-yes += $(VP8_PREFIX)decoder/asm_dec_offsets.c.o
|
OBJS-yes += $(VP8_PREFIX)decoder/asm_dec_offsets.c.o
|
||||||
CLEAN-OBJS += asm_dec_offsets.asm
|
CLEAN-OBJS += asm_dec_offsets.asm
|
||||||
$(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)asm_dec_offsets.asm
|
$(filter %$(ASM).o,$(OBJS-yes)): $(BUILD_PFX)asm_dec_offsets.asm
|
||||||
|
endif
|
||||||
endif
|
endif
|
||||||
endif
|
endif
|
||||||
|
|
||||||
|
|||||||
@@ -27,9 +27,6 @@ static void update_mode_info_border(MODE_INFO *mi, int rows, int cols)
|
|||||||
|
|
||||||
for (i = 0; i < rows; i++)
|
for (i = 0; i < rows; i++)
|
||||||
{
|
{
|
||||||
/* TODO(holmer): Bug? This updates the last element of each row
|
|
||||||
* rather than the border element!
|
|
||||||
*/
|
|
||||||
vpx_memset(&mi[i*cols-1], 0, sizeof(MODE_INFO));
|
vpx_memset(&mi[i*cols-1], 0, sizeof(MODE_INFO));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -46,11 +43,9 @@ void vp8_de_alloc_frame_buffers(VP8_COMMON *oci)
|
|||||||
|
|
||||||
vpx_free(oci->above_context);
|
vpx_free(oci->above_context);
|
||||||
vpx_free(oci->mip);
|
vpx_free(oci->mip);
|
||||||
vpx_free(oci->prev_mip);
|
|
||||||
|
|
||||||
oci->above_context = 0;
|
oci->above_context = 0;
|
||||||
oci->mip = 0;
|
oci->mip = 0;
|
||||||
oci->prev_mip = 0;
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -70,9 +65,9 @@ int vp8_alloc_frame_buffers(VP8_COMMON *oci, int width, int height)
|
|||||||
|
|
||||||
for (i = 0; i < NUM_YV12_BUFFERS; i++)
|
for (i = 0; i < NUM_YV12_BUFFERS; i++)
|
||||||
{
|
{
|
||||||
oci->fb_idx_ref_cnt[i] = 0;
|
oci->fb_idx_ref_cnt[0] = 0;
|
||||||
oci->yv12_fb[i].flags = 0;
|
|
||||||
if (vp8_yv12_alloc_frame_buffer(&oci->yv12_fb[i], width, height, VP8BORDERINPIXELS) < 0)
|
if (vp8_yv12_alloc_frame_buffer(&oci->yv12_fb[i], width, height, VP8BORDERINPIXELS) < 0)
|
||||||
{
|
{
|
||||||
vp8_de_alloc_frame_buffers(oci);
|
vp8_de_alloc_frame_buffers(oci);
|
||||||
return 1;
|
return 1;
|
||||||
@@ -115,21 +110,6 @@ int vp8_alloc_frame_buffers(VP8_COMMON *oci, int width, int height)
|
|||||||
|
|
||||||
oci->mi = oci->mip + oci->mode_info_stride + 1;
|
oci->mi = oci->mip + oci->mode_info_stride + 1;
|
||||||
|
|
||||||
/* allocate memory for last frame MODE_INFO array */
|
|
||||||
#if CONFIG_ERROR_CONCEALMENT
|
|
||||||
oci->prev_mip = vpx_calloc((oci->mb_cols + 1) * (oci->mb_rows + 1), sizeof(MODE_INFO));
|
|
||||||
|
|
||||||
if (!oci->prev_mip)
|
|
||||||
{
|
|
||||||
vp8_de_alloc_frame_buffers(oci);
|
|
||||||
return 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
oci->prev_mi = oci->prev_mip + oci->mode_info_stride + 1;
|
|
||||||
#else
|
|
||||||
oci->prev_mip = NULL;
|
|
||||||
oci->prev_mi = NULL;
|
|
||||||
#endif
|
|
||||||
|
|
||||||
oci->above_context = vpx_calloc(sizeof(ENTROPY_CONTEXT_PLANES) * oci->mb_cols, 1);
|
oci->above_context = vpx_calloc(sizeof(ENTROPY_CONTEXT_PLANES) * oci->mb_cols, 1);
|
||||||
|
|
||||||
@@ -140,9 +120,6 @@ int vp8_alloc_frame_buffers(VP8_COMMON *oci, int width, int height)
|
|||||||
}
|
}
|
||||||
|
|
||||||
update_mode_info_border(oci->mi, oci->mb_rows, oci->mb_cols);
|
update_mode_info_border(oci->mi, oci->mb_rows, oci->mb_cols);
|
||||||
#if CONFIG_ERROR_CONCEALMENT
|
|
||||||
update_mode_info_border(oci->prev_mi, oci->mb_rows, oci->mb_cols);
|
|
||||||
#endif
|
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
@@ -152,32 +129,32 @@ void vp8_setup_version(VP8_COMMON *cm)
|
|||||||
{
|
{
|
||||||
case 0:
|
case 0:
|
||||||
cm->no_lpf = 0;
|
cm->no_lpf = 0;
|
||||||
cm->filter_type = NORMAL_LOOPFILTER;
|
cm->simpler_lpf = 0;
|
||||||
cm->use_bilinear_mc_filter = 0;
|
cm->use_bilinear_mc_filter = 0;
|
||||||
cm->full_pixel = 0;
|
cm->full_pixel = 0;
|
||||||
break;
|
break;
|
||||||
case 1:
|
case 1:
|
||||||
cm->no_lpf = 0;
|
cm->no_lpf = 0;
|
||||||
cm->filter_type = SIMPLE_LOOPFILTER;
|
cm->simpler_lpf = 1;
|
||||||
cm->use_bilinear_mc_filter = 1;
|
cm->use_bilinear_mc_filter = 1;
|
||||||
cm->full_pixel = 0;
|
cm->full_pixel = 0;
|
||||||
break;
|
break;
|
||||||
case 2:
|
case 2:
|
||||||
cm->no_lpf = 1;
|
cm->no_lpf = 1;
|
||||||
cm->filter_type = NORMAL_LOOPFILTER;
|
cm->simpler_lpf = 0;
|
||||||
cm->use_bilinear_mc_filter = 1;
|
cm->use_bilinear_mc_filter = 1;
|
||||||
cm->full_pixel = 0;
|
cm->full_pixel = 0;
|
||||||
break;
|
break;
|
||||||
case 3:
|
case 3:
|
||||||
cm->no_lpf = 1;
|
cm->no_lpf = 1;
|
||||||
cm->filter_type = SIMPLE_LOOPFILTER;
|
cm->simpler_lpf = 1;
|
||||||
cm->use_bilinear_mc_filter = 1;
|
cm->use_bilinear_mc_filter = 1;
|
||||||
cm->full_pixel = 1;
|
cm->full_pixel = 1;
|
||||||
break;
|
break;
|
||||||
default:
|
default:
|
||||||
/*4,5,6,7 are reserved for future use*/
|
/*4,5,6,7 are reserved for future use*/
|
||||||
cm->no_lpf = 0;
|
cm->no_lpf = 0;
|
||||||
cm->filter_type = NORMAL_LOOPFILTER;
|
cm->simpler_lpf = 0;
|
||||||
cm->use_bilinear_mc_filter = 0;
|
cm->use_bilinear_mc_filter = 0;
|
||||||
cm->full_pixel = 0;
|
cm->full_pixel = 0;
|
||||||
break;
|
break;
|
||||||
@@ -192,7 +169,7 @@ void vp8_create_common(VP8_COMMON *oci)
|
|||||||
|
|
||||||
oci->mb_no_coeff_skip = 1;
|
oci->mb_no_coeff_skip = 1;
|
||||||
oci->no_lpf = 0;
|
oci->no_lpf = 0;
|
||||||
oci->filter_type = NORMAL_LOOPFILTER;
|
oci->simpler_lpf = 0;
|
||||||
oci->use_bilinear_mc_filter = 0;
|
oci->use_bilinear_mc_filter = 0;
|
||||||
oci->full_pixel = 0;
|
oci->full_pixel = 0;
|
||||||
oci->multi_token_partition = ONE_PARTITION;
|
oci->multi_token_partition = ONE_PARTITION;
|
||||||
|
|||||||
@@ -24,17 +24,14 @@ void vp8_arch_arm_common_init(VP8_COMMON *ctx)
|
|||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
VP8_COMMON_RTCD *rtcd = &ctx->rtcd;
|
VP8_COMMON_RTCD *rtcd = &ctx->rtcd;
|
||||||
int flags = arm_cpu_caps();
|
int flags = arm_cpu_caps();
|
||||||
|
int has_edsp = flags & HAS_EDSP;
|
||||||
|
int has_media = flags & HAS_MEDIA;
|
||||||
|
int has_neon = flags & HAS_NEON;
|
||||||
rtcd->flags = flags;
|
rtcd->flags = flags;
|
||||||
|
|
||||||
/* Override default functions with fastest ones for this CPU. */
|
/* Override default functions with fastest ones for this CPU. */
|
||||||
#if HAVE_ARMV5TE
|
|
||||||
if (flags & HAS_EDSP)
|
|
||||||
{
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#if HAVE_ARMV6
|
#if HAVE_ARMV6
|
||||||
if (flags & HAS_MEDIA)
|
if (has_media)
|
||||||
{
|
{
|
||||||
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_armv6;
|
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_armv6;
|
||||||
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_armv6;
|
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_armv6;
|
||||||
@@ -54,11 +51,9 @@ void vp8_arch_arm_common_init(VP8_COMMON *ctx)
|
|||||||
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_armv6;
|
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_armv6;
|
||||||
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_armv6;
|
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_armv6;
|
||||||
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_armv6;
|
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_armv6;
|
||||||
rtcd->loopfilter.simple_mb_v =
|
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_mbvs_armv6;
|
||||||
vp8_loop_filter_simple_vertical_edge_armv6;
|
|
||||||
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_armv6;
|
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_armv6;
|
||||||
rtcd->loopfilter.simple_mb_h =
|
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_mbhs_armv6;
|
||||||
vp8_loop_filter_simple_horizontal_edge_armv6;
|
|
||||||
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_armv6;
|
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_armv6;
|
||||||
|
|
||||||
rtcd->recon.copy16x16 = vp8_copy_mem16x16_v6;
|
rtcd->recon.copy16x16 = vp8_copy_mem16x16_v6;
|
||||||
@@ -71,7 +66,7 @@ void vp8_arch_arm_common_init(VP8_COMMON *ctx)
|
|||||||
#endif
|
#endif
|
||||||
|
|
||||||
#if HAVE_ARMV7
|
#if HAVE_ARMV7
|
||||||
if (flags & HAS_NEON)
|
if (has_neon)
|
||||||
{
|
{
|
||||||
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_neon;
|
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_neon;
|
||||||
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_neon;
|
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_neon;
|
||||||
|
|||||||
@@ -30,12 +30,12 @@
|
|||||||
ldr r4, [sp, #36] ; width
|
ldr r4, [sp, #36] ; width
|
||||||
|
|
||||||
mov r12, r3 ; outer-loop counter
|
mov r12, r3 ; outer-loop counter
|
||||||
|
|
||||||
add r7, r2, r4 ; preload next row
|
|
||||||
pld [r0, r7]
|
|
||||||
|
|
||||||
sub r2, r2, r4 ; src increment for height loop
|
sub r2, r2, r4 ; src increment for height loop
|
||||||
|
|
||||||
|
;;IF ARCHITECTURE=6
|
||||||
|
pld [r0]
|
||||||
|
;;ENDIF
|
||||||
|
|
||||||
ldr r5, [r11] ; load up filter coefficients
|
ldr r5, [r11] ; load up filter coefficients
|
||||||
|
|
||||||
mov r3, r3, lsl #1 ; height*2
|
mov r3, r3, lsl #1 ; height*2
|
||||||
@@ -96,8 +96,9 @@
|
|||||||
add r0, r0, r2 ; move to next input row
|
add r0, r0, r2 ; move to next input row
|
||||||
subs r12, r12, #1
|
subs r12, r12, #1
|
||||||
|
|
||||||
add r9, r2, r4, lsl #1 ; adding back block width
|
;;IF ARCHITECTURE=6
|
||||||
pld [r0, r9] ; preload next row
|
pld [r0]
|
||||||
|
;;ENDIF
|
||||||
|
|
||||||
add r11, r11, #2 ; move over to next column
|
add r11, r11, #2 ; move over to next column
|
||||||
mov r1, r11
|
mov r1, r11
|
||||||
|
|||||||
@@ -22,7 +22,9 @@
|
|||||||
;push {r4-r7}
|
;push {r4-r7}
|
||||||
|
|
||||||
;preload
|
;preload
|
||||||
pld [r0, #31] ; preload for next 16x16 block
|
pld [r0]
|
||||||
|
pld [r0, r1]
|
||||||
|
pld [r0, r1, lsl #1]
|
||||||
|
|
||||||
ands r4, r0, #15
|
ands r4, r0, #15
|
||||||
beq copy_mem16x16_fast
|
beq copy_mem16x16_fast
|
||||||
@@ -88,8 +90,6 @@ copy_mem16x16_1_loop
|
|||||||
ldrneb r6, [r0, #2]
|
ldrneb r6, [r0, #2]
|
||||||
ldrneb r7, [r0, #3]
|
ldrneb r7, [r0, #3]
|
||||||
|
|
||||||
pld [r0, #31] ; preload for next 16x16 block
|
|
||||||
|
|
||||||
bne copy_mem16x16_1_loop
|
bne copy_mem16x16_1_loop
|
||||||
|
|
||||||
ldmia sp!, {r4 - r7}
|
ldmia sp!, {r4 - r7}
|
||||||
@@ -121,8 +121,6 @@ copy_mem16x16_4_loop
|
|||||||
ldrne r6, [r0, #8]
|
ldrne r6, [r0, #8]
|
||||||
ldrne r7, [r0, #12]
|
ldrne r7, [r0, #12]
|
||||||
|
|
||||||
pld [r0, #31] ; preload for next 16x16 block
|
|
||||||
|
|
||||||
bne copy_mem16x16_4_loop
|
bne copy_mem16x16_4_loop
|
||||||
|
|
||||||
ldmia sp!, {r4 - r7}
|
ldmia sp!, {r4 - r7}
|
||||||
@@ -150,7 +148,6 @@ copy_mem16x16_8_loop
|
|||||||
|
|
||||||
add r2, r2, r3
|
add r2, r2, r3
|
||||||
|
|
||||||
pld [r0, #31] ; preload for next 16x16 block
|
|
||||||
bne copy_mem16x16_8_loop
|
bne copy_mem16x16_8_loop
|
||||||
|
|
||||||
ldmia sp!, {r4 - r7}
|
ldmia sp!, {r4 - r7}
|
||||||
@@ -174,7 +171,6 @@ copy_mem16x16_fast_loop
|
|||||||
;stm r2, {r4-r7}
|
;stm r2, {r4-r7}
|
||||||
add r2, r2, r3
|
add r2, r2, r3
|
||||||
|
|
||||||
pld [r0, #31] ; preload for next 16x16 block
|
|
||||||
bne copy_mem16x16_fast_loop
|
bne copy_mem16x16_fast_loop
|
||||||
|
|
||||||
ldmia sp!, {r4 - r7}
|
ldmia sp!, {r4 - r7}
|
||||||
|
|||||||
@@ -10,8 +10,6 @@
|
|||||||
|
|
||||||
|
|
||||||
EXPORT |vp8_filter_block2d_first_pass_armv6|
|
EXPORT |vp8_filter_block2d_first_pass_armv6|
|
||||||
EXPORT |vp8_filter_block2d_first_pass_16x16_armv6|
|
|
||||||
EXPORT |vp8_filter_block2d_first_pass_8x8_armv6|
|
|
||||||
EXPORT |vp8_filter_block2d_second_pass_armv6|
|
EXPORT |vp8_filter_block2d_second_pass_armv6|
|
||||||
EXPORT |vp8_filter4_block2d_second_pass_armv6|
|
EXPORT |vp8_filter4_block2d_second_pass_armv6|
|
||||||
EXPORT |vp8_filter_block2d_first_pass_only_armv6|
|
EXPORT |vp8_filter_block2d_first_pass_only_armv6|
|
||||||
@@ -42,6 +40,11 @@
|
|||||||
add r12, r3, #16 ; square off the output
|
add r12, r3, #16 ; square off the output
|
||||||
sub sp, sp, #4
|
sub sp, sp, #4
|
||||||
|
|
||||||
|
;;IF ARCHITECTURE=6
|
||||||
|
;pld [r0, #-2]
|
||||||
|
;;pld [r0, #30]
|
||||||
|
;;ENDIF
|
||||||
|
|
||||||
ldr r4, [r11] ; load up packed filter coefficients
|
ldr r4, [r11] ; load up packed filter coefficients
|
||||||
ldr r5, [r11, #4]
|
ldr r5, [r11, #4]
|
||||||
ldr r6, [r11, #8]
|
ldr r6, [r11, #8]
|
||||||
@@ -98,10 +101,15 @@
|
|||||||
|
|
||||||
bne width_loop_1st_6
|
bne width_loop_1st_6
|
||||||
|
|
||||||
|
;;add r9, r2, #30 ; attempt to load 2 adjacent cache lines
|
||||||
|
;;IF ARCHITECTURE=6
|
||||||
|
;pld [r0, r2]
|
||||||
|
;;pld [r0, r9]
|
||||||
|
;;ENDIF
|
||||||
|
|
||||||
ldr r1, [sp] ; load and update dst address
|
ldr r1, [sp] ; load and update dst address
|
||||||
subs r7, r7, #0x10000
|
subs r7, r7, #0x10000
|
||||||
add r0, r0, r2 ; move to next input line
|
add r0, r0, r2 ; move to next input line
|
||||||
|
|
||||||
add r1, r1, #2 ; move over to next column
|
add r1, r1, #2 ; move over to next column
|
||||||
str r1, [sp]
|
str r1, [sp]
|
||||||
|
|
||||||
@@ -112,192 +120,6 @@
|
|||||||
|
|
||||||
ENDP
|
ENDP
|
||||||
|
|
||||||
; --------------------------
|
|
||||||
; 16x16 version
|
|
||||||
; -----------------------------
|
|
||||||
|vp8_filter_block2d_first_pass_16x16_armv6| PROC
|
|
||||||
stmdb sp!, {r4 - r11, lr}
|
|
||||||
|
|
||||||
ldr r11, [sp, #40] ; vp8_filter address
|
|
||||||
ldr r7, [sp, #36] ; output height
|
|
||||||
|
|
||||||
add r4, r2, #18 ; preload next low
|
|
||||||
pld [r0, r4]
|
|
||||||
|
|
||||||
sub r2, r2, r3 ; inside loop increments input array,
|
|
||||||
; so the height loop only needs to add
|
|
||||||
; r2 - width to the input pointer
|
|
||||||
|
|
||||||
mov r3, r3, lsl #1 ; multiply width by 2 because using shorts
|
|
||||||
add r12, r3, #16 ; square off the output
|
|
||||||
sub sp, sp, #4
|
|
||||||
|
|
||||||
ldr r4, [r11] ; load up packed filter coefficients
|
|
||||||
ldr r5, [r11, #4]
|
|
||||||
ldr r6, [r11, #8]
|
|
||||||
|
|
||||||
str r1, [sp] ; push destination to stack
|
|
||||||
mov r7, r7, lsl #16 ; height is top part of counter
|
|
||||||
|
|
||||||
; six tap filter
|
|
||||||
|height_loop_1st_16_6|
|
|
||||||
ldrb r8, [r0, #-2] ; load source data
|
|
||||||
ldrb r9, [r0, #-1]
|
|
||||||
ldrb r10, [r0], #2
|
|
||||||
orr r7, r7, r3, lsr #2 ; construct loop counter
|
|
||||||
|
|
||||||
|width_loop_1st_16_6|
|
|
||||||
ldrb r11, [r0, #-1]
|
|
||||||
|
|
||||||
pkhbt lr, r8, r9, lsl #16 ; r9 | r8
|
|
||||||
pkhbt r8, r9, r10, lsl #16 ; r10 | r9
|
|
||||||
|
|
||||||
ldrb r9, [r0]
|
|
||||||
|
|
||||||
smuad lr, lr, r4 ; apply the filter
|
|
||||||
pkhbt r10, r10, r11, lsl #16 ; r11 | r10
|
|
||||||
smuad r8, r8, r4
|
|
||||||
pkhbt r11, r11, r9, lsl #16 ; r9 | r11
|
|
||||||
|
|
||||||
smlad lr, r10, r5, lr
|
|
||||||
ldrb r10, [r0, #1]
|
|
||||||
smlad r8, r11, r5, r8
|
|
||||||
ldrb r11, [r0, #2]
|
|
||||||
|
|
||||||
sub r7, r7, #1
|
|
||||||
|
|
||||||
pkhbt r9, r9, r10, lsl #16 ; r10 | r9
|
|
||||||
pkhbt r10, r10, r11, lsl #16 ; r11 | r10
|
|
||||||
|
|
||||||
smlad lr, r9, r6, lr
|
|
||||||
smlad r11, r10, r6, r8
|
|
||||||
|
|
||||||
ands r10, r7, #0xff ; test loop counter
|
|
||||||
|
|
||||||
add lr, lr, #0x40 ; round_shift_and_clamp
|
|
||||||
ldrneb r8, [r0, #-2] ; load data for next loop
|
|
||||||
usat lr, #8, lr, asr #7
|
|
||||||
add r11, r11, #0x40
|
|
||||||
ldrneb r9, [r0, #-1]
|
|
||||||
usat r11, #8, r11, asr #7
|
|
||||||
|
|
||||||
strh lr, [r1], r12 ; result is transposed and stored, which
|
|
||||||
; will make second pass filtering easier.
|
|
||||||
ldrneb r10, [r0], #2
|
|
||||||
strh r11, [r1], r12
|
|
||||||
|
|
||||||
bne width_loop_1st_16_6
|
|
||||||
|
|
||||||
ldr r1, [sp] ; load and update dst address
|
|
||||||
subs r7, r7, #0x10000
|
|
||||||
add r0, r0, r2 ; move to next input line
|
|
||||||
|
|
||||||
add r11, r2, #34 ; adding back block width(=16)
|
|
||||||
pld [r0, r11] ; preload next low
|
|
||||||
|
|
||||||
add r1, r1, #2 ; move over to next column
|
|
||||||
str r1, [sp]
|
|
||||||
|
|
||||||
bne height_loop_1st_16_6
|
|
||||||
|
|
||||||
add sp, sp, #4
|
|
||||||
ldmia sp!, {r4 - r11, pc}
|
|
||||||
|
|
||||||
ENDP
|
|
||||||
|
|
||||||
; --------------------------
|
|
||||||
; 8x8 version
|
|
||||||
; -----------------------------
|
|
||||||
|vp8_filter_block2d_first_pass_8x8_armv6| PROC
|
|
||||||
stmdb sp!, {r4 - r11, lr}
|
|
||||||
|
|
||||||
ldr r11, [sp, #40] ; vp8_filter address
|
|
||||||
ldr r7, [sp, #36] ; output height
|
|
||||||
|
|
||||||
add r4, r2, #10 ; preload next low
|
|
||||||
pld [r0, r4]
|
|
||||||
|
|
||||||
sub r2, r2, r3 ; inside loop increments input array,
|
|
||||||
; so the height loop only needs to add
|
|
||||||
; r2 - width to the input pointer
|
|
||||||
|
|
||||||
mov r3, r3, lsl #1 ; multiply width by 2 because using shorts
|
|
||||||
add r12, r3, #16 ; square off the output
|
|
||||||
sub sp, sp, #4
|
|
||||||
|
|
||||||
ldr r4, [r11] ; load up packed filter coefficients
|
|
||||||
ldr r5, [r11, #4]
|
|
||||||
ldr r6, [r11, #8]
|
|
||||||
|
|
||||||
str r1, [sp] ; push destination to stack
|
|
||||||
mov r7, r7, lsl #16 ; height is top part of counter
|
|
||||||
|
|
||||||
; six tap filter
|
|
||||||
|height_loop_1st_8_6|
|
|
||||||
ldrb r8, [r0, #-2] ; load source data
|
|
||||||
ldrb r9, [r0, #-1]
|
|
||||||
ldrb r10, [r0], #2
|
|
||||||
orr r7, r7, r3, lsr #2 ; construct loop counter
|
|
||||||
|
|
||||||
|width_loop_1st_8_6|
|
|
||||||
ldrb r11, [r0, #-1]
|
|
||||||
|
|
||||||
pkhbt lr, r8, r9, lsl #16 ; r9 | r8
|
|
||||||
pkhbt r8, r9, r10, lsl #16 ; r10 | r9
|
|
||||||
|
|
||||||
ldrb r9, [r0]
|
|
||||||
|
|
||||||
smuad lr, lr, r4 ; apply the filter
|
|
||||||
pkhbt r10, r10, r11, lsl #16 ; r11 | r10
|
|
||||||
smuad r8, r8, r4
|
|
||||||
pkhbt r11, r11, r9, lsl #16 ; r9 | r11
|
|
||||||
|
|
||||||
smlad lr, r10, r5, lr
|
|
||||||
ldrb r10, [r0, #1]
|
|
||||||
smlad r8, r11, r5, r8
|
|
||||||
ldrb r11, [r0, #2]
|
|
||||||
|
|
||||||
sub r7, r7, #1
|
|
||||||
|
|
||||||
pkhbt r9, r9, r10, lsl #16 ; r10 | r9
|
|
||||||
pkhbt r10, r10, r11, lsl #16 ; r11 | r10
|
|
||||||
|
|
||||||
smlad lr, r9, r6, lr
|
|
||||||
smlad r11, r10, r6, r8
|
|
||||||
|
|
||||||
ands r10, r7, #0xff ; test loop counter
|
|
||||||
|
|
||||||
add lr, lr, #0x40 ; round_shift_and_clamp
|
|
||||||
ldrneb r8, [r0, #-2] ; load data for next loop
|
|
||||||
usat lr, #8, lr, asr #7
|
|
||||||
add r11, r11, #0x40
|
|
||||||
ldrneb r9, [r0, #-1]
|
|
||||||
usat r11, #8, r11, asr #7
|
|
||||||
|
|
||||||
strh lr, [r1], r12 ; result is transposed and stored, which
|
|
||||||
; will make second pass filtering easier.
|
|
||||||
ldrneb r10, [r0], #2
|
|
||||||
strh r11, [r1], r12
|
|
||||||
|
|
||||||
bne width_loop_1st_8_6
|
|
||||||
|
|
||||||
ldr r1, [sp] ; load and update dst address
|
|
||||||
subs r7, r7, #0x10000
|
|
||||||
add r0, r0, r2 ; move to next input line
|
|
||||||
|
|
||||||
add r11, r2, #18 ; adding back block width(=8)
|
|
||||||
pld [r0, r11] ; preload next low
|
|
||||||
|
|
||||||
add r1, r1, #2 ; move over to next column
|
|
||||||
str r1, [sp]
|
|
||||||
|
|
||||||
bne height_loop_1st_8_6
|
|
||||||
|
|
||||||
add sp, sp, #4
|
|
||||||
ldmia sp!, {r4 - r11, pc}
|
|
||||||
|
|
||||||
ENDP
|
|
||||||
|
|
||||||
;---------------------------------
|
;---------------------------------
|
||||||
; r0 short *src_ptr,
|
; r0 short *src_ptr,
|
||||||
; r1 unsigned char *output_ptr,
|
; r1 unsigned char *output_ptr,
|
||||||
@@ -440,10 +262,6 @@
|
|||||||
|vp8_filter_block2d_first_pass_only_armv6| PROC
|
|vp8_filter_block2d_first_pass_only_armv6| PROC
|
||||||
stmdb sp!, {r4 - r11, lr}
|
stmdb sp!, {r4 - r11, lr}
|
||||||
|
|
||||||
add r7, r2, r3 ; preload next low
|
|
||||||
add r7, r7, #2
|
|
||||||
pld [r0, r7]
|
|
||||||
|
|
||||||
ldr r4, [sp, #36] ; output pitch
|
ldr r4, [sp, #36] ; output pitch
|
||||||
ldr r11, [sp, #40] ; HFilter address
|
ldr r11, [sp, #40] ; HFilter address
|
||||||
sub sp, sp, #8
|
sub sp, sp, #8
|
||||||
@@ -512,15 +330,16 @@
|
|||||||
|
|
||||||
bne width_loop_1st_only_6
|
bne width_loop_1st_only_6
|
||||||
|
|
||||||
|
;;add r9, r2, #30 ; attempt to load 2 adjacent cache lines
|
||||||
|
;;IF ARCHITECTURE=6
|
||||||
|
;pld [r0, r2]
|
||||||
|
;;pld [r0, r9]
|
||||||
|
;;ENDIF
|
||||||
|
|
||||||
ldr lr, [sp] ; load back output pitch
|
ldr lr, [sp] ; load back output pitch
|
||||||
ldr r12, [sp, #4] ; load back output pitch
|
ldr r12, [sp, #4] ; load back output pitch
|
||||||
subs r7, r7, #1
|
subs r7, r7, #1
|
||||||
add r0, r0, r12 ; updata src for next loop
|
add r0, r0, r12 ; updata src for next loop
|
||||||
|
|
||||||
add r11, r12, r3 ; preload next low
|
|
||||||
add r11, r11, #2
|
|
||||||
pld [r0, r11]
|
|
||||||
|
|
||||||
add r1, r1, lr ; update dst for next loop
|
add r1, r1, lr ; update dst for next loop
|
||||||
|
|
||||||
bne height_loop_1st_only_6
|
bne height_loop_1st_only_6
|
||||||
|
|||||||
@@ -53,11 +53,14 @@ count RN r5
|
|||||||
|
|
||||||
;r0 unsigned char *src_ptr,
|
;r0 unsigned char *src_ptr,
|
||||||
;r1 int src_pixel_step,
|
;r1 int src_pixel_step,
|
||||||
;r2 const char *blimit,
|
;r2 const char *flimit,
|
||||||
;r3 const char *limit,
|
;r3 const char *limit,
|
||||||
;stack const char *thresh,
|
;stack const char *thresh,
|
||||||
;stack int count
|
;stack int count
|
||||||
|
|
||||||
|
;Note: All 16 elements in flimit are equal. So, in the code, only one load is needed
|
||||||
|
;for flimit. Same way applies to limit and thresh.
|
||||||
|
|
||||||
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
||||||
|vp8_loop_filter_horizontal_edge_armv6| PROC
|
|vp8_loop_filter_horizontal_edge_armv6| PROC
|
||||||
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
||||||
@@ -69,18 +72,14 @@ count RN r5
|
|||||||
sub sp, sp, #16 ; create temp buffer
|
sub sp, sp, #16 ; create temp buffer
|
||||||
|
|
||||||
ldr r9, [src], pstep ; p3
|
ldr r9, [src], pstep ; p3
|
||||||
ldrb r4, [r2] ; blimit
|
ldr r4, [r2], #4 ; flimit
|
||||||
ldr r10, [src], pstep ; p2
|
ldr r10, [src], pstep ; p2
|
||||||
ldrb r2, [r3] ; limit
|
ldr r2, [r3], #4 ; limit
|
||||||
ldr r11, [src], pstep ; p1
|
ldr r11, [src], pstep ; p1
|
||||||
orr r4, r4, r4, lsl #8
|
uadd8 r4, r4, r4 ; flimit * 2
|
||||||
ldrb r3, [r6] ; thresh
|
ldr r3, [r6], #4 ; thresh
|
||||||
orr r2, r2, r2, lsl #8
|
|
||||||
mov count, count, lsl #1 ; 4-in-parallel
|
mov count, count, lsl #1 ; 4-in-parallel
|
||||||
orr r4, r4, r4, lsl #16
|
uadd8 r4, r4, r2 ; flimit * 2 + limit
|
||||||
orr r3, r3, r3, lsl #8
|
|
||||||
orr r2, r2, r2, lsl #16
|
|
||||||
orr r3, r3, r3, lsl #16
|
|
||||||
|
|
||||||
|Hnext8|
|
|Hnext8|
|
||||||
; vp8_filter_mask() function
|
; vp8_filter_mask() function
|
||||||
@@ -254,6 +253,12 @@ count RN r5
|
|||||||
|
|
||||||
subs count, count, #1
|
subs count, count, #1
|
||||||
|
|
||||||
|
;pld [src]
|
||||||
|
;pld [src, pstep]
|
||||||
|
;pld [src, pstep, lsl #1]
|
||||||
|
;pld [src, pstep, lsl #2]
|
||||||
|
;pld [src, pstep, lsl #3]
|
||||||
|
|
||||||
ldrne r9, [src], pstep ; p3
|
ldrne r9, [src], pstep ; p3
|
||||||
ldrne r10, [src], pstep ; p2
|
ldrne r10, [src], pstep ; p2
|
||||||
ldrne r11, [src], pstep ; p1
|
ldrne r11, [src], pstep ; p1
|
||||||
@@ -276,18 +281,14 @@ count RN r5
|
|||||||
sub sp, sp, #16 ; create temp buffer
|
sub sp, sp, #16 ; create temp buffer
|
||||||
|
|
||||||
ldr r9, [src], pstep ; p3
|
ldr r9, [src], pstep ; p3
|
||||||
ldrb r4, [r2] ; blimit
|
ldr r4, [r2], #4 ; flimit
|
||||||
ldr r10, [src], pstep ; p2
|
ldr r10, [src], pstep ; p2
|
||||||
ldrb r2, [r3] ; limit
|
ldr r2, [r3], #4 ; limit
|
||||||
ldr r11, [src], pstep ; p1
|
ldr r11, [src], pstep ; p1
|
||||||
orr r4, r4, r4, lsl #8
|
uadd8 r4, r4, r4 ; flimit * 2
|
||||||
ldrb r3, [r6] ; thresh
|
ldr r3, [r6], #4 ; thresh
|
||||||
orr r2, r2, r2, lsl #8
|
|
||||||
mov count, count, lsl #1 ; 4-in-parallel
|
mov count, count, lsl #1 ; 4-in-parallel
|
||||||
orr r4, r4, r4, lsl #16
|
uadd8 r4, r4, r2 ; flimit * 2 + limit
|
||||||
orr r3, r3, r3, lsl #8
|
|
||||||
orr r2, r2, r2, lsl #16
|
|
||||||
orr r3, r3, r3, lsl #16
|
|
||||||
|
|
||||||
|MBHnext8|
|
|MBHnext8|
|
||||||
|
|
||||||
@@ -589,19 +590,15 @@ count RN r5
|
|||||||
sub sp, sp, #16 ; create temp buffer
|
sub sp, sp, #16 ; create temp buffer
|
||||||
|
|
||||||
ldr r6, [src], pstep ; load source data
|
ldr r6, [src], pstep ; load source data
|
||||||
ldrb r4, [r2] ; blimit
|
ldr r4, [r2], #4 ; flimit
|
||||||
ldr r7, [src], pstep
|
ldr r7, [src], pstep
|
||||||
ldrb r2, [r3] ; limit
|
ldr r2, [r3], #4 ; limit
|
||||||
ldr r8, [src], pstep
|
ldr r8, [src], pstep
|
||||||
orr r4, r4, r4, lsl #8
|
uadd8 r4, r4, r4 ; flimit * 2
|
||||||
ldrb r3, [r12] ; thresh
|
ldr r3, [r12], #4 ; thresh
|
||||||
orr r2, r2, r2, lsl #8
|
|
||||||
ldr lr, [src], pstep
|
ldr lr, [src], pstep
|
||||||
mov count, count, lsl #1 ; 4-in-parallel
|
mov count, count, lsl #1 ; 4-in-parallel
|
||||||
orr r4, r4, r4, lsl #16
|
uadd8 r4, r4, r2 ; flimit * 2 + limit
|
||||||
orr r3, r3, r3, lsl #8
|
|
||||||
orr r2, r2, r2, lsl #16
|
|
||||||
orr r3, r3, r3, lsl #16
|
|
||||||
|
|
||||||
|Vnext8|
|
|Vnext8|
|
||||||
|
|
||||||
@@ -860,26 +857,18 @@ count RN r5
|
|||||||
sub src, src, #4 ; move src pointer down by 4
|
sub src, src, #4 ; move src pointer down by 4
|
||||||
ldr count, [sp, #40] ; count for 8-in-parallel
|
ldr count, [sp, #40] ; count for 8-in-parallel
|
||||||
ldr r12, [sp, #36] ; load thresh address
|
ldr r12, [sp, #36] ; load thresh address
|
||||||
pld [src, #23] ; preload for next block
|
|
||||||
sub sp, sp, #16 ; create temp buffer
|
sub sp, sp, #16 ; create temp buffer
|
||||||
|
|
||||||
ldr r6, [src], pstep ; load source data
|
ldr r6, [src], pstep ; load source data
|
||||||
ldrb r4, [r2] ; blimit
|
ldr r4, [r2], #4 ; flimit
|
||||||
pld [src, #23]
|
|
||||||
ldr r7, [src], pstep
|
ldr r7, [src], pstep
|
||||||
ldrb r2, [r3] ; limit
|
ldr r2, [r3], #4 ; limit
|
||||||
pld [src, #23]
|
|
||||||
ldr r8, [src], pstep
|
ldr r8, [src], pstep
|
||||||
orr r4, r4, r4, lsl #8
|
uadd8 r4, r4, r4 ; flimit * 2
|
||||||
ldrb r3, [r12] ; thresh
|
ldr r3, [r12], #4 ; thresh
|
||||||
orr r2, r2, r2, lsl #8
|
|
||||||
pld [src, #23]
|
|
||||||
ldr lr, [src], pstep
|
ldr lr, [src], pstep
|
||||||
mov count, count, lsl #1 ; 4-in-parallel
|
mov count, count, lsl #1 ; 4-in-parallel
|
||||||
orr r4, r4, r4, lsl #16
|
uadd8 r4, r4, r2 ; flimit * 2 + limit
|
||||||
orr r3, r3, r3, lsl #8
|
|
||||||
orr r2, r2, r2, lsl #16
|
|
||||||
orr r3, r3, r3, lsl #16
|
|
||||||
|
|
||||||
|MBVnext8|
|
|MBVnext8|
|
||||||
; vp8_filter_mask() function
|
; vp8_filter_mask() function
|
||||||
@@ -919,7 +908,6 @@ count RN r5
|
|||||||
str lr, [sp, #8]
|
str lr, [sp, #8]
|
||||||
ldr lr, [src], pstep
|
ldr lr, [src], pstep
|
||||||
|
|
||||||
|
|
||||||
TRANSPOSE_MATRIX r6, r7, r8, lr, r9, r10, r11, r12
|
TRANSPOSE_MATRIX r6, r7, r8, lr, r9, r10, r11, r12
|
||||||
|
|
||||||
ldr lr, [sp, #8] ; load back (f)limit accumulator
|
ldr lr, [sp, #8] ; load back (f)limit accumulator
|
||||||
@@ -968,7 +956,6 @@ count RN r5
|
|||||||
beq mbvskip_filter ; skip filtering
|
beq mbvskip_filter ; skip filtering
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
;vp8_hevmask() function
|
;vp8_hevmask() function
|
||||||
;calculate high edge variance
|
;calculate high edge variance
|
||||||
|
|
||||||
@@ -1136,7 +1123,6 @@ count RN r5
|
|||||||
smlabb r8, r6, lr, r7
|
smlabb r8, r6, lr, r7
|
||||||
smlatb r6, r6, lr, r7
|
smlatb r6, r6, lr, r7
|
||||||
smlabb r9, r10, lr, r7
|
smlabb r9, r10, lr, r7
|
||||||
|
|
||||||
smlatb r10, r10, lr, r7
|
smlatb r10, r10, lr, r7
|
||||||
ssat r8, #8, r8, asr #7
|
ssat r8, #8, r8, asr #7
|
||||||
ssat r6, #8, r6, asr #7
|
ssat r6, #8, r6, asr #7
|
||||||
@@ -1256,13 +1242,9 @@ count RN r5
|
|||||||
sub src, src, #4
|
sub src, src, #4
|
||||||
subs count, count, #1
|
subs count, count, #1
|
||||||
|
|
||||||
pld [src, #23] ; preload for next block
|
|
||||||
ldrne r6, [src], pstep ; load source data
|
ldrne r6, [src], pstep ; load source data
|
||||||
pld [src, #23]
|
|
||||||
ldrne r7, [src], pstep
|
ldrne r7, [src], pstep
|
||||||
pld [src, #23]
|
|
||||||
ldrne r8, [src], pstep
|
ldrne r8, [src], pstep
|
||||||
pld [src, #23]
|
|
||||||
ldrne lr, [src], pstep
|
ldrne lr, [src], pstep
|
||||||
|
|
||||||
bne MBVnext8
|
bne MBVnext8
|
||||||
|
|||||||
@@ -45,28 +45,35 @@
|
|||||||
MEND
|
MEND
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
src RN r0
|
src RN r0
|
||||||
pstep RN r1
|
pstep RN r1
|
||||||
|
|
||||||
;r0 unsigned char *src_ptr,
|
;r0 unsigned char *src_ptr,
|
||||||
;r1 int src_pixel_step,
|
;r1 int src_pixel_step,
|
||||||
;r2 const char *blimit
|
;r2 const char *flimit,
|
||||||
|
;r3 const char *limit,
|
||||||
|
;stack const char *thresh,
|
||||||
|
;stack int count
|
||||||
|
|
||||||
|
; All 16 elements in flimit are equal. So, in the code, only one load is needed
|
||||||
|
; for flimit. Same applies to limit. thresh is not used in simple looopfilter
|
||||||
|
|
||||||
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
||||||
|vp8_loop_filter_simple_horizontal_edge_armv6| PROC
|
|vp8_loop_filter_simple_horizontal_edge_armv6| PROC
|
||||||
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
||||||
stmdb sp!, {r4 - r11, lr}
|
stmdb sp!, {r4 - r11, lr}
|
||||||
|
|
||||||
ldrb r12, [r2] ; blimit
|
ldr r12, [r3] ; limit
|
||||||
ldr r3, [src, -pstep, lsl #1] ; p1
|
ldr r3, [src, -pstep, lsl #1] ; p1
|
||||||
ldr r4, [src, -pstep] ; p0
|
ldr r4, [src, -pstep] ; p0
|
||||||
ldr r5, [src] ; q0
|
ldr r5, [src] ; q0
|
||||||
ldr r6, [src, pstep] ; q1
|
ldr r6, [src, pstep] ; q1
|
||||||
orr r12, r12, r12, lsl #8 ; blimit
|
ldr r7, [r2] ; flimit
|
||||||
ldr r2, c0x80808080
|
ldr r2, c0x80808080
|
||||||
orr r12, r12, r12, lsl #16 ; blimit
|
ldr r9, [sp, #40] ; count for 8-in-parallel
|
||||||
mov r9, #4 ; double the count. we're doing 4 at a time
|
uadd8 r7, r7, r7 ; flimit * 2
|
||||||
|
mov r9, r9, lsl #1 ; double the count. we're doing 4 at a time
|
||||||
|
uadd8 r12, r7, r12 ; flimit * 2 + limit
|
||||||
mov lr, #0 ; need 0 in a couple places
|
mov lr, #0 ; need 0 in a couple places
|
||||||
|
|
||||||
|simple_hnext8|
|
|simple_hnext8|
|
||||||
@@ -141,32 +148,30 @@ pstep RN r1
|
|||||||
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
;-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
||||||
stmdb sp!, {r4 - r11, lr}
|
stmdb sp!, {r4 - r11, lr}
|
||||||
|
|
||||||
ldrb r12, [r2] ; r12: blimit
|
ldr r12, [r2] ; r12: flimit
|
||||||
ldr r2, c0x80808080
|
ldr r2, c0x80808080
|
||||||
orr r12, r12, r12, lsl #8
|
ldr r7, [r3] ; limit
|
||||||
|
|
||||||
; load soure data to r7, r8, r9, r10
|
; load soure data to r7, r8, r9, r10
|
||||||
ldrh r3, [src, #-2]
|
ldrh r3, [src, #-2]
|
||||||
pld [src, #23] ; preload for next block
|
|
||||||
ldrh r4, [src], pstep
|
ldrh r4, [src], pstep
|
||||||
orr r12, r12, r12, lsl #16
|
uadd8 r12, r12, r12 ; flimit * 2
|
||||||
|
|
||||||
ldrh r5, [src, #-2]
|
ldrh r5, [src, #-2]
|
||||||
pld [src, #23]
|
|
||||||
ldrh r6, [src], pstep
|
ldrh r6, [src], pstep
|
||||||
|
uadd8 r12, r12, r7 ; flimit * 2 + limit
|
||||||
|
|
||||||
pkhbt r7, r3, r4, lsl #16
|
pkhbt r7, r3, r4, lsl #16
|
||||||
|
|
||||||
ldrh r3, [src, #-2]
|
ldrh r3, [src, #-2]
|
||||||
pld [src, #23]
|
|
||||||
ldrh r4, [src], pstep
|
ldrh r4, [src], pstep
|
||||||
|
ldr r11, [sp, #40] ; count (r11) for 8-in-parallel
|
||||||
|
|
||||||
pkhbt r8, r5, r6, lsl #16
|
pkhbt r8, r5, r6, lsl #16
|
||||||
|
|
||||||
ldrh r5, [src, #-2]
|
ldrh r5, [src, #-2]
|
||||||
pld [src, #23]
|
|
||||||
ldrh r6, [src], pstep
|
ldrh r6, [src], pstep
|
||||||
mov r11, #4 ; double the count. we're doing 4 at a time
|
mov r11, r11, lsl #1 ; 4-in-parallel
|
||||||
|
|
||||||
|simple_vnext8|
|
|simple_vnext8|
|
||||||
; vp8_simple_filter_mask() function
|
; vp8_simple_filter_mask() function
|
||||||
@@ -254,23 +259,19 @@ pstep RN r1
|
|||||||
|
|
||||||
; load soure data to r7, r8, r9, r10
|
; load soure data to r7, r8, r9, r10
|
||||||
ldrneh r3, [src, #-2]
|
ldrneh r3, [src, #-2]
|
||||||
pld [src, #23] ; preload for next block
|
|
||||||
ldrneh r4, [src], pstep
|
ldrneh r4, [src], pstep
|
||||||
|
|
||||||
ldrneh r5, [src, #-2]
|
ldrneh r5, [src, #-2]
|
||||||
pld [src, #23]
|
|
||||||
ldrneh r6, [src], pstep
|
ldrneh r6, [src], pstep
|
||||||
|
|
||||||
pkhbt r7, r3, r4, lsl #16
|
pkhbt r7, r3, r4, lsl #16
|
||||||
|
|
||||||
ldrneh r3, [src, #-2]
|
ldrneh r3, [src, #-2]
|
||||||
pld [src, #23]
|
|
||||||
ldrneh r4, [src], pstep
|
ldrneh r4, [src], pstep
|
||||||
|
|
||||||
pkhbt r8, r5, r6, lsl #16
|
pkhbt r8, r5, r6, lsl #16
|
||||||
|
|
||||||
ldrneh r5, [src, #-2]
|
ldrneh r5, [src, #-2]
|
||||||
pld [src, #23]
|
|
||||||
ldrneh r6, [src], pstep
|
ldrneh r6, [src], pstep
|
||||||
|
|
||||||
bne simple_vnext8
|
bne simple_vnext8
|
||||||
|
|||||||
@@ -32,12 +32,9 @@
|
|||||||
beq skip_firstpass_filter
|
beq skip_firstpass_filter
|
||||||
|
|
||||||
;first-pass filter
|
;first-pass filter
|
||||||
adr r12, filter8_coeff
|
ldr r12, _filter8_coeff_
|
||||||
sub r0, r0, r1, lsl #1
|
sub r0, r0, r1, lsl #1
|
||||||
|
|
||||||
add r3, r1, #10 ; preload next low
|
|
||||||
pld [r0, r3]
|
|
||||||
|
|
||||||
add r2, r12, r2, lsl #4 ;calculate filter location
|
add r2, r12, r2, lsl #4 ;calculate filter location
|
||||||
add r0, r0, #3 ;adjust src only for loading convinience
|
add r0, r0, #3 ;adjust src only for loading convinience
|
||||||
|
|
||||||
@@ -113,9 +110,6 @@
|
|||||||
|
|
||||||
add r0, r0, r1 ; move to next input line
|
add r0, r0, r1 ; move to next input line
|
||||||
|
|
||||||
add r11, r1, #18 ; preload next low. adding back block width(=8), which is subtracted earlier
|
|
||||||
pld [r0, r11]
|
|
||||||
|
|
||||||
bne first_pass_hloop_v6
|
bne first_pass_hloop_v6
|
||||||
|
|
||||||
;second pass filter
|
;second pass filter
|
||||||
@@ -127,7 +121,7 @@ secondpass_filter
|
|||||||
cmp r3, #0
|
cmp r3, #0
|
||||||
beq skip_secondpass_filter
|
beq skip_secondpass_filter
|
||||||
|
|
||||||
adr r12, filter8_coeff
|
ldr r12, _filter8_coeff_
|
||||||
add lr, r12, r3, lsl #4 ;calculate filter location
|
add lr, r12, r3, lsl #4 ;calculate filter location
|
||||||
|
|
||||||
mov r2, #0x00080000
|
mov r2, #0x00080000
|
||||||
@@ -251,6 +245,8 @@ skip_secondpass_hloop
|
|||||||
;-----------------
|
;-----------------
|
||||||
;One word each is reserved. Label filter_coeff can be used to access the data.
|
;One word each is reserved. Label filter_coeff can be used to access the data.
|
||||||
;Data address: filter_coeff, filter_coeff+4, filter_coeff+8 ...
|
;Data address: filter_coeff, filter_coeff+4, filter_coeff+8 ...
|
||||||
|
_filter8_coeff_
|
||||||
|
DCD filter8_coeff
|
||||||
filter8_coeff
|
filter8_coeff
|
||||||
DCD 0x00000000, 0x00000080, 0x00000000, 0x00000000
|
DCD 0x00000000, 0x00000080, 0x00000000, 0x00000000
|
||||||
DCD 0xfffa0000, 0x000c007b, 0x0000ffff, 0x00000000
|
DCD 0xfffa0000, 0x000c007b, 0x0000ffff, 0x00000000
|
||||||
|
|||||||
@@ -25,28 +25,6 @@ extern void vp8_filter_block2d_first_pass_armv6
|
|||||||
const short *vp8_filter
|
const short *vp8_filter
|
||||||
);
|
);
|
||||||
|
|
||||||
// 8x8
|
|
||||||
extern void vp8_filter_block2d_first_pass_8x8_armv6
|
|
||||||
(
|
|
||||||
unsigned char *src_ptr,
|
|
||||||
short *output_ptr,
|
|
||||||
unsigned int src_pixels_per_line,
|
|
||||||
unsigned int output_width,
|
|
||||||
unsigned int output_height,
|
|
||||||
const short *vp8_filter
|
|
||||||
);
|
|
||||||
|
|
||||||
// 16x16
|
|
||||||
extern void vp8_filter_block2d_first_pass_16x16_armv6
|
|
||||||
(
|
|
||||||
unsigned char *src_ptr,
|
|
||||||
short *output_ptr,
|
|
||||||
unsigned int src_pixels_per_line,
|
|
||||||
unsigned int output_width,
|
|
||||||
unsigned int output_height,
|
|
||||||
const short *vp8_filter
|
|
||||||
);
|
|
||||||
|
|
||||||
extern void vp8_filter_block2d_second_pass_armv6
|
extern void vp8_filter_block2d_second_pass_armv6
|
||||||
(
|
(
|
||||||
short *src_ptr,
|
short *src_ptr,
|
||||||
@@ -165,12 +143,12 @@ void vp8_sixtap_predict8x8_armv6
|
|||||||
{
|
{
|
||||||
if (yoffset & 0x1)
|
if (yoffset & 0x1)
|
||||||
{
|
{
|
||||||
vp8_filter_block2d_first_pass_8x8_armv6(src_ptr - src_pixels_per_line, FData + 1, src_pixels_per_line, 8, 11, HFilter);
|
vp8_filter_block2d_first_pass_armv6(src_ptr - src_pixels_per_line, FData + 1, src_pixels_per_line, 8, 11, HFilter);
|
||||||
vp8_filter4_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 8, VFilter);
|
vp8_filter4_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 8, VFilter);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
vp8_filter_block2d_first_pass_8x8_armv6(src_ptr - (2 * src_pixels_per_line), FData, src_pixels_per_line, 8, 13, HFilter);
|
vp8_filter_block2d_first_pass_armv6(src_ptr - (2 * src_pixels_per_line), FData, src_pixels_per_line, 8, 13, HFilter);
|
||||||
vp8_filter_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 8, VFilter);
|
vp8_filter_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 8, VFilter);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -207,12 +185,12 @@ void vp8_sixtap_predict16x16_armv6
|
|||||||
{
|
{
|
||||||
if (yoffset & 0x1)
|
if (yoffset & 0x1)
|
||||||
{
|
{
|
||||||
vp8_filter_block2d_first_pass_16x16_armv6(src_ptr - src_pixels_per_line, FData + 1, src_pixels_per_line, 16, 19, HFilter);
|
vp8_filter_block2d_first_pass_armv6(src_ptr - src_pixels_per_line, FData + 1, src_pixels_per_line, 16, 19, HFilter);
|
||||||
vp8_filter4_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 16, VFilter);
|
vp8_filter4_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 16, VFilter);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
vp8_filter_block2d_first_pass_16x16_armv6(src_ptr - (2 * src_pixels_per_line), FData, src_pixels_per_line, 16, 21, HFilter);
|
vp8_filter_block2d_first_pass_armv6(src_ptr - (2 * src_pixels_per_line), FData, src_pixels_per_line, 16, 21, HFilter);
|
||||||
vp8_filter_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 16, VFilter);
|
vp8_filter_block2d_second_pass_armv6(FData + 2, dst_ptr, dst_pitch, 16, VFilter);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -9,107 +9,135 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include "vpx_config.h"
|
#include "vpx_ports/config.h"
|
||||||
|
#include <math.h>
|
||||||
#include "vp8/common/loopfilter.h"
|
#include "vp8/common/loopfilter.h"
|
||||||
#include "vp8/common/onyxc_int.h"
|
#include "vp8/common/onyxc_int.h"
|
||||||
|
|
||||||
#if HAVE_ARMV6
|
|
||||||
extern prototype_loopfilter(vp8_loop_filter_horizontal_edge_armv6);
|
extern prototype_loopfilter(vp8_loop_filter_horizontal_edge_armv6);
|
||||||
extern prototype_loopfilter(vp8_loop_filter_vertical_edge_armv6);
|
extern prototype_loopfilter(vp8_loop_filter_vertical_edge_armv6);
|
||||||
extern prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_armv6);
|
extern prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_armv6);
|
||||||
extern prototype_loopfilter(vp8_mbloop_filter_vertical_edge_armv6);
|
extern prototype_loopfilter(vp8_mbloop_filter_vertical_edge_armv6);
|
||||||
#endif
|
extern prototype_loopfilter(vp8_loop_filter_simple_horizontal_edge_armv6);
|
||||||
|
extern prototype_loopfilter(vp8_loop_filter_simple_vertical_edge_armv6);
|
||||||
|
|
||||||
#if HAVE_ARMV7
|
extern prototype_loopfilter(vp8_loop_filter_horizontal_edge_y_neon);
|
||||||
typedef void loopfilter_y_neon(unsigned char *src, int pitch,
|
extern prototype_loopfilter(vp8_loop_filter_vertical_edge_y_neon);
|
||||||
unsigned char blimit, unsigned char limit, unsigned char thresh);
|
extern prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_y_neon);
|
||||||
typedef void loopfilter_uv_neon(unsigned char *u, int pitch,
|
extern prototype_loopfilter(vp8_mbloop_filter_vertical_edge_y_neon);
|
||||||
unsigned char blimit, unsigned char limit, unsigned char thresh,
|
extern prototype_loopfilter(vp8_loop_filter_simple_horizontal_edge_neon);
|
||||||
unsigned char *v);
|
extern prototype_loopfilter(vp8_loop_filter_simple_vertical_edge_neon);
|
||||||
|
|
||||||
extern loopfilter_y_neon vp8_loop_filter_horizontal_edge_y_neon;
|
extern loop_filter_uvfunction vp8_loop_filter_horizontal_edge_uv_neon;
|
||||||
extern loopfilter_y_neon vp8_loop_filter_vertical_edge_y_neon;
|
extern loop_filter_uvfunction vp8_loop_filter_vertical_edge_uv_neon;
|
||||||
extern loopfilter_y_neon vp8_mbloop_filter_horizontal_edge_y_neon;
|
extern loop_filter_uvfunction vp8_mbloop_filter_horizontal_edge_uv_neon;
|
||||||
extern loopfilter_y_neon vp8_mbloop_filter_vertical_edge_y_neon;
|
extern loop_filter_uvfunction vp8_mbloop_filter_vertical_edge_uv_neon;
|
||||||
|
|
||||||
extern loopfilter_uv_neon vp8_loop_filter_horizontal_edge_uv_neon;
|
|
||||||
extern loopfilter_uv_neon vp8_loop_filter_vertical_edge_uv_neon;
|
|
||||||
extern loopfilter_uv_neon vp8_mbloop_filter_horizontal_edge_uv_neon;
|
|
||||||
extern loopfilter_uv_neon vp8_mbloop_filter_vertical_edge_uv_neon;
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#if HAVE_ARMV6
|
#if HAVE_ARMV6
|
||||||
/*ARMV6 loopfilter functions*/
|
/*ARMV6 loopfilter functions*/
|
||||||
/* Horizontal MB filtering */
|
/* Horizontal MB filtering */
|
||||||
void vp8_loop_filter_mbh_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbh_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_horizontal_edge_armv6(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_horizontal_edge_armv6(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_armv6(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_horizontal_edge_armv6(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_armv6(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_horizontal_edge_armv6(v_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbhs_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Vertical MB Filtering */
|
/* Vertical MB Filtering */
|
||||||
void vp8_loop_filter_mbv_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbv_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_vertical_edge_armv6(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_vertical_edge_armv6(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_armv6(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_vertical_edge_armv6(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_armv6(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_vertical_edge_armv6(v_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbvs_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Horizontal B Filtering */
|
/* Horizontal B Filtering */
|
||||||
void vp8_loop_filter_bh_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bh_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_horizontal_edge_armv6(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_horizontal_edge_armv6(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_armv6(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_horizontal_edge_armv6(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_armv6(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_horizontal_edge_armv6(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_armv6(u_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_horizontal_edge_armv6(u_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_armv6(v_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_horizontal_edge_armv6(v_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_bhs_armv6(unsigned char *y_ptr, int y_stride,
|
void vp8_loop_filter_bhs_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
const unsigned char *blimit)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr + 4 * y_stride, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr + 8 * y_stride, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr + 12 * y_stride, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_armv6(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Vertical B Filtering */
|
/* Vertical B Filtering */
|
||||||
void vp8_loop_filter_bv_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bv_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_vertical_edge_armv6(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_vertical_edge_armv6(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_armv6(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_vertical_edge_armv6(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_armv6(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_vertical_edge_armv6(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_vertical_edge_armv6(u_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_vertical_edge_armv6(u_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_loop_filter_vertical_edge_armv6(v_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_vertical_edge_armv6(v_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_bvs_armv6(unsigned char *y_ptr, int y_stride,
|
void vp8_loop_filter_bvs_armv6(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
const unsigned char *blimit)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr + 4, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr + 8, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr + 12, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_armv6(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
@@ -117,60 +145,93 @@ void vp8_loop_filter_bvs_armv6(unsigned char *y_ptr, int y_stride,
|
|||||||
/* NEON loopfilter functions */
|
/* NEON loopfilter functions */
|
||||||
/* Horizontal MB filtering */
|
/* Horizontal MB filtering */
|
||||||
void vp8_loop_filter_mbh_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbh_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
unsigned char mblim = *lfi->mblim;
|
(void) simpler_lpf;
|
||||||
unsigned char lim = *lfi->lim;
|
vp8_mbloop_filter_horizontal_edge_y_neon(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
unsigned char hev_thr = *lfi->hev_thr;
|
|
||||||
vp8_mbloop_filter_horizontal_edge_y_neon(y_ptr, y_stride, mblim, lim, hev_thr);
|
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_uv_neon(u_ptr, uv_stride, mblim, lim, hev_thr, v_ptr);
|
vp8_mbloop_filter_horizontal_edge_uv_neon(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, v_ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbhs_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_neon(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Vertical MB Filtering */
|
/* Vertical MB Filtering */
|
||||||
void vp8_loop_filter_mbv_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbv_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
unsigned char mblim = *lfi->mblim;
|
(void) simpler_lpf;
|
||||||
unsigned char lim = *lfi->lim;
|
vp8_mbloop_filter_vertical_edge_y_neon(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
unsigned char hev_thr = *lfi->hev_thr;
|
|
||||||
|
|
||||||
vp8_mbloop_filter_vertical_edge_y_neon(y_ptr, y_stride, mblim, lim, hev_thr);
|
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_uv_neon(u_ptr, uv_stride, mblim, lim, hev_thr, v_ptr);
|
vp8_mbloop_filter_vertical_edge_uv_neon(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, v_ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbvs_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_neon(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Horizontal B Filtering */
|
/* Horizontal B Filtering */
|
||||||
void vp8_loop_filter_bh_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bh_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
unsigned char blim = *lfi->blim;
|
(void) simpler_lpf;
|
||||||
unsigned char lim = *lfi->lim;
|
vp8_loop_filter_horizontal_edge_y_neon(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
unsigned char hev_thr = *lfi->hev_thr;
|
vp8_loop_filter_horizontal_edge_y_neon(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_horizontal_edge_y_neon(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_horizontal_edge_y_neon(y_ptr + 4 * y_stride, y_stride, blim, lim, hev_thr);
|
|
||||||
vp8_loop_filter_horizontal_edge_y_neon(y_ptr + 8 * y_stride, y_stride, blim, lim, hev_thr);
|
|
||||||
vp8_loop_filter_horizontal_edge_y_neon(y_ptr + 12 * y_stride, y_stride, blim, lim, hev_thr);
|
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_uv_neon(u_ptr + 4 * uv_stride, uv_stride, blim, lim, hev_thr, v_ptr + 4 * uv_stride);
|
vp8_loop_filter_horizontal_edge_uv_neon(u_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, v_ptr + 4 * uv_stride);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_bhs_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_neon(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_neon(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_neon(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Vertical B Filtering */
|
/* Vertical B Filtering */
|
||||||
void vp8_loop_filter_bv_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bv_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
unsigned char blim = *lfi->blim;
|
(void) simpler_lpf;
|
||||||
unsigned char lim = *lfi->lim;
|
vp8_loop_filter_vertical_edge_y_neon(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
unsigned char hev_thr = *lfi->hev_thr;
|
vp8_loop_filter_vertical_edge_y_neon(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_vertical_edge_y_neon(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_vertical_edge_y_neon(y_ptr + 4, y_stride, blim, lim, hev_thr);
|
|
||||||
vp8_loop_filter_vertical_edge_y_neon(y_ptr + 8, y_stride, blim, lim, hev_thr);
|
|
||||||
vp8_loop_filter_vertical_edge_y_neon(y_ptr + 12, y_stride, blim, lim, hev_thr);
|
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_vertical_edge_uv_neon(u_ptr + 4, uv_stride, blim, lim, hev_thr, v_ptr + 4);
|
vp8_loop_filter_vertical_edge_uv_neon(u_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, v_ptr + 4);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_bvs_neon(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_neon(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_neon(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_neon(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -12,17 +12,15 @@
|
|||||||
#ifndef LOOPFILTER_ARM_H
|
#ifndef LOOPFILTER_ARM_H
|
||||||
#define LOOPFILTER_ARM_H
|
#define LOOPFILTER_ARM_H
|
||||||
|
|
||||||
#include "vpx_config.h"
|
|
||||||
|
|
||||||
#if HAVE_ARMV6
|
#if HAVE_ARMV6
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_mbv_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbv_armv6);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_bv_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_bv_armv6);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_mbh_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbh_armv6);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_bh_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_bh_armv6);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbvs_armv6);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_bvs_armv6);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbhs_armv6);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_armv6);
|
extern prototype_loopfilter_block(vp8_loop_filter_bhs_armv6);
|
||||||
|
|
||||||
#if !CONFIG_RUNTIME_CPU_DETECT
|
#if !CONFIG_RUNTIME_CPU_DETECT
|
||||||
#undef vp8_lf_normal_mb_v
|
#undef vp8_lf_normal_mb_v
|
||||||
@@ -38,29 +36,28 @@ extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_armv6);
|
|||||||
#define vp8_lf_normal_b_h vp8_loop_filter_bh_armv6
|
#define vp8_lf_normal_b_h vp8_loop_filter_bh_armv6
|
||||||
|
|
||||||
#undef vp8_lf_simple_mb_v
|
#undef vp8_lf_simple_mb_v
|
||||||
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_armv6
|
#define vp8_lf_simple_mb_v vp8_loop_filter_mbvs_armv6
|
||||||
|
|
||||||
#undef vp8_lf_simple_b_v
|
#undef vp8_lf_simple_b_v
|
||||||
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_armv6
|
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_armv6
|
||||||
|
|
||||||
#undef vp8_lf_simple_mb_h
|
#undef vp8_lf_simple_mb_h
|
||||||
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_armv6
|
#define vp8_lf_simple_mb_h vp8_loop_filter_mbhs_armv6
|
||||||
|
|
||||||
#undef vp8_lf_simple_b_h
|
#undef vp8_lf_simple_b_h
|
||||||
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_armv6
|
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_armv6
|
||||||
#endif /* !CONFIG_RUNTIME_CPU_DETECT */
|
#endif
|
||||||
|
#endif
|
||||||
#endif /* HAVE_ARMV6 */
|
|
||||||
|
|
||||||
#if HAVE_ARMV7
|
#if HAVE_ARMV7
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_mbv_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbv_neon);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_bv_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_bv_neon);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_mbh_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbh_neon);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_bh_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_bh_neon);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_mbvs_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbvs_neon);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_bvs_neon);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_mbhs_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbhs_neon);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_neon);
|
extern prototype_loopfilter_block(vp8_loop_filter_bhs_neon);
|
||||||
|
|
||||||
#if !CONFIG_RUNTIME_CPU_DETECT
|
#if !CONFIG_RUNTIME_CPU_DETECT
|
||||||
#undef vp8_lf_normal_mb_v
|
#undef vp8_lf_normal_mb_v
|
||||||
@@ -86,8 +83,7 @@ extern prototype_simple_loopfilter(vp8_loop_filter_bhs_neon);
|
|||||||
|
|
||||||
#undef vp8_lf_simple_b_h
|
#undef vp8_lf_simple_b_h
|
||||||
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_neon
|
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_neon
|
||||||
#endif /* !CONFIG_RUNTIME_CPU_DETECT */
|
#endif
|
||||||
|
#endif
|
||||||
|
|
||||||
#endif /* HAVE_ARMV7 */
|
#endif
|
||||||
|
|
||||||
#endif /* LOOPFILTER_ARM_H */
|
|
||||||
|
|||||||
@@ -25,7 +25,7 @@
|
|||||||
|vp8_bilinear_predict16x16_neon| PROC
|
|vp8_bilinear_predict16x16_neon| PROC
|
||||||
push {r4-r5, lr}
|
push {r4-r5, lr}
|
||||||
|
|
||||||
adr r12, bifilter16_coeff
|
ldr r12, _bifilter16_coeff_
|
||||||
ldr r4, [sp, #12] ;load parameters from stack
|
ldr r4, [sp, #12] ;load parameters from stack
|
||||||
ldr r5, [sp, #16] ;load parameters from stack
|
ldr r5, [sp, #16] ;load parameters from stack
|
||||||
|
|
||||||
@@ -351,6 +351,8 @@ filt_blk2d_spo16x16_loop_neon
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_bifilter16_coeff_
|
||||||
|
DCD bifilter16_coeff
|
||||||
bifilter16_coeff
|
bifilter16_coeff
|
||||||
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
||||||
|
|
||||||
|
|||||||
@@ -25,7 +25,7 @@
|
|||||||
|vp8_bilinear_predict4x4_neon| PROC
|
|vp8_bilinear_predict4x4_neon| PROC
|
||||||
push {r4, lr}
|
push {r4, lr}
|
||||||
|
|
||||||
adr r12, bifilter4_coeff
|
ldr r12, _bifilter4_coeff_
|
||||||
ldr r4, [sp, #8] ;load parameters from stack
|
ldr r4, [sp, #8] ;load parameters from stack
|
||||||
ldr lr, [sp, #12] ;load parameters from stack
|
ldr lr, [sp, #12] ;load parameters from stack
|
||||||
|
|
||||||
@@ -124,6 +124,8 @@ skip_secondpass_filter
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_bifilter4_coeff_
|
||||||
|
DCD bifilter4_coeff
|
||||||
bifilter4_coeff
|
bifilter4_coeff
|
||||||
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
||||||
|
|
||||||
|
|||||||
@@ -25,7 +25,7 @@
|
|||||||
|vp8_bilinear_predict8x4_neon| PROC
|
|vp8_bilinear_predict8x4_neon| PROC
|
||||||
push {r4, lr}
|
push {r4, lr}
|
||||||
|
|
||||||
adr r12, bifilter8x4_coeff
|
ldr r12, _bifilter8x4_coeff_
|
||||||
ldr r4, [sp, #8] ;load parameters from stack
|
ldr r4, [sp, #8] ;load parameters from stack
|
||||||
ldr lr, [sp, #12] ;load parameters from stack
|
ldr lr, [sp, #12] ;load parameters from stack
|
||||||
|
|
||||||
@@ -129,6 +129,8 @@ skip_secondpass_filter
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_bifilter8x4_coeff_
|
||||||
|
DCD bifilter8x4_coeff
|
||||||
bifilter8x4_coeff
|
bifilter8x4_coeff
|
||||||
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
||||||
|
|
||||||
|
|||||||
@@ -25,7 +25,7 @@
|
|||||||
|vp8_bilinear_predict8x8_neon| PROC
|
|vp8_bilinear_predict8x8_neon| PROC
|
||||||
push {r4, lr}
|
push {r4, lr}
|
||||||
|
|
||||||
adr r12, bifilter8_coeff
|
ldr r12, _bifilter8_coeff_
|
||||||
ldr r4, [sp, #8] ;load parameters from stack
|
ldr r4, [sp, #8] ;load parameters from stack
|
||||||
ldr lr, [sp, #12] ;load parameters from stack
|
ldr lr, [sp, #12] ;load parameters from stack
|
||||||
|
|
||||||
@@ -177,6 +177,8 @@ skip_secondpass_filter
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_bifilter8_coeff_
|
||||||
|
DCD bifilter8_coeff
|
||||||
bifilter8_coeff
|
bifilter8_coeff
|
||||||
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
DCD 128, 0, 112, 16, 96, 32, 80, 48, 64, 64, 48, 80, 32, 96, 16, 112
|
||||||
|
|
||||||
|
|||||||
@@ -20,16 +20,19 @@
|
|||||||
|vp8_short_inv_walsh4x4_neon| PROC
|
|vp8_short_inv_walsh4x4_neon| PROC
|
||||||
|
|
||||||
; read in all four lines of values: d0->d3
|
; read in all four lines of values: d0->d3
|
||||||
vld1.i16 {q0-q1}, [r0@128]
|
vldm.64 r0, {q0, q1}
|
||||||
|
|
||||||
; first for loop
|
; first for loop
|
||||||
vadd.s16 d4, d0, d3 ;a = [0] + [12]
|
|
||||||
vadd.s16 d6, d1, d2 ;b = [4] + [8]
|
|
||||||
vsub.s16 d5, d0, d3 ;d = [0] - [12]
|
|
||||||
vsub.s16 d7, d1, d2 ;c = [4] - [8]
|
|
||||||
|
|
||||||
vadd.s16 q0, q2, q3 ; a+b d+c
|
vadd.s16 d4, d0, d3 ;a = [0] + [12]
|
||||||
vsub.s16 q1, q2, q3 ; a-b d-c
|
vadd.s16 d5, d1, d2 ;b = [4] + [8]
|
||||||
|
vsub.s16 d6, d1, d2 ;c = [4] - [8]
|
||||||
|
vsub.s16 d7, d0, d3 ;d = [0] - [12]
|
||||||
|
|
||||||
|
vadd.s16 d0, d4, d5 ;a + b
|
||||||
|
vadd.s16 d1, d6, d7 ;c + d
|
||||||
|
vsub.s16 d2, d4, d5 ;a - b
|
||||||
|
vsub.s16 d3, d7, d6 ;d - c
|
||||||
|
|
||||||
vtrn.32 d0, d2 ;d0: 0 1 8 9
|
vtrn.32 d0, d2 ;d0: 0 1 8 9
|
||||||
;d2: 2 3 10 11
|
;d2: 2 3 10 11
|
||||||
@@ -44,22 +47,29 @@
|
|||||||
; second for loop
|
; second for loop
|
||||||
|
|
||||||
vadd.s16 d4, d0, d3 ;a = [0] + [3]
|
vadd.s16 d4, d0, d3 ;a = [0] + [3]
|
||||||
vadd.s16 d6, d1, d2 ;b = [1] + [2]
|
vadd.s16 d5, d1, d2 ;b = [1] + [2]
|
||||||
vsub.s16 d5, d0, d3 ;d = [0] - [3]
|
vsub.s16 d6, d1, d2 ;c = [1] - [2]
|
||||||
vsub.s16 d7, d1, d2 ;c = [1] - [2]
|
vsub.s16 d7, d0, d3 ;d = [0] - [3]
|
||||||
|
|
||||||
vmov.i16 q8, #3
|
vadd.s16 d0, d4, d5 ;e = a + b
|
||||||
|
vadd.s16 d1, d6, d7 ;f = c + d
|
||||||
|
vsub.s16 d2, d4, d5 ;g = a - b
|
||||||
|
vsub.s16 d3, d7, d6 ;h = d - c
|
||||||
|
|
||||||
vadd.s16 q0, q2, q3 ; a+b d+c
|
vmov.i16 q2, #3
|
||||||
vsub.s16 q1, q2, q3 ; a-b d-c
|
vadd.i16 q0, q0, q2 ;e/f += 3
|
||||||
|
vadd.i16 q1, q1, q2 ;g/h += 3
|
||||||
vadd.i16 q0, q0, q8 ;e/f += 3
|
|
||||||
vadd.i16 q1, q1, q8 ;g/h += 3
|
|
||||||
|
|
||||||
vshr.s16 q0, q0, #3 ;e/f >> 3
|
vshr.s16 q0, q0, #3 ;e/f >> 3
|
||||||
vshr.s16 q1, q1, #3 ;g/h >> 3
|
vshr.s16 q1, q1, #3 ;g/h >> 3
|
||||||
|
|
||||||
vst4.i16 {d0,d1,d2,d3}, [r1@128]
|
vtrn.32 d0, d2
|
||||||
|
vtrn.32 d1, d3
|
||||||
|
vtrn.16 d0, d1
|
||||||
|
vtrn.16 d2, d3
|
||||||
|
|
||||||
|
vstmia.16 r1!, {q0}
|
||||||
|
vstmia.16 r1!, {q1}
|
||||||
|
|
||||||
bx lr
|
bx lr
|
||||||
ENDP ; |vp8_short_inv_walsh4x4_neon|
|
ENDP ; |vp8_short_inv_walsh4x4_neon|
|
||||||
@@ -67,13 +77,19 @@
|
|||||||
|
|
||||||
;short vp8_short_inv_walsh4x4_1_neon(short *input, short *output)
|
;short vp8_short_inv_walsh4x4_1_neon(short *input, short *output)
|
||||||
|vp8_short_inv_walsh4x4_1_neon| PROC
|
|vp8_short_inv_walsh4x4_1_neon| PROC
|
||||||
ldrsh r2, [r0] ; load input[0]
|
; load a full line into a neon register
|
||||||
add r3, r2, #3 ; add 3
|
vld1.16 {q0}, [r0]
|
||||||
add r2, r1, #16 ; base for last 8 output
|
; extract first element and replicate
|
||||||
asr r0, r3, #3 ; right shift 3
|
vdup.16 q1, d0[0]
|
||||||
vdup.16 q0, r0 ; load and duplicate
|
; add 3 to all values
|
||||||
vst1.16 {q0}, [r1@128] ; write back 8
|
vmov.i16 q2, #3
|
||||||
vst1.16 {q0}, [r2@128] ; write back last 8
|
vadd.i16 q3, q1, q2
|
||||||
|
; right shift
|
||||||
|
vshr.s16 q3, q3, #3
|
||||||
|
; write it back
|
||||||
|
vstmia.16 r1!, {q3}
|
||||||
|
vstmia.16 r1!, {q3}
|
||||||
|
|
||||||
bx lr
|
bx lr
|
||||||
ENDP ; |vp8_short_inv_walsh4x4_1_neon|
|
ENDP ; |vp8_short_inv_walsh4x4_1_neon|
|
||||||
|
|
||||||
|
|||||||
@@ -14,97 +14,109 @@
|
|||||||
EXPORT |vp8_loop_filter_vertical_edge_y_neon|
|
EXPORT |vp8_loop_filter_vertical_edge_y_neon|
|
||||||
EXPORT |vp8_loop_filter_vertical_edge_uv_neon|
|
EXPORT |vp8_loop_filter_vertical_edge_uv_neon|
|
||||||
ARM
|
ARM
|
||||||
|
REQUIRE8
|
||||||
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
|
||||||
|
; flimit, limit, and thresh should be positive numbers.
|
||||||
|
; All 16 elements in these variables are equal.
|
||||||
|
|
||||||
|
; void vp8_loop_filter_horizontal_edge_y_neon(unsigned char *src, int pitch,
|
||||||
|
; const signed char *flimit,
|
||||||
|
; const signed char *limit,
|
||||||
|
; const signed char *thresh,
|
||||||
|
; int count)
|
||||||
; r0 unsigned char *src
|
; r0 unsigned char *src
|
||||||
; r1 int pitch
|
; r1 int pitch
|
||||||
; r2 unsigned char blimit
|
; r2 const signed char *flimit
|
||||||
; r3 unsigned char limit
|
; r3 const signed char *limit
|
||||||
; sp unsigned char thresh,
|
; sp const signed char *thresh,
|
||||||
|
; sp+4 int count (unused)
|
||||||
|vp8_loop_filter_horizontal_edge_y_neon| PROC
|
|vp8_loop_filter_horizontal_edge_y_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
vdup.u8 q0, r2 ; duplicate blimit
|
vld1.s8 {d0[], d1[]}, [r2] ; flimit
|
||||||
vdup.u8 q1, r3 ; duplicate limit
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
sub r2, r0, r1, lsl #2 ; move src pointer down by 4 lines
|
sub r2, r0, r1, lsl #2 ; move src pointer down by 4 lines
|
||||||
ldr r3, [sp, #4] ; load thresh
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
add r12, r2, r1
|
|
||||||
add r1, r1, r1
|
|
||||||
|
|
||||||
vdup.u8 q2, r3 ; duplicate thresh
|
vld1.u8 {q3}, [r2], r1 ; p3
|
||||||
|
vld1.u8 {q4}, [r2], r1 ; p2
|
||||||
vld1.u8 {q3}, [r2@128], r1 ; p3
|
vld1.u8 {q5}, [r2], r1 ; p1
|
||||||
vld1.u8 {q4}, [r12@128], r1 ; p2
|
vld1.u8 {q6}, [r2], r1 ; p0
|
||||||
vld1.u8 {q5}, [r2@128], r1 ; p1
|
vld1.u8 {q7}, [r2], r1 ; q0
|
||||||
vld1.u8 {q6}, [r12@128], r1 ; p0
|
vld1.u8 {q8}, [r2], r1 ; q1
|
||||||
vld1.u8 {q7}, [r2@128], r1 ; q0
|
vld1.u8 {q9}, [r2], r1 ; q2
|
||||||
vld1.u8 {q8}, [r12@128], r1 ; q1
|
vld1.u8 {q10}, [r2] ; q3
|
||||||
vld1.u8 {q9}, [r2@128] ; q2
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
vld1.u8 {q10}, [r12@128] ; q3
|
sub r0, r0, r1, lsl #1
|
||||||
|
|
||||||
sub r2, r2, r1, lsl #1
|
|
||||||
sub r12, r12, r1, lsl #1
|
|
||||||
|
|
||||||
bl vp8_loop_filter_neon
|
bl vp8_loop_filter_neon
|
||||||
|
|
||||||
vst1.u8 {q5}, [r2@128], r1 ; store op1
|
vst1.u8 {q5}, [r0], r1 ; store op1
|
||||||
vst1.u8 {q6}, [r12@128], r1 ; store op0
|
vst1.u8 {q6}, [r0], r1 ; store op0
|
||||||
vst1.u8 {q7}, [r2@128], r1 ; store oq0
|
vst1.u8 {q7}, [r0], r1 ; store oq0
|
||||||
vst1.u8 {q8}, [r12@128], r1 ; store oq1
|
vst1.u8 {q8}, [r0], r1 ; store oq1
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_loop_filter_horizontal_edge_y_neon|
|
ENDP ; |vp8_loop_filter_horizontal_edge_y_neon|
|
||||||
|
|
||||||
|
; void vp8_loop_filter_horizontal_edge_uv_neon(unsigned char *u, int pitch
|
||||||
|
; const signed char *flimit,
|
||||||
|
; const signed char *limit,
|
||||||
|
; const signed char *thresh,
|
||||||
|
; unsigned char *v)
|
||||||
; r0 unsigned char *u,
|
; r0 unsigned char *u,
|
||||||
; r1 int pitch,
|
; r1 int pitch,
|
||||||
; r2 unsigned char blimit
|
; r2 const signed char *flimit,
|
||||||
; r3 unsigned char limit
|
; r3 const signed char *limit,
|
||||||
; sp unsigned char thresh,
|
; sp const signed char *thresh,
|
||||||
; sp+4 unsigned char *v
|
; sp+4 unsigned char *v
|
||||||
|vp8_loop_filter_horizontal_edge_uv_neon| PROC
|
|vp8_loop_filter_horizontal_edge_uv_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
vdup.u8 q0, r2 ; duplicate blimit
|
vld1.s8 {d0[], d1[]}, [r2] ; flimit
|
||||||
vdup.u8 q1, r3 ; duplicate limit
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
ldr r12, [sp, #4] ; load thresh
|
|
||||||
ldr r2, [sp, #8] ; load v ptr
|
ldr r2, [sp, #8] ; load v ptr
|
||||||
vdup.u8 q2, r12 ; duplicate thresh
|
|
||||||
|
|
||||||
sub r3, r0, r1, lsl #2 ; move u pointer down by 4 lines
|
sub r3, r0, r1, lsl #2 ; move u pointer down by 4 lines
|
||||||
sub r12, r2, r1, lsl #2 ; move v pointer down by 4 lines
|
vld1.u8 {d6}, [r3], r1 ; p3
|
||||||
|
vld1.u8 {d8}, [r3], r1 ; p2
|
||||||
|
vld1.u8 {d10}, [r3], r1 ; p1
|
||||||
|
vld1.u8 {d12}, [r3], r1 ; p0
|
||||||
|
vld1.u8 {d14}, [r3], r1 ; q0
|
||||||
|
vld1.u8 {d16}, [r3], r1 ; q1
|
||||||
|
vld1.u8 {d18}, [r3], r1 ; q2
|
||||||
|
vld1.u8 {d20}, [r3] ; q3
|
||||||
|
|
||||||
vld1.u8 {d6}, [r3@64], r1 ; p3
|
ldr r3, [sp, #4] ; load thresh pointer
|
||||||
vld1.u8 {d7}, [r12@64], r1 ; p3
|
|
||||||
vld1.u8 {d8}, [r3@64], r1 ; p2
|
sub r12, r2, r1, lsl #2 ; move v pointer down by 4 lines
|
||||||
vld1.u8 {d9}, [r12@64], r1 ; p2
|
vld1.u8 {d7}, [r12], r1 ; p3
|
||||||
vld1.u8 {d10}, [r3@64], r1 ; p1
|
vld1.u8 {d9}, [r12], r1 ; p2
|
||||||
vld1.u8 {d11}, [r12@64], r1 ; p1
|
vld1.u8 {d11}, [r12], r1 ; p1
|
||||||
vld1.u8 {d12}, [r3@64], r1 ; p0
|
vld1.u8 {d13}, [r12], r1 ; p0
|
||||||
vld1.u8 {d13}, [r12@64], r1 ; p0
|
vld1.u8 {d15}, [r12], r1 ; q0
|
||||||
vld1.u8 {d14}, [r3@64], r1 ; q0
|
vld1.u8 {d17}, [r12], r1 ; q1
|
||||||
vld1.u8 {d15}, [r12@64], r1 ; q0
|
vld1.u8 {d19}, [r12], r1 ; q2
|
||||||
vld1.u8 {d16}, [r3@64], r1 ; q1
|
vld1.u8 {d21}, [r12] ; q3
|
||||||
vld1.u8 {d17}, [r12@64], r1 ; q1
|
|
||||||
vld1.u8 {d18}, [r3@64], r1 ; q2
|
vld1.s8 {d4[], d5[]}, [r3] ; thresh
|
||||||
vld1.u8 {d19}, [r12@64], r1 ; q2
|
|
||||||
vld1.u8 {d20}, [r3@64] ; q3
|
|
||||||
vld1.u8 {d21}, [r12@64] ; q3
|
|
||||||
|
|
||||||
bl vp8_loop_filter_neon
|
bl vp8_loop_filter_neon
|
||||||
|
|
||||||
sub r0, r0, r1, lsl #1
|
sub r0, r0, r1, lsl #1
|
||||||
sub r2, r2, r1, lsl #1
|
sub r2, r2, r1, lsl #1
|
||||||
|
|
||||||
vst1.u8 {d10}, [r0@64], r1 ; store u op1
|
vst1.u8 {d10}, [r0], r1 ; store u op1
|
||||||
vst1.u8 {d11}, [r2@64], r1 ; store v op1
|
vst1.u8 {d11}, [r2], r1 ; store v op1
|
||||||
vst1.u8 {d12}, [r0@64], r1 ; store u op0
|
vst1.u8 {d12}, [r0], r1 ; store u op0
|
||||||
vst1.u8 {d13}, [r2@64], r1 ; store v op0
|
vst1.u8 {d13}, [r2], r1 ; store v op0
|
||||||
vst1.u8 {d14}, [r0@64], r1 ; store u oq0
|
vst1.u8 {d14}, [r0], r1 ; store u oq0
|
||||||
vst1.u8 {d15}, [r2@64], r1 ; store v oq0
|
vst1.u8 {d15}, [r2], r1 ; store v oq0
|
||||||
vst1.u8 {d16}, [r0@64] ; store u oq1
|
vst1.u8 {d16}, [r0] ; store u oq1
|
||||||
vst1.u8 {d17}, [r2@64] ; store v oq1
|
vst1.u8 {d17}, [r2] ; store v oq1
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_loop_filter_horizontal_edge_uv_neon|
|
ENDP ; |vp8_loop_filter_horizontal_edge_uv_neon|
|
||||||
|
|
||||||
; void vp8_loop_filter_vertical_edge_y_neon(unsigned char *src, int pitch,
|
; void vp8_loop_filter_vertical_edge_y_neon(unsigned char *src, int pitch,
|
||||||
@@ -112,38 +124,39 @@
|
|||||||
; const signed char *limit,
|
; const signed char *limit,
|
||||||
; const signed char *thresh,
|
; const signed char *thresh,
|
||||||
; int count)
|
; int count)
|
||||||
; r0 unsigned char *src
|
; r0 unsigned char *src,
|
||||||
; r1 int pitch
|
; r1 int pitch,
|
||||||
; r2 unsigned char blimit
|
; r2 const signed char *flimit,
|
||||||
; r3 unsigned char limit
|
; r3 const signed char *limit,
|
||||||
; sp unsigned char thresh,
|
; sp const signed char *thresh,
|
||||||
|
; sp+4 int count (unused)
|
||||||
|vp8_loop_filter_vertical_edge_y_neon| PROC
|
|vp8_loop_filter_vertical_edge_y_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
vdup.u8 q0, r2 ; duplicate blimit
|
vld1.s8 {d0[], d1[]}, [r2] ; flimit
|
||||||
vdup.u8 q1, r3 ; duplicate limit
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
sub r2, r0, #4 ; src ptr down by 4 columns
|
sub r2, r0, #4 ; src ptr down by 4 columns
|
||||||
add r1, r1, r1
|
sub r0, r0, #2 ; dst ptr
|
||||||
ldr r3, [sp, #4] ; load thresh
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
add r12, r2, r1, asr #1
|
|
||||||
|
|
||||||
vld1.u8 {d6}, [r2], r1
|
vld1.u8 {d6}, [r2], r1 ; load first 8-line src data
|
||||||
vld1.u8 {d8}, [r12], r1
|
vld1.u8 {d8}, [r2], r1
|
||||||
vld1.u8 {d10}, [r2], r1
|
vld1.u8 {d10}, [r2], r1
|
||||||
vld1.u8 {d12}, [r12], r1
|
vld1.u8 {d12}, [r2], r1
|
||||||
vld1.u8 {d14}, [r2], r1
|
vld1.u8 {d14}, [r2], r1
|
||||||
vld1.u8 {d16}, [r12], r1
|
vld1.u8 {d16}, [r2], r1
|
||||||
vld1.u8 {d18}, [r2], r1
|
vld1.u8 {d18}, [r2], r1
|
||||||
vld1.u8 {d20}, [r12], r1
|
vld1.u8 {d20}, [r2], r1
|
||||||
|
|
||||||
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
|
|
||||||
vld1.u8 {d7}, [r2], r1 ; load second 8-line src data
|
vld1.u8 {d7}, [r2], r1 ; load second 8-line src data
|
||||||
vld1.u8 {d9}, [r12], r1
|
vld1.u8 {d9}, [r2], r1
|
||||||
vld1.u8 {d11}, [r2], r1
|
vld1.u8 {d11}, [r2], r1
|
||||||
vld1.u8 {d13}, [r12], r1
|
vld1.u8 {d13}, [r2], r1
|
||||||
vld1.u8 {d15}, [r2], r1
|
vld1.u8 {d15}, [r2], r1
|
||||||
vld1.u8 {d17}, [r12], r1
|
vld1.u8 {d17}, [r2], r1
|
||||||
vld1.u8 {d19}, [r2]
|
vld1.u8 {d19}, [r2], r1
|
||||||
vld1.u8 {d21}, [r12]
|
vld1.u8 {d21}, [r2]
|
||||||
|
|
||||||
;transpose to 8x16 matrix
|
;transpose to 8x16 matrix
|
||||||
vtrn.32 q3, q7
|
vtrn.32 q3, q7
|
||||||
@@ -151,8 +164,6 @@
|
|||||||
vtrn.32 q5, q9
|
vtrn.32 q5, q9
|
||||||
vtrn.32 q6, q10
|
vtrn.32 q6, q10
|
||||||
|
|
||||||
vdup.u8 q2, r3 ; duplicate thresh
|
|
||||||
|
|
||||||
vtrn.16 q3, q5
|
vtrn.16 q3, q5
|
||||||
vtrn.16 q4, q6
|
vtrn.16 q4, q6
|
||||||
vtrn.16 q7, q9
|
vtrn.16 q7, q9
|
||||||
@@ -167,34 +178,28 @@
|
|||||||
|
|
||||||
vswp d12, d11
|
vswp d12, d11
|
||||||
vswp d16, d13
|
vswp d16, d13
|
||||||
|
|
||||||
sub r0, r0, #2 ; dst ptr
|
|
||||||
|
|
||||||
vswp d14, d12
|
vswp d14, d12
|
||||||
vswp d16, d15
|
vswp d16, d15
|
||||||
|
|
||||||
add r12, r0, r1, asr #1
|
|
||||||
|
|
||||||
;store op1, op0, oq0, oq1
|
;store op1, op0, oq0, oq1
|
||||||
vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [r0], r1
|
vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [r0], r1
|
||||||
vst4.8 {d10[1], d11[1], d12[1], d13[1]}, [r12], r1
|
vst4.8 {d10[1], d11[1], d12[1], d13[1]}, [r0], r1
|
||||||
vst4.8 {d10[2], d11[2], d12[2], d13[2]}, [r0], r1
|
vst4.8 {d10[2], d11[2], d12[2], d13[2]}, [r0], r1
|
||||||
vst4.8 {d10[3], d11[3], d12[3], d13[3]}, [r12], r1
|
vst4.8 {d10[3], d11[3], d12[3], d13[3]}, [r0], r1
|
||||||
vst4.8 {d10[4], d11[4], d12[4], d13[4]}, [r0], r1
|
vst4.8 {d10[4], d11[4], d12[4], d13[4]}, [r0], r1
|
||||||
vst4.8 {d10[5], d11[5], d12[5], d13[5]}, [r12], r1
|
vst4.8 {d10[5], d11[5], d12[5], d13[5]}, [r0], r1
|
||||||
vst4.8 {d10[6], d11[6], d12[6], d13[6]}, [r0], r1
|
vst4.8 {d10[6], d11[6], d12[6], d13[6]}, [r0], r1
|
||||||
vst4.8 {d10[7], d11[7], d12[7], d13[7]}, [r12], r1
|
vst4.8 {d10[7], d11[7], d12[7], d13[7]}, [r0], r1
|
||||||
|
|
||||||
vst4.8 {d14[0], d15[0], d16[0], d17[0]}, [r0], r1
|
vst4.8 {d14[0], d15[0], d16[0], d17[0]}, [r0], r1
|
||||||
vst4.8 {d14[1], d15[1], d16[1], d17[1]}, [r12], r1
|
vst4.8 {d14[1], d15[1], d16[1], d17[1]}, [r0], r1
|
||||||
vst4.8 {d14[2], d15[2], d16[2], d17[2]}, [r0], r1
|
vst4.8 {d14[2], d15[2], d16[2], d17[2]}, [r0], r1
|
||||||
vst4.8 {d14[3], d15[3], d16[3], d17[3]}, [r12], r1
|
vst4.8 {d14[3], d15[3], d16[3], d17[3]}, [r0], r1
|
||||||
vst4.8 {d14[4], d15[4], d16[4], d17[4]}, [r0], r1
|
vst4.8 {d14[4], d15[4], d16[4], d17[4]}, [r0], r1
|
||||||
vst4.8 {d14[5], d15[5], d16[5], d17[5]}, [r12], r1
|
vst4.8 {d14[5], d15[5], d16[5], d17[5]}, [r0], r1
|
||||||
vst4.8 {d14[6], d15[6], d16[6], d17[6]}, [r0]
|
vst4.8 {d14[6], d15[6], d16[6], d17[6]}, [r0], r1
|
||||||
vst4.8 {d14[7], d15[7], d16[7], d17[7]}, [r12]
|
vst4.8 {d14[7], d15[7], d16[7], d17[7]}, [r0]
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_loop_filter_vertical_edge_y_neon|
|
ENDP ; |vp8_loop_filter_vertical_edge_y_neon|
|
||||||
|
|
||||||
; void vp8_loop_filter_vertical_edge_uv_neon(unsigned char *u, int pitch
|
; void vp8_loop_filter_vertical_edge_uv_neon(unsigned char *u, int pitch
|
||||||
@@ -204,36 +209,38 @@
|
|||||||
; unsigned char *v)
|
; unsigned char *v)
|
||||||
; r0 unsigned char *u,
|
; r0 unsigned char *u,
|
||||||
; r1 int pitch,
|
; r1 int pitch,
|
||||||
; r2 unsigned char blimit
|
; r2 const signed char *flimit,
|
||||||
; r3 unsigned char limit
|
; r3 const signed char *limit,
|
||||||
; sp unsigned char thresh,
|
; sp const signed char *thresh,
|
||||||
; sp+4 unsigned char *v
|
; sp+4 unsigned char *v
|
||||||
|vp8_loop_filter_vertical_edge_uv_neon| PROC
|
|vp8_loop_filter_vertical_edge_uv_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
vdup.u8 q0, r2 ; duplicate blimit
|
sub r12, r0, #4 ; move u pointer down by 4 columns
|
||||||
sub r12, r0, #4 ; move u pointer down by 4 columns
|
vld1.s8 {d0[], d1[]}, [r2] ; flimit
|
||||||
ldr r2, [sp, #8] ; load v ptr
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
vdup.u8 q1, r3 ; duplicate limit
|
|
||||||
sub r3, r2, #4 ; move v pointer down by 4 columns
|
|
||||||
|
|
||||||
vld1.u8 {d6}, [r12], r1 ;load u data
|
ldr r2, [sp, #8] ; load v ptr
|
||||||
vld1.u8 {d7}, [r3], r1 ;load v data
|
|
||||||
|
vld1.u8 {d6}, [r12], r1 ;load u data
|
||||||
vld1.u8 {d8}, [r12], r1
|
vld1.u8 {d8}, [r12], r1
|
||||||
vld1.u8 {d9}, [r3], r1
|
|
||||||
vld1.u8 {d10}, [r12], r1
|
vld1.u8 {d10}, [r12], r1
|
||||||
vld1.u8 {d11}, [r3], r1
|
|
||||||
vld1.u8 {d12}, [r12], r1
|
vld1.u8 {d12}, [r12], r1
|
||||||
vld1.u8 {d13}, [r3], r1
|
|
||||||
vld1.u8 {d14}, [r12], r1
|
vld1.u8 {d14}, [r12], r1
|
||||||
vld1.u8 {d15}, [r3], r1
|
|
||||||
vld1.u8 {d16}, [r12], r1
|
vld1.u8 {d16}, [r12], r1
|
||||||
vld1.u8 {d17}, [r3], r1
|
|
||||||
vld1.u8 {d18}, [r12], r1
|
vld1.u8 {d18}, [r12], r1
|
||||||
vld1.u8 {d19}, [r3], r1
|
|
||||||
vld1.u8 {d20}, [r12]
|
vld1.u8 {d20}, [r12]
|
||||||
|
|
||||||
|
sub r3, r2, #4 ; move v pointer down by 4 columns
|
||||||
|
vld1.u8 {d7}, [r3], r1 ;load v data
|
||||||
|
vld1.u8 {d9}, [r3], r1
|
||||||
|
vld1.u8 {d11}, [r3], r1
|
||||||
|
vld1.u8 {d13}, [r3], r1
|
||||||
|
vld1.u8 {d15}, [r3], r1
|
||||||
|
vld1.u8 {d17}, [r3], r1
|
||||||
|
vld1.u8 {d19}, [r3], r1
|
||||||
vld1.u8 {d21}, [r3]
|
vld1.u8 {d21}, [r3]
|
||||||
|
|
||||||
ldr r12, [sp, #4] ; load thresh
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
|
|
||||||
;transpose to 8x16 matrix
|
;transpose to 8x16 matrix
|
||||||
vtrn.32 q3, q7
|
vtrn.32 q3, q7
|
||||||
@@ -241,8 +248,6 @@
|
|||||||
vtrn.32 q5, q9
|
vtrn.32 q5, q9
|
||||||
vtrn.32 q6, q10
|
vtrn.32 q6, q10
|
||||||
|
|
||||||
vdup.u8 q2, r12 ; duplicate thresh
|
|
||||||
|
|
||||||
vtrn.16 q3, q5
|
vtrn.16 q3, q5
|
||||||
vtrn.16 q4, q6
|
vtrn.16 q4, q6
|
||||||
vtrn.16 q7, q9
|
vtrn.16 q7, q9
|
||||||
@@ -253,16 +258,18 @@
|
|||||||
vtrn.8 q7, q8
|
vtrn.8 q7, q8
|
||||||
vtrn.8 q9, q10
|
vtrn.8 q9, q10
|
||||||
|
|
||||||
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
|
|
||||||
bl vp8_loop_filter_neon
|
bl vp8_loop_filter_neon
|
||||||
|
|
||||||
|
sub r0, r0, #2
|
||||||
|
sub r2, r2, #2
|
||||||
|
|
||||||
vswp d12, d11
|
vswp d12, d11
|
||||||
vswp d16, d13
|
vswp d16, d13
|
||||||
vswp d14, d12
|
vswp d14, d12
|
||||||
vswp d16, d15
|
vswp d16, d15
|
||||||
|
|
||||||
sub r0, r0, #2
|
|
||||||
sub r2, r2, #2
|
|
||||||
|
|
||||||
;store op1, op0, oq0, oq1
|
;store op1, op0, oq0, oq1
|
||||||
vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [r0], r1
|
vst4.8 {d10[0], d11[0], d12[0], d13[0]}, [r0], r1
|
||||||
vst4.8 {d14[0], d15[0], d16[0], d17[0]}, [r2], r1
|
vst4.8 {d14[0], d15[0], d16[0], d17[0]}, [r2], r1
|
||||||
@@ -281,7 +288,7 @@
|
|||||||
vst4.8 {d10[7], d11[7], d12[7], d13[7]}, [r0]
|
vst4.8 {d10[7], d11[7], d12[7], d13[7]}, [r0]
|
||||||
vst4.8 {d14[7], d15[7], d16[7], d17[7]}, [r2]
|
vst4.8 {d14[7], d15[7], d16[7], d17[7]}, [r2]
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_loop_filter_vertical_edge_uv_neon|
|
ENDP ; |vp8_loop_filter_vertical_edge_uv_neon|
|
||||||
|
|
||||||
; void vp8_loop_filter_neon();
|
; void vp8_loop_filter_neon();
|
||||||
@@ -301,6 +308,7 @@
|
|||||||
; q9 q2
|
; q9 q2
|
||||||
; q10 q3
|
; q10 q3
|
||||||
|vp8_loop_filter_neon| PROC
|
|vp8_loop_filter_neon| PROC
|
||||||
|
ldr r12, _lf_coeff_
|
||||||
|
|
||||||
; vp8_filter_mask
|
; vp8_filter_mask
|
||||||
vabd.u8 q11, q3, q4 ; abs(p3 - p2)
|
vabd.u8 q11, q3, q4 ; abs(p3 - p2)
|
||||||
@@ -309,44 +317,42 @@
|
|||||||
vabd.u8 q14, q8, q7 ; abs(q1 - q0)
|
vabd.u8 q14, q8, q7 ; abs(q1 - q0)
|
||||||
vabd.u8 q3, q9, q8 ; abs(q2 - q1)
|
vabd.u8 q3, q9, q8 ; abs(q2 - q1)
|
||||||
vabd.u8 q4, q10, q9 ; abs(q3 - q2)
|
vabd.u8 q4, q10, q9 ; abs(q3 - q2)
|
||||||
|
vabd.u8 q9, q6, q7 ; abs(p0 - q0)
|
||||||
|
|
||||||
vmax.u8 q11, q11, q12
|
vmax.u8 q11, q11, q12
|
||||||
vmax.u8 q12, q13, q14
|
vmax.u8 q12, q13, q14
|
||||||
vmax.u8 q3, q3, q4
|
vmax.u8 q3, q3, q4
|
||||||
vmax.u8 q15, q11, q12
|
vmax.u8 q15, q11, q12
|
||||||
|
|
||||||
vabd.u8 q9, q6, q7 ; abs(p0 - q0)
|
|
||||||
|
|
||||||
; vp8_hevmask
|
; vp8_hevmask
|
||||||
vcgt.u8 q13, q13, q2 ; (abs(p1 - p0) > thresh)*-1
|
vcgt.u8 q13, q13, q2 ; (abs(p1 - p0) > thresh)*-1
|
||||||
vcgt.u8 q14, q14, q2 ; (abs(q1 - q0) > thresh)*-1
|
vcgt.u8 q14, q14, q2 ; (abs(q1 - q0) > thresh)*-1
|
||||||
vmax.u8 q15, q15, q3
|
vmax.u8 q15, q15, q3
|
||||||
|
|
||||||
vmov.u8 q10, #0x80 ; 0x80
|
vadd.u8 q0, q0, q0 ; flimit * 2
|
||||||
|
vadd.u8 q0, q0, q1 ; flimit * 2 + limit
|
||||||
|
vcge.u8 q15, q1, q15
|
||||||
|
|
||||||
vabd.u8 q2, q5, q8 ; a = abs(p1 - q1)
|
vabd.u8 q2, q5, q8 ; a = abs(p1 - q1)
|
||||||
vqadd.u8 q9, q9, q9 ; b = abs(p0 - q0) * 2
|
vqadd.u8 q9, q9, q9 ; b = abs(p0 - q0) * 2
|
||||||
|
vshr.u8 q2, q2, #1 ; a = a / 2
|
||||||
|
vqadd.u8 q9, q9, q2 ; a = b + a
|
||||||
|
vcge.u8 q9, q0, q9 ; (a > flimit * 2 + limit) * -1
|
||||||
|
|
||||||
vcge.u8 q15, q1, q15
|
vld1.u8 {q0}, [r12]!
|
||||||
|
|
||||||
; vp8_filter() function
|
; vp8_filter() function
|
||||||
; convert to signed
|
; convert to signed
|
||||||
veor q7, q7, q10 ; qs0
|
veor q7, q7, q0 ; qs0
|
||||||
vshr.u8 q2, q2, #1 ; a = a / 2
|
veor q6, q6, q0 ; ps0
|
||||||
veor q6, q6, q10 ; ps0
|
veor q5, q5, q0 ; ps1
|
||||||
|
veor q8, q8, q0 ; qs1
|
||||||
|
|
||||||
veor q5, q5, q10 ; ps1
|
vld1.u8 {q10}, [r12]!
|
||||||
vqadd.u8 q9, q9, q2 ; a = b + a
|
|
||||||
|
|
||||||
veor q8, q8, q10 ; qs1
|
|
||||||
|
|
||||||
vmov.u8 q10, #3 ; #3
|
|
||||||
|
|
||||||
vsubl.s8 q2, d14, d12 ; ( qs0 - ps0)
|
vsubl.s8 q2, d14, d12 ; ( qs0 - ps0)
|
||||||
vsubl.s8 q11, d15, d13
|
vsubl.s8 q11, d15, d13
|
||||||
|
|
||||||
vcge.u8 q9, q0, q9 ; (a > flimit * 2 + limit) * -1
|
|
||||||
|
|
||||||
vmovl.u8 q4, d20
|
vmovl.u8 q4, d20
|
||||||
|
|
||||||
vqsub.s8 q1, q5, q8 ; vp8_filter = clamp(ps1-qs1)
|
vqsub.s8 q1, q5, q8 ; vp8_filter = clamp(ps1-qs1)
|
||||||
@@ -361,7 +367,7 @@
|
|||||||
vaddw.s8 q2, q2, d2
|
vaddw.s8 q2, q2, d2
|
||||||
vaddw.s8 q11, q11, d3
|
vaddw.s8 q11, q11, d3
|
||||||
|
|
||||||
vmov.u8 q9, #4 ; #4
|
vld1.u8 {q9}, [r12]!
|
||||||
|
|
||||||
; vp8_filter = clamp(vp8_filter + 3 * ( qs0 - ps0))
|
; vp8_filter = clamp(vp8_filter + 3 * ( qs0 - ps0))
|
||||||
vqmovn.s16 d2, q2
|
vqmovn.s16 d2, q2
|
||||||
@@ -373,20 +379,19 @@
|
|||||||
vshr.s8 q2, q2, #3 ; Filter2 >>= 3
|
vshr.s8 q2, q2, #3 ; Filter2 >>= 3
|
||||||
vshr.s8 q1, q1, #3 ; Filter1 >>= 3
|
vshr.s8 q1, q1, #3 ; Filter1 >>= 3
|
||||||
|
|
||||||
|
|
||||||
vqadd.s8 q11, q6, q2 ; u = clamp(ps0 + Filter2)
|
vqadd.s8 q11, q6, q2 ; u = clamp(ps0 + Filter2)
|
||||||
vqsub.s8 q10, q7, q1 ; u = clamp(qs0 - Filter1)
|
vqsub.s8 q10, q7, q1 ; u = clamp(qs0 - Filter1)
|
||||||
|
|
||||||
; outer tap adjustments: ++vp8_filter >> 1
|
; outer tap adjustments: ++vp8_filter >> 1
|
||||||
vrshr.s8 q1, q1, #1
|
vrshr.s8 q1, q1, #1
|
||||||
vbic q1, q1, q14 ; vp8_filter &= ~hev
|
vbic q1, q1, q14 ; vp8_filter &= ~hev
|
||||||
vmov.u8 q0, #0x80 ; 0x80
|
|
||||||
vqadd.s8 q13, q5, q1 ; u = clamp(ps1 + vp8_filter)
|
vqadd.s8 q13, q5, q1 ; u = clamp(ps1 + vp8_filter)
|
||||||
vqsub.s8 q12, q8, q1 ; u = clamp(qs1 - vp8_filter)
|
vqsub.s8 q12, q8, q1 ; u = clamp(qs1 - vp8_filter)
|
||||||
|
|
||||||
|
veor q5, q13, q0 ; *op1 = u^0x80
|
||||||
veor q6, q11, q0 ; *op0 = u^0x80
|
veor q6, q11, q0 ; *op0 = u^0x80
|
||||||
veor q7, q10, q0 ; *oq0 = u^0x80
|
veor q7, q10, q0 ; *oq0 = u^0x80
|
||||||
veor q5, q13, q0 ; *op1 = u^0x80
|
|
||||||
veor q8, q12, q0 ; *oq1 = u^0x80
|
veor q8, q12, q0 ; *oq1 = u^0x80
|
||||||
|
|
||||||
bx lr
|
bx lr
|
||||||
@@ -394,4 +399,12 @@
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_lf_coeff_
|
||||||
|
DCD lf_coeff
|
||||||
|
lf_coeff
|
||||||
|
DCD 0x80808080, 0x80808080, 0x80808080, 0x80808080
|
||||||
|
DCD 0x03030303, 0x03030303, 0x03030303, 0x03030303
|
||||||
|
DCD 0x04040404, 0x04040404, 0x04040404, 0x04040404
|
||||||
|
DCD 0x01010101, 0x01010101, 0x01010101, 0x01010101
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -9,109 +9,107 @@
|
|||||||
;
|
;
|
||||||
|
|
||||||
|
|
||||||
;EXPORT |vp8_loop_filter_simple_horizontal_edge_neon|
|
EXPORT |vp8_loop_filter_simple_horizontal_edge_neon|
|
||||||
EXPORT |vp8_loop_filter_bhs_neon|
|
|
||||||
EXPORT |vp8_loop_filter_mbhs_neon|
|
|
||||||
ARM
|
ARM
|
||||||
|
REQUIRE8
|
||||||
PRESERVE8
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
;Note: flimit, limit, and thresh shpuld be positive numbers. All 16 elements in flimit
|
||||||
; r0 unsigned char *s, PRESERVE
|
;are equal. So, in the code, only one load is needed
|
||||||
; r1 int p, PRESERVE
|
;for flimit. Same way applies to limit and thresh.
|
||||||
; q1 limit, PRESERVE
|
; r0 unsigned char *s,
|
||||||
|
; r1 int p, //pitch
|
||||||
|
; r2 const signed char *flimit,
|
||||||
|
; r3 const signed char *limit,
|
||||||
|
; stack(r4) const signed char *thresh,
|
||||||
|
; //stack(r5) int count --unused
|
||||||
|
|
||||||
|vp8_loop_filter_simple_horizontal_edge_neon| PROC
|
|vp8_loop_filter_simple_horizontal_edge_neon| PROC
|
||||||
|
sub r0, r0, r1, lsl #1 ; move src pointer down by 2 lines
|
||||||
|
|
||||||
sub r3, r0, r1, lsl #1 ; move src pointer down by 2 lines
|
ldr r12, _lfhy_coeff_
|
||||||
|
vld1.u8 {q5}, [r0], r1 ; p1
|
||||||
vld1.u8 {q7}, [r0@128], r1 ; q0
|
vld1.s8 {d2[], d3[]}, [r2] ; flimit
|
||||||
vld1.u8 {q5}, [r3@128], r1 ; p0
|
vld1.s8 {d26[], d27[]}, [r3] ; limit -> q13
|
||||||
vld1.u8 {q8}, [r0@128] ; q1
|
vld1.u8 {q6}, [r0], r1 ; p0
|
||||||
vld1.u8 {q6}, [r3@128] ; p1
|
vld1.u8 {q0}, [r12]! ; 0x80
|
||||||
|
vld1.u8 {q7}, [r0], r1 ; q0
|
||||||
|
vld1.u8 {q10}, [r12]! ; 0x03
|
||||||
|
vld1.u8 {q8}, [r0] ; q1
|
||||||
|
|
||||||
|
;vp8_filter_mask() function
|
||||||
vabd.u8 q15, q6, q7 ; abs(p0 - q0)
|
vabd.u8 q15, q6, q7 ; abs(p0 - q0)
|
||||||
vabd.u8 q14, q5, q8 ; abs(p1 - q1)
|
vabd.u8 q14, q5, q8 ; abs(p1 - q1)
|
||||||
|
|
||||||
vqadd.u8 q15, q15, q15 ; abs(p0 - q0) * 2
|
vqadd.u8 q15, q15, q15 ; abs(p0 - q0) * 2
|
||||||
vshr.u8 q14, q14, #1 ; abs(p1 - q1) / 2
|
vshr.u8 q14, q14, #1 ; abs(p1 - q1) / 2
|
||||||
vmov.u8 q0, #0x80 ; 0x80
|
|
||||||
vmov.s16 q13, #3
|
|
||||||
vqadd.u8 q15, q15, q14 ; abs(p0 - q0) * 2 + abs(p1 - q1) / 2
|
vqadd.u8 q15, q15, q14 ; abs(p0 - q0) * 2 + abs(p1 - q1) / 2
|
||||||
|
|
||||||
|
;vp8_filter() function
|
||||||
veor q7, q7, q0 ; qs0: q0 offset to convert to a signed value
|
veor q7, q7, q0 ; qs0: q0 offset to convert to a signed value
|
||||||
veor q6, q6, q0 ; ps0: p0 offset to convert to a signed value
|
veor q6, q6, q0 ; ps0: p0 offset to convert to a signed value
|
||||||
veor q5, q5, q0 ; ps1: p1 offset to convert to a signed value
|
veor q5, q5, q0 ; ps1: p1 offset to convert to a signed value
|
||||||
veor q8, q8, q0 ; qs1: q1 offset to convert to a signed value
|
veor q8, q8, q0 ; qs1: q1 offset to convert to a signed value
|
||||||
|
|
||||||
vcge.u8 q15, q1, q15 ; (abs(p0 - q0)*2 + abs(p1-q1)/2 > limit)*-1
|
vadd.u8 q1, q1, q1 ; flimit * 2
|
||||||
|
vadd.u8 q1, q1, q13 ; flimit * 2 + limit
|
||||||
|
vcge.u8 q15, q1, q15 ; (abs(p0 - q0)*2 + abs(p1-q1)/2 > flimit*2 + limit)*-1
|
||||||
|
|
||||||
|
;;;;;;;;;;
|
||||||
|
;vqsub.s8 q2, q7, q6 ; ( qs0 - ps0)
|
||||||
vsubl.s8 q2, d14, d12 ; ( qs0 - ps0)
|
vsubl.s8 q2, d14, d12 ; ( qs0 - ps0)
|
||||||
vsubl.s8 q3, d15, d13
|
vsubl.s8 q3, d15, d13
|
||||||
|
|
||||||
vqsub.s8 q4, q5, q8 ; q4: vp8_filter = vp8_signed_char_clamp(ps1-qs1)
|
vqsub.s8 q4, q5, q8 ; q4: vp8_filter = vp8_signed_char_clamp(ps1-qs1)
|
||||||
|
|
||||||
vmul.s16 q2, q2, q13 ; 3 * ( qs0 - ps0)
|
;vmul.i8 q2, q2, q10 ; 3 * ( qs0 - ps0)
|
||||||
vmul.s16 q3, q3, q13
|
vadd.s16 q11, q2, q2 ; 3 * ( qs0 - ps0)
|
||||||
|
vadd.s16 q12, q3, q3
|
||||||
|
|
||||||
vmov.u8 q10, #0x03 ; 0x03
|
vld1.u8 {q9}, [r12]! ; 0x04
|
||||||
vmov.u8 q9, #0x04 ; 0x04
|
|
||||||
|
vadd.s16 q2, q2, q11
|
||||||
|
vadd.s16 q3, q3, q12
|
||||||
|
|
||||||
vaddw.s8 q2, q2, d8 ; vp8_filter + 3 * ( qs0 - ps0)
|
vaddw.s8 q2, q2, d8 ; vp8_filter + 3 * ( qs0 - ps0)
|
||||||
vaddw.s8 q3, q3, d9
|
vaddw.s8 q3, q3, d9
|
||||||
|
|
||||||
|
;vqadd.s8 q4, q4, q2 ; vp8_filter = vp8_signed_char_clamp(vp8_filter + 3 * ( qs0 - ps0))
|
||||||
vqmovn.s16 d8, q2 ; vp8_filter = vp8_signed_char_clamp(vp8_filter + 3 * ( qs0 - ps0))
|
vqmovn.s16 d8, q2 ; vp8_filter = vp8_signed_char_clamp(vp8_filter + 3 * ( qs0 - ps0))
|
||||||
vqmovn.s16 d9, q3
|
vqmovn.s16 d9, q3
|
||||||
|
;;;;;;;;;;;;;
|
||||||
|
|
||||||
vand q14, q4, q15 ; vp8_filter &= mask
|
vand q4, q4, q15 ; vp8_filter &= mask
|
||||||
|
|
||||||
vqadd.s8 q2, q14, q10 ; Filter2 = vp8_signed_char_clamp(vp8_filter+3)
|
vqadd.s8 q2, q4, q10 ; Filter2 = vp8_signed_char_clamp(vp8_filter+3)
|
||||||
vqadd.s8 q3, q14, q9 ; Filter1 = vp8_signed_char_clamp(vp8_filter+4)
|
vqadd.s8 q4, q4, q9 ; Filter1 = vp8_signed_char_clamp(vp8_filter+4)
|
||||||
vshr.s8 q2, q2, #3 ; Filter2 >>= 3
|
vshr.s8 q2, q2, #3 ; Filter2 >>= 3
|
||||||
vshr.s8 q4, q3, #3 ; Filter1 >>= 3
|
vshr.s8 q4, q4, #3 ; Filter1 >>= 3
|
||||||
|
|
||||||
sub r0, r0, r1
|
sub r0, r0, r1, lsl #1
|
||||||
|
|
||||||
;calculate output
|
;calculate output
|
||||||
vqadd.s8 q11, q6, q2 ; u = vp8_signed_char_clamp(ps0 + Filter2)
|
vqadd.s8 q11, q6, q2 ; u = vp8_signed_char_clamp(ps0 + Filter2)
|
||||||
vqsub.s8 q10, q7, q4 ; u = vp8_signed_char_clamp(qs0 - Filter1)
|
vqsub.s8 q10, q7, q4 ; u = vp8_signed_char_clamp(qs0 - Filter1)
|
||||||
|
|
||||||
|
add r3, r0, r1
|
||||||
|
|
||||||
veor q6, q11, q0 ; *op0 = u^0x80
|
veor q6, q11, q0 ; *op0 = u^0x80
|
||||||
veor q7, q10, q0 ; *oq0 = u^0x80
|
veor q7, q10, q0 ; *oq0 = u^0x80
|
||||||
|
|
||||||
vst1.u8 {q6}, [r3@128] ; store op0
|
vst1.u8 {q6}, [r0] ; store op0
|
||||||
vst1.u8 {q7}, [r0@128] ; store oq0
|
vst1.u8 {q7}, [r3] ; store oq0
|
||||||
|
|
||||||
bx lr
|
bx lr
|
||||||
ENDP ; |vp8_loop_filter_simple_horizontal_edge_neon|
|
ENDP ; |vp8_loop_filter_simple_horizontal_edge_neon|
|
||||||
|
|
||||||
; r0 unsigned char *y
|
;-----------------
|
||||||
; r1 int ystride
|
|
||||||
; r2 const unsigned char *blimit
|
|
||||||
|
|
||||||
|vp8_loop_filter_bhs_neon| PROC
|
_lfhy_coeff_
|
||||||
push {r4, lr}
|
DCD lfhy_coeff
|
||||||
ldrb r3, [r2] ; load blim from mem
|
lfhy_coeff
|
||||||
vdup.s8 q1, r3 ; duplicate blim
|
DCD 0x80808080, 0x80808080, 0x80808080, 0x80808080
|
||||||
|
DCD 0x03030303, 0x03030303, 0x03030303, 0x03030303
|
||||||
add r0, r0, r1, lsl #2 ; src = y_ptr + 4 * y_stride
|
DCD 0x04040404, 0x04040404, 0x04040404, 0x04040404
|
||||||
bl vp8_loop_filter_simple_horizontal_edge_neon
|
|
||||||
; vp8_loop_filter_simple_horizontal_edge_neon preserves r0, r1 and q1
|
|
||||||
add r0, r0, r1, lsl #2 ; src = y_ptr + 8* y_stride
|
|
||||||
bl vp8_loop_filter_simple_horizontal_edge_neon
|
|
||||||
add r0, r0, r1, lsl #2 ; src = y_ptr + 12 * y_stride
|
|
||||||
pop {r4, lr}
|
|
||||||
b vp8_loop_filter_simple_horizontal_edge_neon
|
|
||||||
ENDP ;|vp8_loop_filter_bhs_neon|
|
|
||||||
|
|
||||||
; r0 unsigned char *y
|
|
||||||
; r1 int ystride
|
|
||||||
; r2 const unsigned char *blimit
|
|
||||||
|
|
||||||
|vp8_loop_filter_mbhs_neon| PROC
|
|
||||||
ldrb r3, [r2] ; load blim from mem
|
|
||||||
vdup.s8 q1, r3 ; duplicate mblim
|
|
||||||
b vp8_loop_filter_simple_horizontal_edge_neon
|
|
||||||
ENDP ;|vp8_loop_filter_bhs_neon|
|
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -9,54 +9,60 @@
|
|||||||
;
|
;
|
||||||
|
|
||||||
|
|
||||||
;EXPORT |vp8_loop_filter_simple_vertical_edge_neon|
|
EXPORT |vp8_loop_filter_simple_vertical_edge_neon|
|
||||||
EXPORT |vp8_loop_filter_bvs_neon|
|
|
||||||
EXPORT |vp8_loop_filter_mbvs_neon|
|
|
||||||
ARM
|
ARM
|
||||||
|
REQUIRE8
|
||||||
PRESERVE8
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
;Note: flimit, limit, and thresh should be positive numbers. All 16 elements in flimit
|
||||||
; r0 unsigned char *s, PRESERVE
|
;are equal. So, in the code, only one load is needed
|
||||||
; r1 int p, PRESERVE
|
;for flimit. Same way applies to limit and thresh.
|
||||||
; q1 limit, PRESERVE
|
; r0 unsigned char *s,
|
||||||
|
; r1 int p, //pitch
|
||||||
|
; r2 const signed char *flimit,
|
||||||
|
; r3 const signed char *limit,
|
||||||
|
; stack(r4) const signed char *thresh,
|
||||||
|
; //stack(r5) int count --unused
|
||||||
|
|
||||||
|vp8_loop_filter_simple_vertical_edge_neon| PROC
|
|vp8_loop_filter_simple_vertical_edge_neon| PROC
|
||||||
sub r0, r0, #2 ; move src pointer down by 2 columns
|
sub r0, r0, #2 ; move src pointer down by 2 columns
|
||||||
add r12, r1, r1
|
|
||||||
add r3, r0, r1
|
|
||||||
|
|
||||||
vld4.8 {d6[0], d7[0], d8[0], d9[0]}, [r0], r12
|
vld4.8 {d6[0], d7[0], d8[0], d9[0]}, [r0], r1
|
||||||
vld4.8 {d6[1], d7[1], d8[1], d9[1]}, [r3], r12
|
vld1.s8 {d2[], d3[]}, [r2] ; flimit
|
||||||
vld4.8 {d6[2], d7[2], d8[2], d9[2]}, [r0], r12
|
vld1.s8 {d26[], d27[]}, [r3] ; limit -> q13
|
||||||
vld4.8 {d6[3], d7[3], d8[3], d9[3]}, [r3], r12
|
vld4.8 {d6[1], d7[1], d8[1], d9[1]}, [r0], r1
|
||||||
vld4.8 {d6[4], d7[4], d8[4], d9[4]}, [r0], r12
|
ldr r12, _vlfy_coeff_
|
||||||
vld4.8 {d6[5], d7[5], d8[5], d9[5]}, [r3], r12
|
vld4.8 {d6[2], d7[2], d8[2], d9[2]}, [r0], r1
|
||||||
vld4.8 {d6[6], d7[6], d8[6], d9[6]}, [r0], r12
|
vld4.8 {d6[3], d7[3], d8[3], d9[3]}, [r0], r1
|
||||||
vld4.8 {d6[7], d7[7], d8[7], d9[7]}, [r3], r12
|
vld4.8 {d6[4], d7[4], d8[4], d9[4]}, [r0], r1
|
||||||
|
vld4.8 {d6[5], d7[5], d8[5], d9[5]}, [r0], r1
|
||||||
|
vld4.8 {d6[6], d7[6], d8[6], d9[6]}, [r0], r1
|
||||||
|
vld4.8 {d6[7], d7[7], d8[7], d9[7]}, [r0], r1
|
||||||
|
|
||||||
vld4.8 {d10[0], d11[0], d12[0], d13[0]}, [r0], r12
|
vld4.8 {d10[0], d11[0], d12[0], d13[0]}, [r0], r1
|
||||||
vld4.8 {d10[1], d11[1], d12[1], d13[1]}, [r3], r12
|
vld1.u8 {q0}, [r12]! ; 0x80
|
||||||
vld4.8 {d10[2], d11[2], d12[2], d13[2]}, [r0], r12
|
vld4.8 {d10[1], d11[1], d12[1], d13[1]}, [r0], r1
|
||||||
vld4.8 {d10[3], d11[3], d12[3], d13[3]}, [r3], r12
|
vld1.u8 {q11}, [r12]! ; 0x03
|
||||||
vld4.8 {d10[4], d11[4], d12[4], d13[4]}, [r0], r12
|
vld4.8 {d10[2], d11[2], d12[2], d13[2]}, [r0], r1
|
||||||
vld4.8 {d10[5], d11[5], d12[5], d13[5]}, [r3], r12
|
vld1.u8 {q12}, [r12]! ; 0x04
|
||||||
vld4.8 {d10[6], d11[6], d12[6], d13[6]}, [r0], r12
|
vld4.8 {d10[3], d11[3], d12[3], d13[3]}, [r0], r1
|
||||||
vld4.8 {d10[7], d11[7], d12[7], d13[7]}, [r3]
|
vld4.8 {d10[4], d11[4], d12[4], d13[4]}, [r0], r1
|
||||||
|
vld4.8 {d10[5], d11[5], d12[5], d13[5]}, [r0], r1
|
||||||
|
vld4.8 {d10[6], d11[6], d12[6], d13[6]}, [r0], r1
|
||||||
|
vld4.8 {d10[7], d11[7], d12[7], d13[7]}, [r0], r1
|
||||||
|
|
||||||
vswp d7, d10
|
vswp d7, d10
|
||||||
vswp d12, d9
|
vswp d12, d9
|
||||||
|
;vswp q4, q5 ; p1:q3, p0:q5, q0:q4, q1:q6
|
||||||
|
|
||||||
;vp8_filter_mask() function
|
;vp8_filter_mask() function
|
||||||
;vp8_hevmask() function
|
;vp8_hevmask() function
|
||||||
sub r0, r0, r1, lsl #4
|
sub r0, r0, r1, lsl #4
|
||||||
vabd.u8 q15, q5, q4 ; abs(p0 - q0)
|
vabd.u8 q15, q5, q4 ; abs(p0 - q0)
|
||||||
vabd.u8 q14, q3, q6 ; abs(p1 - q1)
|
vabd.u8 q14, q3, q6 ; abs(p1 - q1)
|
||||||
|
|
||||||
vqadd.u8 q15, q15, q15 ; abs(p0 - q0) * 2
|
vqadd.u8 q15, q15, q15 ; abs(p0 - q0) * 2
|
||||||
vshr.u8 q14, q14, #1 ; abs(p1 - q1) / 2
|
vshr.u8 q14, q14, #1 ; abs(p1 - q1) / 2
|
||||||
vmov.u8 q0, #0x80 ; 0x80
|
|
||||||
vmov.s16 q11, #3
|
|
||||||
vqadd.u8 q15, q15, q14 ; abs(p0 - q0) * 2 + abs(p1 - q1) / 2
|
vqadd.u8 q15, q15, q14 ; abs(p0 - q0) * 2 + abs(p1 - q1) / 2
|
||||||
|
|
||||||
veor q4, q4, q0 ; qs0: q0 offset to convert to a signed value
|
veor q4, q4, q0 ; qs0: q0 offset to convert to a signed value
|
||||||
@@ -64,91 +70,87 @@
|
|||||||
veor q3, q3, q0 ; ps1: p1 offset to convert to a signed value
|
veor q3, q3, q0 ; ps1: p1 offset to convert to a signed value
|
||||||
veor q6, q6, q0 ; qs1: q1 offset to convert to a signed value
|
veor q6, q6, q0 ; qs1: q1 offset to convert to a signed value
|
||||||
|
|
||||||
|
vadd.u8 q1, q1, q1 ; flimit * 2
|
||||||
|
vadd.u8 q1, q1, q13 ; flimit * 2 + limit
|
||||||
vcge.u8 q15, q1, q15 ; abs(p0 - q0)*2 + abs(p1-q1)/2 > flimit*2 + limit)*-1
|
vcge.u8 q15, q1, q15 ; abs(p0 - q0)*2 + abs(p1-q1)/2 > flimit*2 + limit)*-1
|
||||||
|
|
||||||
|
;vp8_filter() function
|
||||||
|
;;;;;;;;;;
|
||||||
|
;vqsub.s8 q2, q5, q4 ; ( qs0 - ps0)
|
||||||
vsubl.s8 q2, d8, d10 ; ( qs0 - ps0)
|
vsubl.s8 q2, d8, d10 ; ( qs0 - ps0)
|
||||||
vsubl.s8 q13, d9, d11
|
vsubl.s8 q13, d9, d11
|
||||||
|
|
||||||
vqsub.s8 q14, q3, q6 ; vp8_filter = vp8_signed_char_clamp(ps1-qs1)
|
vqsub.s8 q1, q3, q6 ; vp8_filter = vp8_signed_char_clamp(ps1-qs1)
|
||||||
|
|
||||||
vmul.s16 q2, q2, q11 ; 3 * ( qs0 - ps0)
|
;vmul.i8 q2, q2, q11 ; vp8_filter = vp8_signed_char_clamp(vp8_filter + 3 * ( qs0 - ps0))
|
||||||
vmul.s16 q13, q13, q11
|
vadd.s16 q10, q2, q2 ; 3 * ( qs0 - ps0)
|
||||||
|
vadd.s16 q14, q13, q13
|
||||||
|
vadd.s16 q2, q2, q10
|
||||||
|
vadd.s16 q13, q13, q14
|
||||||
|
|
||||||
vmov.u8 q11, #0x03 ; 0x03
|
;vqadd.s8 q1, q1, q2
|
||||||
vmov.u8 q12, #0x04 ; 0x04
|
vaddw.s8 q2, q2, d2 ; vp8_filter + 3 * ( qs0 - ps0)
|
||||||
|
vaddw.s8 q13, q13, d3
|
||||||
|
|
||||||
vaddw.s8 q2, q2, d28 ; vp8_filter + 3 * ( qs0 - ps0)
|
vqmovn.s16 d2, q2 ; vp8_filter = vp8_signed_char_clamp(vp8_filter + 3 * ( qs0 - ps0))
|
||||||
vaddw.s8 q13, q13, d29
|
vqmovn.s16 d3, q13
|
||||||
|
|
||||||
vqmovn.s16 d28, q2 ; vp8_filter = vp8_signed_char_clamp(vp8_filter + 3 * ( qs0 - ps0))
|
|
||||||
vqmovn.s16 d29, q13
|
|
||||||
|
|
||||||
add r0, r0, #1
|
add r0, r0, #1
|
||||||
add r3, r0, r1
|
add r2, r0, r1
|
||||||
|
;;;;;;;;;;;
|
||||||
|
|
||||||
vand q14, q14, q15 ; vp8_filter &= mask
|
vand q1, q1, q15 ; vp8_filter &= mask
|
||||||
|
|
||||||
vqadd.s8 q2, q14, q11 ; Filter2 = vp8_signed_char_clamp(vp8_filter+3)
|
vqadd.s8 q2, q1, q11 ; Filter2 = vp8_signed_char_clamp(vp8_filter+3)
|
||||||
vqadd.s8 q3, q14, q12 ; Filter1 = vp8_signed_char_clamp(vp8_filter+4)
|
vqadd.s8 q1, q1, q12 ; Filter1 = vp8_signed_char_clamp(vp8_filter+4)
|
||||||
vshr.s8 q2, q2, #3 ; Filter2 >>= 3
|
vshr.s8 q2, q2, #3 ; Filter2 >>= 3
|
||||||
vshr.s8 q14, q3, #3 ; Filter1 >>= 3
|
vshr.s8 q1, q1, #3 ; Filter1 >>= 3
|
||||||
|
|
||||||
;calculate output
|
;calculate output
|
||||||
|
vqsub.s8 q10, q4, q1 ; u = vp8_signed_char_clamp(qs0 - Filter1)
|
||||||
vqadd.s8 q11, q5, q2 ; u = vp8_signed_char_clamp(ps0 + Filter2)
|
vqadd.s8 q11, q5, q2 ; u = vp8_signed_char_clamp(ps0 + Filter2)
|
||||||
vqsub.s8 q10, q4, q14 ; u = vp8_signed_char_clamp(qs0 - Filter1)
|
|
||||||
|
|
||||||
veor q6, q11, q0 ; *op0 = u^0x80
|
|
||||||
veor q7, q10, q0 ; *oq0 = u^0x80
|
veor q7, q10, q0 ; *oq0 = u^0x80
|
||||||
add r12, r1, r1
|
veor q6, q11, q0 ; *op0 = u^0x80
|
||||||
|
|
||||||
|
add r3, r2, r1
|
||||||
vswp d13, d14
|
vswp d13, d14
|
||||||
|
add r12, r3, r1
|
||||||
|
|
||||||
;store op1, op0, oq0, oq1
|
;store op1, op0, oq0, oq1
|
||||||
vst2.8 {d12[0], d13[0]}, [r0], r12
|
vst2.8 {d12[0], d13[0]}, [r0]
|
||||||
vst2.8 {d12[1], d13[1]}, [r3], r12
|
vst2.8 {d12[1], d13[1]}, [r2]
|
||||||
vst2.8 {d12[2], d13[2]}, [r0], r12
|
vst2.8 {d12[2], d13[2]}, [r3]
|
||||||
vst2.8 {d12[3], d13[3]}, [r3], r12
|
vst2.8 {d12[3], d13[3]}, [r12], r1
|
||||||
vst2.8 {d12[4], d13[4]}, [r0], r12
|
add r0, r12, r1
|
||||||
vst2.8 {d12[5], d13[5]}, [r3], r12
|
vst2.8 {d12[4], d13[4]}, [r12]
|
||||||
vst2.8 {d12[6], d13[6]}, [r0], r12
|
vst2.8 {d12[5], d13[5]}, [r0], r1
|
||||||
vst2.8 {d12[7], d13[7]}, [r3], r12
|
add r2, r0, r1
|
||||||
vst2.8 {d14[0], d15[0]}, [r0], r12
|
vst2.8 {d12[6], d13[6]}, [r0]
|
||||||
vst2.8 {d14[1], d15[1]}, [r3], r12
|
vst2.8 {d12[7], d13[7]}, [r2], r1
|
||||||
vst2.8 {d14[2], d15[2]}, [r0], r12
|
add r3, r2, r1
|
||||||
vst2.8 {d14[3], d15[3]}, [r3], r12
|
vst2.8 {d14[0], d15[0]}, [r2]
|
||||||
vst2.8 {d14[4], d15[4]}, [r0], r12
|
vst2.8 {d14[1], d15[1]}, [r3], r1
|
||||||
vst2.8 {d14[5], d15[5]}, [r3], r12
|
add r12, r3, r1
|
||||||
vst2.8 {d14[6], d15[6]}, [r0], r12
|
vst2.8 {d14[2], d15[2]}, [r3]
|
||||||
vst2.8 {d14[7], d15[7]}, [r3]
|
vst2.8 {d14[3], d15[3]}, [r12], r1
|
||||||
|
add r0, r12, r1
|
||||||
|
vst2.8 {d14[4], d15[4]}, [r12]
|
||||||
|
vst2.8 {d14[5], d15[5]}, [r0], r1
|
||||||
|
add r2, r0, r1
|
||||||
|
vst2.8 {d14[6], d15[6]}, [r0]
|
||||||
|
vst2.8 {d14[7], d15[7]}, [r2]
|
||||||
|
|
||||||
bx lr
|
bx lr
|
||||||
ENDP ; |vp8_loop_filter_simple_vertical_edge_neon|
|
ENDP ; |vp8_loop_filter_simple_vertical_edge_neon|
|
||||||
|
|
||||||
; r0 unsigned char *y
|
;-----------------
|
||||||
; r1 int ystride
|
|
||||||
; r2 const unsigned char *blimit
|
|
||||||
|
|
||||||
|vp8_loop_filter_bvs_neon| PROC
|
_vlfy_coeff_
|
||||||
push {r4, lr}
|
DCD vlfy_coeff
|
||||||
ldrb r3, [r2] ; load blim from mem
|
vlfy_coeff
|
||||||
mov r4, r0
|
DCD 0x80808080, 0x80808080, 0x80808080, 0x80808080
|
||||||
add r0, r0, #4
|
DCD 0x03030303, 0x03030303, 0x03030303, 0x03030303
|
||||||
vdup.s8 q1, r3 ; duplicate blim
|
DCD 0x04040404, 0x04040404, 0x04040404, 0x04040404
|
||||||
bl vp8_loop_filter_simple_vertical_edge_neon
|
|
||||||
; vp8_loop_filter_simple_vertical_edge_neon preserves r1 and q1
|
|
||||||
add r0, r4, #8
|
|
||||||
bl vp8_loop_filter_simple_vertical_edge_neon
|
|
||||||
add r0, r4, #12
|
|
||||||
pop {r4, lr}
|
|
||||||
b vp8_loop_filter_simple_vertical_edge_neon
|
|
||||||
ENDP ;|vp8_loop_filter_bvs_neon|
|
|
||||||
|
|
||||||
; r0 unsigned char *y
|
|
||||||
; r1 int ystride
|
|
||||||
; r2 const unsigned char *blimit
|
|
||||||
|
|
||||||
|vp8_loop_filter_mbvs_neon| PROC
|
|
||||||
ldrb r3, [r2] ; load mblim from mem
|
|
||||||
vdup.s8 q1, r3 ; duplicate mblim
|
|
||||||
b vp8_loop_filter_simple_vertical_edge_neon
|
|
||||||
ENDP ;|vp8_loop_filter_bvs_neon|
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -14,143 +14,155 @@
|
|||||||
EXPORT |vp8_mbloop_filter_vertical_edge_y_neon|
|
EXPORT |vp8_mbloop_filter_vertical_edge_y_neon|
|
||||||
EXPORT |vp8_mbloop_filter_vertical_edge_uv_neon|
|
EXPORT |vp8_mbloop_filter_vertical_edge_uv_neon|
|
||||||
ARM
|
ARM
|
||||||
|
REQUIRE8
|
||||||
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
|
||||||
|
; flimit, limit, and thresh should be positive numbers.
|
||||||
|
; All 16 elements in these variables are equal.
|
||||||
|
|
||||||
; void vp8_mbloop_filter_horizontal_edge_y_neon(unsigned char *src, int pitch,
|
; void vp8_mbloop_filter_horizontal_edge_y_neon(unsigned char *src, int pitch,
|
||||||
; const unsigned char *blimit,
|
; const signed char *flimit,
|
||||||
; const unsigned char *limit,
|
; const signed char *limit,
|
||||||
; const unsigned char *thresh)
|
; const signed char *thresh,
|
||||||
|
; int count)
|
||||||
; r0 unsigned char *src,
|
; r0 unsigned char *src,
|
||||||
; r1 int pitch,
|
; r1 int pitch,
|
||||||
; r2 unsigned char blimit
|
; r2 const signed char *flimit,
|
||||||
; r3 unsigned char limit
|
; r3 const signed char *limit,
|
||||||
; sp unsigned char thresh,
|
; sp const signed char *thresh,
|
||||||
|
; sp+4 int count (unused)
|
||||||
|vp8_mbloop_filter_horizontal_edge_y_neon| PROC
|
|vp8_mbloop_filter_horizontal_edge_y_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
add r1, r1, r1 ; double stride
|
sub r0, r0, r1, lsl #2 ; move src pointer down by 4 lines
|
||||||
ldr r12, [sp, #4] ; load thresh
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
sub r0, r0, r1, lsl #1 ; move src pointer down by 4 lines
|
|
||||||
vdup.u8 q2, r12 ; thresh
|
|
||||||
add r12, r0, r1, lsr #1 ; move src pointer up by 1 line
|
|
||||||
|
|
||||||
vld1.u8 {q3}, [r0@128], r1 ; p3
|
vld1.u8 {q3}, [r0], r1 ; p3
|
||||||
vld1.u8 {q4}, [r12@128], r1 ; p2
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
vld1.u8 {q5}, [r0@128], r1 ; p1
|
vld1.u8 {q4}, [r0], r1 ; p2
|
||||||
vld1.u8 {q6}, [r12@128], r1 ; p0
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
vld1.u8 {q7}, [r0@128], r1 ; q0
|
vld1.u8 {q5}, [r0], r1 ; p1
|
||||||
vld1.u8 {q8}, [r12@128], r1 ; q1
|
vld1.u8 {q6}, [r0], r1 ; p0
|
||||||
vld1.u8 {q9}, [r0@128], r1 ; q2
|
vld1.u8 {q7}, [r0], r1 ; q0
|
||||||
vld1.u8 {q10}, [r12@128], r1 ; q3
|
vld1.u8 {q8}, [r0], r1 ; q1
|
||||||
|
vld1.u8 {q9}, [r0], r1 ; q2
|
||||||
bl vp8_mbloop_filter_neon
|
vld1.u8 {q10}, [r0], r1 ; q3
|
||||||
|
|
||||||
sub r12, r12, r1, lsl #2
|
|
||||||
add r0, r12, r1, lsr #1
|
|
||||||
|
|
||||||
vst1.u8 {q4}, [r12@128],r1 ; store op2
|
|
||||||
vst1.u8 {q5}, [r0@128],r1 ; store op1
|
|
||||||
vst1.u8 {q6}, [r12@128], r1 ; store op0
|
|
||||||
vst1.u8 {q7}, [r0@128],r1 ; store oq0
|
|
||||||
vst1.u8 {q8}, [r12@128] ; store oq1
|
|
||||||
vst1.u8 {q9}, [r0@128] ; store oq2
|
|
||||||
|
|
||||||
pop {pc}
|
|
||||||
ENDP ; |vp8_mbloop_filter_horizontal_edge_y_neon|
|
|
||||||
|
|
||||||
; void vp8_mbloop_filter_horizontal_edge_uv_neon(unsigned char *u, int pitch,
|
|
||||||
; const unsigned char *blimit,
|
|
||||||
; const unsigned char *limit,
|
|
||||||
; const unsigned char *thresh,
|
|
||||||
; unsigned char *v)
|
|
||||||
; r0 unsigned char *u,
|
|
||||||
; r1 int pitch,
|
|
||||||
; r2 unsigned char blimit
|
|
||||||
; r3 unsigned char limit
|
|
||||||
; sp unsigned char thresh,
|
|
||||||
; sp+4 unsigned char *v
|
|
||||||
|
|
||||||
|vp8_mbloop_filter_horizontal_edge_uv_neon| PROC
|
|
||||||
push {lr}
|
|
||||||
ldr r12, [sp, #4] ; load thresh
|
|
||||||
sub r0, r0, r1, lsl #2 ; move u pointer down by 4 lines
|
|
||||||
vdup.u8 q2, r12 ; thresh
|
|
||||||
ldr r12, [sp, #8] ; load v ptr
|
|
||||||
sub r12, r12, r1, lsl #2 ; move v pointer down by 4 lines
|
|
||||||
|
|
||||||
vld1.u8 {d6}, [r0@64], r1 ; p3
|
|
||||||
vld1.u8 {d7}, [r12@64], r1 ; p3
|
|
||||||
vld1.u8 {d8}, [r0@64], r1 ; p2
|
|
||||||
vld1.u8 {d9}, [r12@64], r1 ; p2
|
|
||||||
vld1.u8 {d10}, [r0@64], r1 ; p1
|
|
||||||
vld1.u8 {d11}, [r12@64], r1 ; p1
|
|
||||||
vld1.u8 {d12}, [r0@64], r1 ; p0
|
|
||||||
vld1.u8 {d13}, [r12@64], r1 ; p0
|
|
||||||
vld1.u8 {d14}, [r0@64], r1 ; q0
|
|
||||||
vld1.u8 {d15}, [r12@64], r1 ; q0
|
|
||||||
vld1.u8 {d16}, [r0@64], r1 ; q1
|
|
||||||
vld1.u8 {d17}, [r12@64], r1 ; q1
|
|
||||||
vld1.u8 {d18}, [r0@64], r1 ; q2
|
|
||||||
vld1.u8 {d19}, [r12@64], r1 ; q2
|
|
||||||
vld1.u8 {d20}, [r0@64], r1 ; q3
|
|
||||||
vld1.u8 {d21}, [r12@64], r1 ; q3
|
|
||||||
|
|
||||||
bl vp8_mbloop_filter_neon
|
bl vp8_mbloop_filter_neon
|
||||||
|
|
||||||
sub r0, r0, r1, lsl #3
|
sub r0, r0, r1, lsl #3
|
||||||
sub r12, r12, r1, lsl #3
|
add r0, r0, r1
|
||||||
|
add r2, r0, r1
|
||||||
|
add r3, r2, r1
|
||||||
|
|
||||||
|
vst1.u8 {q4}, [r0] ; store op2
|
||||||
|
vst1.u8 {q5}, [r2] ; store op1
|
||||||
|
vst1.u8 {q6}, [r3], r1 ; store op0
|
||||||
|
add r12, r3, r1
|
||||||
|
vst1.u8 {q7}, [r3] ; store oq0
|
||||||
|
vst1.u8 {q8}, [r12], r1 ; store oq1
|
||||||
|
vst1.u8 {q9}, [r12] ; store oq2
|
||||||
|
|
||||||
|
ldmia sp!, {pc}
|
||||||
|
ENDP ; |vp8_mbloop_filter_horizontal_edge_y_neon|
|
||||||
|
|
||||||
|
; void vp8_mbloop_filter_horizontal_edge_uv_neon(unsigned char *u, int pitch,
|
||||||
|
; const signed char *flimit,
|
||||||
|
; const signed char *limit,
|
||||||
|
; const signed char *thresh,
|
||||||
|
; unsigned char *v)
|
||||||
|
; r0 unsigned char *u,
|
||||||
|
; r1 int pitch,
|
||||||
|
; r2 const signed char *flimit,
|
||||||
|
; r3 const signed char *limit,
|
||||||
|
; sp const signed char *thresh,
|
||||||
|
; sp+4 unsigned char *v
|
||||||
|
|vp8_mbloop_filter_horizontal_edge_uv_neon| PROC
|
||||||
|
stmdb sp!, {lr}
|
||||||
|
sub r0, r0, r1, lsl #2 ; move u pointer down by 4 lines
|
||||||
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
|
ldr r3, [sp, #8] ; load v ptr
|
||||||
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
|
sub r3, r3, r1, lsl #2 ; move v pointer down by 4 lines
|
||||||
|
|
||||||
|
vld1.u8 {d6}, [r0], r1 ; p3
|
||||||
|
vld1.u8 {d7}, [r3], r1 ; p3
|
||||||
|
vld1.u8 {d8}, [r0], r1 ; p2
|
||||||
|
vld1.u8 {d9}, [r3], r1 ; p2
|
||||||
|
vld1.u8 {d10}, [r0], r1 ; p1
|
||||||
|
vld1.u8 {d11}, [r3], r1 ; p1
|
||||||
|
vld1.u8 {d12}, [r0], r1 ; p0
|
||||||
|
vld1.u8 {d13}, [r3], r1 ; p0
|
||||||
|
vld1.u8 {d14}, [r0], r1 ; q0
|
||||||
|
vld1.u8 {d15}, [r3], r1 ; q0
|
||||||
|
vld1.u8 {d16}, [r0], r1 ; q1
|
||||||
|
vld1.u8 {d17}, [r3], r1 ; q1
|
||||||
|
vld1.u8 {d18}, [r0], r1 ; q2
|
||||||
|
vld1.u8 {d19}, [r3], r1 ; q2
|
||||||
|
vld1.u8 {d20}, [r0], r1 ; q3
|
||||||
|
vld1.u8 {d21}, [r3], r1 ; q3
|
||||||
|
|
||||||
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
|
|
||||||
|
bl vp8_mbloop_filter_neon
|
||||||
|
|
||||||
|
sub r0, r0, r1, lsl #3
|
||||||
|
sub r3, r3, r1, lsl #3
|
||||||
|
|
||||||
add r0, r0, r1
|
add r0, r0, r1
|
||||||
add r12, r12, r1
|
add r3, r3, r1
|
||||||
|
|
||||||
vst1.u8 {d8}, [r0@64], r1 ; store u op2
|
vst1.u8 {d8}, [r0], r1 ; store u op2
|
||||||
vst1.u8 {d9}, [r12@64], r1 ; store v op2
|
vst1.u8 {d9}, [r3], r1 ; store v op2
|
||||||
vst1.u8 {d10}, [r0@64], r1 ; store u op1
|
vst1.u8 {d10}, [r0], r1 ; store u op1
|
||||||
vst1.u8 {d11}, [r12@64], r1 ; store v op1
|
vst1.u8 {d11}, [r3], r1 ; store v op1
|
||||||
vst1.u8 {d12}, [r0@64], r1 ; store u op0
|
vst1.u8 {d12}, [r0], r1 ; store u op0
|
||||||
vst1.u8 {d13}, [r12@64], r1 ; store v op0
|
vst1.u8 {d13}, [r3], r1 ; store v op0
|
||||||
vst1.u8 {d14}, [r0@64], r1 ; store u oq0
|
vst1.u8 {d14}, [r0], r1 ; store u oq0
|
||||||
vst1.u8 {d15}, [r12@64], r1 ; store v oq0
|
vst1.u8 {d15}, [r3], r1 ; store v oq0
|
||||||
vst1.u8 {d16}, [r0@64], r1 ; store u oq1
|
vst1.u8 {d16}, [r0], r1 ; store u oq1
|
||||||
vst1.u8 {d17}, [r12@64], r1 ; store v oq1
|
vst1.u8 {d17}, [r3], r1 ; store v oq1
|
||||||
vst1.u8 {d18}, [r0@64], r1 ; store u oq2
|
vst1.u8 {d18}, [r0], r1 ; store u oq2
|
||||||
vst1.u8 {d19}, [r12@64], r1 ; store v oq2
|
vst1.u8 {d19}, [r3], r1 ; store v oq2
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_mbloop_filter_horizontal_edge_uv_neon|
|
ENDP ; |vp8_mbloop_filter_horizontal_edge_uv_neon|
|
||||||
|
|
||||||
; void vp8_mbloop_filter_vertical_edge_y_neon(unsigned char *src, int pitch,
|
; void vp8_mbloop_filter_vertical_edge_y_neon(unsigned char *src, int pitch,
|
||||||
; const unsigned char *blimit,
|
; const signed char *flimit,
|
||||||
; const unsigned char *limit,
|
; const signed char *limit,
|
||||||
; const unsigned char *thresh)
|
; const signed char *thresh,
|
||||||
|
; int count)
|
||||||
; r0 unsigned char *src,
|
; r0 unsigned char *src,
|
||||||
; r1 int pitch,
|
; r1 int pitch,
|
||||||
; r2 unsigned char blimit
|
; r2 const signed char *flimit,
|
||||||
; r3 unsigned char limit
|
; r3 const signed char *limit,
|
||||||
; sp unsigned char thresh,
|
; sp const signed char *thresh,
|
||||||
|
; sp+4 int count (unused)
|
||||||
|vp8_mbloop_filter_vertical_edge_y_neon| PROC
|
|vp8_mbloop_filter_vertical_edge_y_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
ldr r12, [sp, #4] ; load thresh
|
|
||||||
sub r0, r0, #4 ; move src pointer down by 4 columns
|
sub r0, r0, #4 ; move src pointer down by 4 columns
|
||||||
vdup.s8 q2, r12 ; thresh
|
|
||||||
add r12, r0, r1, lsl #3 ; move src pointer down by 8 lines
|
|
||||||
|
|
||||||
vld1.u8 {d6}, [r0], r1 ; load first 8-line src data
|
vld1.u8 {d6}, [r0], r1 ; load first 8-line src data
|
||||||
vld1.u8 {d7}, [r12], r1 ; load second 8-line src data
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
vld1.u8 {d8}, [r0], r1
|
vld1.u8 {d8}, [r0], r1
|
||||||
vld1.u8 {d9}, [r12], r1
|
sub sp, sp, #32
|
||||||
vld1.u8 {d10}, [r0], r1
|
vld1.u8 {d10}, [r0], r1
|
||||||
vld1.u8 {d11}, [r12], r1
|
|
||||||
vld1.u8 {d12}, [r0], r1
|
vld1.u8 {d12}, [r0], r1
|
||||||
vld1.u8 {d13}, [r12], r1
|
|
||||||
vld1.u8 {d14}, [r0], r1
|
vld1.u8 {d14}, [r0], r1
|
||||||
vld1.u8 {d15}, [r12], r1
|
|
||||||
vld1.u8 {d16}, [r0], r1
|
vld1.u8 {d16}, [r0], r1
|
||||||
vld1.u8 {d17}, [r12], r1
|
|
||||||
vld1.u8 {d18}, [r0], r1
|
vld1.u8 {d18}, [r0], r1
|
||||||
vld1.u8 {d19}, [r12], r1
|
|
||||||
vld1.u8 {d20}, [r0], r1
|
vld1.u8 {d20}, [r0], r1
|
||||||
vld1.u8 {d21}, [r12], r1
|
|
||||||
|
vld1.u8 {d7}, [r0], r1 ; load second 8-line src data
|
||||||
|
vld1.u8 {d9}, [r0], r1
|
||||||
|
vld1.u8 {d11}, [r0], r1
|
||||||
|
vld1.u8 {d13}, [r0], r1
|
||||||
|
vld1.u8 {d15}, [r0], r1
|
||||||
|
vld1.u8 {d17}, [r0], r1
|
||||||
|
vld1.u8 {d19}, [r0], r1
|
||||||
|
vld1.u8 {d21}, [r0], r1
|
||||||
|
|
||||||
;transpose to 8x16 matrix
|
;transpose to 8x16 matrix
|
||||||
vtrn.32 q3, q7
|
vtrn.32 q3, q7
|
||||||
@@ -168,17 +180,29 @@
|
|||||||
vtrn.8 q7, q8
|
vtrn.8 q7, q8
|
||||||
vtrn.8 q9, q10
|
vtrn.8 q9, q10
|
||||||
|
|
||||||
sub r0, r0, r1, lsl #3
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
|
mov r12, sp
|
||||||
|
vst1.u8 {q3}, [r12]!
|
||||||
|
vst1.u8 {q10}, [r12]!
|
||||||
|
|
||||||
bl vp8_mbloop_filter_neon
|
bl vp8_mbloop_filter_neon
|
||||||
|
|
||||||
sub r12, r12, r1, lsl #3
|
sub r0, r0, r1, lsl #4
|
||||||
|
|
||||||
|
add r2, r0, r1
|
||||||
|
|
||||||
|
add r3, r2, r1
|
||||||
|
|
||||||
|
vld1.u8 {q3}, [sp]!
|
||||||
|
vld1.u8 {q10}, [sp]!
|
||||||
|
|
||||||
;transpose to 16x8 matrix
|
;transpose to 16x8 matrix
|
||||||
vtrn.32 q3, q7
|
vtrn.32 q3, q7
|
||||||
vtrn.32 q4, q8
|
vtrn.32 q4, q8
|
||||||
vtrn.32 q5, q9
|
vtrn.32 q5, q9
|
||||||
vtrn.32 q6, q10
|
vtrn.32 q6, q10
|
||||||
|
add r12, r3, r1
|
||||||
|
|
||||||
vtrn.16 q3, q5
|
vtrn.16 q3, q5
|
||||||
vtrn.16 q4, q6
|
vtrn.16 q4, q6
|
||||||
@@ -191,30 +215,36 @@
|
|||||||
vtrn.8 q9, q10
|
vtrn.8 q9, q10
|
||||||
|
|
||||||
;store op2, op1, op0, oq0, oq1, oq2
|
;store op2, op1, op0, oq0, oq1, oq2
|
||||||
vst1.8 {d6}, [r0], r1
|
vst1.8 {d6}, [r0]
|
||||||
vst1.8 {d7}, [r12], r1
|
vst1.8 {d8}, [r2]
|
||||||
vst1.8 {d8}, [r0], r1
|
vst1.8 {d10}, [r3]
|
||||||
vst1.8 {d9}, [r12], r1
|
vst1.8 {d12}, [r12], r1
|
||||||
vst1.8 {d10}, [r0], r1
|
add r0, r12, r1
|
||||||
vst1.8 {d11}, [r12], r1
|
vst1.8 {d14}, [r12]
|
||||||
vst1.8 {d12}, [r0], r1
|
|
||||||
vst1.8 {d13}, [r12], r1
|
|
||||||
vst1.8 {d14}, [r0], r1
|
|
||||||
vst1.8 {d15}, [r12], r1
|
|
||||||
vst1.8 {d16}, [r0], r1
|
vst1.8 {d16}, [r0], r1
|
||||||
vst1.8 {d17}, [r12], r1
|
add r2, r0, r1
|
||||||
vst1.8 {d18}, [r0], r1
|
vst1.8 {d18}, [r0]
|
||||||
vst1.8 {d19}, [r12], r1
|
vst1.8 {d20}, [r2], r1
|
||||||
vst1.8 {d20}, [r0]
|
add r3, r2, r1
|
||||||
vst1.8 {d21}, [r12]
|
vst1.8 {d7}, [r2]
|
||||||
|
vst1.8 {d9}, [r3], r1
|
||||||
|
add r12, r3, r1
|
||||||
|
vst1.8 {d11}, [r3]
|
||||||
|
vst1.8 {d13}, [r12], r1
|
||||||
|
add r0, r12, r1
|
||||||
|
vst1.8 {d15}, [r12]
|
||||||
|
vst1.8 {d17}, [r0], r1
|
||||||
|
add r2, r0, r1
|
||||||
|
vst1.8 {d19}, [r0]
|
||||||
|
vst1.8 {d21}, [r2]
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_mbloop_filter_vertical_edge_y_neon|
|
ENDP ; |vp8_mbloop_filter_vertical_edge_y_neon|
|
||||||
|
|
||||||
; void vp8_mbloop_filter_vertical_edge_uv_neon(unsigned char *u, int pitch,
|
; void vp8_mbloop_filter_vertical_edge_uv_neon(unsigned char *u, int pitch,
|
||||||
; const unsigned char *blimit,
|
; const signed char *flimit,
|
||||||
; const unsigned char *limit,
|
; const signed char *limit,
|
||||||
; const unsigned char *thresh,
|
; const signed char *thresh,
|
||||||
; unsigned char *v)
|
; unsigned char *v)
|
||||||
; r0 unsigned char *u,
|
; r0 unsigned char *u,
|
||||||
; r1 int pitch,
|
; r1 int pitch,
|
||||||
@@ -223,29 +253,30 @@
|
|||||||
; sp const signed char *thresh,
|
; sp const signed char *thresh,
|
||||||
; sp+4 unsigned char *v
|
; sp+4 unsigned char *v
|
||||||
|vp8_mbloop_filter_vertical_edge_uv_neon| PROC
|
|vp8_mbloop_filter_vertical_edge_uv_neon| PROC
|
||||||
push {lr}
|
stmdb sp!, {lr}
|
||||||
ldr r12, [sp, #4] ; load thresh
|
sub r0, r0, #4 ; move src pointer down by 4 columns
|
||||||
sub r0, r0, #4 ; move u pointer down by 4 columns
|
vld1.s8 {d2[], d3[]}, [r3] ; limit
|
||||||
vdup.u8 q2, r12 ; thresh
|
ldr r3, [sp, #8] ; load v ptr
|
||||||
ldr r12, [sp, #8] ; load v ptr
|
ldr r12, [sp, #4] ; load thresh pointer
|
||||||
sub r12, r12, #4 ; move v pointer down by 4 columns
|
|
||||||
|
sub r3, r3, #4 ; move v pointer down by 4 columns
|
||||||
|
|
||||||
vld1.u8 {d6}, [r0], r1 ;load u data
|
vld1.u8 {d6}, [r0], r1 ;load u data
|
||||||
vld1.u8 {d7}, [r12], r1 ;load v data
|
vld1.u8 {d7}, [r3], r1 ;load v data
|
||||||
vld1.u8 {d8}, [r0], r1
|
vld1.u8 {d8}, [r0], r1
|
||||||
vld1.u8 {d9}, [r12], r1
|
vld1.u8 {d9}, [r3], r1
|
||||||
vld1.u8 {d10}, [r0], r1
|
vld1.u8 {d10}, [r0], r1
|
||||||
vld1.u8 {d11}, [r12], r1
|
vld1.u8 {d11}, [r3], r1
|
||||||
vld1.u8 {d12}, [r0], r1
|
vld1.u8 {d12}, [r0], r1
|
||||||
vld1.u8 {d13}, [r12], r1
|
vld1.u8 {d13}, [r3], r1
|
||||||
vld1.u8 {d14}, [r0], r1
|
vld1.u8 {d14}, [r0], r1
|
||||||
vld1.u8 {d15}, [r12], r1
|
vld1.u8 {d15}, [r3], r1
|
||||||
vld1.u8 {d16}, [r0], r1
|
vld1.u8 {d16}, [r0], r1
|
||||||
vld1.u8 {d17}, [r12], r1
|
vld1.u8 {d17}, [r3], r1
|
||||||
vld1.u8 {d18}, [r0], r1
|
vld1.u8 {d18}, [r0], r1
|
||||||
vld1.u8 {d19}, [r12], r1
|
vld1.u8 {d19}, [r3], r1
|
||||||
vld1.u8 {d20}, [r0], r1
|
vld1.u8 {d20}, [r0], r1
|
||||||
vld1.u8 {d21}, [r12], r1
|
vld1.u8 {d21}, [r3], r1
|
||||||
|
|
||||||
;transpose to 8x16 matrix
|
;transpose to 8x16 matrix
|
||||||
vtrn.32 q3, q7
|
vtrn.32 q3, q7
|
||||||
@@ -263,11 +294,19 @@
|
|||||||
vtrn.8 q7, q8
|
vtrn.8 q7, q8
|
||||||
vtrn.8 q9, q10
|
vtrn.8 q9, q10
|
||||||
|
|
||||||
sub r0, r0, r1, lsl #3
|
sub sp, sp, #32
|
||||||
|
vld1.s8 {d4[], d5[]}, [r12] ; thresh
|
||||||
|
mov r12, sp
|
||||||
|
vst1.u8 {q3}, [r12]!
|
||||||
|
vst1.u8 {q10}, [r12]!
|
||||||
|
|
||||||
bl vp8_mbloop_filter_neon
|
bl vp8_mbloop_filter_neon
|
||||||
|
|
||||||
sub r12, r12, r1, lsl #3
|
sub r0, r0, r1, lsl #3
|
||||||
|
sub r3, r3, r1, lsl #3
|
||||||
|
|
||||||
|
vld1.u8 {q3}, [sp]!
|
||||||
|
vld1.u8 {q10}, [sp]!
|
||||||
|
|
||||||
;transpose to 16x8 matrix
|
;transpose to 16x8 matrix
|
||||||
vtrn.32 q3, q7
|
vtrn.32 q3, q7
|
||||||
@@ -287,23 +326,23 @@
|
|||||||
|
|
||||||
;store op2, op1, op0, oq0, oq1, oq2
|
;store op2, op1, op0, oq0, oq1, oq2
|
||||||
vst1.8 {d6}, [r0], r1
|
vst1.8 {d6}, [r0], r1
|
||||||
vst1.8 {d7}, [r12], r1
|
vst1.8 {d7}, [r3], r1
|
||||||
vst1.8 {d8}, [r0], r1
|
vst1.8 {d8}, [r0], r1
|
||||||
vst1.8 {d9}, [r12], r1
|
vst1.8 {d9}, [r3], r1
|
||||||
vst1.8 {d10}, [r0], r1
|
vst1.8 {d10}, [r0], r1
|
||||||
vst1.8 {d11}, [r12], r1
|
vst1.8 {d11}, [r3], r1
|
||||||
vst1.8 {d12}, [r0], r1
|
vst1.8 {d12}, [r0], r1
|
||||||
vst1.8 {d13}, [r12], r1
|
vst1.8 {d13}, [r3], r1
|
||||||
vst1.8 {d14}, [r0], r1
|
vst1.8 {d14}, [r0], r1
|
||||||
vst1.8 {d15}, [r12], r1
|
vst1.8 {d15}, [r3], r1
|
||||||
vst1.8 {d16}, [r0], r1
|
vst1.8 {d16}, [r0], r1
|
||||||
vst1.8 {d17}, [r12], r1
|
vst1.8 {d17}, [r3], r1
|
||||||
vst1.8 {d18}, [r0], r1
|
vst1.8 {d18}, [r0], r1
|
||||||
vst1.8 {d19}, [r12], r1
|
vst1.8 {d19}, [r3], r1
|
||||||
vst1.8 {d20}, [r0]
|
vst1.8 {d20}, [r0], r1
|
||||||
vst1.8 {d21}, [r12]
|
vst1.8 {d21}, [r3], r1
|
||||||
|
|
||||||
pop {pc}
|
ldmia sp!, {pc}
|
||||||
ENDP ; |vp8_mbloop_filter_vertical_edge_uv_neon|
|
ENDP ; |vp8_mbloop_filter_vertical_edge_uv_neon|
|
||||||
|
|
||||||
; void vp8_mbloop_filter_neon()
|
; void vp8_mbloop_filter_neon()
|
||||||
@@ -311,33 +350,41 @@
|
|||||||
; functions do the necessary load, transpose (if necessary), preserve (if
|
; functions do the necessary load, transpose (if necessary), preserve (if
|
||||||
; necessary) and store.
|
; necessary) and store.
|
||||||
|
|
||||||
; r0,r1 PRESERVE
|
; TODO:
|
||||||
; r2 mblimit
|
; The vertical filter writes p3/q3 back out because two 4 element writes are
|
||||||
; r3 limit
|
; much simpler than ordering and writing two 3 element sets (or three 2 elements
|
||||||
|
; sets, or whichever other combinations are possible).
|
||||||
|
; If we can preserve q3 and q10, the vertical filter will be able to avoid
|
||||||
|
; storing those values on the stack and reading them back after the filter.
|
||||||
|
|
||||||
|
; r0,r1 PRESERVE
|
||||||
|
; r2 flimit
|
||||||
|
; r3 PRESERVE
|
||||||
|
; q1 limit
|
||||||
; q2 thresh
|
; q2 thresh
|
||||||
; q3 p3 PRESERVE
|
; q3 p3
|
||||||
; q4 p2
|
; q4 p2
|
||||||
; q5 p1
|
; q5 p1
|
||||||
; q6 p0
|
; q6 p0
|
||||||
; q7 q0
|
; q7 q0
|
||||||
; q8 q1
|
; q8 q1
|
||||||
; q9 q2
|
; q9 q2
|
||||||
; q10 q3 PRESERVE
|
; q10 q3
|
||||||
|
|
||||||
|vp8_mbloop_filter_neon| PROC
|
|vp8_mbloop_filter_neon| PROC
|
||||||
|
ldr r12, _mblf_coeff_
|
||||||
|
|
||||||
; vp8_filter_mask
|
; vp8_filter_mask
|
||||||
vabd.u8 q11, q3, q4 ; abs(p3 - p2)
|
vabd.u8 q11, q3, q4 ; abs(p3 - p2)
|
||||||
vabd.u8 q12, q4, q5 ; abs(p2 - p1)
|
vabd.u8 q12, q4, q5 ; abs(p2 - p1)
|
||||||
vabd.u8 q13, q5, q6 ; abs(p1 - p0)
|
vabd.u8 q13, q5, q6 ; abs(p1 - p0)
|
||||||
vabd.u8 q14, q8, q7 ; abs(q1 - q0)
|
vabd.u8 q14, q8, q7 ; abs(q1 - q0)
|
||||||
vabd.u8 q1, q9, q8 ; abs(q2 - q1)
|
vabd.u8 q3, q9, q8 ; abs(q2 - q1)
|
||||||
vabd.u8 q0, q10, q9 ; abs(q3 - q2)
|
vabd.u8 q0, q10, q9 ; abs(q3 - q2)
|
||||||
|
|
||||||
vmax.u8 q11, q11, q12
|
vmax.u8 q11, q11, q12
|
||||||
vmax.u8 q12, q13, q14
|
vmax.u8 q12, q13, q14
|
||||||
vmax.u8 q1, q1, q0
|
vmax.u8 q3, q3, q0
|
||||||
vmax.u8 q15, q11, q12
|
vmax.u8 q15, q11, q12
|
||||||
|
|
||||||
vabd.u8 q12, q6, q7 ; abs(p0 - q0)
|
vabd.u8 q12, q6, q7 ; abs(p0 - q0)
|
||||||
@@ -345,53 +392,51 @@
|
|||||||
; vp8_hevmask
|
; vp8_hevmask
|
||||||
vcgt.u8 q13, q13, q2 ; (abs(p1 - p0) > thresh) * -1
|
vcgt.u8 q13, q13, q2 ; (abs(p1 - p0) > thresh) * -1
|
||||||
vcgt.u8 q14, q14, q2 ; (abs(q1 - q0) > thresh) * -1
|
vcgt.u8 q14, q14, q2 ; (abs(q1 - q0) > thresh) * -1
|
||||||
vmax.u8 q15, q15, q1
|
vmax.u8 q15, q15, q3
|
||||||
|
|
||||||
vdup.u8 q1, r3 ; limit
|
vld1.s8 {d4[], d5[]}, [r2] ; flimit
|
||||||
vdup.u8 q2, r2 ; mblimit
|
|
||||||
|
|
||||||
vmov.u8 q0, #0x80 ; 0x80
|
vld1.u8 {q0}, [r12]!
|
||||||
|
|
||||||
|
vadd.u8 q2, q2, q2 ; flimit * 2
|
||||||
|
vadd.u8 q2, q2, q1 ; flimit * 2 + limit
|
||||||
vcge.u8 q15, q1, q15
|
vcge.u8 q15, q1, q15
|
||||||
|
|
||||||
vabd.u8 q1, q5, q8 ; a = abs(p1 - q1)
|
vabd.u8 q1, q5, q8 ; a = abs(p1 - q1)
|
||||||
vqadd.u8 q12, q12, q12 ; b = abs(p0 - q0) * 2
|
vqadd.u8 q12, q12, q12 ; b = abs(p0 - q0) * 2
|
||||||
vmov.u16 q11, #3 ; #3
|
vshr.u8 q1, q1, #1 ; a = a / 2
|
||||||
|
vqadd.u8 q12, q12, q1 ; a = b + a
|
||||||
|
vcge.u8 q12, q2, q12 ; (a > flimit * 2 + limit) * -1
|
||||||
|
|
||||||
; vp8_filter
|
; vp8_filter
|
||||||
; convert to signed
|
; convert to signed
|
||||||
veor q7, q7, q0 ; qs0
|
veor q7, q7, q0 ; qs0
|
||||||
vshr.u8 q1, q1, #1 ; a = a / 2
|
|
||||||
veor q6, q6, q0 ; ps0
|
veor q6, q6, q0 ; ps0
|
||||||
veor q5, q5, q0 ; ps1
|
veor q5, q5, q0 ; ps1
|
||||||
|
|
||||||
vqadd.u8 q12, q12, q1 ; a = b + a
|
|
||||||
|
|
||||||
veor q8, q8, q0 ; qs1
|
veor q8, q8, q0 ; qs1
|
||||||
veor q4, q4, q0 ; ps2
|
veor q4, q4, q0 ; ps2
|
||||||
veor q9, q9, q0 ; qs2
|
veor q9, q9, q0 ; qs2
|
||||||
|
|
||||||
vorr q14, q13, q14 ; vp8_hevmask
|
vorr q14, q13, q14 ; vp8_hevmask
|
||||||
|
|
||||||
vcge.u8 q12, q2, q12 ; (a > flimit * 2 + limit) * -1
|
|
||||||
|
|
||||||
vsubl.s8 q2, d14, d12 ; qs0 - ps0
|
vsubl.s8 q2, d14, d12 ; qs0 - ps0
|
||||||
vsubl.s8 q13, d15, d13
|
vsubl.s8 q13, d15, d13
|
||||||
|
|
||||||
vqsub.s8 q1, q5, q8 ; vp8_filter = clamp(ps1-qs1)
|
vqsub.s8 q1, q5, q8 ; vp8_filter = clamp(ps1-qs1)
|
||||||
|
|
||||||
vmul.i16 q2, q2, q11 ; 3 * ( qs0 - ps0)
|
vadd.s16 q10, q2, q2 ; 3 * (qs0 - ps0)
|
||||||
|
vadd.s16 q11, q13, q13
|
||||||
vand q15, q15, q12 ; vp8_filter_mask
|
vand q15, q15, q12 ; vp8_filter_mask
|
||||||
|
|
||||||
vmul.i16 q13, q13, q11
|
vadd.s16 q2, q2, q10
|
||||||
|
vadd.s16 q13, q13, q11
|
||||||
|
|
||||||
vmov.u8 q12, #3 ; #3
|
vld1.u8 {q12}, [r12]! ; #3
|
||||||
|
|
||||||
vaddw.s8 q2, q2, d2 ; vp8_filter + 3 * ( qs0 - ps0)
|
vaddw.s8 q2, q2, d2 ; vp8_filter + 3 * ( qs0 - ps0)
|
||||||
vaddw.s8 q13, q13, d3
|
vaddw.s8 q13, q13, d3
|
||||||
|
|
||||||
vmov.u8 q11, #4 ; #4
|
vld1.u8 {q11}, [r12]! ; #4
|
||||||
|
|
||||||
; vp8_filter = clamp(vp8_filter + 3 * ( qs0 - ps0))
|
; vp8_filter = clamp(vp8_filter + 3 * ( qs0 - ps0))
|
||||||
vqmovn.s16 d2, q2
|
vqmovn.s16 d2, q2
|
||||||
@@ -399,23 +444,27 @@
|
|||||||
|
|
||||||
vand q1, q1, q15 ; vp8_filter &= mask
|
vand q1, q1, q15 ; vp8_filter &= mask
|
||||||
|
|
||||||
vmov.u16 q15, #63 ; #63
|
vld1.u8 {q15}, [r12]! ; #63
|
||||||
|
;
|
||||||
vand q13, q1, q14 ; Filter2 &= hev
|
vand q13, q1, q14 ; Filter2 &= hev
|
||||||
|
|
||||||
|
vld1.u8 {d7}, [r12]! ; #9
|
||||||
|
|
||||||
vqadd.s8 q2, q13, q11 ; Filter1 = clamp(Filter2+4)
|
vqadd.s8 q2, q13, q11 ; Filter1 = clamp(Filter2+4)
|
||||||
vqadd.s8 q13, q13, q12 ; Filter2 = clamp(Filter2+3)
|
vqadd.s8 q13, q13, q12 ; Filter2 = clamp(Filter2+3)
|
||||||
|
|
||||||
vmov q0, q15
|
vld1.u8 {d6}, [r12]! ; #18
|
||||||
|
|
||||||
vshr.s8 q2, q2, #3 ; Filter1 >>= 3
|
vshr.s8 q2, q2, #3 ; Filter1 >>= 3
|
||||||
vshr.s8 q13, q13, #3 ; Filter2 >>= 3
|
vshr.s8 q13, q13, #3 ; Filter2 >>= 3
|
||||||
|
|
||||||
vmov q11, q15
|
vmov q10, q15
|
||||||
vmov q12, q15
|
vmov q12, q15
|
||||||
|
|
||||||
vqsub.s8 q7, q7, q2 ; qs0 = clamp(qs0 - Filter1)
|
vqsub.s8 q7, q7, q2 ; qs0 = clamp(qs0 - Filter1)
|
||||||
|
|
||||||
|
vld1.u8 {d5}, [r12]! ; #27
|
||||||
|
|
||||||
vqadd.s8 q6, q6, q13 ; ps0 = clamp(ps0 + Filter2)
|
vqadd.s8 q6, q6, q13 ; ps0 = clamp(ps0 + Filter2)
|
||||||
|
|
||||||
vbic q1, q1, q14 ; vp8_filter &= ~hev
|
vbic q1, q1, q14 ; vp8_filter &= ~hev
|
||||||
@@ -423,47 +472,49 @@
|
|||||||
; roughly 1/7th difference across boundary
|
; roughly 1/7th difference across boundary
|
||||||
; roughly 2/7th difference across boundary
|
; roughly 2/7th difference across boundary
|
||||||
; roughly 3/7th difference across boundary
|
; roughly 3/7th difference across boundary
|
||||||
|
vmov q11, q15
|
||||||
vmov.u8 d5, #9 ; #9
|
|
||||||
vmov.u8 d4, #18 ; #18
|
|
||||||
|
|
||||||
vmov q13, q15
|
vmov q13, q15
|
||||||
vmov q14, q15
|
vmov q14, q15
|
||||||
|
|
||||||
vmlal.s8 q0, d2, d5 ; 63 + Filter2 * 9
|
vmlal.s8 q10, d2, d7 ; Filter2 * 9
|
||||||
vmlal.s8 q11, d3, d5
|
vmlal.s8 q11, d3, d7
|
||||||
vmov.u8 d5, #27 ; #27
|
vmlal.s8 q12, d2, d6 ; Filter2 * 18
|
||||||
vmlal.s8 q12, d2, d4 ; 63 + Filter2 * 18
|
vmlal.s8 q13, d3, d6
|
||||||
vmlal.s8 q13, d3, d4
|
vmlal.s8 q14, d2, d5 ; Filter2 * 27
|
||||||
vmlal.s8 q14, d2, d5 ; 63 + Filter2 * 27
|
|
||||||
vmlal.s8 q15, d3, d5
|
vmlal.s8 q15, d3, d5
|
||||||
|
vqshrn.s16 d20, q10, #7 ; u = clamp((63 + Filter2 * 9)>>7)
|
||||||
vqshrn.s16 d0, q0, #7 ; u = clamp((63 + Filter2 * 9)>>7)
|
vqshrn.s16 d21, q11, #7
|
||||||
vqshrn.s16 d1, q11, #7
|
|
||||||
vqshrn.s16 d24, q12, #7 ; u = clamp((63 + Filter2 * 18)>>7)
|
vqshrn.s16 d24, q12, #7 ; u = clamp((63 + Filter2 * 18)>>7)
|
||||||
vqshrn.s16 d25, q13, #7
|
vqshrn.s16 d25, q13, #7
|
||||||
vqshrn.s16 d28, q14, #7 ; u = clamp((63 + Filter2 * 27)>>7)
|
vqshrn.s16 d28, q14, #7 ; u = clamp((63 + Filter2 * 27)>>7)
|
||||||
vqshrn.s16 d29, q15, #7
|
vqshrn.s16 d29, q15, #7
|
||||||
|
|
||||||
vmov.u8 q1, #0x80 ; 0x80
|
vqsub.s8 q11, q9, q10 ; s = clamp(qs2 - u)
|
||||||
|
vqadd.s8 q10, q4, q10 ; s = clamp(ps2 + u)
|
||||||
vqsub.s8 q11, q9, q0 ; s = clamp(qs2 - u)
|
|
||||||
vqadd.s8 q0, q4, q0 ; s = clamp(ps2 + u)
|
|
||||||
vqsub.s8 q13, q8, q12 ; s = clamp(qs1 - u)
|
vqsub.s8 q13, q8, q12 ; s = clamp(qs1 - u)
|
||||||
vqadd.s8 q12, q5, q12 ; s = clamp(ps1 + u)
|
vqadd.s8 q12, q5, q12 ; s = clamp(ps1 + u)
|
||||||
vqsub.s8 q15, q7, q14 ; s = clamp(qs0 - u)
|
vqsub.s8 q15, q7, q14 ; s = clamp(qs0 - u)
|
||||||
vqadd.s8 q14, q6, q14 ; s = clamp(ps0 + u)
|
vqadd.s8 q14, q6, q14 ; s = clamp(ps0 + u)
|
||||||
|
veor q9, q11, q0 ; *oq2 = s^0x80
|
||||||
veor q9, q11, q1 ; *oq2 = s^0x80
|
veor q4, q10, q0 ; *op2 = s^0x80
|
||||||
veor q4, q0, q1 ; *op2 = s^0x80
|
veor q8, q13, q0 ; *oq1 = s^0x80
|
||||||
veor q8, q13, q1 ; *oq1 = s^0x80
|
veor q5, q12, q0 ; *op2 = s^0x80
|
||||||
veor q5, q12, q1 ; *op2 = s^0x80
|
veor q7, q15, q0 ; *oq0 = s^0x80
|
||||||
veor q7, q15, q1 ; *oq0 = s^0x80
|
veor q6, q14, q0 ; *op0 = s^0x80
|
||||||
veor q6, q14, q1 ; *op0 = s^0x80
|
|
||||||
|
|
||||||
bx lr
|
bx lr
|
||||||
ENDP ; |vp8_mbloop_filter_neon|
|
ENDP ; |vp8_mbloop_filter_neon|
|
||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_mblf_coeff_
|
||||||
|
DCD mblf_coeff
|
||||||
|
mblf_coeff
|
||||||
|
DCD 0x80808080, 0x80808080, 0x80808080, 0x80808080
|
||||||
|
DCD 0x03030303, 0x03030303, 0x03030303, 0x03030303
|
||||||
|
DCD 0x04040404, 0x04040404, 0x04040404, 0x04040404
|
||||||
|
DCD 0x003f003f, 0x003f003f, 0x003f003f, 0x003f003f
|
||||||
|
DCD 0x09090909, 0x09090909, 0x12121212, 0x12121212
|
||||||
|
DCD 0x1b1b1b1b, 0x1b1b1b1b
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -31,7 +31,7 @@
|
|||||||
;result of the multiplication that is needed in IDCT.
|
;result of the multiplication that is needed in IDCT.
|
||||||
|
|
||||||
|vp8_short_idct4x4llm_neon| PROC
|
|vp8_short_idct4x4llm_neon| PROC
|
||||||
adr r12, idct_coeff
|
ldr r12, _idct_coeff_
|
||||||
vld1.16 {q1, q2}, [r0]
|
vld1.16 {q1, q2}, [r0]
|
||||||
vld1.16 {d0}, [r12]
|
vld1.16 {d0}, [r12]
|
||||||
|
|
||||||
@@ -114,6 +114,8 @@
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_idct_coeff_
|
||||||
|
DCD idct_coeff
|
||||||
idct_coeff
|
idct_coeff
|
||||||
DCD 0x4e7b4e7b, 0x8a8c8a8c
|
DCD 0x4e7b4e7b, 0x8a8c8a8c
|
||||||
|
|
||||||
|
|||||||
@@ -15,17 +15,6 @@
|
|||||||
PRESERVE8
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
|
||||||
filter16_coeff
|
|
||||||
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
|
||||||
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
|
||||||
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
|
||||||
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
|
||||||
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
|
||||||
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
|
||||||
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
|
||||||
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
|
||||||
|
|
||||||
; r0 unsigned char *src_ptr,
|
; r0 unsigned char *src_ptr,
|
||||||
; r1 int src_pixels_per_line,
|
; r1 int src_pixels_per_line,
|
||||||
; r2 int xoffset,
|
; r2 int xoffset,
|
||||||
@@ -44,7 +33,7 @@ filter16_coeff
|
|||||||
|vp8_sixtap_predict16x16_neon| PROC
|
|vp8_sixtap_predict16x16_neon| PROC
|
||||||
push {r4-r5, lr}
|
push {r4-r5, lr}
|
||||||
|
|
||||||
adr r12, filter16_coeff
|
ldr r12, _filter16_coeff_
|
||||||
ldr r4, [sp, #12] ;load parameters from stack
|
ldr r4, [sp, #12] ;load parameters from stack
|
||||||
ldr r5, [sp, #16] ;load parameters from stack
|
ldr r5, [sp, #16] ;load parameters from stack
|
||||||
|
|
||||||
@@ -487,4 +476,17 @@ secondpass_only_inner_loop_neon
|
|||||||
ENDP
|
ENDP
|
||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_filter16_coeff_
|
||||||
|
DCD filter16_coeff
|
||||||
|
filter16_coeff
|
||||||
|
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
||||||
|
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
||||||
|
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
||||||
|
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
||||||
|
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
||||||
|
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
||||||
|
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
||||||
|
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -15,17 +15,6 @@
|
|||||||
PRESERVE8
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
|
||||||
filter4_coeff
|
|
||||||
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
|
||||||
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
|
||||||
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
|
||||||
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
|
||||||
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
|
||||||
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
|
||||||
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
|
||||||
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
|
||||||
|
|
||||||
; r0 unsigned char *src_ptr,
|
; r0 unsigned char *src_ptr,
|
||||||
; r1 int src_pixels_per_line,
|
; r1 int src_pixels_per_line,
|
||||||
; r2 int xoffset,
|
; r2 int xoffset,
|
||||||
@@ -36,7 +25,7 @@ filter4_coeff
|
|||||||
|vp8_sixtap_predict_neon| PROC
|
|vp8_sixtap_predict_neon| PROC
|
||||||
push {r4, lr}
|
push {r4, lr}
|
||||||
|
|
||||||
adr r12, filter4_coeff
|
ldr r12, _filter4_coeff_
|
||||||
ldr r4, [sp, #8] ;load parameters from stack
|
ldr r4, [sp, #8] ;load parameters from stack
|
||||||
ldr lr, [sp, #12] ;load parameters from stack
|
ldr lr, [sp, #12] ;load parameters from stack
|
||||||
|
|
||||||
@@ -419,4 +408,16 @@ secondpass_filter4x4_only
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_filter4_coeff_
|
||||||
|
DCD filter4_coeff
|
||||||
|
filter4_coeff
|
||||||
|
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
||||||
|
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
||||||
|
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
||||||
|
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
||||||
|
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
||||||
|
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
||||||
|
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
||||||
|
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -15,17 +15,6 @@
|
|||||||
PRESERVE8
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
|
||||||
filter8_coeff
|
|
||||||
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
|
||||||
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
|
||||||
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
|
||||||
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
|
||||||
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
|
||||||
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
|
||||||
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
|
||||||
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
|
||||||
|
|
||||||
; r0 unsigned char *src_ptr,
|
; r0 unsigned char *src_ptr,
|
||||||
; r1 int src_pixels_per_line,
|
; r1 int src_pixels_per_line,
|
||||||
; r2 int xoffset,
|
; r2 int xoffset,
|
||||||
@@ -36,7 +25,7 @@ filter8_coeff
|
|||||||
|vp8_sixtap_predict8x4_neon| PROC
|
|vp8_sixtap_predict8x4_neon| PROC
|
||||||
push {r4-r5, lr}
|
push {r4-r5, lr}
|
||||||
|
|
||||||
adr r12, filter8_coeff
|
ldr r12, _filter8_coeff_
|
||||||
ldr r4, [sp, #12] ;load parameters from stack
|
ldr r4, [sp, #12] ;load parameters from stack
|
||||||
ldr r5, [sp, #16] ;load parameters from stack
|
ldr r5, [sp, #16] ;load parameters from stack
|
||||||
|
|
||||||
@@ -470,4 +459,16 @@ secondpass_filter8x4_only
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_filter8_coeff_
|
||||||
|
DCD filter8_coeff
|
||||||
|
filter8_coeff
|
||||||
|
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
||||||
|
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
||||||
|
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
||||||
|
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
||||||
|
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
||||||
|
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
||||||
|
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
||||||
|
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -15,17 +15,6 @@
|
|||||||
PRESERVE8
|
PRESERVE8
|
||||||
|
|
||||||
AREA ||.text||, CODE, READONLY, ALIGN=2
|
AREA ||.text||, CODE, READONLY, ALIGN=2
|
||||||
|
|
||||||
filter8_coeff
|
|
||||||
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
|
||||||
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
|
||||||
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
|
||||||
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
|
||||||
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
|
||||||
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
|
||||||
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
|
||||||
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
|
||||||
|
|
||||||
; r0 unsigned char *src_ptr,
|
; r0 unsigned char *src_ptr,
|
||||||
; r1 int src_pixels_per_line,
|
; r1 int src_pixels_per_line,
|
||||||
; r2 int xoffset,
|
; r2 int xoffset,
|
||||||
@@ -36,7 +25,7 @@ filter8_coeff
|
|||||||
|vp8_sixtap_predict8x8_neon| PROC
|
|vp8_sixtap_predict8x8_neon| PROC
|
||||||
push {r4-r5, lr}
|
push {r4-r5, lr}
|
||||||
|
|
||||||
adr r12, filter8_coeff
|
ldr r12, _filter8_coeff_
|
||||||
|
|
||||||
ldr r4, [sp, #12] ;load parameters from stack
|
ldr r4, [sp, #12] ;load parameters from stack
|
||||||
ldr r5, [sp, #16] ;load parameters from stack
|
ldr r5, [sp, #16] ;load parameters from stack
|
||||||
@@ -521,4 +510,16 @@ filt_blk2d_spo8x8_loop_neon
|
|||||||
|
|
||||||
;-----------------
|
;-----------------
|
||||||
|
|
||||||
|
_filter8_coeff_
|
||||||
|
DCD filter8_coeff
|
||||||
|
filter8_coeff
|
||||||
|
DCD 0, 0, 128, 0, 0, 0, 0, 0
|
||||||
|
DCD 0, -6, 123, 12, -1, 0, 0, 0
|
||||||
|
DCD 2, -11, 108, 36, -8, 1, 0, 0
|
||||||
|
DCD 0, -9, 93, 50, -6, 0, 0, 0
|
||||||
|
DCD 3, -16, 77, 77, -16, 3, 0, 0
|
||||||
|
DCD 0, -6, 50, 93, -9, 0, 0, 0
|
||||||
|
DCD 1, -8, 36, 108, -11, 2, 0, 0
|
||||||
|
DCD 0, -1, 12, 123, -6, 0, 0, 0
|
||||||
|
|
||||||
END
|
END
|
||||||
|
|||||||
@@ -9,14 +9,27 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include "vpx_config.h"
|
#include "vpx_ports/config.h"
|
||||||
#include "vpx/vpx_codec.h"
|
#include <stddef.h>
|
||||||
#include "vpx_ports/asm_offsets.h"
|
|
||||||
#include "vpx_scale/yv12config.h"
|
#include "vpx_scale/yv12config.h"
|
||||||
|
|
||||||
BEGIN
|
#define ct_assert(name,cond) \
|
||||||
|
static void assert_##name(void) UNUSED;\
|
||||||
|
static void assert_##name(void) {switch(0){case 0:case !!(cond):;}}
|
||||||
|
|
||||||
/* vpx_scale */
|
#define DEFINE(sym, val) int sym = val;
|
||||||
|
|
||||||
|
/*
|
||||||
|
#define BLANK() asm volatile("\n->" : : )
|
||||||
|
*/
|
||||||
|
|
||||||
|
/*
|
||||||
|
* int main(void)
|
||||||
|
* {
|
||||||
|
*/
|
||||||
|
|
||||||
|
//vpx_scale
|
||||||
DEFINE(yv12_buffer_config_y_width, offsetof(YV12_BUFFER_CONFIG, y_width));
|
DEFINE(yv12_buffer_config_y_width, offsetof(YV12_BUFFER_CONFIG, y_width));
|
||||||
DEFINE(yv12_buffer_config_y_height, offsetof(YV12_BUFFER_CONFIG, y_height));
|
DEFINE(yv12_buffer_config_y_height, offsetof(YV12_BUFFER_CONFIG, y_height));
|
||||||
DEFINE(yv12_buffer_config_y_stride, offsetof(YV12_BUFFER_CONFIG, y_stride));
|
DEFINE(yv12_buffer_config_y_stride, offsetof(YV12_BUFFER_CONFIG, y_stride));
|
||||||
@@ -27,14 +40,10 @@ DEFINE(yv12_buffer_config_y_buffer, offsetof(YV12_BUFFER_CONFIG, y_b
|
|||||||
DEFINE(yv12_buffer_config_u_buffer, offsetof(YV12_BUFFER_CONFIG, u_buffer));
|
DEFINE(yv12_buffer_config_u_buffer, offsetof(YV12_BUFFER_CONFIG, u_buffer));
|
||||||
DEFINE(yv12_buffer_config_v_buffer, offsetof(YV12_BUFFER_CONFIG, v_buffer));
|
DEFINE(yv12_buffer_config_v_buffer, offsetof(YV12_BUFFER_CONFIG, v_buffer));
|
||||||
DEFINE(yv12_buffer_config_border, offsetof(YV12_BUFFER_CONFIG, border));
|
DEFINE(yv12_buffer_config_border, offsetof(YV12_BUFFER_CONFIG, border));
|
||||||
DEFINE(VP8BORDERINPIXELS_VAL, VP8BORDERINPIXELS);
|
|
||||||
|
|
||||||
END
|
//add asserts for any offset that is not supported by assembly code
|
||||||
|
//add asserts for any size that is not supported by assembly code
|
||||||
/* add asserts for any offset that is not supported by assembly code */
|
/*
|
||||||
/* add asserts for any size that is not supported by assembly code */
|
* return 0;
|
||||||
|
* }
|
||||||
#if HAVE_ARMV7
|
*/
|
||||||
/* vp8_yv12_extend_frame_borders_neon makes several assumptions based on this */
|
|
||||||
ct_assert(VP8BORDERINPIXELS_VAL, VP8BORDERINPIXELS == 32)
|
|
||||||
#endif
|
|
||||||
|
|||||||
@@ -137,11 +137,16 @@ typedef enum
|
|||||||
modes for the Y blocks to the left and above us; for interframes, there
|
modes for the Y blocks to the left and above us; for interframes, there
|
||||||
is a single probability table. */
|
is a single probability table. */
|
||||||
|
|
||||||
union b_mode_info
|
typedef struct
|
||||||
{
|
{
|
||||||
B_PREDICTION_MODE as_mode;
|
B_PREDICTION_MODE mode;
|
||||||
int_mv mv;
|
union
|
||||||
};
|
{
|
||||||
|
int as_int;
|
||||||
|
MV as_mv;
|
||||||
|
} mv;
|
||||||
|
} B_MODE_INFO;
|
||||||
|
|
||||||
|
|
||||||
typedef enum
|
typedef enum
|
||||||
{
|
{
|
||||||
@@ -156,26 +161,38 @@ typedef struct
|
|||||||
{
|
{
|
||||||
MB_PREDICTION_MODE mode, uv_mode;
|
MB_PREDICTION_MODE mode, uv_mode;
|
||||||
MV_REFERENCE_FRAME ref_frame;
|
MV_REFERENCE_FRAME ref_frame;
|
||||||
int_mv mv;
|
union
|
||||||
|
{
|
||||||
|
int as_int;
|
||||||
|
MV as_mv;
|
||||||
|
} mv;
|
||||||
|
|
||||||
unsigned char partitioning;
|
unsigned char partitioning;
|
||||||
unsigned char mb_skip_coeff; /* does this mb has coefficients at all, 1=no coefficients, 0=need decode tokens */
|
unsigned char mb_skip_coeff; /* does this mb has coefficients at all, 1=no coefficients, 0=need decode tokens */
|
||||||
|
unsigned char dc_diff;
|
||||||
unsigned char need_to_clamp_mvs;
|
unsigned char need_to_clamp_mvs;
|
||||||
|
|
||||||
unsigned char segment_id; /* Which set of segmentation parameters should be used for this MB */
|
unsigned char segment_id; /* Which set of segmentation parameters should be used for this MB */
|
||||||
|
|
||||||
|
unsigned char force_no_skip; /* encoder only */
|
||||||
} MB_MODE_INFO;
|
} MB_MODE_INFO;
|
||||||
|
|
||||||
|
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
MB_MODE_INFO mbmi;
|
MB_MODE_INFO mbmi;
|
||||||
union b_mode_info bmi[16];
|
B_MODE_INFO bmi[16];
|
||||||
} MODE_INFO;
|
} MODE_INFO;
|
||||||
|
|
||||||
|
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
short *qcoeff;
|
short *qcoeff;
|
||||||
short *dqcoeff;
|
short *dqcoeff;
|
||||||
unsigned char *predictor;
|
unsigned char *predictor;
|
||||||
short *diff;
|
short *diff;
|
||||||
|
short *reference;
|
||||||
|
|
||||||
short *dequant;
|
short *dequant;
|
||||||
|
|
||||||
/* 16 Y blocks, 4 U blocks, 4 V blocks each with 16 entries */
|
/* 16 Y blocks, 4 U blocks, 4 V blocks each with 16 entries */
|
||||||
@@ -189,13 +206,15 @@ typedef struct
|
|||||||
|
|
||||||
int eob;
|
int eob;
|
||||||
|
|
||||||
union b_mode_info bmi;
|
B_MODE_INFO bmi;
|
||||||
|
|
||||||
} BLOCKD;
|
} BLOCKD;
|
||||||
|
|
||||||
typedef struct MacroBlockD
|
typedef struct
|
||||||
{
|
{
|
||||||
DECLARE_ALIGNED(16, short, diff[400]); /* from idct diff */
|
DECLARE_ALIGNED(16, short, diff[400]); /* from idct diff */
|
||||||
DECLARE_ALIGNED(16, unsigned char, predictor[384]);
|
DECLARE_ALIGNED(16, unsigned char, predictor[384]);
|
||||||
|
/* not used DECLARE_ALIGNED(16, short, reference[384]); */
|
||||||
DECLARE_ALIGNED(16, short, qcoeff[400]);
|
DECLARE_ALIGNED(16, short, qcoeff[400]);
|
||||||
DECLARE_ALIGNED(16, short, dqcoeff[400]);
|
DECLARE_ALIGNED(16, short, dqcoeff[400]);
|
||||||
DECLARE_ALIGNED(16, char, eobs[25]);
|
DECLARE_ALIGNED(16, char, eobs[25]);
|
||||||
@@ -252,9 +271,6 @@ typedef struct MacroBlockD
|
|||||||
int mb_to_top_edge;
|
int mb_to_top_edge;
|
||||||
int mb_to_bottom_edge;
|
int mb_to_bottom_edge;
|
||||||
|
|
||||||
int ref_frame_cost[MAX_REF_FRAMES];
|
|
||||||
|
|
||||||
|
|
||||||
unsigned int frames_since_golden;
|
unsigned int frames_since_golden;
|
||||||
unsigned int frames_till_alt_ref_frame;
|
unsigned int frames_till_alt_ref_frame;
|
||||||
vp8_subpix_fn_t subpixel_predict;
|
vp8_subpix_fn_t subpixel_predict;
|
||||||
@@ -266,14 +282,6 @@ typedef struct MacroBlockD
|
|||||||
|
|
||||||
int corrupted;
|
int corrupted;
|
||||||
|
|
||||||
#if ARCH_X86 || ARCH_X86_64
|
|
||||||
/* This is an intermediate buffer currently used in sub-pixel motion search
|
|
||||||
* to keep a copy of the reference area. This buffer can be used for other
|
|
||||||
* purpose.
|
|
||||||
*/
|
|
||||||
DECLARE_ALIGNED(32, unsigned char, y_buf[22*32]);
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
struct VP8_COMMON_RTCD *rtcd;
|
struct VP8_COMMON_RTCD *rtcd;
|
||||||
#endif
|
#endif
|
||||||
@@ -283,20 +291,4 @@ typedef struct MacroBlockD
|
|||||||
extern void vp8_build_block_doffsets(MACROBLOCKD *x);
|
extern void vp8_build_block_doffsets(MACROBLOCKD *x);
|
||||||
extern void vp8_setup_block_dptrs(MACROBLOCKD *x);
|
extern void vp8_setup_block_dptrs(MACROBLOCKD *x);
|
||||||
|
|
||||||
static void update_blockd_bmi(MACROBLOCKD *xd)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
int is_4x4;
|
|
||||||
is_4x4 = (xd->mode_info_context->mbmi.mode == SPLITMV) ||
|
|
||||||
(xd->mode_info_context->mbmi.mode == B_PRED);
|
|
||||||
|
|
||||||
if (is_4x4)
|
|
||||||
{
|
|
||||||
for (i = 0; i < 16; i++)
|
|
||||||
{
|
|
||||||
xd->block[i].bmi = xd->mode_info_context->bmi[i];
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
#endif /* __INC_BLOCKD_H */
|
#endif /* __INC_BLOCKD_H */
|
||||||
|
|||||||
@@ -12,7 +12,7 @@
|
|||||||
/* Update probabilities for the nodes in the token entropy tree.
|
/* Update probabilities for the nodes in the token entropy tree.
|
||||||
Generated file included by entropy.c */
|
Generated file included by entropy.c */
|
||||||
|
|
||||||
const vp8_prob vp8_coef_update_probs [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [ENTROPY_NODES] =
|
const vp8_prob vp8_coef_update_probs [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [vp8_coef_tokens-1] =
|
||||||
{
|
{
|
||||||
{
|
{
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -97,7 +97,7 @@ void vp8_print_modes_and_motion_vectors(MODE_INFO *mi, int rows, int cols, int f
|
|||||||
bindex = (b_row & 3) * 4 + (b_col & 3);
|
bindex = (b_row & 3) * 4 + (b_col & 3);
|
||||||
|
|
||||||
if (mi[mb_index].mbmi.mode == B_PRED)
|
if (mi[mb_index].mbmi.mode == B_PRED)
|
||||||
fprintf(mvs, "%2d ", mi[mb_index].bmi[bindex].as_mode);
|
fprintf(mvs, "%2d ", mi[mb_index].bmi[bindex].mode);
|
||||||
else
|
else
|
||||||
fprintf(mvs, "xx ");
|
fprintf(mvs, "xx ");
|
||||||
|
|
||||||
|
|||||||
@@ -1,225 +0,0 @@
|
|||||||
/*
|
|
||||||
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
|
|
||||||
*
|
|
||||||
* Use of this source code is governed by a BSD-style license
|
|
||||||
* that can be found in the LICENSE file in the root of the source
|
|
||||||
* tree. An additional intellectual property rights grant can be found
|
|
||||||
* in the file PATENTS. All contributing project authors may
|
|
||||||
* be found in the AUTHORS file in the root of the source tree.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include "defaultcoefcounts.h"
|
|
||||||
|
|
||||||
/* Generated file, included by entropy.c */
|
|
||||||
|
|
||||||
const unsigned int vp8_default_coef_counts[BLOCK_TYPES]
|
|
||||||
[COEF_BANDS]
|
|
||||||
[PREV_COEF_CONTEXTS]
|
|
||||||
[MAX_ENTROPY_TOKENS] =
|
|
||||||
{
|
|
||||||
|
|
||||||
{
|
|
||||||
/* Block Type ( 0 ) */
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 0 ) */
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 1 ) */
|
|
||||||
{30190, 26544, 225, 24, 4, 0, 0, 0, 0, 0, 0, 4171593,},
|
|
||||||
{26846, 25157, 1241, 130, 26, 6, 1, 0, 0, 0, 0, 149987,},
|
|
||||||
{10484, 9538, 1006, 160, 36, 18, 0, 0, 0, 0, 0, 15104,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 2 ) */
|
|
||||||
{25842, 40456, 1126, 83, 11, 2, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{9338, 8010, 512, 73, 7, 3, 2, 0, 0, 0, 0, 43294,},
|
|
||||||
{1047, 751, 149, 31, 13, 6, 1, 0, 0, 0, 0, 879,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 3 ) */
|
|
||||||
{26136, 9826, 252, 13, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{8134, 5574, 191, 14, 2, 0, 0, 0, 0, 0, 0, 35302,},
|
|
||||||
{ 605, 677, 116, 9, 1, 0, 0, 0, 0, 0, 0, 611,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 4 ) */
|
|
||||||
{10263, 15463, 283, 17, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{2773, 2191, 128, 9, 2, 2, 0, 0, 0, 0, 0, 10073,},
|
|
||||||
{ 134, 125, 32, 4, 0, 2, 0, 0, 0, 0, 0, 50,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 5 ) */
|
|
||||||
{10483, 2663, 23, 1, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{2137, 1251, 27, 1, 1, 0, 0, 0, 0, 0, 0, 14362,},
|
|
||||||
{ 116, 156, 14, 2, 1, 0, 0, 0, 0, 0, 0, 190,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 6 ) */
|
|
||||||
{40977, 27614, 412, 28, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{6113, 5213, 261, 22, 3, 0, 0, 0, 0, 0, 0, 26164,},
|
|
||||||
{ 382, 312, 50, 14, 2, 0, 0, 0, 0, 0, 0, 345,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 7 ) */
|
|
||||||
{ 0, 26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 319,},
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8,},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Block Type ( 1 ) */
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 0 ) */
|
|
||||||
{3268, 19382, 1043, 250, 93, 82, 49, 26, 17, 8, 25, 82289,},
|
|
||||||
{8758, 32110, 5436, 1832, 827, 668, 420, 153, 24, 0, 3, 52914,},
|
|
||||||
{9337, 23725, 8487, 3954, 2107, 1836, 1069, 399, 59, 0, 0, 18620,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 1 ) */
|
|
||||||
{12419, 8420, 452, 62, 9, 1, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{11715, 8705, 693, 92, 15, 7, 2, 0, 0, 0, 0, 53988,},
|
|
||||||
{7603, 8585, 2306, 778, 270, 145, 39, 5, 0, 0, 0, 9136,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 2 ) */
|
|
||||||
{15938, 14335, 1207, 184, 55, 13, 4, 1, 0, 0, 0, 0,},
|
|
||||||
{7415, 6829, 1138, 244, 71, 26, 7, 0, 0, 0, 0, 9980,},
|
|
||||||
{1580, 1824, 655, 241, 89, 46, 10, 2, 0, 0, 0, 429,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 3 ) */
|
|
||||||
{19453, 5260, 201, 19, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{9173, 3758, 213, 22, 1, 1, 0, 0, 0, 0, 0, 9820,},
|
|
||||||
{1689, 1277, 276, 51, 17, 4, 0, 0, 0, 0, 0, 679,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 4 ) */
|
|
||||||
{12076, 10667, 620, 85, 19, 9, 5, 0, 0, 0, 0, 0,},
|
|
||||||
{4665, 3625, 423, 55, 19, 9, 0, 0, 0, 0, 0, 5127,},
|
|
||||||
{ 415, 440, 143, 34, 20, 7, 2, 0, 0, 0, 0, 101,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 5 ) */
|
|
||||||
{12183, 4846, 115, 11, 1, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{4226, 3149, 177, 21, 2, 0, 0, 0, 0, 0, 0, 7157,},
|
|
||||||
{ 375, 621, 189, 51, 11, 4, 1, 0, 0, 0, 0, 198,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 6 ) */
|
|
||||||
{61658, 37743, 1203, 94, 10, 3, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{15514, 11563, 903, 111, 14, 5, 0, 0, 0, 0, 0, 25195,},
|
|
||||||
{ 929, 1077, 291, 78, 14, 7, 1, 0, 0, 0, 0, 507,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 7 ) */
|
|
||||||
{ 0, 990, 15, 3, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 412, 13, 0, 0, 0, 0, 0, 0, 0, 0, 1641,},
|
|
||||||
{ 0, 18, 7, 1, 0, 0, 0, 0, 0, 0, 0, 30,},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Block Type ( 2 ) */
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 0 ) */
|
|
||||||
{ 953, 24519, 628, 120, 28, 12, 4, 0, 0, 0, 0, 2248798,},
|
|
||||||
{1525, 25654, 2647, 617, 239, 143, 42, 5, 0, 0, 0, 66837,},
|
|
||||||
{1180, 11011, 3001, 1237, 532, 448, 239, 54, 5, 0, 0, 7122,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 1 ) */
|
|
||||||
{1356, 2220, 67, 10, 4, 1, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{1450, 2544, 102, 18, 4, 3, 0, 0, 0, 0, 0, 57063,},
|
|
||||||
{1182, 2110, 470, 130, 41, 21, 0, 0, 0, 0, 0, 6047,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 2 ) */
|
|
||||||
{ 370, 3378, 200, 30, 5, 4, 1, 0, 0, 0, 0, 0,},
|
|
||||||
{ 293, 1006, 131, 29, 11, 0, 0, 0, 0, 0, 0, 5404,},
|
|
||||||
{ 114, 387, 98, 23, 4, 8, 1, 0, 0, 0, 0, 236,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 3 ) */
|
|
||||||
{ 579, 194, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 395, 213, 5, 1, 0, 0, 0, 0, 0, 0, 0, 4157,},
|
|
||||||
{ 119, 122, 4, 0, 0, 0, 0, 0, 0, 0, 0, 300,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 4 ) */
|
|
||||||
{ 38, 557, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 21, 114, 12, 1, 0, 0, 0, 0, 0, 0, 0, 427,},
|
|
||||||
{ 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 5 ) */
|
|
||||||
{ 52, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 18, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 652,},
|
|
||||||
{ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 30,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 6 ) */
|
|
||||||
{ 640, 569, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 25, 77, 2, 0, 0, 0, 0, 0, 0, 0, 0, 517,},
|
|
||||||
{ 4, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 7 ) */
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Block Type ( 3 ) */
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 0 ) */
|
|
||||||
{2506, 20161, 2707, 767, 261, 178, 107, 30, 14, 3, 0, 100694,},
|
|
||||||
{8806, 36478, 8817, 3268, 1280, 850, 401, 114, 42, 0, 0, 58572,},
|
|
||||||
{11003, 27214, 11798, 5716, 2482, 2072, 1048, 175, 32, 0, 0, 19284,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 1 ) */
|
|
||||||
{9738, 11313, 959, 205, 70, 18, 11, 1, 0, 0, 0, 0,},
|
|
||||||
{12628, 15085, 1507, 273, 52, 19, 9, 0, 0, 0, 0, 54280,},
|
|
||||||
{10701, 15846, 5561, 1926, 813, 570, 249, 36, 0, 0, 0, 6460,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 2 ) */
|
|
||||||
{6781, 22539, 2784, 634, 182, 123, 20, 4, 0, 0, 0, 0,},
|
|
||||||
{6263, 11544, 2649, 790, 259, 168, 27, 5, 0, 0, 0, 20539,},
|
|
||||||
{3109, 4075, 2031, 896, 457, 386, 158, 29, 0, 0, 0, 1138,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 3 ) */
|
|
||||||
{11515, 4079, 465, 73, 5, 14, 2, 0, 0, 0, 0, 0,},
|
|
||||||
{9361, 5834, 650, 96, 24, 8, 4, 0, 0, 0, 0, 22181,},
|
|
||||||
{4343, 3974, 1360, 415, 132, 96, 14, 1, 0, 0, 0, 1267,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 4 ) */
|
|
||||||
{4787, 9297, 823, 168, 44, 12, 4, 0, 0, 0, 0, 0,},
|
|
||||||
{3619, 4472, 719, 198, 60, 31, 3, 0, 0, 0, 0, 8401,},
|
|
||||||
{1157, 1175, 483, 182, 88, 31, 8, 0, 0, 0, 0, 268,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 5 ) */
|
|
||||||
{8299, 1226, 32, 5, 1, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{3502, 1568, 57, 4, 1, 1, 0, 0, 0, 0, 0, 9811,},
|
|
||||||
{1055, 1070, 166, 29, 6, 1, 0, 0, 0, 0, 0, 527,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 6 ) */
|
|
||||||
{27414, 27927, 1989, 347, 69, 26, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{5876, 10074, 1574, 341, 91, 24, 4, 0, 0, 0, 0, 21954,},
|
|
||||||
{1571, 2171, 778, 324, 124, 65, 16, 0, 0, 0, 0, 979,},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
/* Coeff Band ( 7 ) */
|
|
||||||
{ 0, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
|
||||||
{ 0, 23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 459,},
|
|
||||||
{ 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13,},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
};
|
|
||||||
@@ -8,14 +8,214 @@
|
|||||||
* be found in the AUTHORS file in the root of the source tree.
|
* be found in the AUTHORS file in the root of the source tree.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#ifndef __DEFAULTCOEFCOUNTS_H
|
|
||||||
#define __DEFAULTCOEFCOUNTS_H
|
|
||||||
|
|
||||||
#include "entropy.h"
|
/* Generated file, included by entropy.c */
|
||||||
|
|
||||||
extern const unsigned int vp8_default_coef_counts[BLOCK_TYPES]
|
static const unsigned int default_coef_counts [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [vp8_coef_tokens] =
|
||||||
[COEF_BANDS]
|
{
|
||||||
[PREV_COEF_CONTEXTS]
|
|
||||||
[MAX_ENTROPY_TOKENS];
|
|
||||||
|
|
||||||
#endif //__DEFAULTCOEFCOUNTS_H
|
{
|
||||||
|
/* Block Type ( 0 ) */
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 0 ) */
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 1 ) */
|
||||||
|
{30190, 26544, 225, 24, 4, 0, 0, 0, 0, 0, 0, 4171593,},
|
||||||
|
{26846, 25157, 1241, 130, 26, 6, 1, 0, 0, 0, 0, 149987,},
|
||||||
|
{10484, 9538, 1006, 160, 36, 18, 0, 0, 0, 0, 0, 15104,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 2 ) */
|
||||||
|
{25842, 40456, 1126, 83, 11, 2, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{9338, 8010, 512, 73, 7, 3, 2, 0, 0, 0, 0, 43294,},
|
||||||
|
{1047, 751, 149, 31, 13, 6, 1, 0, 0, 0, 0, 879,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 3 ) */
|
||||||
|
{26136, 9826, 252, 13, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{8134, 5574, 191, 14, 2, 0, 0, 0, 0, 0, 0, 35302,},
|
||||||
|
{ 605, 677, 116, 9, 1, 0, 0, 0, 0, 0, 0, 611,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 4 ) */
|
||||||
|
{10263, 15463, 283, 17, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{2773, 2191, 128, 9, 2, 2, 0, 0, 0, 0, 0, 10073,},
|
||||||
|
{ 134, 125, 32, 4, 0, 2, 0, 0, 0, 0, 0, 50,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 5 ) */
|
||||||
|
{10483, 2663, 23, 1, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{2137, 1251, 27, 1, 1, 0, 0, 0, 0, 0, 0, 14362,},
|
||||||
|
{ 116, 156, 14, 2, 1, 0, 0, 0, 0, 0, 0, 190,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 6 ) */
|
||||||
|
{40977, 27614, 412, 28, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{6113, 5213, 261, 22, 3, 0, 0, 0, 0, 0, 0, 26164,},
|
||||||
|
{ 382, 312, 50, 14, 2, 0, 0, 0, 0, 0, 0, 345,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 7 ) */
|
||||||
|
{ 0, 26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 319,},
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8,},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Block Type ( 1 ) */
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 0 ) */
|
||||||
|
{3268, 19382, 1043, 250, 93, 82, 49, 26, 17, 8, 25, 82289,},
|
||||||
|
{8758, 32110, 5436, 1832, 827, 668, 420, 153, 24, 0, 3, 52914,},
|
||||||
|
{9337, 23725, 8487, 3954, 2107, 1836, 1069, 399, 59, 0, 0, 18620,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 1 ) */
|
||||||
|
{12419, 8420, 452, 62, 9, 1, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{11715, 8705, 693, 92, 15, 7, 2, 0, 0, 0, 0, 53988,},
|
||||||
|
{7603, 8585, 2306, 778, 270, 145, 39, 5, 0, 0, 0, 9136,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 2 ) */
|
||||||
|
{15938, 14335, 1207, 184, 55, 13, 4, 1, 0, 0, 0, 0,},
|
||||||
|
{7415, 6829, 1138, 244, 71, 26, 7, 0, 0, 0, 0, 9980,},
|
||||||
|
{1580, 1824, 655, 241, 89, 46, 10, 2, 0, 0, 0, 429,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 3 ) */
|
||||||
|
{19453, 5260, 201, 19, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{9173, 3758, 213, 22, 1, 1, 0, 0, 0, 0, 0, 9820,},
|
||||||
|
{1689, 1277, 276, 51, 17, 4, 0, 0, 0, 0, 0, 679,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 4 ) */
|
||||||
|
{12076, 10667, 620, 85, 19, 9, 5, 0, 0, 0, 0, 0,},
|
||||||
|
{4665, 3625, 423, 55, 19, 9, 0, 0, 0, 0, 0, 5127,},
|
||||||
|
{ 415, 440, 143, 34, 20, 7, 2, 0, 0, 0, 0, 101,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 5 ) */
|
||||||
|
{12183, 4846, 115, 11, 1, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{4226, 3149, 177, 21, 2, 0, 0, 0, 0, 0, 0, 7157,},
|
||||||
|
{ 375, 621, 189, 51, 11, 4, 1, 0, 0, 0, 0, 198,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 6 ) */
|
||||||
|
{61658, 37743, 1203, 94, 10, 3, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{15514, 11563, 903, 111, 14, 5, 0, 0, 0, 0, 0, 25195,},
|
||||||
|
{ 929, 1077, 291, 78, 14, 7, 1, 0, 0, 0, 0, 507,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 7 ) */
|
||||||
|
{ 0, 990, 15, 3, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 412, 13, 0, 0, 0, 0, 0, 0, 0, 0, 1641,},
|
||||||
|
{ 0, 18, 7, 1, 0, 0, 0, 0, 0, 0, 0, 30,},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Block Type ( 2 ) */
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 0 ) */
|
||||||
|
{ 953, 24519, 628, 120, 28, 12, 4, 0, 0, 0, 0, 2248798,},
|
||||||
|
{1525, 25654, 2647, 617, 239, 143, 42, 5, 0, 0, 0, 66837,},
|
||||||
|
{1180, 11011, 3001, 1237, 532, 448, 239, 54, 5, 0, 0, 7122,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 1 ) */
|
||||||
|
{1356, 2220, 67, 10, 4, 1, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{1450, 2544, 102, 18, 4, 3, 0, 0, 0, 0, 0, 57063,},
|
||||||
|
{1182, 2110, 470, 130, 41, 21, 0, 0, 0, 0, 0, 6047,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 2 ) */
|
||||||
|
{ 370, 3378, 200, 30, 5, 4, 1, 0, 0, 0, 0, 0,},
|
||||||
|
{ 293, 1006, 131, 29, 11, 0, 0, 0, 0, 0, 0, 5404,},
|
||||||
|
{ 114, 387, 98, 23, 4, 8, 1, 0, 0, 0, 0, 236,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 3 ) */
|
||||||
|
{ 579, 194, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 395, 213, 5, 1, 0, 0, 0, 0, 0, 0, 0, 4157,},
|
||||||
|
{ 119, 122, 4, 0, 0, 0, 0, 0, 0, 0, 0, 300,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 4 ) */
|
||||||
|
{ 38, 557, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 21, 114, 12, 1, 0, 0, 0, 0, 0, 0, 0, 427,},
|
||||||
|
{ 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 5 ) */
|
||||||
|
{ 52, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 18, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 652,},
|
||||||
|
{ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 30,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 6 ) */
|
||||||
|
{ 640, 569, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 25, 77, 2, 0, 0, 0, 0, 0, 0, 0, 0, 517,},
|
||||||
|
{ 4, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 7 ) */
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Block Type ( 3 ) */
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 0 ) */
|
||||||
|
{2506, 20161, 2707, 767, 261, 178, 107, 30, 14, 3, 0, 100694,},
|
||||||
|
{8806, 36478, 8817, 3268, 1280, 850, 401, 114, 42, 0, 0, 58572,},
|
||||||
|
{11003, 27214, 11798, 5716, 2482, 2072, 1048, 175, 32, 0, 0, 19284,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 1 ) */
|
||||||
|
{9738, 11313, 959, 205, 70, 18, 11, 1, 0, 0, 0, 0,},
|
||||||
|
{12628, 15085, 1507, 273, 52, 19, 9, 0, 0, 0, 0, 54280,},
|
||||||
|
{10701, 15846, 5561, 1926, 813, 570, 249, 36, 0, 0, 0, 6460,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 2 ) */
|
||||||
|
{6781, 22539, 2784, 634, 182, 123, 20, 4, 0, 0, 0, 0,},
|
||||||
|
{6263, 11544, 2649, 790, 259, 168, 27, 5, 0, 0, 0, 20539,},
|
||||||
|
{3109, 4075, 2031, 896, 457, 386, 158, 29, 0, 0, 0, 1138,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 3 ) */
|
||||||
|
{11515, 4079, 465, 73, 5, 14, 2, 0, 0, 0, 0, 0,},
|
||||||
|
{9361, 5834, 650, 96, 24, 8, 4, 0, 0, 0, 0, 22181,},
|
||||||
|
{4343, 3974, 1360, 415, 132, 96, 14, 1, 0, 0, 0, 1267,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 4 ) */
|
||||||
|
{4787, 9297, 823, 168, 44, 12, 4, 0, 0, 0, 0, 0,},
|
||||||
|
{3619, 4472, 719, 198, 60, 31, 3, 0, 0, 0, 0, 8401,},
|
||||||
|
{1157, 1175, 483, 182, 88, 31, 8, 0, 0, 0, 0, 268,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 5 ) */
|
||||||
|
{8299, 1226, 32, 5, 1, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{3502, 1568, 57, 4, 1, 1, 0, 0, 0, 0, 0, 9811,},
|
||||||
|
{1055, 1070, 166, 29, 6, 1, 0, 0, 0, 0, 0, 527,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 6 ) */
|
||||||
|
{27414, 27927, 1989, 347, 69, 26, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{5876, 10074, 1574, 341, 91, 24, 4, 0, 0, 0, 0, 21954,},
|
||||||
|
{1571, 2171, 778, 324, 124, 65, 16, 0, 0, 0, 0, 979,},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
/* Coeff Band ( 7 ) */
|
||||||
|
{ 0, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
|
||||||
|
{ 0, 23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 459,},
|
||||||
|
{ 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13,},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|||||||
@@ -26,32 +26,8 @@ typedef vp8_prob Prob;
|
|||||||
|
|
||||||
#include "coefupdateprobs.h"
|
#include "coefupdateprobs.h"
|
||||||
|
|
||||||
DECLARE_ALIGNED(16, const unsigned char, vp8_norm[256]) =
|
DECLARE_ALIGNED(16, cuchar, vp8_coef_bands[16]) = { 0, 1, 2, 3, 6, 4, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7};
|
||||||
{
|
DECLARE_ALIGNED(16, cuchar, vp8_prev_token_class[MAX_ENTROPY_TOKENS]) = { 0, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0};
|
||||||
0, 7, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4,
|
|
||||||
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
|
|
||||||
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
|
|
||||||
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
|
|
||||||
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
|
|
||||||
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
|
|
||||||
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
|
|
||||||
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
|
||||||
};
|
|
||||||
|
|
||||||
DECLARE_ALIGNED(16, cuchar, vp8_coef_bands[16]) =
|
|
||||||
{ 0, 1, 2, 3, 6, 4, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7};
|
|
||||||
|
|
||||||
DECLARE_ALIGNED(16, cuchar, vp8_prev_token_class[MAX_ENTROPY_TOKENS]) =
|
|
||||||
{ 0, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0};
|
|
||||||
|
|
||||||
DECLARE_ALIGNED(16, const int, vp8_default_zig_zag1d[16]) =
|
DECLARE_ALIGNED(16, const int, vp8_default_zig_zag1d[16]) =
|
||||||
{
|
{
|
||||||
0, 1, 4, 8,
|
0, 1, 4, 8,
|
||||||
@@ -89,7 +65,7 @@ const vp8_tree_index vp8_coef_tree[ 22] = /* corresponding _CONTEXT_NODEs */
|
|||||||
-DCT_VAL_CATEGORY5, -DCT_VAL_CATEGORY6 /* 10 = CAT_FIVE */
|
-DCT_VAL_CATEGORY5, -DCT_VAL_CATEGORY6 /* 10 = CAT_FIVE */
|
||||||
};
|
};
|
||||||
|
|
||||||
struct vp8_token_struct vp8_coef_encodings[MAX_ENTROPY_TOKENS];
|
struct vp8_token_struct vp8_coef_encodings[vp8_coef_tokens];
|
||||||
|
|
||||||
/* Trees for extra bits. Probabilities are constant and
|
/* Trees for extra bits. Probabilities are constant and
|
||||||
do not depend on previously encoded bits */
|
do not depend on previously encoded bits */
|
||||||
@@ -169,12 +145,10 @@ void vp8_default_coef_probs(VP8_COMMON *pc)
|
|||||||
|
|
||||||
do
|
do
|
||||||
{
|
{
|
||||||
unsigned int branch_ct [ENTROPY_NODES] [2];
|
unsigned int branch_ct [vp8_coef_tokens-1] [2];
|
||||||
vp8_tree_probs_from_distribution(
|
vp8_tree_probs_from_distribution(
|
||||||
MAX_ENTROPY_TOKENS, vp8_coef_encodings, vp8_coef_tree,
|
vp8_coef_tokens, vp8_coef_encodings, vp8_coef_tree,
|
||||||
pc->fc.coef_probs[h][i][k],
|
pc->fc.coef_probs [h][i][k], branch_ct, default_coef_counts [h][i][k],
|
||||||
branch_ct,
|
|
||||||
vp8_default_coef_counts[h][i][k],
|
|
||||||
256, 1);
|
256, 1);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -30,12 +30,13 @@
|
|||||||
#define DCT_VAL_CATEGORY6 10 /* 67+ Extra Bits 11+1 */
|
#define DCT_VAL_CATEGORY6 10 /* 67+ Extra Bits 11+1 */
|
||||||
#define DCT_EOB_TOKEN 11 /* EOB Extra Bits 0+0 */
|
#define DCT_EOB_TOKEN 11 /* EOB Extra Bits 0+0 */
|
||||||
|
|
||||||
#define MAX_ENTROPY_TOKENS 12
|
#define vp8_coef_tokens 12
|
||||||
|
#define MAX_ENTROPY_TOKENS vp8_coef_tokens
|
||||||
#define ENTROPY_NODES 11
|
#define ENTROPY_NODES 11
|
||||||
|
|
||||||
extern const vp8_tree_index vp8_coef_tree[];
|
extern const vp8_tree_index vp8_coef_tree[];
|
||||||
|
|
||||||
extern struct vp8_token_struct vp8_coef_encodings[MAX_ENTROPY_TOKENS];
|
extern struct vp8_token_struct vp8_coef_encodings[vp8_coef_tokens];
|
||||||
|
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
@@ -84,9 +85,9 @@ extern DECLARE_ALIGNED(16, const unsigned char, vp8_coef_bands[16]);
|
|||||||
/*# define DC_TOKEN_CONTEXTS 3*/ /* 00, 0!0, !0!0 */
|
/*# define DC_TOKEN_CONTEXTS 3*/ /* 00, 0!0, !0!0 */
|
||||||
# define PREV_COEF_CONTEXTS 3
|
# define PREV_COEF_CONTEXTS 3
|
||||||
|
|
||||||
extern DECLARE_ALIGNED(16, const unsigned char, vp8_prev_token_class[MAX_ENTROPY_TOKENS]);
|
extern DECLARE_ALIGNED(16, const unsigned char, vp8_prev_token_class[vp8_coef_tokens]);
|
||||||
|
|
||||||
extern const vp8_prob vp8_coef_update_probs [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [ENTROPY_NODES];
|
extern const vp8_prob vp8_coef_update_probs [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [vp8_coef_tokens-1];
|
||||||
|
|
||||||
|
|
||||||
struct VP8Common;
|
struct VP8Common;
|
||||||
|
|||||||
@@ -33,11 +33,11 @@ typedef enum
|
|||||||
SUBMVREF_LEFT_ABOVE_ZED
|
SUBMVREF_LEFT_ABOVE_ZED
|
||||||
} sumvfref_t;
|
} sumvfref_t;
|
||||||
|
|
||||||
int vp8_mv_cont(const int_mv *l, const int_mv *a)
|
int vp8_mv_cont(const MV *l, const MV *a)
|
||||||
{
|
{
|
||||||
int lez = (l->as_int == 0);
|
int lez = (l->row == 0 && l->col == 0);
|
||||||
int aez = (a->as_int == 0);
|
int aez = (a->row == 0 && a->col == 0);
|
||||||
int lea = (l->as_int == a->as_int);
|
int lea = (l->row == a->row && l->col == a->col);
|
||||||
|
|
||||||
if (lea && lez)
|
if (lea && lez)
|
||||||
return SUBMVREF_LEFT_ABOVE_ZED;
|
return SUBMVREF_LEFT_ABOVE_ZED;
|
||||||
|
|||||||
@@ -25,7 +25,7 @@ extern const int vp8_mbsplit_count [VP8_NUMMBSPLITS]; /* # of subsets */
|
|||||||
|
|
||||||
extern const vp8_prob vp8_mbsplit_probs [VP8_NUMMBSPLITS-1];
|
extern const vp8_prob vp8_mbsplit_probs [VP8_NUMMBSPLITS-1];
|
||||||
|
|
||||||
extern int vp8_mv_cont(const int_mv *l, const int_mv *a);
|
extern int vp8_mv_cont(const MV *l, const MV *a);
|
||||||
#define SUBMVREF_COUNT 5
|
#define SUBMVREF_COUNT 5
|
||||||
extern const vp8_prob vp8_sub_mv_ref_prob2 [SUBMVREF_COUNT][VP8_SUBMVREFS-1];
|
extern const vp8_prob vp8_sub_mv_ref_prob2 [SUBMVREF_COUNT][VP8_SUBMVREFS-1];
|
||||||
|
|
||||||
|
|||||||
@@ -18,8 +18,6 @@ enum
|
|||||||
{
|
{
|
||||||
mv_max = 1023, /* max absolute value of a MV component */
|
mv_max = 1023, /* max absolute value of a MV component */
|
||||||
MVvals = (2 * mv_max) + 1, /* # possible values "" */
|
MVvals = (2 * mv_max) + 1, /* # possible values "" */
|
||||||
mvfp_max = 255, /* max absolute value of a full pixel MV component */
|
|
||||||
MVfpvals = (2 * mvfp_max) +1, /* # possible full pixel MV values */
|
|
||||||
|
|
||||||
mvlong_width = 10, /* Large MVs have 9 bit magnitudes */
|
mvlong_width = 10, /* Large MVs have 9 bit magnitudes */
|
||||||
mvnum_short = 8, /* magnitudes 0 through 7 */
|
mvnum_short = 8, /* magnitudes 0 through 7 */
|
||||||
|
|||||||
@@ -13,12 +13,10 @@
|
|||||||
#include "vpx_mem/vpx_mem.h"
|
#include "vpx_mem/vpx_mem.h"
|
||||||
|
|
||||||
|
|
||||||
static void copy_and_extend_plane
|
static void extend_plane_borders
|
||||||
(
|
(
|
||||||
unsigned char *s, /* source */
|
unsigned char *s, /* source */
|
||||||
int sp, /* source pitch */
|
int sp, /* pitch */
|
||||||
unsigned char *d, /* destination */
|
|
||||||
int dp, /* destination pitch */
|
|
||||||
int h, /* height */
|
int h, /* height */
|
||||||
int w, /* width */
|
int w, /* width */
|
||||||
int et, /* extend top border */
|
int et, /* extend top border */
|
||||||
@@ -27,6 +25,7 @@ static void copy_and_extend_plane
|
|||||||
int er /* extend right border */
|
int er /* extend right border */
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
|
|
||||||
int i;
|
int i;
|
||||||
unsigned char *src_ptr1, *src_ptr2;
|
unsigned char *src_ptr1, *src_ptr2;
|
||||||
unsigned char *dest_ptr1, *dest_ptr2;
|
unsigned char *dest_ptr1, *dest_ptr2;
|
||||||
@@ -35,73 +34,68 @@ static void copy_and_extend_plane
|
|||||||
/* copy the left and right most columns out */
|
/* copy the left and right most columns out */
|
||||||
src_ptr1 = s;
|
src_ptr1 = s;
|
||||||
src_ptr2 = s + w - 1;
|
src_ptr2 = s + w - 1;
|
||||||
dest_ptr1 = d - el;
|
dest_ptr1 = s - el;
|
||||||
dest_ptr2 = d + w;
|
dest_ptr2 = s + w;
|
||||||
|
|
||||||
for (i = 0; i < h; i++)
|
for (i = 0; i < h - 0 + 1; i++)
|
||||||
{
|
{
|
||||||
vpx_memset(dest_ptr1, src_ptr1[0], el);
|
/* Some linkers will complain if we call vpx_memset with el set to a
|
||||||
vpx_memcpy(dest_ptr1 + el, src_ptr1, w);
|
* constant 0.
|
||||||
|
*/
|
||||||
|
if (el)
|
||||||
|
vpx_memset(dest_ptr1, src_ptr1[0], el);
|
||||||
vpx_memset(dest_ptr2, src_ptr2[0], er);
|
vpx_memset(dest_ptr2, src_ptr2[0], er);
|
||||||
src_ptr1 += sp;
|
src_ptr1 += sp;
|
||||||
src_ptr2 += sp;
|
src_ptr2 += sp;
|
||||||
dest_ptr1 += dp;
|
dest_ptr1 += sp;
|
||||||
dest_ptr2 += dp;
|
dest_ptr2 += sp;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Now copy the top and bottom lines into each line of the respective
|
/* Now copy the top and bottom source lines into each line of the respective borders */
|
||||||
* borders
|
src_ptr1 = s - el;
|
||||||
*/
|
src_ptr2 = s + sp * (h - 1) - el;
|
||||||
src_ptr1 = d - el;
|
dest_ptr1 = s + sp * (-et) - el;
|
||||||
src_ptr2 = d + dp * (h - 1) - el;
|
dest_ptr2 = s + sp * (h) - el;
|
||||||
dest_ptr1 = d + dp * (-et) - el;
|
linesize = el + er + w + 1;
|
||||||
dest_ptr2 = d + dp * (h) - el;
|
|
||||||
linesize = el + er + w;
|
|
||||||
|
|
||||||
for (i = 0; i < et; i++)
|
for (i = 0; i < (int)et; i++)
|
||||||
{
|
{
|
||||||
vpx_memcpy(dest_ptr1, src_ptr1, linesize);
|
vpx_memcpy(dest_ptr1, src_ptr1, linesize);
|
||||||
dest_ptr1 += dp;
|
dest_ptr1 += sp;
|
||||||
}
|
}
|
||||||
|
|
||||||
for (i = 0; i < eb; i++)
|
for (i = 0; i < (int)eb; i++)
|
||||||
{
|
{
|
||||||
vpx_memcpy(dest_ptr2, src_ptr2, linesize);
|
vpx_memcpy(dest_ptr2, src_ptr2, linesize);
|
||||||
dest_ptr2 += dp;
|
dest_ptr2 += sp;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_copy_and_extend_frame(YV12_BUFFER_CONFIG *src,
|
void vp8_extend_to_multiple_of16(YV12_BUFFER_CONFIG *ybf, int width, int height)
|
||||||
YV12_BUFFER_CONFIG *dst)
|
|
||||||
{
|
{
|
||||||
int et = dst->border;
|
int er = 0xf & (16 - (width & 0xf));
|
||||||
int el = dst->border;
|
int eb = 0xf & (16 - (height & 0xf));
|
||||||
int eb = dst->border + dst->y_height - src->y_height;
|
|
||||||
int er = dst->border + dst->y_width - src->y_width;
|
|
||||||
|
|
||||||
copy_and_extend_plane(src->y_buffer, src->y_stride,
|
/* check for non multiples of 16 */
|
||||||
dst->y_buffer, dst->y_stride,
|
if (er != 0 || eb != 0)
|
||||||
src->y_height, src->y_width,
|
{
|
||||||
et, el, eb, er);
|
extend_plane_borders(ybf->y_buffer, ybf->y_stride, height, width, 0, 0, eb, er);
|
||||||
|
|
||||||
et = dst->border >> 1;
|
/* adjust for uv */
|
||||||
el = dst->border >> 1;
|
height = (height + 1) >> 1;
|
||||||
eb = (dst->border >> 1) + dst->uv_height - src->uv_height;
|
width = (width + 1) >> 1;
|
||||||
er = (dst->border >> 1) + dst->uv_width - src->uv_width;
|
er = 0x7 & (8 - (width & 0x7));
|
||||||
|
eb = 0x7 & (8 - (height & 0x7));
|
||||||
|
|
||||||
copy_and_extend_plane(src->u_buffer, src->uv_stride,
|
if (er || eb)
|
||||||
dst->u_buffer, dst->uv_stride,
|
{
|
||||||
src->uv_height, src->uv_width,
|
extend_plane_borders(ybf->u_buffer, ybf->uv_stride, height, width, 0, 0, eb, er);
|
||||||
et, el, eb, er);
|
extend_plane_borders(ybf->v_buffer, ybf->uv_stride, height, width, 0, 0, eb, er);
|
||||||
|
}
|
||||||
copy_and_extend_plane(src->v_buffer, src->uv_stride,
|
}
|
||||||
dst->v_buffer, dst->uv_stride,
|
|
||||||
src->uv_height, src->uv_width,
|
|
||||||
et, el, eb, er);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* note the extension is only for the last row, for intra prediction purpose */
|
/* note the extension is only for the last row, for intra prediction purpose */
|
||||||
void vp8_extend_mb_row(YV12_BUFFER_CONFIG *ybf, unsigned char *YPtr, unsigned char *UPtr, unsigned char *VPtr)
|
void vp8_extend_mb_row(YV12_BUFFER_CONFIG *ybf, unsigned char *YPtr, unsigned char *UPtr, unsigned char *VPtr)
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -14,8 +14,8 @@
|
|||||||
|
|
||||||
#include "vpx_scale/yv12config.h"
|
#include "vpx_scale/yv12config.h"
|
||||||
|
|
||||||
|
void Extend(YV12_BUFFER_CONFIG *ybf);
|
||||||
void vp8_extend_mb_row(YV12_BUFFER_CONFIG *ybf, unsigned char *YPtr, unsigned char *UPtr, unsigned char *VPtr);
|
void vp8_extend_mb_row(YV12_BUFFER_CONFIG *ybf, unsigned char *YPtr, unsigned char *UPtr, unsigned char *VPtr);
|
||||||
void vp8_copy_and_extend_frame(YV12_BUFFER_CONFIG *src,
|
void vp8_extend_to_multiple_of16(YV12_BUFFER_CONFIG *ybf, int width, int height);
|
||||||
YV12_BUFFER_CONFIG *dst);
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -25,9 +25,9 @@ void vp8_find_near_mvs
|
|||||||
(
|
(
|
||||||
MACROBLOCKD *xd,
|
MACROBLOCKD *xd,
|
||||||
const MODE_INFO *here,
|
const MODE_INFO *here,
|
||||||
int_mv *nearest,
|
MV *nearest,
|
||||||
int_mv *nearby,
|
MV *nearby,
|
||||||
int_mv *best_mv,
|
MV *best_mv,
|
||||||
int cnt[4],
|
int cnt[4],
|
||||||
int refframe,
|
int refframe,
|
||||||
int *ref_frame_sign_bias
|
int *ref_frame_sign_bias
|
||||||
@@ -131,14 +131,13 @@ void vp8_find_near_mvs
|
|||||||
near_mvs[CNT_INTRA] = near_mvs[CNT_NEAREST];
|
near_mvs[CNT_INTRA] = near_mvs[CNT_NEAREST];
|
||||||
|
|
||||||
/* Set up return values */
|
/* Set up return values */
|
||||||
best_mv->as_int = near_mvs[0].as_int;
|
*best_mv = near_mvs[0].as_mv;
|
||||||
nearest->as_int = near_mvs[CNT_NEAREST].as_int;
|
*nearest = near_mvs[CNT_NEAREST].as_mv;
|
||||||
nearby->as_int = near_mvs[CNT_NEAR].as_int;
|
*nearby = near_mvs[CNT_NEAR].as_mv;
|
||||||
|
|
||||||
//TODO: move clamp outside findnearmv
|
vp8_clamp_mv(nearest, xd);
|
||||||
vp8_clamp_mv2(nearest, xd);
|
vp8_clamp_mv(nearby, xd);
|
||||||
vp8_clamp_mv2(nearby, xd);
|
vp8_clamp_mv(best_mv, xd); /*TODO: move this up before the copy*/
|
||||||
vp8_clamp_mv2(best_mv, xd);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
vp8_prob *vp8_mv_ref_probs(
|
vp8_prob *vp8_mv_ref_probs(
|
||||||
@@ -153,3 +152,26 @@ vp8_prob *vp8_mv_ref_probs(
|
|||||||
return p;
|
return p;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const B_MODE_INFO *vp8_left_bmi(const MODE_INFO *cur_mb, int b)
|
||||||
|
{
|
||||||
|
if (!(b & 3))
|
||||||
|
{
|
||||||
|
/* On L edge, get from MB to left of us */
|
||||||
|
--cur_mb;
|
||||||
|
b += 4;
|
||||||
|
}
|
||||||
|
|
||||||
|
return cur_mb->bmi + b - 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
const B_MODE_INFO *vp8_above_bmi(const MODE_INFO *cur_mb, int b, int mi_stride)
|
||||||
|
{
|
||||||
|
if (!(b >> 2))
|
||||||
|
{
|
||||||
|
/* On top edge, get from MB above us */
|
||||||
|
cur_mb -= mi_stride;
|
||||||
|
b += 16;
|
||||||
|
}
|
||||||
|
|
||||||
|
return cur_mb->bmi + b - 4;
|
||||||
|
}
|
||||||
|
|||||||
@@ -17,6 +17,11 @@
|
|||||||
#include "modecont.h"
|
#include "modecont.h"
|
||||||
#include "treecoder.h"
|
#include "treecoder.h"
|
||||||
|
|
||||||
|
typedef union
|
||||||
|
{
|
||||||
|
unsigned int as_int;
|
||||||
|
MV as_mv;
|
||||||
|
} int_mv; /* facilitates rapid equality tests */
|
||||||
|
|
||||||
static void mv_bias(int refmb_ref_frame_sign_bias, int refframe, int_mv *mvp, const int *ref_frame_sign_bias)
|
static void mv_bias(int refmb_ref_frame_sign_bias, int refframe, int_mv *mvp, const int *ref_frame_sign_bias)
|
||||||
{
|
{
|
||||||
@@ -34,48 +39,24 @@ static void mv_bias(int refmb_ref_frame_sign_bias, int refframe, int_mv *mvp, co
|
|||||||
|
|
||||||
#define LEFT_TOP_MARGIN (16 << 3)
|
#define LEFT_TOP_MARGIN (16 << 3)
|
||||||
#define RIGHT_BOTTOM_MARGIN (16 << 3)
|
#define RIGHT_BOTTOM_MARGIN (16 << 3)
|
||||||
static void vp8_clamp_mv2(int_mv *mv, const MACROBLOCKD *xd)
|
static void vp8_clamp_mv(MV *mv, const MACROBLOCKD *xd)
|
||||||
{
|
{
|
||||||
if (mv->as_mv.col < (xd->mb_to_left_edge - LEFT_TOP_MARGIN))
|
if (mv->col < (xd->mb_to_left_edge - LEFT_TOP_MARGIN))
|
||||||
mv->as_mv.col = xd->mb_to_left_edge - LEFT_TOP_MARGIN;
|
mv->col = xd->mb_to_left_edge - LEFT_TOP_MARGIN;
|
||||||
else if (mv->as_mv.col > xd->mb_to_right_edge + RIGHT_BOTTOM_MARGIN)
|
else if (mv->col > xd->mb_to_right_edge + RIGHT_BOTTOM_MARGIN)
|
||||||
mv->as_mv.col = xd->mb_to_right_edge + RIGHT_BOTTOM_MARGIN;
|
mv->col = xd->mb_to_right_edge + RIGHT_BOTTOM_MARGIN;
|
||||||
|
|
||||||
if (mv->as_mv.row < (xd->mb_to_top_edge - LEFT_TOP_MARGIN))
|
if (mv->row < (xd->mb_to_top_edge - LEFT_TOP_MARGIN))
|
||||||
mv->as_mv.row = xd->mb_to_top_edge - LEFT_TOP_MARGIN;
|
mv->row = xd->mb_to_top_edge - LEFT_TOP_MARGIN;
|
||||||
else if (mv->as_mv.row > xd->mb_to_bottom_edge + RIGHT_BOTTOM_MARGIN)
|
else if (mv->row > xd->mb_to_bottom_edge + RIGHT_BOTTOM_MARGIN)
|
||||||
mv->as_mv.row = xd->mb_to_bottom_edge + RIGHT_BOTTOM_MARGIN;
|
mv->row = xd->mb_to_bottom_edge + RIGHT_BOTTOM_MARGIN;
|
||||||
}
|
|
||||||
|
|
||||||
static void vp8_clamp_mv(int_mv *mv, int mb_to_left_edge, int mb_to_right_edge,
|
|
||||||
int mb_to_top_edge, int mb_to_bottom_edge)
|
|
||||||
{
|
|
||||||
mv->as_mv.col = (mv->as_mv.col < mb_to_left_edge) ?
|
|
||||||
mb_to_left_edge : mv->as_mv.col;
|
|
||||||
mv->as_mv.col = (mv->as_mv.col > mb_to_right_edge) ?
|
|
||||||
mb_to_right_edge : mv->as_mv.col;
|
|
||||||
mv->as_mv.row = (mv->as_mv.row < mb_to_top_edge) ?
|
|
||||||
mb_to_top_edge : mv->as_mv.row;
|
|
||||||
mv->as_mv.row = (mv->as_mv.row > mb_to_bottom_edge) ?
|
|
||||||
mb_to_bottom_edge : mv->as_mv.row;
|
|
||||||
}
|
|
||||||
static unsigned int vp8_check_mv_bounds(int_mv *mv, int mb_to_left_edge,
|
|
||||||
int mb_to_right_edge, int mb_to_top_edge,
|
|
||||||
int mb_to_bottom_edge)
|
|
||||||
{
|
|
||||||
unsigned int need_to_clamp;
|
|
||||||
need_to_clamp = (mv->as_mv.col < mb_to_left_edge) ? 1 : 0;
|
|
||||||
need_to_clamp |= (mv->as_mv.col > mb_to_right_edge) ? 1 : 0;
|
|
||||||
need_to_clamp |= (mv->as_mv.row < mb_to_top_edge) ? 1 : 0;
|
|
||||||
need_to_clamp |= (mv->as_mv.row > mb_to_bottom_edge) ? 1 : 0;
|
|
||||||
return need_to_clamp;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_find_near_mvs
|
void vp8_find_near_mvs
|
||||||
(
|
(
|
||||||
MACROBLOCKD *xd,
|
MACROBLOCKD *xd,
|
||||||
const MODE_INFO *here,
|
const MODE_INFO *here,
|
||||||
int_mv *nearest, int_mv *nearby, int_mv *best,
|
MV *nearest, MV *nearby, MV *best,
|
||||||
int near_mv_ref_cts[4],
|
int near_mv_ref_cts[4],
|
||||||
int refframe,
|
int refframe,
|
||||||
int *ref_frame_sign_bias
|
int *ref_frame_sign_bias
|
||||||
@@ -85,89 +66,10 @@ vp8_prob *vp8_mv_ref_probs(
|
|||||||
vp8_prob p[VP8_MVREFS-1], const int near_mv_ref_ct[4]
|
vp8_prob p[VP8_MVREFS-1], const int near_mv_ref_ct[4]
|
||||||
);
|
);
|
||||||
|
|
||||||
|
const B_MODE_INFO *vp8_left_bmi(const MODE_INFO *cur_mb, int b);
|
||||||
|
|
||||||
|
const B_MODE_INFO *vp8_above_bmi(const MODE_INFO *cur_mb, int b, int mi_stride);
|
||||||
|
|
||||||
extern const unsigned char vp8_mbsplit_offset[4][16];
|
extern const unsigned char vp8_mbsplit_offset[4][16];
|
||||||
|
|
||||||
|
|
||||||
static int left_block_mv(const MODE_INFO *cur_mb, int b)
|
|
||||||
{
|
|
||||||
if (!(b & 3))
|
|
||||||
{
|
|
||||||
/* On L edge, get from MB to left of us */
|
|
||||||
--cur_mb;
|
|
||||||
|
|
||||||
if(cur_mb->mbmi.mode != SPLITMV)
|
|
||||||
return cur_mb->mbmi.mv.as_int;
|
|
||||||
b += 4;
|
|
||||||
}
|
|
||||||
|
|
||||||
return (cur_mb->bmi + b - 1)->mv.as_int;
|
|
||||||
}
|
|
||||||
|
|
||||||
static int above_block_mv(const MODE_INFO *cur_mb, int b, int mi_stride)
|
|
||||||
{
|
|
||||||
if (!(b >> 2))
|
|
||||||
{
|
|
||||||
/* On top edge, get from MB above us */
|
|
||||||
cur_mb -= mi_stride;
|
|
||||||
|
|
||||||
if(cur_mb->mbmi.mode != SPLITMV)
|
|
||||||
return cur_mb->mbmi.mv.as_int;
|
|
||||||
b += 16;
|
|
||||||
}
|
|
||||||
|
|
||||||
return (cur_mb->bmi + b - 4)->mv.as_int;
|
|
||||||
}
|
|
||||||
static B_PREDICTION_MODE left_block_mode(const MODE_INFO *cur_mb, int b)
|
|
||||||
{
|
|
||||||
if (!(b & 3))
|
|
||||||
{
|
|
||||||
/* On L edge, get from MB to left of us */
|
|
||||||
--cur_mb;
|
|
||||||
switch (cur_mb->mbmi.mode)
|
|
||||||
{
|
|
||||||
case B_PRED:
|
|
||||||
return (cur_mb->bmi + b + 3)->as_mode;
|
|
||||||
case DC_PRED:
|
|
||||||
return B_DC_PRED;
|
|
||||||
case V_PRED:
|
|
||||||
return B_VE_PRED;
|
|
||||||
case H_PRED:
|
|
||||||
return B_HE_PRED;
|
|
||||||
case TM_PRED:
|
|
||||||
return B_TM_PRED;
|
|
||||||
default:
|
|
||||||
return B_DC_PRED;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return (cur_mb->bmi + b - 1)->as_mode;
|
|
||||||
}
|
|
||||||
|
|
||||||
static B_PREDICTION_MODE above_block_mode(const MODE_INFO *cur_mb, int b, int mi_stride)
|
|
||||||
{
|
|
||||||
if (!(b >> 2))
|
|
||||||
{
|
|
||||||
/* On top edge, get from MB above us */
|
|
||||||
cur_mb -= mi_stride;
|
|
||||||
|
|
||||||
switch (cur_mb->mbmi.mode)
|
|
||||||
{
|
|
||||||
case B_PRED:
|
|
||||||
return (cur_mb->bmi + b + 12)->as_mode;
|
|
||||||
case DC_PRED:
|
|
||||||
return B_DC_PRED;
|
|
||||||
case V_PRED:
|
|
||||||
return B_VE_PRED;
|
|
||||||
case H_PRED:
|
|
||||||
return B_HE_PRED;
|
|
||||||
case TM_PRED:
|
|
||||||
return B_TM_PRED;
|
|
||||||
default:
|
|
||||||
return B_DC_PRED;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return (cur_mb->bmi + b - 4)->as_mode;
|
|
||||||
}
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -17,54 +17,9 @@
|
|||||||
#include "vp8/common/idct.h"
|
#include "vp8/common/idct.h"
|
||||||
#include "vp8/common/onyxc_int.h"
|
#include "vp8/common/onyxc_int.h"
|
||||||
|
|
||||||
#if CONFIG_MULTITHREAD
|
|
||||||
#if HAVE_UNISTD_H
|
|
||||||
#include <unistd.h>
|
|
||||||
#elif defined(_WIN32)
|
|
||||||
#include <windows.h>
|
|
||||||
typedef void (WINAPI *PGNSI)(LPSYSTEM_INFO);
|
|
||||||
#endif
|
|
||||||
#endif
|
|
||||||
|
|
||||||
extern void vp8_arch_x86_common_init(VP8_COMMON *ctx);
|
extern void vp8_arch_x86_common_init(VP8_COMMON *ctx);
|
||||||
extern void vp8_arch_arm_common_init(VP8_COMMON *ctx);
|
extern void vp8_arch_arm_common_init(VP8_COMMON *ctx);
|
||||||
|
|
||||||
#if CONFIG_MULTITHREAD
|
|
||||||
static int get_cpu_count()
|
|
||||||
{
|
|
||||||
int core_count = 16;
|
|
||||||
|
|
||||||
#if HAVE_UNISTD_H
|
|
||||||
#if defined(_SC_NPROCESSORS_ONLN)
|
|
||||||
core_count = sysconf(_SC_NPROCESSORS_ONLN);
|
|
||||||
#elif defined(_SC_NPROC_ONLN)
|
|
||||||
core_count = sysconf(_SC_NPROC_ONLN);
|
|
||||||
#endif
|
|
||||||
#elif defined(_WIN32)
|
|
||||||
{
|
|
||||||
PGNSI pGNSI;
|
|
||||||
SYSTEM_INFO sysinfo;
|
|
||||||
|
|
||||||
/* Call GetNativeSystemInfo if supported or
|
|
||||||
* GetSystemInfo otherwise. */
|
|
||||||
|
|
||||||
pGNSI = (PGNSI) GetProcAddress(
|
|
||||||
GetModuleHandle(TEXT("kernel32.dll")), "GetNativeSystemInfo");
|
|
||||||
if (pGNSI != NULL)
|
|
||||||
pGNSI(&sysinfo);
|
|
||||||
else
|
|
||||||
GetSystemInfo(&sysinfo);
|
|
||||||
|
|
||||||
core_count = sysinfo.dwNumberOfProcessors;
|
|
||||||
}
|
|
||||||
#else
|
|
||||||
/* other platforms */
|
|
||||||
#endif
|
|
||||||
|
|
||||||
return core_count > 0 ? core_count : 1;
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
void vp8_machine_specific_config(VP8_COMMON *ctx)
|
void vp8_machine_specific_config(VP8_COMMON *ctx)
|
||||||
{
|
{
|
||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
@@ -88,12 +43,6 @@ void vp8_machine_specific_config(VP8_COMMON *ctx)
|
|||||||
vp8_build_intra_predictors_mby;
|
vp8_build_intra_predictors_mby;
|
||||||
rtcd->recon.build_intra_predictors_mby_s =
|
rtcd->recon.build_intra_predictors_mby_s =
|
||||||
vp8_build_intra_predictors_mby_s;
|
vp8_build_intra_predictors_mby_s;
|
||||||
rtcd->recon.build_intra_predictors_mbuv =
|
|
||||||
vp8_build_intra_predictors_mbuv;
|
|
||||||
rtcd->recon.build_intra_predictors_mbuv_s =
|
|
||||||
vp8_build_intra_predictors_mbuv_s;
|
|
||||||
rtcd->recon.intra4x4_predict =
|
|
||||||
vp8_intra4x4_predict;
|
|
||||||
|
|
||||||
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_c;
|
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_c;
|
||||||
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_c;
|
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_c;
|
||||||
@@ -108,12 +57,12 @@ void vp8_machine_specific_config(VP8_COMMON *ctx)
|
|||||||
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_c;
|
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_c;
|
||||||
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_c;
|
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_c;
|
||||||
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_c;
|
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_c;
|
||||||
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_simple_vertical_edge_c;
|
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_mbvs_c;
|
||||||
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_c;
|
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_c;
|
||||||
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_simple_horizontal_edge_c;
|
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_mbhs_c;
|
||||||
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_c;
|
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_c;
|
||||||
|
|
||||||
#if CONFIG_POSTPROC || (CONFIG_VP8_ENCODER && CONFIG_INTERNAL_STATS)
|
#if CONFIG_POSTPROC || (CONFIG_VP8_ENCODER && CONFIG_PSNR)
|
||||||
rtcd->postproc.down = vp8_mbpost_proc_down_c;
|
rtcd->postproc.down = vp8_mbpost_proc_down_c;
|
||||||
rtcd->postproc.across = vp8_mbpost_proc_across_ip_c;
|
rtcd->postproc.across = vp8_mbpost_proc_across_ip_c;
|
||||||
rtcd->postproc.downacross = vp8_post_proc_down_and_across_c;
|
rtcd->postproc.downacross = vp8_post_proc_down_and_across_c;
|
||||||
@@ -133,7 +82,4 @@ void vp8_machine_specific_config(VP8_COMMON *ctx)
|
|||||||
vp8_arch_arm_common_init(ctx);
|
vp8_arch_arm_common_init(ctx);
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#if CONFIG_MULTITHREAD
|
|
||||||
ctx->processor_core_count = get_cpu_count();
|
|
||||||
#endif /* CONFIG_MULTITHREAD */
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -9,149 +9,160 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include "vpx_config.h"
|
#include "vpx_ports/config.h"
|
||||||
#include "loopfilter.h"
|
#include "loopfilter.h"
|
||||||
#include "onyxc_int.h"
|
#include "onyxc_int.h"
|
||||||
#include "vpx_mem/vpx_mem.h"
|
|
||||||
|
|
||||||
typedef unsigned char uc;
|
typedef unsigned char uc;
|
||||||
|
|
||||||
|
|
||||||
prototype_loopfilter(vp8_loop_filter_horizontal_edge_c);
|
prototype_loopfilter(vp8_loop_filter_horizontal_edge_c);
|
||||||
prototype_loopfilter(vp8_loop_filter_vertical_edge_c);
|
prototype_loopfilter(vp8_loop_filter_vertical_edge_c);
|
||||||
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_c);
|
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_c);
|
||||||
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_c);
|
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_c);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_horizontal_edge_c);
|
||||||
prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_c);
|
prototype_loopfilter(vp8_loop_filter_simple_vertical_edge_c);
|
||||||
prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_c);
|
|
||||||
|
|
||||||
/* Horizontal MB filtering */
|
/* Horizontal MB filtering */
|
||||||
void vp8_loop_filter_mbh_c(unsigned char *y_ptr, unsigned char *u_ptr,
|
void vp8_loop_filter_mbh_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
unsigned char *v_ptr, int y_stride, int uv_stride,
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
loop_filter_info *lfi)
|
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_horizontal_edge_c(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_horizontal_edge_c(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_c(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_horizontal_edge_c(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_c(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_horizontal_edge_c(v_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbhs_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_c(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Vertical MB Filtering */
|
/* Vertical MB Filtering */
|
||||||
void vp8_loop_filter_mbv_c(unsigned char *y_ptr, unsigned char *u_ptr,
|
void vp8_loop_filter_mbv_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
unsigned char *v_ptr, int y_stride, int uv_stride,
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
loop_filter_info *lfi)
|
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_vertical_edge_c(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_vertical_edge_c(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_c(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_vertical_edge_c(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_c(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_vertical_edge_c(v_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbvs_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_c(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Horizontal B Filtering */
|
/* Horizontal B Filtering */
|
||||||
void vp8_loop_filter_bh_c(unsigned char *y_ptr, unsigned char *u_ptr,
|
void vp8_loop_filter_bh_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
unsigned char *v_ptr, int y_stride, int uv_stride,
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
loop_filter_info *lfi)
|
|
||||||
{
|
{
|
||||||
vp8_loop_filter_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_c(u_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_horizontal_edge_c(u_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_c(v_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_horizontal_edge_c(v_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_bhs_c(unsigned char *y_ptr, int y_stride,
|
void vp8_loop_filter_bhs_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
const unsigned char *blimit)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_c(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Vertical B Filtering */
|
/* Vertical B Filtering */
|
||||||
void vp8_loop_filter_bv_c(unsigned char *y_ptr, unsigned char *u_ptr,
|
void vp8_loop_filter_bv_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
unsigned char *v_ptr, int y_stride, int uv_stride,
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
loop_filter_info *lfi)
|
|
||||||
{
|
{
|
||||||
vp8_loop_filter_vertical_edge_c(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_vertical_edge_c(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_c(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_vertical_edge_c(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_c(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_vertical_edge_c(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_vertical_edge_c(u_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_vertical_edge_c(u_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_loop_filter_vertical_edge_c(v_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_vertical_edge_c(v_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_bvs_c(unsigned char *y_ptr, int y_stride,
|
void vp8_loop_filter_bvs_c(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
const unsigned char *blimit)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 4, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 8, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 12, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_c(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void lf_init_lut(loop_filter_info_n *lfi)
|
void vp8_init_loop_filter(VP8_COMMON *cm)
|
||||||
{
|
{
|
||||||
int filt_lvl;
|
loop_filter_info *lfi = cm->lf_info;
|
||||||
|
LOOPFILTERTYPE lft = cm->filter_type;
|
||||||
|
int sharpness_lvl = cm->sharpness_level;
|
||||||
|
int frame_type = cm->frame_type;
|
||||||
|
int i, j;
|
||||||
|
|
||||||
for (filt_lvl = 0; filt_lvl <= MAX_LOOP_FILTER; filt_lvl++)
|
int block_inside_limit = 0;
|
||||||
{
|
int HEVThresh;
|
||||||
if (filt_lvl >= 40)
|
|
||||||
{
|
|
||||||
lfi->hev_thr_lut[KEY_FRAME][filt_lvl] = 2;
|
|
||||||
lfi->hev_thr_lut[INTER_FRAME][filt_lvl] = 3;
|
|
||||||
}
|
|
||||||
else if (filt_lvl >= 20)
|
|
||||||
{
|
|
||||||
lfi->hev_thr_lut[KEY_FRAME][filt_lvl] = 1;
|
|
||||||
lfi->hev_thr_lut[INTER_FRAME][filt_lvl] = 2;
|
|
||||||
}
|
|
||||||
else if (filt_lvl >= 15)
|
|
||||||
{
|
|
||||||
lfi->hev_thr_lut[KEY_FRAME][filt_lvl] = 1;
|
|
||||||
lfi->hev_thr_lut[INTER_FRAME][filt_lvl] = 1;
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
lfi->hev_thr_lut[KEY_FRAME][filt_lvl] = 0;
|
|
||||||
lfi->hev_thr_lut[INTER_FRAME][filt_lvl] = 0;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
lfi->mode_lf_lut[DC_PRED] = 1;
|
/* For each possible value for the loop filter fill out a "loop_filter_info" entry. */
|
||||||
lfi->mode_lf_lut[V_PRED] = 1;
|
|
||||||
lfi->mode_lf_lut[H_PRED] = 1;
|
|
||||||
lfi->mode_lf_lut[TM_PRED] = 1;
|
|
||||||
lfi->mode_lf_lut[B_PRED] = 0;
|
|
||||||
|
|
||||||
lfi->mode_lf_lut[ZEROMV] = 1;
|
|
||||||
lfi->mode_lf_lut[NEARESTMV] = 2;
|
|
||||||
lfi->mode_lf_lut[NEARMV] = 2;
|
|
||||||
lfi->mode_lf_lut[NEWMV] = 2;
|
|
||||||
lfi->mode_lf_lut[SPLITMV] = 3;
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_loop_filter_update_sharpness(loop_filter_info_n *lfi,
|
|
||||||
int sharpness_lvl)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
|
|
||||||
/* For each possible value for the loop filter fill out limits */
|
|
||||||
for (i = 0; i <= MAX_LOOP_FILTER; i++)
|
for (i = 0; i <= MAX_LOOP_FILTER; i++)
|
||||||
{
|
{
|
||||||
int filt_lvl = i;
|
int filt_lvl = i;
|
||||||
int block_inside_limit = 0;
|
|
||||||
|
if (frame_type == KEY_FRAME)
|
||||||
|
{
|
||||||
|
if (filt_lvl >= 40)
|
||||||
|
HEVThresh = 2;
|
||||||
|
else if (filt_lvl >= 15)
|
||||||
|
HEVThresh = 1;
|
||||||
|
else
|
||||||
|
HEVThresh = 0;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
if (filt_lvl >= 40)
|
||||||
|
HEVThresh = 3;
|
||||||
|
else if (filt_lvl >= 20)
|
||||||
|
HEVThresh = 2;
|
||||||
|
else if (filt_lvl >= 15)
|
||||||
|
HEVThresh = 1;
|
||||||
|
else
|
||||||
|
HEVThresh = 0;
|
||||||
|
}
|
||||||
|
|
||||||
/* Set loop filter paramaeters that control sharpness. */
|
/* Set loop filter paramaeters that control sharpness. */
|
||||||
block_inside_limit = filt_lvl >> (sharpness_lvl > 0);
|
block_inside_limit = filt_lvl >> (sharpness_lvl > 0);
|
||||||
@@ -166,143 +177,170 @@ void vp8_loop_filter_update_sharpness(loop_filter_info_n *lfi,
|
|||||||
if (block_inside_limit < 1)
|
if (block_inside_limit < 1)
|
||||||
block_inside_limit = 1;
|
block_inside_limit = 1;
|
||||||
|
|
||||||
vpx_memset(lfi->lim[i], block_inside_limit, SIMD_WIDTH);
|
for (j = 0; j < 16; j++)
|
||||||
vpx_memset(lfi->blim[i], (2 * filt_lvl + block_inside_limit),
|
{
|
||||||
SIMD_WIDTH);
|
lfi[i].lim[j] = block_inside_limit;
|
||||||
vpx_memset(lfi->mblim[i], (2 * (filt_lvl + 2) + block_inside_limit),
|
lfi[i].mbflim[j] = filt_lvl + 2;
|
||||||
SIMD_WIDTH);
|
lfi[i].flim[j] = filt_lvl;
|
||||||
|
lfi[i].thr[j] = HEVThresh;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Set up the function pointers depending on the type of loop filtering selected */
|
||||||
|
if (lft == NORMAL_LOOPFILTER)
|
||||||
|
{
|
||||||
|
cm->lf_mbv = LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v);
|
||||||
|
cm->lf_bv = LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v);
|
||||||
|
cm->lf_mbh = LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h);
|
||||||
|
cm->lf_bh = LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
cm->lf_mbv = LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v);
|
||||||
|
cm->lf_bv = LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v);
|
||||||
|
cm->lf_mbh = LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h);
|
||||||
|
cm->lf_bh = LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_init(VP8_COMMON *cm)
|
/* Put vp8_init_loop_filter() in vp8dx_create_decompressor(). Only call vp8_frame_init_loop_filter() while decoding
|
||||||
|
* each frame. Check last_frame_type to skip the function most of times.
|
||||||
|
*/
|
||||||
|
void vp8_frame_init_loop_filter(loop_filter_info *lfi, int frame_type)
|
||||||
{
|
{
|
||||||
loop_filter_info_n *lfi = &cm->lf_info;
|
int HEVThresh;
|
||||||
int i;
|
int i, j;
|
||||||
|
|
||||||
/* init limits for given sharpness*/
|
/* For each possible value for the loop filter fill out a "loop_filter_info" entry. */
|
||||||
vp8_loop_filter_update_sharpness(lfi, cm->sharpness_level);
|
for (i = 0; i <= MAX_LOOP_FILTER; i++)
|
||||||
cm->last_sharpness_level = cm->sharpness_level;
|
|
||||||
|
|
||||||
/* init LUT for lvl and hev thr picking */
|
|
||||||
lf_init_lut(lfi);
|
|
||||||
|
|
||||||
/* init hev threshold const vectors */
|
|
||||||
for(i = 0; i < 4 ; i++)
|
|
||||||
{
|
{
|
||||||
vpx_memset(lfi->hev_thr[i], i, SIMD_WIDTH);
|
int filt_lvl = i;
|
||||||
|
|
||||||
|
if (frame_type == KEY_FRAME)
|
||||||
|
{
|
||||||
|
if (filt_lvl >= 40)
|
||||||
|
HEVThresh = 2;
|
||||||
|
else if (filt_lvl >= 15)
|
||||||
|
HEVThresh = 1;
|
||||||
|
else
|
||||||
|
HEVThresh = 0;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
if (filt_lvl >= 40)
|
||||||
|
HEVThresh = 3;
|
||||||
|
else if (filt_lvl >= 20)
|
||||||
|
HEVThresh = 2;
|
||||||
|
else if (filt_lvl >= 15)
|
||||||
|
HEVThresh = 1;
|
||||||
|
else
|
||||||
|
HEVThresh = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (j = 0; j < 16; j++)
|
||||||
|
{
|
||||||
|
/*lfi[i].lim[j] = block_inside_limit;
|
||||||
|
lfi[i].mbflim[j] = filt_lvl+2;*/
|
||||||
|
/*lfi[i].flim[j] = filt_lvl;*/
|
||||||
|
lfi[i].thr[j] = HEVThresh;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_frame_init(VP8_COMMON *cm,
|
|
||||||
MACROBLOCKD *mbd,
|
int vp8_adjust_mb_lf_value(MACROBLOCKD *mbd, int filter_level)
|
||||||
int default_filt_lvl)
|
|
||||||
{
|
{
|
||||||
int seg, /* segment number */
|
MB_MODE_INFO *mbmi = &mbd->mode_info_context->mbmi;
|
||||||
ref, /* index in ref_lf_deltas */
|
|
||||||
mode; /* index in mode_lf_deltas */
|
|
||||||
|
|
||||||
loop_filter_info_n *lfi = &cm->lf_info;
|
if (mbd->mode_ref_lf_delta_enabled)
|
||||||
|
|
||||||
/* update limits if sharpness has changed */
|
|
||||||
if(cm->last_sharpness_level != cm->sharpness_level)
|
|
||||||
{
|
{
|
||||||
vp8_loop_filter_update_sharpness(lfi, cm->sharpness_level);
|
|
||||||
cm->last_sharpness_level = cm->sharpness_level;
|
|
||||||
}
|
|
||||||
|
|
||||||
for(seg = 0; seg < MAX_MB_SEGMENTS; seg++)
|
|
||||||
{
|
|
||||||
int lvl_seg = default_filt_lvl;
|
|
||||||
int lvl_ref, lvl_mode;
|
|
||||||
|
|
||||||
/* Note the baseline filter values for each segment */
|
|
||||||
if (mbd->segmentation_enabled)
|
|
||||||
{
|
|
||||||
/* Abs value */
|
|
||||||
if (mbd->mb_segement_abs_delta == SEGMENT_ABSDATA)
|
|
||||||
{
|
|
||||||
lvl_seg = mbd->segment_feature_data[MB_LVL_ALT_LF][seg];
|
|
||||||
}
|
|
||||||
else /* Delta Value */
|
|
||||||
{
|
|
||||||
lvl_seg += mbd->segment_feature_data[MB_LVL_ALT_LF][seg];
|
|
||||||
lvl_seg = (lvl_seg > 0) ? ((lvl_seg > 63) ? 63: lvl_seg) : 0;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!mbd->mode_ref_lf_delta_enabled)
|
|
||||||
{
|
|
||||||
/* we could get rid of this if we assume that deltas are set to
|
|
||||||
* zero when not in use; encoder always uses deltas
|
|
||||||
*/
|
|
||||||
vpx_memset(lfi->lvl[seg][0], lvl_seg, 4 * 4 );
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
lvl_ref = lvl_seg;
|
|
||||||
|
|
||||||
/* INTRA_FRAME */
|
|
||||||
ref = INTRA_FRAME;
|
|
||||||
|
|
||||||
/* Apply delta for reference frame */
|
/* Apply delta for reference frame */
|
||||||
lvl_ref += mbd->ref_lf_deltas[ref];
|
filter_level += mbd->ref_lf_deltas[mbmi->ref_frame];
|
||||||
|
|
||||||
/* Apply delta for Intra modes */
|
/* Apply delta for mode */
|
||||||
mode = 0; /* B_PRED */
|
if (mbmi->ref_frame == INTRA_FRAME)
|
||||||
/* Only the split mode BPRED has a further special case */
|
|
||||||
lvl_mode = lvl_ref + mbd->mode_lf_deltas[mode];
|
|
||||||
lvl_mode = (lvl_mode > 0) ? (lvl_mode > 63 ? 63 : lvl_mode) : 0; /* clamp */
|
|
||||||
|
|
||||||
lfi->lvl[seg][ref][mode] = lvl_mode;
|
|
||||||
|
|
||||||
mode = 1; /* all the rest of Intra modes */
|
|
||||||
lvl_mode = (lvl_ref > 0) ? (lvl_ref > 63 ? 63 : lvl_ref) : 0; /* clamp */
|
|
||||||
lfi->lvl[seg][ref][mode] = lvl_mode;
|
|
||||||
|
|
||||||
/* LAST, GOLDEN, ALT */
|
|
||||||
for(ref = 1; ref < MAX_REF_FRAMES; ref++)
|
|
||||||
{
|
{
|
||||||
int lvl_ref = lvl_seg;
|
/* Only the split mode BPRED has a further special case */
|
||||||
|
if (mbmi->mode == B_PRED)
|
||||||
/* Apply delta for reference frame */
|
filter_level += mbd->mode_lf_deltas[0];
|
||||||
lvl_ref += mbd->ref_lf_deltas[ref];
|
|
||||||
|
|
||||||
/* Apply delta for Inter modes */
|
|
||||||
for (mode = 1; mode < 4; mode++)
|
|
||||||
{
|
|
||||||
lvl_mode = lvl_ref + mbd->mode_lf_deltas[mode];
|
|
||||||
lvl_mode = (lvl_mode > 0) ? (lvl_mode > 63 ? 63 : lvl_mode) : 0; /* clamp */
|
|
||||||
|
|
||||||
lfi->lvl[seg][ref][mode] = lvl_mode;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
/* Zero motion mode */
|
||||||
|
if (mbmi->mode == ZEROMV)
|
||||||
|
filter_level += mbd->mode_lf_deltas[1];
|
||||||
|
|
||||||
|
/* Split MB motion mode */
|
||||||
|
else if (mbmi->mode == SPLITMV)
|
||||||
|
filter_level += mbd->mode_lf_deltas[3];
|
||||||
|
|
||||||
|
/* All other inter motion modes (Nearest, Near, New) */
|
||||||
|
else
|
||||||
|
filter_level += mbd->mode_lf_deltas[2];
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Range check */
|
||||||
|
if (filter_level > MAX_LOOP_FILTER)
|
||||||
|
filter_level = MAX_LOOP_FILTER;
|
||||||
|
else if (filter_level < 0)
|
||||||
|
filter_level = 0;
|
||||||
}
|
}
|
||||||
|
return filter_level;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_frame
|
void vp8_loop_filter_frame
|
||||||
(
|
(
|
||||||
VP8_COMMON *cm,
|
VP8_COMMON *cm,
|
||||||
MACROBLOCKD *mbd
|
MACROBLOCKD *mbd,
|
||||||
|
int default_filt_lvl
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
YV12_BUFFER_CONFIG *post = cm->frame_to_show;
|
YV12_BUFFER_CONFIG *post = cm->frame_to_show;
|
||||||
loop_filter_info_n *lfi_n = &cm->lf_info;
|
loop_filter_info *lfi = cm->lf_info;
|
||||||
loop_filter_info lfi;
|
|
||||||
|
|
||||||
FRAME_TYPE frame_type = cm->frame_type;
|
FRAME_TYPE frame_type = cm->frame_type;
|
||||||
|
|
||||||
int mb_row;
|
int mb_row;
|
||||||
int mb_col;
|
int mb_col;
|
||||||
|
|
||||||
int filter_level;
|
|
||||||
|
|
||||||
|
int baseline_filter_level[MAX_MB_SEGMENTS];
|
||||||
|
int filter_level;
|
||||||
|
int alt_flt_enabled = mbd->segmentation_enabled;
|
||||||
|
|
||||||
|
int i;
|
||||||
unsigned char *y_ptr, *u_ptr, *v_ptr;
|
unsigned char *y_ptr, *u_ptr, *v_ptr;
|
||||||
|
|
||||||
/* Point at base of Mb MODE_INFO list */
|
mbd->mode_info_context = cm->mi; /* Point at base of Mb MODE_INFO list */
|
||||||
const MODE_INFO *mode_info_context = cm->mi;
|
|
||||||
|
/* Note the baseline filter values for each segment */
|
||||||
|
if (alt_flt_enabled)
|
||||||
|
{
|
||||||
|
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
||||||
|
{
|
||||||
|
/* Abs value */
|
||||||
|
if (mbd->mb_segement_abs_delta == SEGMENT_ABSDATA)
|
||||||
|
baseline_filter_level[i] = mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
||||||
|
/* Delta Value */
|
||||||
|
else
|
||||||
|
{
|
||||||
|
baseline_filter_level[i] = default_filt_lvl + mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
||||||
|
baseline_filter_level[i] = (baseline_filter_level[i] >= 0) ? ((baseline_filter_level[i] <= MAX_LOOP_FILTER) ? baseline_filter_level[i] : MAX_LOOP_FILTER) : 0; /* Clamp to valid range */
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
||||||
|
baseline_filter_level[i] = default_filt_lvl;
|
||||||
|
}
|
||||||
|
|
||||||
/* Initialize the loop filter for this frame. */
|
/* Initialize the loop filter for this frame. */
|
||||||
vp8_loop_filter_frame_init(cm, mbd, cm->filter_level);
|
if ((cm->last_filter_type != cm->filter_type) || (cm->last_sharpness_level != cm->sharpness_level))
|
||||||
|
vp8_init_loop_filter(cm);
|
||||||
|
else if (frame_type != cm->last_frame_type)
|
||||||
|
vp8_frame_init_loop_filter(lfi, frame_type);
|
||||||
|
|
||||||
/* Set up the buffer pointers */
|
/* Set up the buffer pointers */
|
||||||
y_ptr = post->y_buffer;
|
y_ptr = post->y_buffer;
|
||||||
@@ -314,108 +352,101 @@ void vp8_loop_filter_frame
|
|||||||
{
|
{
|
||||||
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
|
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
|
||||||
{
|
{
|
||||||
int skip_lf = (mode_info_context->mbmi.mode != B_PRED &&
|
int Segment = (alt_flt_enabled) ? mbd->mode_info_context->mbmi.segment_id : 0;
|
||||||
mode_info_context->mbmi.mode != SPLITMV &&
|
|
||||||
mode_info_context->mbmi.mb_skip_coeff);
|
|
||||||
|
|
||||||
const int mode_index = lfi_n->mode_lf_lut[mode_info_context->mbmi.mode];
|
filter_level = baseline_filter_level[Segment];
|
||||||
const int seg = mode_info_context->mbmi.segment_id;
|
|
||||||
const int ref_frame = mode_info_context->mbmi.ref_frame;
|
|
||||||
|
|
||||||
filter_level = lfi_n->lvl[seg][ref_frame][mode_index];
|
/* Distance of Mb to the various image edges.
|
||||||
|
* These specified to 8th pel as they are always compared to values that are in 1/8th pel units
|
||||||
|
* Apply any context driven MB level adjustment
|
||||||
|
*/
|
||||||
|
filter_level = vp8_adjust_mb_lf_value(mbd, filter_level);
|
||||||
|
|
||||||
if (filter_level)
|
if (filter_level)
|
||||||
{
|
{
|
||||||
if (cm->filter_type == NORMAL_LOOPFILTER)
|
if (mb_col > 0)
|
||||||
{
|
cm->lf_mbv(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi[filter_level], cm->simpler_lpf);
|
||||||
const int hev_index = lfi_n->hev_thr_lut[frame_type][filter_level];
|
|
||||||
lfi.mblim = lfi_n->mblim[filter_level];
|
|
||||||
lfi.blim = lfi_n->blim[filter_level];
|
|
||||||
lfi.lim = lfi_n->lim[filter_level];
|
|
||||||
lfi.hev_thr = lfi_n->hev_thr[hev_index];
|
|
||||||
|
|
||||||
if (mb_col > 0)
|
if (mbd->mode_info_context->mbmi.dc_diff > 0)
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v)
|
cm->lf_bv(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi[filter_level], cm->simpler_lpf);
|
||||||
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
/* don't apply across umv border */
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v)
|
if (mb_row > 0)
|
||||||
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
|
cm->lf_mbh(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi[filter_level], cm->simpler_lpf);
|
||||||
|
|
||||||
/* don't apply across umv border */
|
if (mbd->mode_info_context->mbmi.dc_diff > 0)
|
||||||
if (mb_row > 0)
|
cm->lf_bh(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi[filter_level], cm->simpler_lpf);
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h)
|
|
||||||
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h)
|
|
||||||
(y_ptr, u_ptr, v_ptr, post->y_stride, post->uv_stride, &lfi);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
if (mb_col > 0)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
|
|
||||||
|
|
||||||
/* don't apply across umv border */
|
|
||||||
if (mb_row > 0)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
y_ptr += 16;
|
y_ptr += 16;
|
||||||
u_ptr += 8;
|
u_ptr += 8;
|
||||||
v_ptr += 8;
|
v_ptr += 8;
|
||||||
|
|
||||||
mode_info_context++; /* step to next MB */
|
mbd->mode_info_context++; /* step to next MB */
|
||||||
}
|
}
|
||||||
|
|
||||||
y_ptr += post->y_stride * 16 - post->y_width;
|
y_ptr += post->y_stride * 16 - post->y_width;
|
||||||
u_ptr += post->uv_stride * 8 - post->uv_width;
|
u_ptr += post->uv_stride * 8 - post->uv_width;
|
||||||
v_ptr += post->uv_stride * 8 - post->uv_width;
|
v_ptr += post->uv_stride * 8 - post->uv_width;
|
||||||
|
|
||||||
mode_info_context++; /* Skip border mb */
|
mbd->mode_info_context++; /* Skip border mb */
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_frame_yonly
|
void vp8_loop_filter_frame_yonly
|
||||||
(
|
(
|
||||||
VP8_COMMON *cm,
|
VP8_COMMON *cm,
|
||||||
MACROBLOCKD *mbd,
|
MACROBLOCKD *mbd,
|
||||||
int default_filt_lvl
|
int default_filt_lvl,
|
||||||
|
int sharpness_lvl
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
YV12_BUFFER_CONFIG *post = cm->frame_to_show;
|
YV12_BUFFER_CONFIG *post = cm->frame_to_show;
|
||||||
|
|
||||||
|
int i;
|
||||||
unsigned char *y_ptr;
|
unsigned char *y_ptr;
|
||||||
int mb_row;
|
int mb_row;
|
||||||
int mb_col;
|
int mb_col;
|
||||||
|
|
||||||
loop_filter_info_n *lfi_n = &cm->lf_info;
|
loop_filter_info *lfi = cm->lf_info;
|
||||||
loop_filter_info lfi;
|
int baseline_filter_level[MAX_MB_SEGMENTS];
|
||||||
|
|
||||||
int filter_level;
|
int filter_level;
|
||||||
|
int alt_flt_enabled = mbd->segmentation_enabled;
|
||||||
FRAME_TYPE frame_type = cm->frame_type;
|
FRAME_TYPE frame_type = cm->frame_type;
|
||||||
|
|
||||||
/* Point at base of Mb MODE_INFO list */
|
(void) sharpness_lvl;
|
||||||
const MODE_INFO *mode_info_context = cm->mi;
|
|
||||||
|
|
||||||
#if 0
|
/*MODE_INFO * this_mb_mode_info = cm->mi;*/ /* Point at base of Mb MODE_INFO list */
|
||||||
if(default_filt_lvl == 0) /* no filter applied */
|
mbd->mode_info_context = cm->mi; /* Point at base of Mb MODE_INFO list */
|
||||||
return;
|
|
||||||
#endif
|
/* Note the baseline filter values for each segment */
|
||||||
|
if (alt_flt_enabled)
|
||||||
|
{
|
||||||
|
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
||||||
|
{
|
||||||
|
/* Abs value */
|
||||||
|
if (mbd->mb_segement_abs_delta == SEGMENT_ABSDATA)
|
||||||
|
baseline_filter_level[i] = mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
||||||
|
/* Delta Value */
|
||||||
|
else
|
||||||
|
{
|
||||||
|
baseline_filter_level[i] = default_filt_lvl + mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
||||||
|
baseline_filter_level[i] = (baseline_filter_level[i] >= 0) ? ((baseline_filter_level[i] <= MAX_LOOP_FILTER) ? baseline_filter_level[i] : MAX_LOOP_FILTER) : 0; /* Clamp to valid range */
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
||||||
|
baseline_filter_level[i] = default_filt_lvl;
|
||||||
|
}
|
||||||
|
|
||||||
/* Initialize the loop filter for this frame. */
|
/* Initialize the loop filter for this frame. */
|
||||||
vp8_loop_filter_frame_init( cm, mbd, default_filt_lvl);
|
if ((cm->last_filter_type != cm->filter_type) || (cm->last_sharpness_level != cm->sharpness_level))
|
||||||
|
vp8_init_loop_filter(cm);
|
||||||
|
else if (frame_type != cm->last_frame_type)
|
||||||
|
vp8_frame_init_loop_filter(lfi, frame_type);
|
||||||
|
|
||||||
/* Set up the buffer pointers */
|
/* Set up the buffer pointers */
|
||||||
y_ptr = post->y_buffer;
|
y_ptr = post->y_buffer;
|
||||||
@@ -425,106 +456,72 @@ void vp8_loop_filter_frame_yonly
|
|||||||
{
|
{
|
||||||
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
|
for (mb_col = 0; mb_col < cm->mb_cols; mb_col++)
|
||||||
{
|
{
|
||||||
int skip_lf = (mode_info_context->mbmi.mode != B_PRED &&
|
int Segment = (alt_flt_enabled) ? mbd->mode_info_context->mbmi.segment_id : 0;
|
||||||
mode_info_context->mbmi.mode != SPLITMV &&
|
filter_level = baseline_filter_level[Segment];
|
||||||
mode_info_context->mbmi.mb_skip_coeff);
|
|
||||||
|
|
||||||
const int mode_index = lfi_n->mode_lf_lut[mode_info_context->mbmi.mode];
|
/* Apply any context driven MB level adjustment */
|
||||||
const int seg = mode_info_context->mbmi.segment_id;
|
filter_level = vp8_adjust_mb_lf_value(mbd, filter_level);
|
||||||
const int ref_frame = mode_info_context->mbmi.ref_frame;
|
|
||||||
|
|
||||||
filter_level = lfi_n->lvl[seg][ref_frame][mode_index];
|
|
||||||
|
|
||||||
if (filter_level)
|
if (filter_level)
|
||||||
{
|
{
|
||||||
if (cm->filter_type == NORMAL_LOOPFILTER)
|
if (mb_col > 0)
|
||||||
{
|
cm->lf_mbv(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
const int hev_index = lfi_n->hev_thr_lut[frame_type][filter_level];
|
|
||||||
lfi.mblim = lfi_n->mblim[filter_level];
|
|
||||||
lfi.blim = lfi_n->blim[filter_level];
|
|
||||||
lfi.lim = lfi_n->lim[filter_level];
|
|
||||||
lfi.hev_thr = lfi_n->hev_thr[hev_index];
|
|
||||||
|
|
||||||
if (mb_col > 0)
|
if (mbd->mode_info_context->mbmi.dc_diff > 0)
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v)
|
cm->lf_bv(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
/* don't apply across umv border */
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v)
|
if (mb_row > 0)
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
cm->lf_mbh(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
|
|
||||||
/* don't apply across umv border */
|
if (mbd->mode_info_context->mbmi.dc_diff > 0)
|
||||||
if (mb_row > 0)
|
cm->lf_bh(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h)
|
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h)
|
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
if (mb_col > 0)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
|
|
||||||
|
|
||||||
/* don't apply across umv border */
|
|
||||||
if (mb_row > 0)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
y_ptr += 16;
|
y_ptr += 16;
|
||||||
mode_info_context ++; /* step to next MB */
|
mbd->mode_info_context ++; /* step to next MB */
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
y_ptr += post->y_stride * 16 - post->y_width;
|
y_ptr += post->y_stride * 16 - post->y_width;
|
||||||
mode_info_context ++; /* Skip border mb */
|
mbd->mode_info_context ++; /* Skip border mb */
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_partial_frame
|
void vp8_loop_filter_partial_frame
|
||||||
(
|
(
|
||||||
VP8_COMMON *cm,
|
VP8_COMMON *cm,
|
||||||
MACROBLOCKD *mbd,
|
MACROBLOCKD *mbd,
|
||||||
int default_filt_lvl
|
int default_filt_lvl,
|
||||||
|
int sharpness_lvl,
|
||||||
|
int Fraction
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
YV12_BUFFER_CONFIG *post = cm->frame_to_show;
|
YV12_BUFFER_CONFIG *post = cm->frame_to_show;
|
||||||
|
|
||||||
|
int i;
|
||||||
unsigned char *y_ptr;
|
unsigned char *y_ptr;
|
||||||
int mb_row;
|
int mb_row;
|
||||||
int mb_col;
|
int mb_col;
|
||||||
|
/*int mb_rows = post->y_height >> 4;*/
|
||||||
int mb_cols = post->y_width >> 4;
|
int mb_cols = post->y_width >> 4;
|
||||||
|
|
||||||
int linestocopy, i;
|
int linestocopy;
|
||||||
|
|
||||||
loop_filter_info_n *lfi_n = &cm->lf_info;
|
|
||||||
loop_filter_info lfi;
|
|
||||||
|
|
||||||
|
loop_filter_info *lfi = cm->lf_info;
|
||||||
|
int baseline_filter_level[MAX_MB_SEGMENTS];
|
||||||
int filter_level;
|
int filter_level;
|
||||||
int alt_flt_enabled = mbd->segmentation_enabled;
|
int alt_flt_enabled = mbd->segmentation_enabled;
|
||||||
FRAME_TYPE frame_type = cm->frame_type;
|
FRAME_TYPE frame_type = cm->frame_type;
|
||||||
|
|
||||||
const MODE_INFO *mode_info_context;
|
(void) sharpness_lvl;
|
||||||
|
|
||||||
int lvl_seg[MAX_MB_SEGMENTS];
|
/*MODE_INFO * this_mb_mode_info = cm->mi + (post->y_height>>5) * (mb_cols + 1);*/ /* Point at base of Mb MODE_INFO list */
|
||||||
|
mbd->mode_info_context = cm->mi + (post->y_height >> 5) * (mb_cols + 1); /* Point at base of Mb MODE_INFO list */
|
||||||
|
|
||||||
mode_info_context = cm->mi + (post->y_height >> 5) * (mb_cols + 1);
|
linestocopy = (post->y_height >> (4 + Fraction));
|
||||||
|
|
||||||
/* 3 is a magic number. 4 is probably magic too */
|
|
||||||
linestocopy = (post->y_height >> (4 + 3));
|
|
||||||
|
|
||||||
if (linestocopy < 1)
|
if (linestocopy < 1)
|
||||||
linestocopy = 1;
|
linestocopy = 1;
|
||||||
@@ -532,27 +529,32 @@ void vp8_loop_filter_partial_frame
|
|||||||
linestocopy <<= 4;
|
linestocopy <<= 4;
|
||||||
|
|
||||||
/* Note the baseline filter values for each segment */
|
/* Note the baseline filter values for each segment */
|
||||||
/* See vp8_loop_filter_frame_init. Rather than call that for each change
|
|
||||||
* to default_filt_lvl, copy the relevant calculation here.
|
|
||||||
*/
|
|
||||||
if (alt_flt_enabled)
|
if (alt_flt_enabled)
|
||||||
{
|
{
|
||||||
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
||||||
{ /* Abs value */
|
{
|
||||||
|
/* Abs value */
|
||||||
if (mbd->mb_segement_abs_delta == SEGMENT_ABSDATA)
|
if (mbd->mb_segement_abs_delta == SEGMENT_ABSDATA)
|
||||||
{
|
baseline_filter_level[i] = mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
||||||
lvl_seg[i] = mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
|
||||||
}
|
|
||||||
/* Delta Value */
|
/* Delta Value */
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
lvl_seg[i] = default_filt_lvl
|
baseline_filter_level[i] = default_filt_lvl + mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
||||||
+ mbd->segment_feature_data[MB_LVL_ALT_LF][i];
|
baseline_filter_level[i] = (baseline_filter_level[i] >= 0) ? ((baseline_filter_level[i] <= MAX_LOOP_FILTER) ? baseline_filter_level[i] : MAX_LOOP_FILTER) : 0; /* Clamp to valid range */
|
||||||
lvl_seg[i] = (lvl_seg[i] > 0) ?
|
|
||||||
((lvl_seg[i] > 63) ? 63: lvl_seg[i]) : 0;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (i = 0; i < MAX_MB_SEGMENTS; i++)
|
||||||
|
baseline_filter_level[i] = default_filt_lvl;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Initialize the loop filter for this frame. */
|
||||||
|
if ((cm->last_filter_type != cm->filter_type) || (cm->last_sharpness_level != cm->sharpness_level))
|
||||||
|
vp8_init_loop_filter(cm);
|
||||||
|
else if (frame_type != cm->last_frame_type)
|
||||||
|
vp8_frame_init_loop_filter(lfi, frame_type);
|
||||||
|
|
||||||
/* Set up the buffer pointers */
|
/* Set up the buffer pointers */
|
||||||
y_ptr = post->y_buffer + (post->y_height >> 5) * 16 * post->y_stride;
|
y_ptr = post->y_buffer + (post->y_height >> 5) * 16 * post->y_stride;
|
||||||
@@ -562,64 +564,28 @@ void vp8_loop_filter_partial_frame
|
|||||||
{
|
{
|
||||||
for (mb_col = 0; mb_col < mb_cols; mb_col++)
|
for (mb_col = 0; mb_col < mb_cols; mb_col++)
|
||||||
{
|
{
|
||||||
int skip_lf = (mode_info_context->mbmi.mode != B_PRED &&
|
int Segment = (alt_flt_enabled) ? mbd->mode_info_context->mbmi.segment_id : 0;
|
||||||
mode_info_context->mbmi.mode != SPLITMV &&
|
filter_level = baseline_filter_level[Segment];
|
||||||
mode_info_context->mbmi.mb_skip_coeff);
|
|
||||||
|
|
||||||
if (alt_flt_enabled)
|
|
||||||
filter_level = lvl_seg[mode_info_context->mbmi.segment_id];
|
|
||||||
else
|
|
||||||
filter_level = default_filt_lvl;
|
|
||||||
|
|
||||||
if (filter_level)
|
if (filter_level)
|
||||||
{
|
{
|
||||||
if (cm->filter_type == NORMAL_LOOPFILTER)
|
if (mb_col > 0)
|
||||||
{
|
cm->lf_mbv(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
const int hev_index = lfi_n->hev_thr_lut[frame_type][filter_level];
|
|
||||||
lfi.mblim = lfi_n->mblim[filter_level];
|
|
||||||
lfi.blim = lfi_n->blim[filter_level];
|
|
||||||
lfi.lim = lfi_n->lim[filter_level];
|
|
||||||
lfi.hev_thr = lfi_n->hev_thr[hev_index];
|
|
||||||
|
|
||||||
if (mb_col > 0)
|
if (mbd->mode_info_context->mbmi.dc_diff > 0)
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_v)
|
cm->lf_bv(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
cm->lf_mbh(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_v)
|
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
|
||||||
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_mb_h)
|
if (mbd->mode_info_context->mbmi.dc_diff > 0)
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
cm->lf_bh(y_ptr, 0, 0, post->y_stride, 0, &lfi[filter_level], 0);
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, normal_b_h)
|
|
||||||
(y_ptr, 0, 0, post->y_stride, 0, &lfi);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
if (mb_col > 0)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_v)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_v)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
|
|
||||||
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_mb_h)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->mblim[filter_level]);
|
|
||||||
|
|
||||||
if (!skip_lf)
|
|
||||||
LF_INVOKE(&cm->rtcd.loopfilter, simple_b_h)
|
|
||||||
(y_ptr, post->y_stride, lfi_n->blim[filter_level]);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
y_ptr += 16;
|
y_ptr += 16;
|
||||||
mode_info_context += 1; /* step to next MB */
|
mbd->mode_info_context += 1; /* step to next MB */
|
||||||
}
|
}
|
||||||
|
|
||||||
y_ptr += post->y_stride * 16 - post->y_width;
|
y_ptr += post->y_stride * 16 - post->y_width;
|
||||||
mode_info_context += 1; /* Skip border mb */
|
mbd->mode_info_context += 1; /* Skip border mb */
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -13,7 +13,6 @@
|
|||||||
#define loopfilter_h
|
#define loopfilter_h
|
||||||
|
|
||||||
#include "vpx_ports/mem.h"
|
#include "vpx_ports/mem.h"
|
||||||
#include "vpx_config.h"
|
|
||||||
|
|
||||||
#define MAX_LOOP_FILTER 63
|
#define MAX_LOOP_FILTER 63
|
||||||
|
|
||||||
@@ -23,45 +22,26 @@ typedef enum
|
|||||||
SIMPLE_LOOPFILTER = 1
|
SIMPLE_LOOPFILTER = 1
|
||||||
} LOOPFILTERTYPE;
|
} LOOPFILTERTYPE;
|
||||||
|
|
||||||
#if ARCH_ARM
|
/* FRK
|
||||||
#define SIMD_WIDTH 1
|
* Need to align this structure so when it is declared and
|
||||||
#else
|
|
||||||
#define SIMD_WIDTH 16
|
|
||||||
#endif
|
|
||||||
|
|
||||||
/* Need to align this structure so when it is declared and
|
|
||||||
* passed it can be loaded into vector registers.
|
* passed it can be loaded into vector registers.
|
||||||
*/
|
*/
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
DECLARE_ALIGNED(SIMD_WIDTH, unsigned char, mblim[MAX_LOOP_FILTER + 1][SIMD_WIDTH]);
|
DECLARE_ALIGNED(16, signed char, lim[16]);
|
||||||
DECLARE_ALIGNED(SIMD_WIDTH, unsigned char, blim[MAX_LOOP_FILTER + 1][SIMD_WIDTH]);
|
DECLARE_ALIGNED(16, signed char, flim[16]);
|
||||||
DECLARE_ALIGNED(SIMD_WIDTH, unsigned char, lim[MAX_LOOP_FILTER + 1][SIMD_WIDTH]);
|
DECLARE_ALIGNED(16, signed char, thr[16]);
|
||||||
DECLARE_ALIGNED(SIMD_WIDTH, unsigned char, hev_thr[4][SIMD_WIDTH]);
|
DECLARE_ALIGNED(16, signed char, mbflim[16]);
|
||||||
unsigned char lvl[4][4][4];
|
|
||||||
unsigned char hev_thr_lut[2][MAX_LOOP_FILTER + 1];
|
|
||||||
unsigned char mode_lf_lut[10];
|
|
||||||
} loop_filter_info_n;
|
|
||||||
|
|
||||||
typedef struct
|
|
||||||
{
|
|
||||||
const unsigned char * mblim;
|
|
||||||
const unsigned char * blim;
|
|
||||||
const unsigned char * lim;
|
|
||||||
const unsigned char * hev_thr;
|
|
||||||
} loop_filter_info;
|
} loop_filter_info;
|
||||||
|
|
||||||
|
|
||||||
#define prototype_loopfilter(sym) \
|
#define prototype_loopfilter(sym) \
|
||||||
void sym(unsigned char *src, int pitch, const unsigned char *blimit,\
|
void sym(unsigned char *src, int pitch, const signed char *flimit,\
|
||||||
const unsigned char *limit, const unsigned char *thresh, int count)
|
const signed char *limit, const signed char *thresh, int count)
|
||||||
|
|
||||||
#define prototype_loopfilter_block(sym) \
|
#define prototype_loopfilter_block(sym) \
|
||||||
void sym(unsigned char *y, unsigned char *u, unsigned char *v, \
|
void sym(unsigned char *y, unsigned char *u, unsigned char *v,\
|
||||||
int ystride, int uv_stride, loop_filter_info *lfi)
|
int ystride, int uv_stride, loop_filter_info *lfi, int simpler)
|
||||||
|
|
||||||
#define prototype_simple_loopfilter(sym) \
|
|
||||||
void sym(unsigned char *y, int ystride, const unsigned char *blimit)
|
|
||||||
|
|
||||||
#if ARCH_X86 || ARCH_X86_64
|
#if ARCH_X86 || ARCH_X86_64
|
||||||
#include "x86/loopfilter_x86.h"
|
#include "x86/loopfilter_x86.h"
|
||||||
@@ -91,39 +71,38 @@ extern prototype_loopfilter_block(vp8_lf_normal_mb_h);
|
|||||||
#endif
|
#endif
|
||||||
extern prototype_loopfilter_block(vp8_lf_normal_b_h);
|
extern prototype_loopfilter_block(vp8_lf_normal_b_h);
|
||||||
|
|
||||||
|
|
||||||
#ifndef vp8_lf_simple_mb_v
|
#ifndef vp8_lf_simple_mb_v
|
||||||
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_c
|
#define vp8_lf_simple_mb_v vp8_loop_filter_mbvs_c
|
||||||
#endif
|
#endif
|
||||||
extern prototype_simple_loopfilter(vp8_lf_simple_mb_v);
|
extern prototype_loopfilter_block(vp8_lf_simple_mb_v);
|
||||||
|
|
||||||
#ifndef vp8_lf_simple_b_v
|
#ifndef vp8_lf_simple_b_v
|
||||||
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_c
|
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_c
|
||||||
#endif
|
#endif
|
||||||
extern prototype_simple_loopfilter(vp8_lf_simple_b_v);
|
extern prototype_loopfilter_block(vp8_lf_simple_b_v);
|
||||||
|
|
||||||
#ifndef vp8_lf_simple_mb_h
|
#ifndef vp8_lf_simple_mb_h
|
||||||
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_c
|
#define vp8_lf_simple_mb_h vp8_loop_filter_mbhs_c
|
||||||
#endif
|
#endif
|
||||||
extern prototype_simple_loopfilter(vp8_lf_simple_mb_h);
|
extern prototype_loopfilter_block(vp8_lf_simple_mb_h);
|
||||||
|
|
||||||
#ifndef vp8_lf_simple_b_h
|
#ifndef vp8_lf_simple_b_h
|
||||||
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_c
|
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_c
|
||||||
#endif
|
#endif
|
||||||
extern prototype_simple_loopfilter(vp8_lf_simple_b_h);
|
extern prototype_loopfilter_block(vp8_lf_simple_b_h);
|
||||||
|
|
||||||
typedef prototype_loopfilter_block((*vp8_lf_block_fn_t));
|
typedef prototype_loopfilter_block((*vp8_lf_block_fn_t));
|
||||||
typedef prototype_simple_loopfilter((*vp8_slf_block_fn_t));
|
|
||||||
|
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
vp8_lf_block_fn_t normal_mb_v;
|
vp8_lf_block_fn_t normal_mb_v;
|
||||||
vp8_lf_block_fn_t normal_b_v;
|
vp8_lf_block_fn_t normal_b_v;
|
||||||
vp8_lf_block_fn_t normal_mb_h;
|
vp8_lf_block_fn_t normal_mb_h;
|
||||||
vp8_lf_block_fn_t normal_b_h;
|
vp8_lf_block_fn_t normal_b_h;
|
||||||
vp8_slf_block_fn_t simple_mb_v;
|
vp8_lf_block_fn_t simple_mb_v;
|
||||||
vp8_slf_block_fn_t simple_b_v;
|
vp8_lf_block_fn_t simple_b_v;
|
||||||
vp8_slf_block_fn_t simple_mb_h;
|
vp8_lf_block_fn_t simple_mb_h;
|
||||||
vp8_slf_block_fn_t simple_b_h;
|
vp8_lf_block_fn_t simple_b_h;
|
||||||
} vp8_loopfilter_rtcd_vtable_t;
|
} vp8_loopfilter_rtcd_vtable_t;
|
||||||
|
|
||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
@@ -136,33 +115,10 @@ typedef void loop_filter_uvfunction
|
|||||||
(
|
(
|
||||||
unsigned char *u, /* source pointer */
|
unsigned char *u, /* source pointer */
|
||||||
int p, /* pitch */
|
int p, /* pitch */
|
||||||
const unsigned char *blimit,
|
const signed char *flimit,
|
||||||
const unsigned char *limit,
|
const signed char *limit,
|
||||||
const unsigned char *thresh,
|
const signed char *thresh,
|
||||||
unsigned char *v
|
unsigned char *v
|
||||||
);
|
);
|
||||||
|
|
||||||
/* assorted loopfilter functions which get used elsewhere */
|
|
||||||
struct VP8Common;
|
|
||||||
struct MacroBlockD;
|
|
||||||
|
|
||||||
void vp8_loop_filter_init(struct VP8Common *cm);
|
|
||||||
|
|
||||||
void vp8_loop_filter_frame_init(struct VP8Common *cm,
|
|
||||||
struct MacroBlockD *mbd,
|
|
||||||
int default_filt_lvl);
|
|
||||||
|
|
||||||
void vp8_loop_filter_frame(struct VP8Common *cm, struct MacroBlockD *mbd);
|
|
||||||
|
|
||||||
void vp8_loop_filter_partial_frame(struct VP8Common *cm,
|
|
||||||
struct MacroBlockD *mbd,
|
|
||||||
int default_filt_lvl);
|
|
||||||
|
|
||||||
void vp8_loop_filter_frame_yonly(struct VP8Common *cm,
|
|
||||||
struct MacroBlockD *mbd,
|
|
||||||
int default_filt_lvl);
|
|
||||||
|
|
||||||
void vp8_loop_filter_update_sharpness(loop_filter_info_n *lfi,
|
|
||||||
int sharpness_lvl);
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -24,9 +24,8 @@ static __inline signed char vp8_signed_char_clamp(int t)
|
|||||||
|
|
||||||
|
|
||||||
/* should we apply any filter at all ( 11111111 yes, 00000000 no) */
|
/* should we apply any filter at all ( 11111111 yes, 00000000 no) */
|
||||||
static __inline signed char vp8_filter_mask(uc limit, uc blimit,
|
static __inline signed char vp8_filter_mask(signed char limit, signed char flimit,
|
||||||
uc p3, uc p2, uc p1, uc p0,
|
uc p3, uc p2, uc p1, uc p0, uc q0, uc q1, uc q2, uc q3)
|
||||||
uc q0, uc q1, uc q2, uc q3)
|
|
||||||
{
|
{
|
||||||
signed char mask = 0;
|
signed char mask = 0;
|
||||||
mask |= (abs(p3 - p2) > limit) * -1;
|
mask |= (abs(p3 - p2) > limit) * -1;
|
||||||
@@ -35,13 +34,13 @@ static __inline signed char vp8_filter_mask(uc limit, uc blimit,
|
|||||||
mask |= (abs(q1 - q0) > limit) * -1;
|
mask |= (abs(q1 - q0) > limit) * -1;
|
||||||
mask |= (abs(q2 - q1) > limit) * -1;
|
mask |= (abs(q2 - q1) > limit) * -1;
|
||||||
mask |= (abs(q3 - q2) > limit) * -1;
|
mask |= (abs(q3 - q2) > limit) * -1;
|
||||||
mask |= (abs(p0 - q0) * 2 + abs(p1 - q1) / 2 > blimit) * -1;
|
mask |= (abs(p0 - q0) * 2 + abs(p1 - q1) / 2 > flimit * 2 + limit) * -1;
|
||||||
mask = ~mask;
|
mask = ~mask;
|
||||||
return mask;
|
return mask;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* is there high variance internal edge ( 11111111 yes, 00000000 no) */
|
/* is there high variance internal edge ( 11111111 yes, 00000000 no) */
|
||||||
static __inline signed char vp8_hevmask(uc thresh, uc p1, uc p0, uc q0, uc q1)
|
static __inline signed char vp8_hevmask(signed char thresh, uc p1, uc p0, uc q0, uc q1)
|
||||||
{
|
{
|
||||||
signed char hev = 0;
|
signed char hev = 0;
|
||||||
hev |= (abs(p1 - p0) > thresh) * -1;
|
hev |= (abs(p1 - p0) > thresh) * -1;
|
||||||
@@ -49,8 +48,7 @@ static __inline signed char vp8_hevmask(uc thresh, uc p1, uc p0, uc q0, uc q1)
|
|||||||
return hev;
|
return hev;
|
||||||
}
|
}
|
||||||
|
|
||||||
static __inline void vp8_filter(signed char mask, uc hev, uc *op1,
|
static __inline void vp8_filter(signed char mask, signed char hev, uc *op1, uc *op0, uc *oq0, uc *oq1)
|
||||||
uc *op0, uc *oq0, uc *oq1)
|
|
||||||
|
|
||||||
{
|
{
|
||||||
signed char ps0, qs0;
|
signed char ps0, qs0;
|
||||||
@@ -100,9 +98,9 @@ void vp8_loop_filter_horizontal_edge_c
|
|||||||
(
|
(
|
||||||
unsigned char *s,
|
unsigned char *s,
|
||||||
int p, /* pitch */
|
int p, /* pitch */
|
||||||
const unsigned char *blimit,
|
const signed char *flimit,
|
||||||
const unsigned char *limit,
|
const signed char *limit,
|
||||||
const unsigned char *thresh,
|
const signed char *thresh,
|
||||||
int count
|
int count
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
@@ -115,11 +113,11 @@ void vp8_loop_filter_horizontal_edge_c
|
|||||||
*/
|
*/
|
||||||
do
|
do
|
||||||
{
|
{
|
||||||
mask = vp8_filter_mask(limit[0], blimit[0],
|
mask = vp8_filter_mask(limit[i], flimit[i],
|
||||||
s[-4*p], s[-3*p], s[-2*p], s[-1*p],
|
s[-4*p], s[-3*p], s[-2*p], s[-1*p],
|
||||||
s[0*p], s[1*p], s[2*p], s[3*p]);
|
s[0*p], s[1*p], s[2*p], s[3*p]);
|
||||||
|
|
||||||
hev = vp8_hevmask(thresh[0], s[-2*p], s[-1*p], s[0*p], s[1*p]);
|
hev = vp8_hevmask(thresh[i], s[-2*p], s[-1*p], s[0*p], s[1*p]);
|
||||||
|
|
||||||
vp8_filter(mask, hev, s - 2 * p, s - 1 * p, s, s + 1 * p);
|
vp8_filter(mask, hev, s - 2 * p, s - 1 * p, s, s + 1 * p);
|
||||||
|
|
||||||
@@ -132,9 +130,9 @@ void vp8_loop_filter_vertical_edge_c
|
|||||||
(
|
(
|
||||||
unsigned char *s,
|
unsigned char *s,
|
||||||
int p,
|
int p,
|
||||||
const unsigned char *blimit,
|
const signed char *flimit,
|
||||||
const unsigned char *limit,
|
const signed char *limit,
|
||||||
const unsigned char *thresh,
|
const signed char *thresh,
|
||||||
int count
|
int count
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
@@ -147,10 +145,10 @@ void vp8_loop_filter_vertical_edge_c
|
|||||||
*/
|
*/
|
||||||
do
|
do
|
||||||
{
|
{
|
||||||
mask = vp8_filter_mask(limit[0], blimit[0],
|
mask = vp8_filter_mask(limit[i], flimit[i],
|
||||||
s[-4], s[-3], s[-2], s[-1], s[0], s[1], s[2], s[3]);
|
s[-4], s[-3], s[-2], s[-1], s[0], s[1], s[2], s[3]);
|
||||||
|
|
||||||
hev = vp8_hevmask(thresh[0], s[-2], s[-1], s[0], s[1]);
|
hev = vp8_hevmask(thresh[i], s[-2], s[-1], s[0], s[1]);
|
||||||
|
|
||||||
vp8_filter(mask, hev, s - 2, s - 1, s, s + 1);
|
vp8_filter(mask, hev, s - 2, s - 1, s, s + 1);
|
||||||
|
|
||||||
@@ -159,7 +157,7 @@ void vp8_loop_filter_vertical_edge_c
|
|||||||
while (++i < count * 8);
|
while (++i < count * 8);
|
||||||
}
|
}
|
||||||
|
|
||||||
static __inline void vp8_mbfilter(signed char mask, uc hev,
|
static __inline void vp8_mbfilter(signed char mask, signed char hev,
|
||||||
uc *op2, uc *op1, uc *op0, uc *oq0, uc *oq1, uc *oq2)
|
uc *op2, uc *op1, uc *op0, uc *oq0, uc *oq1, uc *oq2)
|
||||||
{
|
{
|
||||||
signed char s, u;
|
signed char s, u;
|
||||||
@@ -218,9 +216,9 @@ void vp8_mbloop_filter_horizontal_edge_c
|
|||||||
(
|
(
|
||||||
unsigned char *s,
|
unsigned char *s,
|
||||||
int p,
|
int p,
|
||||||
const unsigned char *blimit,
|
const signed char *flimit,
|
||||||
const unsigned char *limit,
|
const signed char *limit,
|
||||||
const unsigned char *thresh,
|
const signed char *thresh,
|
||||||
int count
|
int count
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
@@ -234,11 +232,11 @@ void vp8_mbloop_filter_horizontal_edge_c
|
|||||||
do
|
do
|
||||||
{
|
{
|
||||||
|
|
||||||
mask = vp8_filter_mask(limit[0], blimit[0],
|
mask = vp8_filter_mask(limit[i], flimit[i],
|
||||||
s[-4*p], s[-3*p], s[-2*p], s[-1*p],
|
s[-4*p], s[-3*p], s[-2*p], s[-1*p],
|
||||||
s[0*p], s[1*p], s[2*p], s[3*p]);
|
s[0*p], s[1*p], s[2*p], s[3*p]);
|
||||||
|
|
||||||
hev = vp8_hevmask(thresh[0], s[-2*p], s[-1*p], s[0*p], s[1*p]);
|
hev = vp8_hevmask(thresh[i], s[-2*p], s[-1*p], s[0*p], s[1*p]);
|
||||||
|
|
||||||
vp8_mbfilter(mask, hev, s - 3 * p, s - 2 * p, s - 1 * p, s, s + 1 * p, s + 2 * p);
|
vp8_mbfilter(mask, hev, s - 3 * p, s - 2 * p, s - 1 * p, s, s + 1 * p, s + 2 * p);
|
||||||
|
|
||||||
@@ -253,9 +251,9 @@ void vp8_mbloop_filter_vertical_edge_c
|
|||||||
(
|
(
|
||||||
unsigned char *s,
|
unsigned char *s,
|
||||||
int p,
|
int p,
|
||||||
const unsigned char *blimit,
|
const signed char *flimit,
|
||||||
const unsigned char *limit,
|
const signed char *limit,
|
||||||
const unsigned char *thresh,
|
const signed char *thresh,
|
||||||
int count
|
int count
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
@@ -266,10 +264,10 @@ void vp8_mbloop_filter_vertical_edge_c
|
|||||||
do
|
do
|
||||||
{
|
{
|
||||||
|
|
||||||
mask = vp8_filter_mask(limit[0], blimit[0],
|
mask = vp8_filter_mask(limit[i], flimit[i],
|
||||||
s[-4], s[-3], s[-2], s[-1], s[0], s[1], s[2], s[3]);
|
s[-4], s[-3], s[-2], s[-1], s[0], s[1], s[2], s[3]);
|
||||||
|
|
||||||
hev = vp8_hevmask(thresh[0], s[-2], s[-1], s[0], s[1]);
|
hev = vp8_hevmask(thresh[i], s[-2], s[-1], s[0], s[1]);
|
||||||
|
|
||||||
vp8_mbfilter(mask, hev, s - 3, s - 2, s - 1, s, s + 1, s + 2);
|
vp8_mbfilter(mask, hev, s - 3, s - 2, s - 1, s, s + 1, s + 2);
|
||||||
|
|
||||||
@@ -280,13 +278,13 @@ void vp8_mbloop_filter_vertical_edge_c
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* should we apply any filter at all ( 11111111 yes, 00000000 no) */
|
/* should we apply any filter at all ( 11111111 yes, 00000000 no) */
|
||||||
static __inline signed char vp8_simple_filter_mask(uc blimit, uc p1, uc p0, uc q0, uc q1)
|
static __inline signed char vp8_simple_filter_mask(signed char limit, signed char flimit, uc p1, uc p0, uc q0, uc q1)
|
||||||
{
|
{
|
||||||
/* Why does this cause problems for win32?
|
/* Why does this cause problems for win32?
|
||||||
* error C2143: syntax error : missing ';' before 'type'
|
* error C2143: syntax error : missing ';' before 'type'
|
||||||
* (void) limit;
|
* (void) limit;
|
||||||
*/
|
*/
|
||||||
signed char mask = (abs(p0 - q0) * 2 + abs(p1 - q1) / 2 <= blimit) * -1;
|
signed char mask = (abs(p0 - q0) * 2 + abs(p1 - q1) / 2 <= flimit * 2 + limit) * -1;
|
||||||
return mask;
|
return mask;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -319,37 +317,47 @@ void vp8_loop_filter_simple_horizontal_edge_c
|
|||||||
(
|
(
|
||||||
unsigned char *s,
|
unsigned char *s,
|
||||||
int p,
|
int p,
|
||||||
const unsigned char *blimit
|
const signed char *flimit,
|
||||||
|
const signed char *limit,
|
||||||
|
const signed char *thresh,
|
||||||
|
int count
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
signed char mask = 0;
|
signed char mask = 0;
|
||||||
int i = 0;
|
int i = 0;
|
||||||
|
(void) thresh;
|
||||||
|
|
||||||
do
|
do
|
||||||
{
|
{
|
||||||
mask = vp8_simple_filter_mask(blimit[0], s[-2*p], s[-1*p], s[0*p], s[1*p]);
|
/*mask = vp8_simple_filter_mask( limit[i], flimit[i],s[-1*p],s[0*p]);*/
|
||||||
|
mask = vp8_simple_filter_mask(limit[i], flimit[i], s[-2*p], s[-1*p], s[0*p], s[1*p]);
|
||||||
vp8_simple_filter(mask, s - 2 * p, s - 1 * p, s, s + 1 * p);
|
vp8_simple_filter(mask, s - 2 * p, s - 1 * p, s, s + 1 * p);
|
||||||
++s;
|
++s;
|
||||||
}
|
}
|
||||||
while (++i < 16);
|
while (++i < count * 8);
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_loop_filter_simple_vertical_edge_c
|
void vp8_loop_filter_simple_vertical_edge_c
|
||||||
(
|
(
|
||||||
unsigned char *s,
|
unsigned char *s,
|
||||||
int p,
|
int p,
|
||||||
const unsigned char *blimit
|
const signed char *flimit,
|
||||||
|
const signed char *limit,
|
||||||
|
const signed char *thresh,
|
||||||
|
int count
|
||||||
)
|
)
|
||||||
{
|
{
|
||||||
signed char mask = 0;
|
signed char mask = 0;
|
||||||
int i = 0;
|
int i = 0;
|
||||||
|
(void) thresh;
|
||||||
|
|
||||||
do
|
do
|
||||||
{
|
{
|
||||||
mask = vp8_simple_filter_mask(blimit[0], s[-2], s[-1], s[0], s[1]);
|
/*mask = vp8_simple_filter_mask( limit[i], flimit[i],s[-1],s[0]);*/
|
||||||
|
mask = vp8_simple_filter_mask(limit[i], flimit[i], s[-2], s[-1], s[0], s[1]);
|
||||||
vp8_simple_filter(mask, s - 2, s - 1, s, s + 1);
|
vp8_simple_filter(mask, s - 2, s - 1, s, s + 1);
|
||||||
s += p;
|
s += p;
|
||||||
}
|
}
|
||||||
while (++i < 16);
|
while (++i < count * 8);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -11,7 +11,6 @@
|
|||||||
|
|
||||||
#ifndef __INC_MV_H
|
#ifndef __INC_MV_H
|
||||||
#define __INC_MV_H
|
#define __INC_MV_H
|
||||||
#include "vpx/vpx_integer.h"
|
|
||||||
|
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
@@ -19,10 +18,4 @@ typedef struct
|
|||||||
short col;
|
short col;
|
||||||
} MV;
|
} MV;
|
||||||
|
|
||||||
typedef union
|
|
||||||
{
|
|
||||||
uint32_t as_int;
|
|
||||||
MV as_mv;
|
|
||||||
} int_mv; /* facilitates faster equality tests and copies */
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -109,7 +109,6 @@ extern "C"
|
|||||||
int noise_sensitivity; // parameter used for applying pre processing blur: recommendation 0
|
int noise_sensitivity; // parameter used for applying pre processing blur: recommendation 0
|
||||||
int Sharpness; // parameter used for sharpening output: recommendation 0:
|
int Sharpness; // parameter used for sharpening output: recommendation 0:
|
||||||
int cpu_used;
|
int cpu_used;
|
||||||
unsigned int rc_max_intra_bitrate_pct;
|
|
||||||
|
|
||||||
// mode ->
|
// mode ->
|
||||||
//(0)=Realtime/Live Encoding. This mode is optimized for realtim encoding (for example, capturing
|
//(0)=Realtime/Live Encoding. This mode is optimized for realtim encoding (for example, capturing
|
||||||
@@ -140,9 +139,8 @@ extern "C"
|
|||||||
|
|
||||||
int end_usage; // vbr or cbr
|
int end_usage; // vbr or cbr
|
||||||
|
|
||||||
// buffer targeting aggressiveness
|
// shoot to keep buffer full at all times by undershooting a bit 95 recommended
|
||||||
int under_shoot_pct;
|
int under_shoot_pct;
|
||||||
int over_shoot_pct;
|
|
||||||
|
|
||||||
// buffering parameters
|
// buffering parameters
|
||||||
int starting_buffer_level; // in seconds
|
int starting_buffer_level; // in seconds
|
||||||
@@ -184,11 +182,8 @@ extern "C"
|
|||||||
int token_partitions; // how many token partitions to create for multi core decoding
|
int token_partitions; // how many token partitions to create for multi core decoding
|
||||||
int encode_breakout; // early breakout encode threshold : for video conf recommend 800
|
int encode_breakout; // early breakout encode threshold : for video conf recommend 800
|
||||||
|
|
||||||
unsigned int error_resilient_mode; // Bitfield defining the error
|
int error_resilient_mode; // if running over udp networks provides decodable frames after a
|
||||||
// resiliency features to enable. Can provide
|
// dropped packet
|
||||||
// decodable frames after losses in previous
|
|
||||||
// frames and decodable partitions after
|
|
||||||
// losses in the same frame.
|
|
||||||
|
|
||||||
int arnr_max_frames;
|
int arnr_max_frames;
|
||||||
int arnr_strength ;
|
int arnr_strength ;
|
||||||
@@ -211,8 +206,8 @@ extern "C"
|
|||||||
|
|
||||||
// receive a frames worth of data caller can assume that a copy of this frame is made
|
// receive a frames worth of data caller can assume that a copy of this frame is made
|
||||||
// and not just a copy of the pointer..
|
// and not just a copy of the pointer..
|
||||||
int vp8_receive_raw_frame(VP8_PTR comp, unsigned int frame_flags, YV12_BUFFER_CONFIG *sd, int64_t time_stamp, int64_t end_time_stamp);
|
int vp8_receive_raw_frame(VP8_PTR comp, unsigned int frame_flags, YV12_BUFFER_CONFIG *sd, INT64 time_stamp, INT64 end_time_stamp);
|
||||||
int vp8_get_compressed_data(VP8_PTR comp, unsigned int *frame_flags, unsigned long *size, unsigned char *dest, int64_t *time_stamp, int64_t *time_end, int flush);
|
int vp8_get_compressed_data(VP8_PTR comp, unsigned int *frame_flags, unsigned long *size, unsigned char *dest, INT64 *time_stamp, INT64 *time_end, int flush);
|
||||||
int vp8_get_preview_raw_frame(VP8_PTR comp, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t *flags);
|
int vp8_get_preview_raw_frame(VP8_PTR comp, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t *flags);
|
||||||
|
|
||||||
int vp8_use_as_reference(VP8_PTR comp, int ref_frame_flags);
|
int vp8_use_as_reference(VP8_PTR comp, int ref_frame_flags);
|
||||||
|
|||||||
@@ -19,9 +19,7 @@
|
|||||||
#include "entropy.h"
|
#include "entropy.h"
|
||||||
#include "idct.h"
|
#include "idct.h"
|
||||||
#include "recon.h"
|
#include "recon.h"
|
||||||
#if CONFIG_POSTPROC
|
|
||||||
#include "postproc.h"
|
#include "postproc.h"
|
||||||
#endif
|
|
||||||
|
|
||||||
/*#ifdef PACKET_TESTING*/
|
/*#ifdef PACKET_TESTING*/
|
||||||
#include "header.h"
|
#include "header.h"
|
||||||
@@ -37,15 +35,13 @@ void vp8_initialize_common(void);
|
|||||||
|
|
||||||
#define NUM_YV12_BUFFERS 4
|
#define NUM_YV12_BUFFERS 4
|
||||||
|
|
||||||
#define MAX_PARTITIONS 9
|
|
||||||
|
|
||||||
typedef struct frame_contexts
|
typedef struct frame_contexts
|
||||||
{
|
{
|
||||||
vp8_prob bmode_prob [VP8_BINTRAMODES-1];
|
vp8_prob bmode_prob [VP8_BINTRAMODES-1];
|
||||||
vp8_prob ymode_prob [VP8_YMODES-1]; /* interframe intra mode probs */
|
vp8_prob ymode_prob [VP8_YMODES-1]; /* interframe intra mode probs */
|
||||||
vp8_prob uv_mode_prob [VP8_UV_MODES-1];
|
vp8_prob uv_mode_prob [VP8_UV_MODES-1];
|
||||||
vp8_prob sub_mv_ref_prob [VP8_SUBMVREFS-1];
|
vp8_prob sub_mv_ref_prob [VP8_SUBMVREFS-1];
|
||||||
vp8_prob coef_probs [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [ENTROPY_NODES];
|
vp8_prob coef_probs [BLOCK_TYPES] [COEF_BANDS] [PREV_COEF_CONTEXTS] [vp8_coef_tokens-1];
|
||||||
MV_CONTEXT mvc[2];
|
MV_CONTEXT mvc[2];
|
||||||
MV_CONTEXT pre_mvc[2]; /* not to caculate the mvcost for the frame if mvc doesn't change. */
|
MV_CONTEXT pre_mvc[2]; /* not to caculate the mvcost for the frame if mvc doesn't change. */
|
||||||
} FRAME_CONTEXT;
|
} FRAME_CONTEXT;
|
||||||
@@ -77,9 +73,7 @@ typedef struct VP8_COMMON_RTCD
|
|||||||
vp8_recon_rtcd_vtable_t recon;
|
vp8_recon_rtcd_vtable_t recon;
|
||||||
vp8_subpix_rtcd_vtable_t subpix;
|
vp8_subpix_rtcd_vtable_t subpix;
|
||||||
vp8_loopfilter_rtcd_vtable_t loopfilter;
|
vp8_loopfilter_rtcd_vtable_t loopfilter;
|
||||||
#if CONFIG_POSTPROC
|
|
||||||
vp8_postproc_rtcd_vtable_t postproc;
|
vp8_postproc_rtcd_vtable_t postproc;
|
||||||
#endif
|
|
||||||
int flags;
|
int flags;
|
||||||
#else
|
#else
|
||||||
int unused;
|
int unused;
|
||||||
@@ -87,7 +81,6 @@ typedef struct VP8_COMMON_RTCD
|
|||||||
} VP8_COMMON_RTCD;
|
} VP8_COMMON_RTCD;
|
||||||
|
|
||||||
typedef struct VP8Common
|
typedef struct VP8Common
|
||||||
|
|
||||||
{
|
{
|
||||||
struct vpx_internal_error_info error;
|
struct vpx_internal_error_info error;
|
||||||
|
|
||||||
@@ -112,8 +105,7 @@ typedef struct VP8Common
|
|||||||
YV12_BUFFER_CONFIG post_proc_buffer;
|
YV12_BUFFER_CONFIG post_proc_buffer;
|
||||||
YV12_BUFFER_CONFIG temp_scale_frame;
|
YV12_BUFFER_CONFIG temp_scale_frame;
|
||||||
|
|
||||||
|
FRAME_TYPE last_frame_type; /* Save last frame's frame type for loopfilter init checking and motion search. */
|
||||||
FRAME_TYPE last_frame_type; /* Save last frame's frame type for motion search. */
|
|
||||||
FRAME_TYPE frame_type;
|
FRAME_TYPE frame_type;
|
||||||
|
|
||||||
int show_frame;
|
int show_frame;
|
||||||
@@ -127,6 +119,7 @@ typedef struct VP8Common
|
|||||||
/* profile settings */
|
/* profile settings */
|
||||||
int mb_no_coeff_skip;
|
int mb_no_coeff_skip;
|
||||||
int no_lpf;
|
int no_lpf;
|
||||||
|
int simpler_lpf;
|
||||||
int use_bilinear_mc_filter;
|
int use_bilinear_mc_filter;
|
||||||
int full_pixel;
|
int full_pixel;
|
||||||
|
|
||||||
@@ -147,15 +140,16 @@ typedef struct VP8Common
|
|||||||
|
|
||||||
MODE_INFO *mip; /* Base of allocated array */
|
MODE_INFO *mip; /* Base of allocated array */
|
||||||
MODE_INFO *mi; /* Corresponds to upper left visible macroblock */
|
MODE_INFO *mi; /* Corresponds to upper left visible macroblock */
|
||||||
MODE_INFO *prev_mip; /* MODE_INFO array 'mip' from last decoded frame */
|
|
||||||
MODE_INFO *prev_mi; /* 'mi' from last frame (points into prev_mip) */
|
|
||||||
|
|
||||||
|
|
||||||
INTERPOLATIONFILTERTYPE mcomp_filter_type;
|
INTERPOLATIONFILTERTYPE mcomp_filter_type;
|
||||||
|
LOOPFILTERTYPE last_filter_type;
|
||||||
LOOPFILTERTYPE filter_type;
|
LOOPFILTERTYPE filter_type;
|
||||||
|
loop_filter_info lf_info[MAX_LOOP_FILTER+1];
|
||||||
loop_filter_info_n lf_info;
|
prototype_loopfilter_block((*lf_mbv));
|
||||||
|
prototype_loopfilter_block((*lf_mbh));
|
||||||
|
prototype_loopfilter_block((*lf_bv));
|
||||||
|
prototype_loopfilter_block((*lf_bh));
|
||||||
int filter_level;
|
int filter_level;
|
||||||
int last_sharpness_level;
|
int last_sharpness_level;
|
||||||
int sharpness_level;
|
int sharpness_level;
|
||||||
@@ -202,12 +196,13 @@ typedef struct VP8Common
|
|||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
VP8_COMMON_RTCD rtcd;
|
VP8_COMMON_RTCD rtcd;
|
||||||
#endif
|
#endif
|
||||||
#if CONFIG_MULTITHREAD
|
|
||||||
int processor_core_count;
|
|
||||||
#endif
|
|
||||||
#if CONFIG_POSTPROC
|
|
||||||
struct postproc_state postproc_state;
|
struct postproc_state postproc_state;
|
||||||
#endif
|
|
||||||
} VP8_COMMON;
|
} VP8_COMMON;
|
||||||
|
|
||||||
|
|
||||||
|
int vp8_adjust_mb_lf_value(MACROBLOCKD *mbd, int filter_level);
|
||||||
|
void vp8_init_loop_filter(VP8_COMMON *cm);
|
||||||
|
void vp8_frame_init_loop_filter(loop_filter_info *lfi, int frame_type);
|
||||||
|
extern void vp8_loop_filter_frame(VP8_COMMON *cm, MACROBLOCKD *mbd, int filt_val);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -22,7 +22,6 @@ extern "C"
|
|||||||
#include "vpx_scale/yv12config.h"
|
#include "vpx_scale/yv12config.h"
|
||||||
#include "ppflags.h"
|
#include "ppflags.h"
|
||||||
#include "vpx_ports/mem.h"
|
#include "vpx_ports/mem.h"
|
||||||
#include "vpx/vpx_codec.h"
|
|
||||||
|
|
||||||
typedef void *VP8D_PTR;
|
typedef void *VP8D_PTR;
|
||||||
typedef struct
|
typedef struct
|
||||||
@@ -32,8 +31,6 @@ extern "C"
|
|||||||
int Version;
|
int Version;
|
||||||
int postprocess;
|
int postprocess;
|
||||||
int max_threads;
|
int max_threads;
|
||||||
int error_concealment;
|
|
||||||
int input_partition;
|
|
||||||
} VP8D_CONFIG;
|
} VP8D_CONFIG;
|
||||||
typedef enum
|
typedef enum
|
||||||
{
|
{
|
||||||
@@ -53,11 +50,11 @@ extern "C"
|
|||||||
|
|
||||||
int vp8dx_get_setting(VP8D_PTR comp, VP8D_SETTING oxst);
|
int vp8dx_get_setting(VP8D_PTR comp, VP8D_SETTING oxst);
|
||||||
|
|
||||||
int vp8dx_receive_compressed_data(VP8D_PTR comp, unsigned long size, const unsigned char *dest, int64_t time_stamp);
|
int vp8dx_receive_compressed_data(VP8D_PTR comp, unsigned long size, const unsigned char *dest, INT64 time_stamp);
|
||||||
int vp8dx_get_raw_frame(VP8D_PTR comp, YV12_BUFFER_CONFIG *sd, int64_t *time_stamp, int64_t *time_end_stamp, vp8_ppflags_t *flags);
|
int vp8dx_get_raw_frame(VP8D_PTR comp, YV12_BUFFER_CONFIG *sd, INT64 *time_stamp, INT64 *time_end_stamp, vp8_ppflags_t *flags);
|
||||||
|
|
||||||
vpx_codec_err_t vp8dx_get_reference(VP8D_PTR comp, VP8_REFFRAME ref_frame_flag, YV12_BUFFER_CONFIG *sd);
|
int vp8dx_get_reference(VP8D_PTR comp, VP8_REFFRAME ref_frame_flag, YV12_BUFFER_CONFIG *sd);
|
||||||
vpx_codec_err_t vp8dx_set_reference(VP8D_PTR comp, VP8_REFFRAME ref_frame_flag, YV12_BUFFER_CONFIG *sd);
|
int vp8dx_set_reference(VP8D_PTR comp, VP8_REFFRAME ref_frame_flag, YV12_BUFFER_CONFIG *sd);
|
||||||
|
|
||||||
VP8D_PTR vp8dx_create_decompressor(VP8D_CONFIG *oxcf);
|
VP8D_PTR vp8dx_create_decompressor(VP8D_CONFIG *oxcf);
|
||||||
|
|
||||||
|
|||||||
@@ -804,14 +804,11 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
for (j = 0; j < mb_cols; j++)
|
for (j = 0; j < mb_cols; j++)
|
||||||
{
|
{
|
||||||
char zz[4];
|
char zz[4];
|
||||||
int dc_diff = !(mi[mb_index].mbmi.mode != B_PRED &&
|
|
||||||
mi[mb_index].mbmi.mode != SPLITMV &&
|
|
||||||
mi[mb_index].mbmi.mb_skip_coeff);
|
|
||||||
|
|
||||||
if (oci->frame_type == KEY_FRAME)
|
if (oci->frame_type == KEY_FRAME)
|
||||||
sprintf(zz, "a");
|
sprintf(zz, "a");
|
||||||
else
|
else
|
||||||
sprintf(zz, "%c", dc_diff + '0');
|
sprintf(zz, "%c", mi[mb_index].mbmi.dc_diff + '0');
|
||||||
|
|
||||||
vp8_blit_text(zz, y_ptr, post->y_stride);
|
vp8_blit_text(zz, y_ptr, post->y_stride);
|
||||||
mb_index ++;
|
mb_index ++;
|
||||||
@@ -837,6 +834,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
YV12_BUFFER_CONFIG *post = &oci->post_proc_buffer;
|
YV12_BUFFER_CONFIG *post = &oci->post_proc_buffer;
|
||||||
int width = post->y_width;
|
int width = post->y_width;
|
||||||
int height = post->y_height;
|
int height = post->y_height;
|
||||||
|
int mb_cols = width >> 4;
|
||||||
unsigned char *y_buffer = oci->post_proc_buffer.y_buffer;
|
unsigned char *y_buffer = oci->post_proc_buffer.y_buffer;
|
||||||
int y_stride = oci->post_proc_buffer.y_stride;
|
int y_stride = oci->post_proc_buffer.y_stride;
|
||||||
MODE_INFO *mi = oci->mi;
|
MODE_INFO *mi = oci->mi;
|
||||||
@@ -860,7 +858,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
{
|
{
|
||||||
case 0 : /* mv_top_bottom */
|
case 0 : /* mv_top_bottom */
|
||||||
{
|
{
|
||||||
union b_mode_info *bmi = &mi->bmi[0];
|
B_MODE_INFO *bmi = &mi->bmi[0];
|
||||||
MV *mv = &bmi->mv.as_mv;
|
MV *mv = &bmi->mv.as_mv;
|
||||||
|
|
||||||
x1 = x0 + 8 + (mv->col >> 3);
|
x1 = x0 + 8 + (mv->col >> 3);
|
||||||
@@ -881,7 +879,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
}
|
}
|
||||||
case 1 : /* mv_left_right */
|
case 1 : /* mv_left_right */
|
||||||
{
|
{
|
||||||
union b_mode_info *bmi = &mi->bmi[0];
|
B_MODE_INFO *bmi = &mi->bmi[0];
|
||||||
MV *mv = &bmi->mv.as_mv;
|
MV *mv = &bmi->mv.as_mv;
|
||||||
|
|
||||||
x1 = x0 + 4 + (mv->col >> 3);
|
x1 = x0 + 4 + (mv->col >> 3);
|
||||||
@@ -902,7 +900,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
}
|
}
|
||||||
case 2 : /* mv_quarters */
|
case 2 : /* mv_quarters */
|
||||||
{
|
{
|
||||||
union b_mode_info *bmi = &mi->bmi[0];
|
B_MODE_INFO *bmi = &mi->bmi[0];
|
||||||
MV *mv = &bmi->mv.as_mv;
|
MV *mv = &bmi->mv.as_mv;
|
||||||
|
|
||||||
x1 = x0 + 4 + (mv->col >> 3);
|
x1 = x0 + 4 + (mv->col >> 3);
|
||||||
@@ -938,7 +936,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
}
|
}
|
||||||
default :
|
default :
|
||||||
{
|
{
|
||||||
union b_mode_info *bmi = mi->bmi;
|
B_MODE_INFO *bmi = mi->bmi;
|
||||||
int bx0, by0;
|
int bx0, by0;
|
||||||
|
|
||||||
for (by0 = y0; by0 < (y0+16); by0 += 4)
|
for (by0 = y0; by0 < (y0+16); by0 += 4)
|
||||||
@@ -1011,7 +1009,7 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
{
|
{
|
||||||
int by, bx;
|
int by, bx;
|
||||||
unsigned char *yl, *ul, *vl;
|
unsigned char *yl, *ul, *vl;
|
||||||
union b_mode_info *bmi = mi->bmi;
|
B_MODE_INFO *bmi = mi->bmi;
|
||||||
|
|
||||||
yl = y_ptr + x;
|
yl = y_ptr + x;
|
||||||
ul = u_ptr + (x>>1);
|
ul = u_ptr + (x>>1);
|
||||||
@@ -1024,9 +1022,9 @@ int vp8_post_proc_frame(VP8_COMMON *oci, YV12_BUFFER_CONFIG *dest, vp8_ppflags_t
|
|||||||
if ((ppflags->display_b_modes_flag & (1<<mi->mbmi.mode))
|
if ((ppflags->display_b_modes_flag & (1<<mi->mbmi.mode))
|
||||||
|| (ppflags->display_mb_modes_flag & B_PRED))
|
|| (ppflags->display_mb_modes_flag & B_PRED))
|
||||||
{
|
{
|
||||||
Y = B_PREDICTION_MODE_colors[bmi->as_mode][0];
|
Y = B_PREDICTION_MODE_colors[bmi->mode][0];
|
||||||
U = B_PREDICTION_MODE_colors[bmi->as_mode][1];
|
U = B_PREDICTION_MODE_colors[bmi->mode][1];
|
||||||
V = B_PREDICTION_MODE_colors[bmi->as_mode][2];
|
V = B_PREDICTION_MODE_colors[bmi->mode][2];
|
||||||
|
|
||||||
POSTPROC_INVOKE(RTCD_VTABLE(oci), blend_b)
|
POSTPROC_INVOKE(RTCD_VTABLE(oci), blend_b)
|
||||||
(yl+bx, ul+(bx>>1), vl+(bx>>1), Y, U, V, 0xc000, y_stride);
|
(yl+bx, ul+(bx>>1), vl+(bx>>1), Y, U, V, 0xc000, y_stride);
|
||||||
|
|||||||
@@ -53,8 +53,9 @@ loop_filter_function_s_ppc loop_filter_simple_vertical_edge_ppc;
|
|||||||
|
|
||||||
// Horizontal MB filtering
|
// Horizontal MB filtering
|
||||||
void loop_filter_mbh_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_mbh_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
mbloop_filter_horizontal_edge_y_ppc(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr);
|
mbloop_filter_horizontal_edge_y_ppc(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
@@ -62,8 +63,9 @@ void loop_filter_mbh_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned ch
|
|||||||
}
|
}
|
||||||
|
|
||||||
void loop_filter_mbhs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_mbhs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
(void)u_ptr;
|
(void)u_ptr;
|
||||||
(void)v_ptr;
|
(void)v_ptr;
|
||||||
(void)uv_stride;
|
(void)uv_stride;
|
||||||
@@ -72,8 +74,9 @@ void loop_filter_mbhs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned c
|
|||||||
|
|
||||||
// Vertical MB Filtering
|
// Vertical MB Filtering
|
||||||
void loop_filter_mbv_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_mbv_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
mbloop_filter_vertical_edge_y_ppc(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr);
|
mbloop_filter_vertical_edge_y_ppc(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
@@ -81,8 +84,9 @@ void loop_filter_mbv_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned ch
|
|||||||
}
|
}
|
||||||
|
|
||||||
void loop_filter_mbvs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_mbvs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
(void)u_ptr;
|
(void)u_ptr;
|
||||||
(void)v_ptr;
|
(void)v_ptr;
|
||||||
(void)uv_stride;
|
(void)uv_stride;
|
||||||
@@ -91,8 +95,9 @@ void loop_filter_mbvs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned c
|
|||||||
|
|
||||||
// Horizontal B Filtering
|
// Horizontal B Filtering
|
||||||
void loop_filter_bh_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_bh_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
// These should all be done at once with one call, instead of 3
|
// These should all be done at once with one call, instead of 3
|
||||||
loop_filter_horizontal_edge_y_ppc(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr);
|
loop_filter_horizontal_edge_y_ppc(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr);
|
||||||
loop_filter_horizontal_edge_y_ppc(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr);
|
loop_filter_horizontal_edge_y_ppc(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr);
|
||||||
@@ -103,8 +108,9 @@ void loop_filter_bh_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned cha
|
|||||||
}
|
}
|
||||||
|
|
||||||
void loop_filter_bhs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_bhs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
(void)u_ptr;
|
(void)u_ptr;
|
||||||
(void)v_ptr;
|
(void)v_ptr;
|
||||||
(void)uv_stride;
|
(void)uv_stride;
|
||||||
@@ -115,8 +121,9 @@ void loop_filter_bhs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned ch
|
|||||||
|
|
||||||
// Vertical B Filtering
|
// Vertical B Filtering
|
||||||
void loop_filter_bv_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_bv_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
loop_filter_vertical_edge_y_ppc(y_ptr, y_stride, lfi->flim, lfi->lim, lfi->thr);
|
loop_filter_vertical_edge_y_ppc(y_ptr, y_stride, lfi->flim, lfi->lim, lfi->thr);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
@@ -124,8 +131,9 @@ void loop_filter_bv_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned cha
|
|||||||
}
|
}
|
||||||
|
|
||||||
void loop_filter_bvs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void loop_filter_bvs_ppc(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
|
(void)simpler_lpf;
|
||||||
(void)u_ptr;
|
(void)u_ptr;
|
||||||
(void)v_ptr;
|
(void)v_ptr;
|
||||||
(void)uv_stride;
|
(void)uv_stride;
|
||||||
|
|||||||
@@ -26,9 +26,6 @@
|
|||||||
#define prototype_build_intra_predictors(sym) \
|
#define prototype_build_intra_predictors(sym) \
|
||||||
void sym(MACROBLOCKD *x)
|
void sym(MACROBLOCKD *x)
|
||||||
|
|
||||||
#define prototype_intra4x4_predict(sym) \
|
|
||||||
void sym(BLOCKD *x, int b_mode, unsigned char *predictor)
|
|
||||||
|
|
||||||
struct vp8_recon_rtcd_vtable;
|
struct vp8_recon_rtcd_vtable;
|
||||||
|
|
||||||
#if ARCH_X86 || ARCH_X86_64
|
#if ARCH_X86 || ARCH_X86_64
|
||||||
@@ -91,30 +88,11 @@ extern prototype_build_intra_predictors\
|
|||||||
extern prototype_build_intra_predictors\
|
extern prototype_build_intra_predictors\
|
||||||
(vp8_recon_build_intra_predictors_mby_s);
|
(vp8_recon_build_intra_predictors_mby_s);
|
||||||
|
|
||||||
#ifndef vp8_recon_build_intra_predictors_mbuv
|
|
||||||
#define vp8_recon_build_intra_predictors_mbuv vp8_build_intra_predictors_mbuv
|
|
||||||
#endif
|
|
||||||
extern prototype_build_intra_predictors\
|
|
||||||
(vp8_recon_build_intra_predictors_mbuv);
|
|
||||||
|
|
||||||
#ifndef vp8_recon_build_intra_predictors_mbuv_s
|
|
||||||
#define vp8_recon_build_intra_predictors_mbuv_s vp8_build_intra_predictors_mbuv_s
|
|
||||||
#endif
|
|
||||||
extern prototype_build_intra_predictors\
|
|
||||||
(vp8_recon_build_intra_predictors_mbuv_s);
|
|
||||||
|
|
||||||
#ifndef vp8_recon_intra4x4_predict
|
|
||||||
#define vp8_recon_intra4x4_predict vp8_intra4x4_predict
|
|
||||||
#endif
|
|
||||||
extern prototype_intra4x4_predict\
|
|
||||||
(vp8_recon_intra4x4_predict);
|
|
||||||
|
|
||||||
|
|
||||||
typedef prototype_copy_block((*vp8_copy_block_fn_t));
|
typedef prototype_copy_block((*vp8_copy_block_fn_t));
|
||||||
typedef prototype_recon_block((*vp8_recon_fn_t));
|
typedef prototype_recon_block((*vp8_recon_fn_t));
|
||||||
typedef prototype_recon_macroblock((*vp8_recon_mb_fn_t));
|
typedef prototype_recon_macroblock((*vp8_recon_mb_fn_t));
|
||||||
typedef prototype_build_intra_predictors((*vp8_build_intra_pred_fn_t));
|
typedef prototype_build_intra_predictors((*vp8_build_intra_pred_fn_t));
|
||||||
typedef prototype_intra4x4_predict((*vp8_intra4x4_pred_fn_t));
|
|
||||||
typedef struct vp8_recon_rtcd_vtable
|
typedef struct vp8_recon_rtcd_vtable
|
||||||
{
|
{
|
||||||
vp8_copy_block_fn_t copy16x16;
|
vp8_copy_block_fn_t copy16x16;
|
||||||
@@ -127,9 +105,6 @@ typedef struct vp8_recon_rtcd_vtable
|
|||||||
vp8_recon_mb_fn_t recon_mby;
|
vp8_recon_mb_fn_t recon_mby;
|
||||||
vp8_build_intra_pred_fn_t build_intra_predictors_mby_s;
|
vp8_build_intra_pred_fn_t build_intra_predictors_mby_s;
|
||||||
vp8_build_intra_pred_fn_t build_intra_predictors_mby;
|
vp8_build_intra_pred_fn_t build_intra_predictors_mby;
|
||||||
vp8_build_intra_pred_fn_t build_intra_predictors_mbuv_s;
|
|
||||||
vp8_build_intra_pred_fn_t build_intra_predictors_mbuv;
|
|
||||||
vp8_intra4x4_pred_fn_t intra4x4_predict;
|
|
||||||
} vp8_recon_rtcd_vtable_t;
|
} vp8_recon_rtcd_vtable_t;
|
||||||
|
|
||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
|
|||||||
@@ -10,7 +10,6 @@
|
|||||||
|
|
||||||
|
|
||||||
#include "vpx_ports/config.h"
|
#include "vpx_ports/config.h"
|
||||||
#include "vpx/vpx_integer.h"
|
|
||||||
#include "recon.h"
|
#include "recon.h"
|
||||||
#include "subpixel.h"
|
#include "subpixel.h"
|
||||||
#include "blockd.h"
|
#include "blockd.h"
|
||||||
@@ -19,6 +18,12 @@
|
|||||||
#include "onyxc_int.h"
|
#include "onyxc_int.h"
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
/* use this define on systems where unaligned int reads and writes are
|
||||||
|
* not allowed, i.e. ARM architectures
|
||||||
|
*/
|
||||||
|
/*#define MUST_BE_ALIGNED*/
|
||||||
|
|
||||||
|
|
||||||
static const int bbb[4] = {0, 2, 8, 10};
|
static const int bbb[4] = {0, 2, 8, 10};
|
||||||
|
|
||||||
|
|
||||||
@@ -34,7 +39,7 @@ void vp8_copy_mem16x16_c(
|
|||||||
|
|
||||||
for (r = 0; r < 16; r++)
|
for (r = 0; r < 16; r++)
|
||||||
{
|
{
|
||||||
#if !(CONFIG_FAST_UNALIGNED)
|
#ifdef MUST_BE_ALIGNED
|
||||||
dst[0] = src[0];
|
dst[0] = src[0];
|
||||||
dst[1] = src[1];
|
dst[1] = src[1];
|
||||||
dst[2] = src[2];
|
dst[2] = src[2];
|
||||||
@@ -53,10 +58,10 @@ void vp8_copy_mem16x16_c(
|
|||||||
dst[15] = src[15];
|
dst[15] = src[15];
|
||||||
|
|
||||||
#else
|
#else
|
||||||
((uint32_t *)dst)[0] = ((uint32_t *)src)[0] ;
|
((int *)dst)[0] = ((int *)src)[0] ;
|
||||||
((uint32_t *)dst)[1] = ((uint32_t *)src)[1] ;
|
((int *)dst)[1] = ((int *)src)[1] ;
|
||||||
((uint32_t *)dst)[2] = ((uint32_t *)src)[2] ;
|
((int *)dst)[2] = ((int *)src)[2] ;
|
||||||
((uint32_t *)dst)[3] = ((uint32_t *)src)[3] ;
|
((int *)dst)[3] = ((int *)src)[3] ;
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
src += src_stride;
|
src += src_stride;
|
||||||
@@ -76,7 +81,7 @@ void vp8_copy_mem8x8_c(
|
|||||||
|
|
||||||
for (r = 0; r < 8; r++)
|
for (r = 0; r < 8; r++)
|
||||||
{
|
{
|
||||||
#if !(CONFIG_FAST_UNALIGNED)
|
#ifdef MUST_BE_ALIGNED
|
||||||
dst[0] = src[0];
|
dst[0] = src[0];
|
||||||
dst[1] = src[1];
|
dst[1] = src[1];
|
||||||
dst[2] = src[2];
|
dst[2] = src[2];
|
||||||
@@ -86,8 +91,8 @@ void vp8_copy_mem8x8_c(
|
|||||||
dst[6] = src[6];
|
dst[6] = src[6];
|
||||||
dst[7] = src[7];
|
dst[7] = src[7];
|
||||||
#else
|
#else
|
||||||
((uint32_t *)dst)[0] = ((uint32_t *)src)[0] ;
|
((int *)dst)[0] = ((int *)src)[0] ;
|
||||||
((uint32_t *)dst)[1] = ((uint32_t *)src)[1] ;
|
((int *)dst)[1] = ((int *)src)[1] ;
|
||||||
#endif
|
#endif
|
||||||
src += src_stride;
|
src += src_stride;
|
||||||
dst += dst_stride;
|
dst += dst_stride;
|
||||||
@@ -106,7 +111,7 @@ void vp8_copy_mem8x4_c(
|
|||||||
|
|
||||||
for (r = 0; r < 4; r++)
|
for (r = 0; r < 4; r++)
|
||||||
{
|
{
|
||||||
#if !(CONFIG_FAST_UNALIGNED)
|
#ifdef MUST_BE_ALIGNED
|
||||||
dst[0] = src[0];
|
dst[0] = src[0];
|
||||||
dst[1] = src[1];
|
dst[1] = src[1];
|
||||||
dst[2] = src[2];
|
dst[2] = src[2];
|
||||||
@@ -116,8 +121,8 @@ void vp8_copy_mem8x4_c(
|
|||||||
dst[6] = src[6];
|
dst[6] = src[6];
|
||||||
dst[7] = src[7];
|
dst[7] = src[7];
|
||||||
#else
|
#else
|
||||||
((uint32_t *)dst)[0] = ((uint32_t *)src)[0] ;
|
((int *)dst)[0] = ((int *)src)[0] ;
|
||||||
((uint32_t *)dst)[1] = ((uint32_t *)src)[1] ;
|
((int *)dst)[1] = ((int *)src)[1] ;
|
||||||
#endif
|
#endif
|
||||||
src += src_stride;
|
src += src_stride;
|
||||||
dst += dst_stride;
|
dst += dst_stride;
|
||||||
@@ -149,13 +154,13 @@ void vp8_build_inter_predictors_b(BLOCKD *d, int pitch, vp8_subpix_fn_t sppf)
|
|||||||
|
|
||||||
for (r = 0; r < 4; r++)
|
for (r = 0; r < 4; r++)
|
||||||
{
|
{
|
||||||
#if !(CONFIG_FAST_UNALIGNED)
|
#ifdef MUST_BE_ALIGNED
|
||||||
pred_ptr[0] = ptr[0];
|
pred_ptr[0] = ptr[0];
|
||||||
pred_ptr[1] = ptr[1];
|
pred_ptr[1] = ptr[1];
|
||||||
pred_ptr[2] = ptr[2];
|
pred_ptr[2] = ptr[2];
|
||||||
pred_ptr[3] = ptr[3];
|
pred_ptr[3] = ptr[3];
|
||||||
#else
|
#else
|
||||||
*(uint32_t *)pred_ptr = *(uint32_t *)ptr ;
|
*(int *)pred_ptr = *(int *)ptr ;
|
||||||
#endif
|
#endif
|
||||||
pred_ptr += pitch;
|
pred_ptr += pitch;
|
||||||
ptr += d->pre_stride;
|
ptr += d->pre_stride;
|
||||||
@@ -202,12 +207,12 @@ static void build_inter_predictors2b(MACROBLOCKD *x, BLOCKD *d, int pitch)
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/*encoder only*/
|
|
||||||
void vp8_build_inter_predictors_mbuv(MACROBLOCKD *x)
|
void vp8_build_inter_predictors_mbuv(MACROBLOCKD *x)
|
||||||
{
|
{
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
if (x->mode_info_context->mbmi.mode != SPLITMV)
|
if (x->mode_info_context->mbmi.ref_frame != INTRA_FRAME &&
|
||||||
|
x->mode_info_context->mbmi.mode != SPLITMV)
|
||||||
{
|
{
|
||||||
unsigned char *uptr, *vptr;
|
unsigned char *uptr, *vptr;
|
||||||
unsigned char *upred_ptr = &x->predictor[256];
|
unsigned char *upred_ptr = &x->predictor[256];
|
||||||
@@ -252,132 +257,158 @@ void vp8_build_inter_predictors_mbuv(MACROBLOCKD *x)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/*encoder only*/
|
/*encoder only*/
|
||||||
void vp8_build_inter16x16_predictors_mby(MACROBLOCKD *x)
|
void vp8_build_inter_predictors_mby(MACROBLOCKD *x)
|
||||||
{
|
{
|
||||||
unsigned char *ptr_base;
|
|
||||||
unsigned char *ptr;
|
|
||||||
unsigned char *pred_ptr = x->predictor;
|
|
||||||
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
|
|
||||||
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
|
|
||||||
int pre_stride = x->block[0].pre_stride;
|
|
||||||
|
|
||||||
ptr_base = x->pre.y_buffer;
|
if (x->mode_info_context->mbmi.ref_frame != INTRA_FRAME &&
|
||||||
ptr = ptr_base + (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
x->mode_info_context->mbmi.mode != SPLITMV)
|
||||||
|
|
||||||
if ((mv_row | mv_col) & 7)
|
|
||||||
{
|
{
|
||||||
x->subpixel_predict16x16(ptr, pre_stride, mv_col & 7, mv_row & 7, pred_ptr, 16);
|
unsigned char *ptr_base;
|
||||||
}
|
unsigned char *ptr;
|
||||||
else
|
unsigned char *pred_ptr = x->predictor;
|
||||||
{
|
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
|
||||||
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, pred_ptr, 16);
|
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
|
||||||
}
|
int pre_stride = x->block[0].pre_stride;
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_build_inter16x16_predictors_mb(MACROBLOCKD *x,
|
ptr_base = x->pre.y_buffer;
|
||||||
unsigned char *dst_y,
|
ptr = ptr_base + (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
||||||
unsigned char *dst_u,
|
|
||||||
unsigned char *dst_v,
|
|
||||||
int dst_ystride,
|
|
||||||
int dst_uvstride)
|
|
||||||
{
|
|
||||||
int offset;
|
|
||||||
unsigned char *ptr;
|
|
||||||
unsigned char *uptr, *vptr;
|
|
||||||
|
|
||||||
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
|
if ((mv_row | mv_col) & 7)
|
||||||
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
|
|
||||||
|
|
||||||
unsigned char *ptr_base = x->pre.y_buffer;
|
|
||||||
int pre_stride = x->block[0].pre_stride;
|
|
||||||
|
|
||||||
ptr = ptr_base + (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
|
||||||
|
|
||||||
if ((mv_row | mv_col) & 7)
|
|
||||||
{
|
|
||||||
x->subpixel_predict16x16(ptr, pre_stride, mv_col & 7, mv_row & 7, dst_y, dst_ystride);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, dst_y, dst_ystride);
|
|
||||||
}
|
|
||||||
|
|
||||||
mv_row = x->block[16].bmi.mv.as_mv.row;
|
|
||||||
mv_col = x->block[16].bmi.mv.as_mv.col;
|
|
||||||
pre_stride >>= 1;
|
|
||||||
offset = (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
|
||||||
uptr = x->pre.u_buffer + offset;
|
|
||||||
vptr = x->pre.v_buffer + offset;
|
|
||||||
|
|
||||||
if ((mv_row | mv_col) & 7)
|
|
||||||
{
|
|
||||||
x->subpixel_predict8x8(uptr, pre_stride, mv_col & 7, mv_row & 7, dst_u, dst_uvstride);
|
|
||||||
x->subpixel_predict8x8(vptr, pre_stride, mv_col & 7, mv_row & 7, dst_v, dst_uvstride);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
RECON_INVOKE(&x->rtcd->recon, copy8x8)(uptr, pre_stride, dst_u, dst_uvstride);
|
|
||||||
RECON_INVOKE(&x->rtcd->recon, copy8x8)(vptr, pre_stride, dst_v, dst_uvstride);
|
|
||||||
}
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_build_inter4x4_predictors_mb(MACROBLOCKD *x)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
|
|
||||||
if (x->mode_info_context->mbmi.partitioning < 3)
|
|
||||||
{
|
|
||||||
for (i = 0; i < 4; i++)
|
|
||||||
{
|
{
|
||||||
BLOCKD *d = &x->block[bbb[i]];
|
x->subpixel_predict16x16(ptr, pre_stride, mv_col & 7, mv_row & 7, pred_ptr, 16);
|
||||||
build_inter_predictors4b(x, d, 16);
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, pred_ptr, 16);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
for (i = 0; i < 16; i += 2)
|
int i;
|
||||||
{
|
|
||||||
BLOCKD *d0 = &x->block[i];
|
|
||||||
BLOCKD *d1 = &x->block[i+1];
|
|
||||||
|
|
||||||
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
if (x->mode_info_context->mbmi.partitioning < 3)
|
||||||
build_inter_predictors2b(x, d0, 16);
|
{
|
||||||
else
|
for (i = 0; i < 4; i++)
|
||||||
{
|
{
|
||||||
vp8_build_inter_predictors_b(d0, 16, x->subpixel_predict);
|
BLOCKD *d = &x->block[bbb[i]];
|
||||||
vp8_build_inter_predictors_b(d1, 16, x->subpixel_predict);
|
build_inter_predictors4b(x, d, 16);
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
|
||||||
|
|
||||||
for (i = 16; i < 24; i += 2)
|
|
||||||
{
|
|
||||||
BLOCKD *d0 = &x->block[i];
|
|
||||||
BLOCKD *d1 = &x->block[i+1];
|
|
||||||
|
|
||||||
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
|
||||||
build_inter_predictors2b(x, d0, 8);
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
vp8_build_inter_predictors_b(d0, 8, x->subpixel_predict);
|
for (i = 0; i < 16; i += 2)
|
||||||
vp8_build_inter_predictors_b(d1, 8, x->subpixel_predict);
|
{
|
||||||
|
BLOCKD *d0 = &x->block[i];
|
||||||
|
BLOCKD *d1 = &x->block[i+1];
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
||||||
|
build_inter_predictors2b(x, d0, 16);
|
||||||
|
else
|
||||||
|
{
|
||||||
|
vp8_build_inter_predictors_b(d0, 16, x->subpixel_predict);
|
||||||
|
vp8_build_inter_predictors_b(d1, 16, x->subpixel_predict);
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void vp8_build_inter_predictors_mb(MACROBLOCKD *x)
|
void vp8_build_inter_predictors_mb(MACROBLOCKD *x)
|
||||||
{
|
{
|
||||||
if (x->mode_info_context->mbmi.mode != SPLITMV)
|
|
||||||
|
if (x->mode_info_context->mbmi.ref_frame != INTRA_FRAME &&
|
||||||
|
x->mode_info_context->mbmi.mode != SPLITMV)
|
||||||
{
|
{
|
||||||
vp8_build_inter16x16_predictors_mb(x, x->predictor, &x->predictor[256],
|
int offset;
|
||||||
&x->predictor[320], 16, 8);
|
unsigned char *ptr_base;
|
||||||
|
unsigned char *ptr;
|
||||||
|
unsigned char *uptr, *vptr;
|
||||||
|
unsigned char *pred_ptr = x->predictor;
|
||||||
|
unsigned char *upred_ptr = &x->predictor[256];
|
||||||
|
unsigned char *vpred_ptr = &x->predictor[320];
|
||||||
|
|
||||||
|
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
|
||||||
|
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
|
||||||
|
int pre_stride = x->block[0].pre_stride;
|
||||||
|
|
||||||
|
ptr_base = x->pre.y_buffer;
|
||||||
|
ptr = ptr_base + (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
||||||
|
|
||||||
|
if ((mv_row | mv_col) & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict16x16(ptr, pre_stride, mv_col & 7, mv_row & 7, pred_ptr, 16);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, pred_ptr, 16);
|
||||||
|
}
|
||||||
|
|
||||||
|
mv_row = x->block[16].bmi.mv.as_mv.row;
|
||||||
|
mv_col = x->block[16].bmi.mv.as_mv.col;
|
||||||
|
pre_stride >>= 1;
|
||||||
|
offset = (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
||||||
|
uptr = x->pre.u_buffer + offset;
|
||||||
|
vptr = x->pre.v_buffer + offset;
|
||||||
|
|
||||||
|
if ((mv_row | mv_col) & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict8x8(uptr, pre_stride, mv_col & 7, mv_row & 7, upred_ptr, 8);
|
||||||
|
x->subpixel_predict8x8(vptr, pre_stride, mv_col & 7, mv_row & 7, vpred_ptr, 8);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x8)(uptr, pre_stride, upred_ptr, 8);
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x8)(vptr, pre_stride, vpred_ptr, 8);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
vp8_build_inter4x4_predictors_mb(x);
|
int i;
|
||||||
|
|
||||||
|
if (x->mode_info_context->mbmi.partitioning < 3)
|
||||||
|
{
|
||||||
|
for (i = 0; i < 4; i++)
|
||||||
|
{
|
||||||
|
BLOCKD *d = &x->block[bbb[i]];
|
||||||
|
build_inter_predictors4b(x, d, 16);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (i = 0; i < 16; i += 2)
|
||||||
|
{
|
||||||
|
BLOCKD *d0 = &x->block[i];
|
||||||
|
BLOCKD *d1 = &x->block[i+1];
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
||||||
|
build_inter_predictors2b(x, d0, 16);
|
||||||
|
else
|
||||||
|
{
|
||||||
|
vp8_build_inter_predictors_b(d0, 16, x->subpixel_predict);
|
||||||
|
vp8_build_inter_predictors_b(d1, 16, x->subpixel_predict);
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
for (i = 16; i < 24; i += 2)
|
||||||
|
{
|
||||||
|
BLOCKD *d0 = &x->block[i];
|
||||||
|
BLOCKD *d1 = &x->block[i+1];
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
||||||
|
build_inter_predictors2b(x, d0, 8);
|
||||||
|
else
|
||||||
|
{
|
||||||
|
vp8_build_inter_predictors_b(d0, 8, x->subpixel_predict);
|
||||||
|
vp8_build_inter_predictors_b(d1, 8, x->subpixel_predict);
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -461,5 +492,202 @@ void vp8_build_uvmvs(MACROBLOCKD *x, int fullpixel)
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* The following functions are wriiten for skip_recon_mb() to call. Since there is no recon in this
|
||||||
|
* situation, we can write the result directly to dst buffer instead of writing it to predictor
|
||||||
|
* buffer and then copying it to dst buffer.
|
||||||
|
*/
|
||||||
|
static void vp8_build_inter_predictors_b_s(BLOCKD *d, unsigned char *dst_ptr, vp8_subpix_fn_t sppf)
|
||||||
|
{
|
||||||
|
int r;
|
||||||
|
unsigned char *ptr_base;
|
||||||
|
unsigned char *ptr;
|
||||||
|
/*unsigned char *pred_ptr = d->predictor;*/
|
||||||
|
int dst_stride = d->dst_stride;
|
||||||
|
int pre_stride = d->pre_stride;
|
||||||
|
|
||||||
|
ptr_base = *(d->base_pre);
|
||||||
|
|
||||||
|
if (d->bmi.mv.as_mv.row & 7 || d->bmi.mv.as_mv.col & 7)
|
||||||
|
{
|
||||||
|
ptr = ptr_base + d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
|
||||||
|
sppf(ptr, pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst_ptr, dst_stride);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
ptr_base += d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
|
||||||
|
ptr = ptr_base;
|
||||||
|
|
||||||
|
for (r = 0; r < 4; r++)
|
||||||
|
{
|
||||||
|
#ifdef MUST_BE_ALIGNED
|
||||||
|
dst_ptr[0] = ptr[0];
|
||||||
|
dst_ptr[1] = ptr[1];
|
||||||
|
dst_ptr[2] = ptr[2];
|
||||||
|
dst_ptr[3] = ptr[3];
|
||||||
|
#else
|
||||||
|
*(int *)dst_ptr = *(int *)ptr ;
|
||||||
|
#endif
|
||||||
|
dst_ptr += dst_stride;
|
||||||
|
ptr += pre_stride;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
void vp8_build_inter_predictors_mb_s(MACROBLOCKD *x)
|
||||||
|
{
|
||||||
|
/*unsigned char *pred_ptr = x->block[0].predictor;
|
||||||
|
unsigned char *dst_ptr = *(x->block[0].base_dst) + x->block[0].dst;*/
|
||||||
|
unsigned char *pred_ptr = x->predictor;
|
||||||
|
unsigned char *dst_ptr = x->dst.y_buffer;
|
||||||
|
|
||||||
|
if (x->mode_info_context->mbmi.mode != SPLITMV)
|
||||||
|
{
|
||||||
|
int offset;
|
||||||
|
unsigned char *ptr_base;
|
||||||
|
unsigned char *ptr;
|
||||||
|
unsigned char *uptr, *vptr;
|
||||||
|
/*unsigned char *pred_ptr = x->predictor;
|
||||||
|
unsigned char *upred_ptr = &x->predictor[256];
|
||||||
|
unsigned char *vpred_ptr = &x->predictor[320];*/
|
||||||
|
unsigned char *udst_ptr = x->dst.u_buffer;
|
||||||
|
unsigned char *vdst_ptr = x->dst.v_buffer;
|
||||||
|
|
||||||
|
int mv_row = x->mode_info_context->mbmi.mv.as_mv.row;
|
||||||
|
int mv_col = x->mode_info_context->mbmi.mv.as_mv.col;
|
||||||
|
int pre_stride = x->dst.y_stride; /*x->block[0].pre_stride;*/
|
||||||
|
|
||||||
|
ptr_base = x->pre.y_buffer;
|
||||||
|
ptr = ptr_base + (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
||||||
|
|
||||||
|
if ((mv_row | mv_col) & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict16x16(ptr, pre_stride, mv_col & 7, mv_row & 7, dst_ptr, x->dst.y_stride); /*x->block[0].dst_stride);*/
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy16x16)(ptr, pre_stride, dst_ptr, x->dst.y_stride); /*x->block[0].dst_stride);*/
|
||||||
|
}
|
||||||
|
|
||||||
|
mv_row = x->block[16].bmi.mv.as_mv.row;
|
||||||
|
mv_col = x->block[16].bmi.mv.as_mv.col;
|
||||||
|
pre_stride >>= 1;
|
||||||
|
offset = (mv_row >> 3) * pre_stride + (mv_col >> 3);
|
||||||
|
uptr = x->pre.u_buffer + offset;
|
||||||
|
vptr = x->pre.v_buffer + offset;
|
||||||
|
|
||||||
|
if ((mv_row | mv_col) & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict8x8(uptr, pre_stride, mv_col & 7, mv_row & 7, udst_ptr, x->dst.uv_stride);
|
||||||
|
x->subpixel_predict8x8(vptr, pre_stride, mv_col & 7, mv_row & 7, vdst_ptr, x->dst.uv_stride);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x8)(uptr, pre_stride, udst_ptr, x->dst.uv_stride);
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x8)(vptr, pre_stride, vdst_ptr, x->dst.uv_stride);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
/* note: this whole ELSE part is not executed at all. So, no way to test the correctness of my modification. Later,
|
||||||
|
* if sth is wrong, go back to what it is in build_inter_predictors_mb.
|
||||||
|
*/
|
||||||
|
int i;
|
||||||
|
|
||||||
|
if (x->mode_info_context->mbmi.partitioning < 3)
|
||||||
|
{
|
||||||
|
for (i = 0; i < 4; i++)
|
||||||
|
{
|
||||||
|
BLOCKD *d = &x->block[bbb[i]];
|
||||||
|
/*build_inter_predictors4b(x, d, 16);*/
|
||||||
|
|
||||||
|
{
|
||||||
|
unsigned char *ptr_base;
|
||||||
|
unsigned char *ptr;
|
||||||
|
unsigned char *pred_ptr = d->predictor;
|
||||||
|
|
||||||
|
ptr_base = *(d->base_pre);
|
||||||
|
ptr = ptr_base + d->pre + (d->bmi.mv.as_mv.row >> 3) * d->pre_stride + (d->bmi.mv.as_mv.col >> 3);
|
||||||
|
|
||||||
|
if (d->bmi.mv.as_mv.row & 7 || d->bmi.mv.as_mv.col & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict8x8(ptr, d->pre_stride, d->bmi.mv.as_mv.col & 7, d->bmi.mv.as_mv.row & 7, dst_ptr, x->dst.y_stride); /*x->block[0].dst_stride);*/
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x8)(ptr, d->pre_stride, dst_ptr, x->dst.y_stride); /*x->block[0].dst_stride);*/
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (i = 0; i < 16; i += 2)
|
||||||
|
{
|
||||||
|
BLOCKD *d0 = &x->block[i];
|
||||||
|
BLOCKD *d1 = &x->block[i+1];
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
||||||
|
{
|
||||||
|
/*build_inter_predictors2b(x, d0, 16);*/
|
||||||
|
unsigned char *ptr_base;
|
||||||
|
unsigned char *ptr;
|
||||||
|
unsigned char *pred_ptr = d0->predictor;
|
||||||
|
|
||||||
|
ptr_base = *(d0->base_pre);
|
||||||
|
ptr = ptr_base + d0->pre + (d0->bmi.mv.as_mv.row >> 3) * d0->pre_stride + (d0->bmi.mv.as_mv.col >> 3);
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_mv.row & 7 || d0->bmi.mv.as_mv.col & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict8x4(ptr, d0->pre_stride, d0->bmi.mv.as_mv.col & 7, d0->bmi.mv.as_mv.row & 7, dst_ptr, x->dst.y_stride);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x4)(ptr, d0->pre_stride, dst_ptr, x->dst.y_stride);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
vp8_build_inter_predictors_b_s(d0, dst_ptr, x->subpixel_predict);
|
||||||
|
vp8_build_inter_predictors_b_s(d1, dst_ptr, x->subpixel_predict);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for (i = 16; i < 24; i += 2)
|
||||||
|
{
|
||||||
|
BLOCKD *d0 = &x->block[i];
|
||||||
|
BLOCKD *d1 = &x->block[i+1];
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_int == d1->bmi.mv.as_int)
|
||||||
|
{
|
||||||
|
/*build_inter_predictors2b(x, d0, 8);*/
|
||||||
|
unsigned char *ptr_base;
|
||||||
|
unsigned char *ptr;
|
||||||
|
unsigned char *pred_ptr = d0->predictor;
|
||||||
|
|
||||||
|
ptr_base = *(d0->base_pre);
|
||||||
|
ptr = ptr_base + d0->pre + (d0->bmi.mv.as_mv.row >> 3) * d0->pre_stride + (d0->bmi.mv.as_mv.col >> 3);
|
||||||
|
|
||||||
|
if (d0->bmi.mv.as_mv.row & 7 || d0->bmi.mv.as_mv.col & 7)
|
||||||
|
{
|
||||||
|
x->subpixel_predict8x4(ptr, d0->pre_stride,
|
||||||
|
d0->bmi.mv.as_mv.col & 7,
|
||||||
|
d0->bmi.mv.as_mv.row & 7,
|
||||||
|
dst_ptr, x->dst.uv_stride);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
RECON_INVOKE(&x->rtcd->recon, copy8x4)(ptr,
|
||||||
|
d0->pre_stride, dst_ptr, x->dst.uv_stride);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
vp8_build_inter_predictors_b_s(d0, dst_ptr, x->subpixel_predict);
|
||||||
|
vp8_build_inter_predictors_b_s(d1, dst_ptr, x->subpixel_predict);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|||||||
@@ -13,15 +13,9 @@
|
|||||||
#define __INC_RECONINTER_H
|
#define __INC_RECONINTER_H
|
||||||
|
|
||||||
extern void vp8_build_inter_predictors_mb(MACROBLOCKD *x);
|
extern void vp8_build_inter_predictors_mb(MACROBLOCKD *x);
|
||||||
extern void vp8_build_inter16x16_predictors_mb(MACROBLOCKD *x,
|
extern void vp8_build_inter_predictors_mb_s(MACROBLOCKD *x);
|
||||||
unsigned char *dst_y,
|
|
||||||
unsigned char *dst_u,
|
|
||||||
unsigned char *dst_v,
|
|
||||||
int dst_ystride,
|
|
||||||
int dst_uvstride);
|
|
||||||
|
|
||||||
|
extern void vp8_build_inter_predictors_mby(MACROBLOCKD *x);
|
||||||
extern void vp8_build_inter16x16_predictors_mby(MACROBLOCKD *x);
|
|
||||||
extern void vp8_build_uvmvs(MACROBLOCKD *x, int fullpixel);
|
extern void vp8_build_uvmvs(MACROBLOCKD *x, int fullpixel);
|
||||||
extern void vp8_build_inter_predictors_b(BLOCKD *d, int pitch, vp8_subpix_fn_t sppf);
|
extern void vp8_build_inter_predictors_b(BLOCKD *d, int pitch, vp8_subpix_fn_t sppf);
|
||||||
extern void vp8_build_inter_predictors_mbuv(MACROBLOCKD *x);
|
extern void vp8_build_inter_predictors_mbuv(MACROBLOCKD *x);
|
||||||
|
|||||||
@@ -14,4 +14,9 @@
|
|||||||
|
|
||||||
extern void init_intra_left_above_pixels(MACROBLOCKD *x);
|
extern void init_intra_left_above_pixels(MACROBLOCKD *x);
|
||||||
|
|
||||||
|
extern void vp8_build_intra_predictors_mbuv(MACROBLOCKD *x);
|
||||||
|
extern void vp8_build_intra_predictors_mbuv_s(MACROBLOCKD *x);
|
||||||
|
|
||||||
|
extern void vp8_predict_intra4x4(BLOCKD *x, int b_mode, unsigned char *Predictor);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -14,7 +14,7 @@
|
|||||||
#include "vpx_mem/vpx_mem.h"
|
#include "vpx_mem/vpx_mem.h"
|
||||||
#include "reconintra.h"
|
#include "reconintra.h"
|
||||||
|
|
||||||
void vp8_intra4x4_predict(BLOCKD *x,
|
void vp8_predict_intra4x4(BLOCKD *x,
|
||||||
int b_mode,
|
int b_mode,
|
||||||
unsigned char *predictor)
|
unsigned char *predictor)
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -12,6 +12,8 @@
|
|||||||
#ifndef _PTHREAD_EMULATION
|
#ifndef _PTHREAD_EMULATION
|
||||||
#define _PTHREAD_EMULATION
|
#define _PTHREAD_EMULATION
|
||||||
|
|
||||||
|
#define VPXINFINITE 10000 /* 10second. */
|
||||||
|
|
||||||
#if CONFIG_OS_SUPPORT && CONFIG_MULTITHREAD
|
#if CONFIG_OS_SUPPORT && CONFIG_MULTITHREAD
|
||||||
|
|
||||||
/* Thread management macros */
|
/* Thread management macros */
|
||||||
@@ -26,7 +28,7 @@
|
|||||||
#define pthread_t HANDLE
|
#define pthread_t HANDLE
|
||||||
#define pthread_attr_t DWORD
|
#define pthread_attr_t DWORD
|
||||||
#define pthread_create(thhandle,attr,thfunc,tharg) (int)((*thhandle=(HANDLE)_beginthreadex(NULL,0,(unsigned int (__stdcall *)(void *))thfunc,tharg,0,NULL))==NULL)
|
#define pthread_create(thhandle,attr,thfunc,tharg) (int)((*thhandle=(HANDLE)_beginthreadex(NULL,0,(unsigned int (__stdcall *)(void *))thfunc,tharg,0,NULL))==NULL)
|
||||||
#define pthread_join(thread, result) ((WaitForSingleObject((thread),INFINITE)!=WAIT_OBJECT_0) || !CloseHandle(thread))
|
#define pthread_join(thread, result) ((WaitForSingleObject((thread),VPXINFINITE)!=WAIT_OBJECT_0) || !CloseHandle(thread))
|
||||||
#define pthread_detach(thread) if(thread!=NULL)CloseHandle(thread)
|
#define pthread_detach(thread) if(thread!=NULL)CloseHandle(thread)
|
||||||
#define thread_sleep(nms) Sleep(nms)
|
#define thread_sleep(nms) Sleep(nms)
|
||||||
#define pthread_cancel(thread) terminate_thread(thread,0)
|
#define pthread_cancel(thread) terminate_thread(thread,0)
|
||||||
@@ -59,9 +61,9 @@
|
|||||||
#ifdef _WIN32
|
#ifdef _WIN32
|
||||||
#define sem_t HANDLE
|
#define sem_t HANDLE
|
||||||
#define pause(voidpara) __asm PAUSE
|
#define pause(voidpara) __asm PAUSE
|
||||||
#define sem_init(sem, sem_attr1, sem_init_value) (int)((*sem = CreateSemaphore(NULL,0,32768,NULL))==NULL)
|
#define sem_init(sem, sem_attr1, sem_init_value) (int)((*sem = CreateEvent(NULL,FALSE,FALSE,NULL))==NULL)
|
||||||
#define sem_wait(sem) (int)(WAIT_OBJECT_0 != WaitForSingleObject(*sem,INFINITE))
|
#define sem_wait(sem) (int)(WAIT_OBJECT_0 != WaitForSingleObject(*sem,VPXINFINITE))
|
||||||
#define sem_post(sem) ReleaseSemaphore(*sem,1,NULL)
|
#define sem_post(sem) SetEvent(*sem)
|
||||||
#define sem_destroy(sem) if(*sem)((int)(CloseHandle(*sem))==TRUE)
|
#define sem_destroy(sem) if(*sem)((int)(CloseHandle(*sem))==TRUE)
|
||||||
#define thread_sleep(nms) Sleep(nms)
|
#define thread_sleep(nms) Sleep(nms)
|
||||||
|
|
||||||
|
|||||||
494
vp8/common/x86/boolcoder.cxx
Normal file
494
vp8/common/x86/boolcoder.cxx
Normal file
@@ -0,0 +1,494 @@
|
|||||||
|
/*
|
||||||
|
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
|
||||||
|
*
|
||||||
|
* Use of this source code is governed by a BSD-style license
|
||||||
|
* that can be found in the LICENSE file in the root of the source
|
||||||
|
* tree. An additional intellectual property rights grant can be found
|
||||||
|
* in the file PATENTS. All contributing project authors may
|
||||||
|
* be found in the AUTHORS file in the root of the source tree.
|
||||||
|
*/
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
/* Arithmetic bool coder with largish probability range.
|
||||||
|
Timothy S Murphy 6 August 2004 */
|
||||||
|
|
||||||
|
#include <assert.h>
|
||||||
|
#include <math.h>
|
||||||
|
|
||||||
|
#include "bool_coder.h"
|
||||||
|
|
||||||
|
#if tim_vp8
|
||||||
|
extern "C" {
|
||||||
|
# include "VP8cx/treewriter.h"
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
int_types::~int_types() {}
|
||||||
|
|
||||||
|
void bool_coder_spec::check_prec() const {
|
||||||
|
assert( w && (r==Up || w > 1) && w < 24 && (ebias || w < 17));
|
||||||
|
}
|
||||||
|
|
||||||
|
bool bool_coder_spec::float_init( uint Ebits, uint Mbits) {
|
||||||
|
uint b = (ebits = Ebits) + (mbits = Mbits);
|
||||||
|
if( b) {
|
||||||
|
assert( ebits < 6 && w + mbits < 31);
|
||||||
|
assert( ebits + mbits < sizeof(Index) * 8);
|
||||||
|
ebias = (1 << ebits) + 1 + mbits;
|
||||||
|
mmask = (1 << mbits) - 1;
|
||||||
|
max_index = ( ( half_index = 1 << b ) << 1) - 1;
|
||||||
|
} else {
|
||||||
|
ebias = 0;
|
||||||
|
max_index = 255;
|
||||||
|
half_index = 128;
|
||||||
|
}
|
||||||
|
check_prec();
|
||||||
|
return b? 1:0;
|
||||||
|
}
|
||||||
|
|
||||||
|
void bool_coder_spec::cost_init()
|
||||||
|
{
|
||||||
|
static cdouble c = -(1 << 20)/log( 2.);
|
||||||
|
|
||||||
|
FILE *f = fopen( "costs.txt", "w");
|
||||||
|
assert( f);
|
||||||
|
|
||||||
|
assert( sizeof(int) >= 4); /* for C interface */
|
||||||
|
assert( max_index <= 255); /* size of Ctbl */
|
||||||
|
uint i = 0; do {
|
||||||
|
cdouble p = ( *this)( (Index) i);
|
||||||
|
Ctbl[i] = (uint32) ( log( p) * c);
|
||||||
|
fprintf(
|
||||||
|
f, "cost( %d -> %10.7f) = %10d = %12.5f bits\n",
|
||||||
|
i, p, Ctbl[i], (double) Ctbl[i] / (1<<20)
|
||||||
|
);
|
||||||
|
} while( ++i <= max_index);
|
||||||
|
fclose( f);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool_coder_spec_explicit_table::bool_coder_spec_explicit_table(
|
||||||
|
cuint16 tbl[256], Rounding rr, uint prec
|
||||||
|
)
|
||||||
|
: bool_coder_spec( prec, rr)
|
||||||
|
{
|
||||||
|
check_prec();
|
||||||
|
uint i = 0;
|
||||||
|
if( tbl)
|
||||||
|
do { Ptbl[i] = tbl[i];} while( ++i < 256);
|
||||||
|
else
|
||||||
|
do { Ptbl[i] = i << 8;} while( ++i < 256);
|
||||||
|
cost_init();
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool_coder_spec_exponential_table::bool_coder_spec_exponential_table(
|
||||||
|
uint x, Rounding rr, uint prec
|
||||||
|
)
|
||||||
|
: bool_coder_spec( prec, rr)
|
||||||
|
{
|
||||||
|
assert( x > 1 && x <= 16);
|
||||||
|
check_prec();
|
||||||
|
Ptbl[128] = 32768u;
|
||||||
|
Ptbl[0] = (uint16) pow( 2., 16. - x);
|
||||||
|
--x;
|
||||||
|
int i=1; do {
|
||||||
|
cdouble d = pow( .5, 1. + (1. - i/128.)*x) * 65536.;
|
||||||
|
uint16 v = (uint16) d;
|
||||||
|
if( v < i)
|
||||||
|
v = i;
|
||||||
|
Ptbl[256-i] = (uint16) ( 65536U - (Ptbl[i] = v));
|
||||||
|
} while( ++i < 128);
|
||||||
|
cost_init();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool_coder_spec::bool_coder_spec( FILE *fp) {
|
||||||
|
fscanf( fp, "%d", &w);
|
||||||
|
int v;
|
||||||
|
fscanf( fp, "%d", &v);
|
||||||
|
assert( 0 <= v && v <= 2);
|
||||||
|
r = (Rounding) v;
|
||||||
|
fscanf( fp, "%d", &ebits);
|
||||||
|
fscanf( fp, "%d", &mbits);
|
||||||
|
if( float_init( ebits, mbits))
|
||||||
|
return;
|
||||||
|
int i=0; do {
|
||||||
|
uint v;
|
||||||
|
fscanf( fp, "%d", &v);
|
||||||
|
assert( 0 <=v && v <= 65535U);
|
||||||
|
Ptbl[i] = v;
|
||||||
|
} while( ++i < 256);
|
||||||
|
cost_init();
|
||||||
|
}
|
||||||
|
|
||||||
|
void bool_coder_spec::dump( FILE *fp) const {
|
||||||
|
fprintf( fp, "%d %d %d %d\n", w, (int) r, ebits, mbits);
|
||||||
|
if( ebits || mbits)
|
||||||
|
return;
|
||||||
|
int i=0; do { fprintf( fp, "%d\n", Ptbl[i]);} while( ++i < 256);
|
||||||
|
}
|
||||||
|
|
||||||
|
vp8bc_index_t bool_coder_spec::operator()( double p) const
|
||||||
|
{
|
||||||
|
if( p <= 0.)
|
||||||
|
return 0;
|
||||||
|
if( p >= 1.)
|
||||||
|
return max_index;
|
||||||
|
if( ebias) {
|
||||||
|
if( p > .5)
|
||||||
|
return max_index - ( *this)( 1. - p);
|
||||||
|
int e;
|
||||||
|
uint m = (uint) ldexp( frexp( p, &e), mbits + 2);
|
||||||
|
uint x = 1 << (mbits + 1);
|
||||||
|
assert( x <= m && m < x<<1);
|
||||||
|
if( (m = (m >> 1) + (m & 1)) >= x) {
|
||||||
|
m = x >> 1;
|
||||||
|
++e;
|
||||||
|
}
|
||||||
|
int y = 1 << ebits;
|
||||||
|
if( (e += y) >= y)
|
||||||
|
return half_index - 1;
|
||||||
|
if( e < 0)
|
||||||
|
return 0;
|
||||||
|
return (Index) ( (e << mbits) + (m & mmask));
|
||||||
|
}
|
||||||
|
|
||||||
|
cuint16 v = (uint16) (p * 65536.);
|
||||||
|
int i = 128;
|
||||||
|
int j = 128;
|
||||||
|
uint16 w;
|
||||||
|
while( w = Ptbl[i], j >>= 1) {
|
||||||
|
if( w < v)
|
||||||
|
i += j;
|
||||||
|
else if( w == v)
|
||||||
|
return (uchar) i;
|
||||||
|
else
|
||||||
|
i -= j;
|
||||||
|
}
|
||||||
|
if( w > v) {
|
||||||
|
cuint16 x = Ptbl[i-1];
|
||||||
|
if( v <= x || w - v > v - x)
|
||||||
|
--i;
|
||||||
|
} else if( w < v && i < 255) {
|
||||||
|
cuint16 x = Ptbl[i+1];
|
||||||
|
if( x <= v || x - v < v - w)
|
||||||
|
++i;
|
||||||
|
}
|
||||||
|
return (Index) i;
|
||||||
|
}
|
||||||
|
|
||||||
|
double bool_coder_spec::operator()( Index i) const {
|
||||||
|
if( !ebias)
|
||||||
|
return Ptbl[i]/65536.;
|
||||||
|
if( i >= half_index)
|
||||||
|
return 1. - ( *this)( (Index) (max_index - i));
|
||||||
|
return ldexp( (double)mantissa( i), - (int) exponent( i));
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
void bool_writer::carry() {
|
||||||
|
uchar *p = B;
|
||||||
|
assert( p > Bstart);
|
||||||
|
while( *--p == 255) { assert( p > Bstart); *p = 0;}
|
||||||
|
++*p;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool_writer::bool_writer( c_spec& s, uchar *Dest, size_t Len)
|
||||||
|
: bool_coder( s),
|
||||||
|
Bstart( Dest),
|
||||||
|
Bend( Len? Dest+Len : 0),
|
||||||
|
B( Dest)
|
||||||
|
{
|
||||||
|
assert( Dest);
|
||||||
|
reset();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool_writer::~bool_writer() { flush();}
|
||||||
|
|
||||||
|
#if 1
|
||||||
|
extern "C" { int bc_v = 0;}
|
||||||
|
#else
|
||||||
|
# define bc_v 0
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
void bool_writer::raw( bool value, uint32 s) {
|
||||||
|
uint32 L = Low;
|
||||||
|
|
||||||
|
assert( Range >= min_range && Range <= spec.max_range());
|
||||||
|
assert( !is_toast && s && s < Range);
|
||||||
|
|
||||||
|
if( bc_v) printf(
|
||||||
|
"Writing a %d, B %x Low %x Range %x s %x blag %d ...\n",
|
||||||
|
value? 1:0, B-Bstart, Low, Range, s, bit_lag
|
||||||
|
);
|
||||||
|
if( value) {
|
||||||
|
L += s;
|
||||||
|
s = Range - s;
|
||||||
|
} else
|
||||||
|
s -= rinc;
|
||||||
|
if( s < min_range) {
|
||||||
|
int ct = bit_lag; do {
|
||||||
|
if( !--ct) {
|
||||||
|
ct = 8;
|
||||||
|
if( L & (1 << 31))
|
||||||
|
carry();
|
||||||
|
assert( !Bend || B < Bend);
|
||||||
|
*B++ = (uchar) (L >> 23);
|
||||||
|
L &= (1<<23) - 1;
|
||||||
|
}
|
||||||
|
} while( L += L, (s += s + rinc) < min_range);
|
||||||
|
bit_lag = ct;
|
||||||
|
}
|
||||||
|
Low = L;
|
||||||
|
Range = s;
|
||||||
|
if( bc_v)
|
||||||
|
printf(
|
||||||
|
"...done, B %x Low %x Range %x blag %d \n",
|
||||||
|
B-Bstart, Low, Range, bit_lag
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool_writer& bool_writer::flush() {
|
||||||
|
if( is_toast)
|
||||||
|
return *this;
|
||||||
|
int b = bit_lag;
|
||||||
|
uint32 L = Low;
|
||||||
|
assert( b);
|
||||||
|
if( L & (1 << (32 - b)))
|
||||||
|
carry();
|
||||||
|
L <<= b & 7;
|
||||||
|
b >>= 3;
|
||||||
|
while( --b >= 0)
|
||||||
|
L <<= 8;
|
||||||
|
b = 4;
|
||||||
|
assert( !Bend || B + 4 <= Bend);
|
||||||
|
do {
|
||||||
|
*B++ = (uchar) (L >> 24);
|
||||||
|
L <<= 8;
|
||||||
|
} while( --b);
|
||||||
|
is_toast = 1;
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool_reader::bool_reader( c_spec& s, cuchar *src, size_t Len)
|
||||||
|
: bool_coder( s),
|
||||||
|
Bstart( src),
|
||||||
|
B( src),
|
||||||
|
Bend( Len? src+Len : 0),
|
||||||
|
shf( 32 - s.w),
|
||||||
|
bct( 8)
|
||||||
|
{
|
||||||
|
int i = 4; do { Low <<= 8; Low |= *B++;} while( --i);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
bool bool_reader::raw( uint32 s) {
|
||||||
|
|
||||||
|
bool val = 0;
|
||||||
|
uint32 L = Low;
|
||||||
|
cuint32 S = s << shf;
|
||||||
|
|
||||||
|
assert( Range >= min_range && Range <= spec.max_range());
|
||||||
|
assert( s && s < Range && (L >> shf) < Range);
|
||||||
|
|
||||||
|
if( bc_v)
|
||||||
|
printf(
|
||||||
|
"Reading, B %x Low %x Range %x s %x bct %d ...\n",
|
||||||
|
B-Bstart, Low, Range, s, bct
|
||||||
|
);
|
||||||
|
|
||||||
|
if( L >= S) {
|
||||||
|
L -= S;
|
||||||
|
s = Range - s;
|
||||||
|
assert( L < (s << shf));
|
||||||
|
val = 1;
|
||||||
|
} else
|
||||||
|
s -= rinc;
|
||||||
|
if( s < min_range) {
|
||||||
|
int ct = bct;
|
||||||
|
do {
|
||||||
|
assert( ~L & (1 << 31));
|
||||||
|
L += L;
|
||||||
|
if( !--ct) {
|
||||||
|
ct = 8;
|
||||||
|
if( !Bend || B < Bend)
|
||||||
|
L |= *B++;
|
||||||
|
}
|
||||||
|
} while( (s += s + rinc) < min_range);
|
||||||
|
bct = ct;
|
||||||
|
}
|
||||||
|
Low = L;
|
||||||
|
Range = s;
|
||||||
|
if( bc_v)
|
||||||
|
printf(
|
||||||
|
"...done, val %d B %x Low %x Range %x bct %d\n",
|
||||||
|
val? 1:0, B-Bstart, Low, Range, bct
|
||||||
|
);
|
||||||
|
return val;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* C interfaces */
|
||||||
|
|
||||||
|
// spec interface
|
||||||
|
|
||||||
|
struct NS : bool_coder_namespace {
|
||||||
|
static Rounding r( vp8bc_c_prec *p, Rounding rr =down_full) {
|
||||||
|
return p? (Rounding) p->r : rr;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
bool_coder_spec *vp8bc_vp6spec() {
|
||||||
|
return new bool_coder_spec_explicit_table( 0, bool_coder_namespace::Down, 8);
|
||||||
|
}
|
||||||
|
bool_coder_spec *vp8bc_float_spec(
|
||||||
|
unsigned int Ebits, unsigned int Mbits, vp8bc_c_prec *p
|
||||||
|
) {
|
||||||
|
return new bool_coder_spec_float( Ebits, Mbits, NS::r( p), p? p->prec : 12);
|
||||||
|
}
|
||||||
|
bool_coder_spec *vp8bc_literal_spec(
|
||||||
|
const unsigned short m[256], vp8bc_c_prec *p
|
||||||
|
) {
|
||||||
|
return new bool_coder_spec_explicit_table( m, NS::r( p), p? p->prec : 16);
|
||||||
|
}
|
||||||
|
bool_coder_spec *vp8bc_exponential_spec( unsigned int x, vp8bc_c_prec *p)
|
||||||
|
{
|
||||||
|
return new bool_coder_spec_exponential_table( x, NS::r( p), p? p->prec : 16);
|
||||||
|
}
|
||||||
|
bool_coder_spec *vp8bc_spec_from_file( FILE *fp) {
|
||||||
|
return new bool_coder_spec( fp);
|
||||||
|
}
|
||||||
|
void vp8bc_destroy_spec( c_bool_coder_spec *p) { delete p;}
|
||||||
|
|
||||||
|
void vp8bc_spec_to_file( c_bool_coder_spec *p, FILE *fp) { p->dump( fp);}
|
||||||
|
|
||||||
|
vp8bc_index_t vp8bc_index( c_bool_coder_spec *p, double x) {
|
||||||
|
return ( *p)( x);
|
||||||
|
}
|
||||||
|
|
||||||
|
vp8bc_index_t vp8bc_index_from_counts(
|
||||||
|
c_bool_coder_spec *p, unsigned int L, unsigned int R
|
||||||
|
) {
|
||||||
|
return ( *p)( (R += L)? (double) L/R : .5);
|
||||||
|
}
|
||||||
|
|
||||||
|
double vp8bc_probability( c_bool_coder_spec *p, vp8bc_index_t i) {
|
||||||
|
return ( *p)( i);
|
||||||
|
}
|
||||||
|
|
||||||
|
vp8bc_index_t vp8bc_complement( c_bool_coder_spec *p, vp8bc_index_t i) {
|
||||||
|
return p->complement( i);
|
||||||
|
}
|
||||||
|
unsigned int vp8bc_cost_zero( c_bool_coder_spec *p, vp8bc_index_t i) {
|
||||||
|
return p->cost_zero( i);
|
||||||
|
}
|
||||||
|
unsigned int vp8bc_cost_one( c_bool_coder_spec *p, vp8bc_index_t i) {
|
||||||
|
return p->cost_one( i);
|
||||||
|
}
|
||||||
|
unsigned int vp8bc_cost_bit( c_bool_coder_spec *p, vp8bc_index_t i, int v) {
|
||||||
|
return p->cost_bit( i, v);
|
||||||
|
}
|
||||||
|
|
||||||
|
#if tim_vp8
|
||||||
|
extern "C" int tok_verbose;
|
||||||
|
|
||||||
|
# define dbg_l 1000000
|
||||||
|
|
||||||
|
static vp8bc_index_t dbg_i [dbg_l];
|
||||||
|
static char dbg_v [dbg_l];
|
||||||
|
static size_t dbg_w = 0, dbg_r = 0;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
// writer interface
|
||||||
|
|
||||||
|
bool_writer *vp8bc_create_writer(
|
||||||
|
c_bool_coder_spec *p, unsigned char *D, size_t L
|
||||||
|
) {
|
||||||
|
return new bool_writer( *p, D, L);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t vp8bc_destroy_writer( bool_writer *p) {
|
||||||
|
const size_t s = p->flush().bytes_written();
|
||||||
|
delete p;
|
||||||
|
return s;
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8bc_write_bool( bool_writer *p, int v, vp8bc_index_t i)
|
||||||
|
{
|
||||||
|
# if tim_vp8
|
||||||
|
// bc_v = dbg_w < 10;
|
||||||
|
if( bc_v = tok_verbose)
|
||||||
|
printf( " writing %d at prob %d\n", v? 1:0, i);
|
||||||
|
accum_entropy_bc( &p->Spec(), i, v);
|
||||||
|
|
||||||
|
( *p)( i, (bool) v);
|
||||||
|
|
||||||
|
if( dbg_w < dbg_l) {
|
||||||
|
dbg_i [dbg_w] = i;
|
||||||
|
dbg_v [dbg_w++] = v? 1:0;
|
||||||
|
}
|
||||||
|
# else
|
||||||
|
( *p)( i, (bool) v);
|
||||||
|
# endif
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8bc_write_bits( bool_writer *p, unsigned int v, int n)
|
||||||
|
{
|
||||||
|
# if tim_vp8
|
||||||
|
{
|
||||||
|
c_bool_coder_spec * const s = & p->Spec();
|
||||||
|
const vp8bc_index_t i = s->half_index();
|
||||||
|
int m = n;
|
||||||
|
while( --m >= 0)
|
||||||
|
accum_entropy_bc( s, i, (v>>m) & 1);
|
||||||
|
}
|
||||||
|
# endif
|
||||||
|
|
||||||
|
p->write_bits( n, v);
|
||||||
|
}
|
||||||
|
|
||||||
|
c_bool_coder_spec *vp8bc_writer_spec( c_bool_writer *w) { return & w->Spec();}
|
||||||
|
|
||||||
|
// reader interface
|
||||||
|
|
||||||
|
bool_reader *vp8bc_create_reader(
|
||||||
|
c_bool_coder_spec *p, const unsigned char *S, size_t L
|
||||||
|
) {
|
||||||
|
return new bool_reader( *p, S, L);
|
||||||
|
}
|
||||||
|
|
||||||
|
void vp8bc_destroy_reader( bool_reader * p) { delete p;}
|
||||||
|
|
||||||
|
int vp8bc_read_bool( bool_reader *p, vp8bc_index_t i)
|
||||||
|
{
|
||||||
|
# if tim_vp8
|
||||||
|
// bc_v = dbg_r < 10;
|
||||||
|
bc_v = tok_verbose;
|
||||||
|
const int v = ( *p)( i)? 1:0;
|
||||||
|
if( tok_verbose)
|
||||||
|
printf( " reading %d at prob %d\n", v, i);
|
||||||
|
if( dbg_r < dbg_l) {
|
||||||
|
assert( dbg_r <= dbg_w);
|
||||||
|
if( i != dbg_i[dbg_r] || v != dbg_v[dbg_r]) {
|
||||||
|
printf(
|
||||||
|
"Position %d: INCORRECTLY READING %d prob %d, wrote %d prob %d\n",
|
||||||
|
dbg_r, v, i, dbg_v[dbg_r], dbg_i[dbg_r]
|
||||||
|
);
|
||||||
|
}
|
||||||
|
++dbg_r;
|
||||||
|
}
|
||||||
|
return v;
|
||||||
|
# else
|
||||||
|
return ( *p)( i)? 1:0;
|
||||||
|
# endif
|
||||||
|
}
|
||||||
|
|
||||||
|
unsigned int vp8bc_read_bits( bool_reader *p, int n) { return p->read_bits( n);}
|
||||||
|
|
||||||
|
c_bool_coder_spec *vp8bc_reader_spec( c_bool_reader *r) { return & r->Spec();}
|
||||||
|
|
||||||
|
#undef bc_v
|
||||||
@@ -14,18 +14,18 @@
|
|||||||
; /****************************************************************************
|
; /****************************************************************************
|
||||||
; * Notes:
|
; * Notes:
|
||||||
; *
|
; *
|
||||||
; * This implementation makes use of 16 bit fixed point version of two multiply
|
; * This implementation makes use of 16 bit fixed point verio of two multiply
|
||||||
; * constants:
|
; * constants:
|
||||||
; * 1. sqrt(2) * cos (pi/8)
|
; * 1. sqrt(2) * cos (pi/8)
|
||||||
; * 2. sqrt(2) * sin (pi/8)
|
; * 2. sqrt(2) * sin (pi/8)
|
||||||
; * Because the first constant is bigger than 1, to maintain the same 16 bit
|
; * Becuase the first constant is bigger than 1, to maintain the same 16 bit
|
||||||
; * fixed point precision as the second one, we use a trick of
|
; * fixed point prrcision as the second one, we use a trick of
|
||||||
; * x * a = x + x*(a-1)
|
; * x * a = x + x*(a-1)
|
||||||
; * so
|
; * so
|
||||||
; * x * sqrt(2) * cos (pi/8) = x + x * (sqrt(2) *cos(pi/8)-1).
|
; * x * sqrt(2) * cos (pi/8) = x + x * (sqrt(2) *cos(pi/8)-1).
|
||||||
; *
|
; *
|
||||||
; * For the second constant, because of the 16bit version is 35468, which
|
; * For the second constant, becuase of the 16bit version is 35468, which
|
||||||
; * is bigger than 32768, in signed 16 bit multiply, it becomes a negative
|
; * is bigger than 32768, in signed 16 bit multiply, it become a negative
|
||||||
; * number.
|
; * number.
|
||||||
; * (x * (unsigned)35468 >> 16) = x * (signed)35468 >> 16 + x
|
; * (x * (unsigned)35468 >> 16) = x * (signed)35468 >> 16 + x
|
||||||
; *
|
; *
|
||||||
|
|||||||
@@ -32,6 +32,9 @@ sym(idct_dequant_0_2x_sse2):
|
|||||||
mov rdx, arg(1) ; dequant
|
mov rdx, arg(1) ; dequant
|
||||||
mov rax, arg(0) ; qcoeff
|
mov rax, arg(0) ; qcoeff
|
||||||
|
|
||||||
|
; Zero out xmm7, for use unpacking
|
||||||
|
pxor xmm7, xmm7
|
||||||
|
|
||||||
movd xmm4, [rax]
|
movd xmm4, [rax]
|
||||||
movd xmm5, [rdx]
|
movd xmm5, [rdx]
|
||||||
|
|
||||||
@@ -40,12 +43,9 @@ sym(idct_dequant_0_2x_sse2):
|
|||||||
|
|
||||||
pmullw xmm4, xmm5
|
pmullw xmm4, xmm5
|
||||||
|
|
||||||
; Zero out xmm5, for use unpacking
|
|
||||||
pxor xmm5, xmm5
|
|
||||||
|
|
||||||
; clear coeffs
|
; clear coeffs
|
||||||
movd [rax], xmm5
|
movd [rax], xmm7
|
||||||
movd [rax+32], xmm5
|
movd [rax+32], xmm7
|
||||||
;pshufb
|
;pshufb
|
||||||
pshuflw xmm4, xmm4, 00000000b
|
pshuflw xmm4, xmm4, 00000000b
|
||||||
pshufhw xmm4, xmm4, 00000000b
|
pshufhw xmm4, xmm4, 00000000b
|
||||||
@@ -62,10 +62,10 @@ sym(idct_dequant_0_2x_sse2):
|
|||||||
lea rcx, [3*rcx]
|
lea rcx, [3*rcx]
|
||||||
movq xmm3, [rax+rcx]
|
movq xmm3, [rax+rcx]
|
||||||
|
|
||||||
punpcklbw xmm0, xmm5
|
punpcklbw xmm0, xmm7
|
||||||
punpcklbw xmm1, xmm5
|
punpcklbw xmm1, xmm7
|
||||||
punpcklbw xmm2, xmm5
|
punpcklbw xmm2, xmm7
|
||||||
punpcklbw xmm3, xmm5
|
punpcklbw xmm3, xmm7
|
||||||
|
|
||||||
mov rax, arg(3) ; dst
|
mov rax, arg(3) ; dst
|
||||||
movsxd rdx, dword ptr arg(4) ; dst_stride
|
movsxd rdx, dword ptr arg(4) ; dst_stride
|
||||||
@@ -77,10 +77,10 @@ sym(idct_dequant_0_2x_sse2):
|
|||||||
paddw xmm3, xmm4
|
paddw xmm3, xmm4
|
||||||
|
|
||||||
; pack up before storing
|
; pack up before storing
|
||||||
packuswb xmm0, xmm5
|
packuswb xmm0, xmm7
|
||||||
packuswb xmm1, xmm5
|
packuswb xmm1, xmm7
|
||||||
packuswb xmm2, xmm5
|
packuswb xmm2, xmm7
|
||||||
packuswb xmm3, xmm5
|
packuswb xmm3, xmm7
|
||||||
|
|
||||||
; store blocks back out
|
; store blocks back out
|
||||||
movq [rax], xmm0
|
movq [rax], xmm0
|
||||||
@@ -102,7 +102,6 @@ sym(idct_dequant_full_2x_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 7
|
SHADOW_ARGS_TO_STACK 7
|
||||||
SAVE_XMM 7
|
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -348,7 +347,6 @@ sym(idct_dequant_full_2x_sse2):
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -379,8 +377,8 @@ sym(idct_dequant_dc_0_2x_sse2):
|
|||||||
mov rdi, arg(3) ; dst
|
mov rdi, arg(3) ; dst
|
||||||
mov rdx, arg(5) ; dc
|
mov rdx, arg(5) ; dc
|
||||||
|
|
||||||
; Zero out xmm5, for use unpacking
|
; Zero out xmm7, for use unpacking
|
||||||
pxor xmm5, xmm5
|
pxor xmm7, xmm7
|
||||||
|
|
||||||
; load up 2 dc words here == 2*16 = doubleword
|
; load up 2 dc words here == 2*16 = doubleword
|
||||||
movd xmm4, [rdx]
|
movd xmm4, [rdx]
|
||||||
@@ -400,10 +398,10 @@ sym(idct_dequant_dc_0_2x_sse2):
|
|||||||
psraw xmm4, 3
|
psraw xmm4, 3
|
||||||
|
|
||||||
; Predict buffer needs to be expanded from bytes to words
|
; Predict buffer needs to be expanded from bytes to words
|
||||||
punpcklbw xmm0, xmm5
|
punpcklbw xmm0, xmm7
|
||||||
punpcklbw xmm1, xmm5
|
punpcklbw xmm1, xmm7
|
||||||
punpcklbw xmm2, xmm5
|
punpcklbw xmm2, xmm7
|
||||||
punpcklbw xmm3, xmm5
|
punpcklbw xmm3, xmm7
|
||||||
|
|
||||||
; Add to predict buffer
|
; Add to predict buffer
|
||||||
paddw xmm0, xmm4
|
paddw xmm0, xmm4
|
||||||
@@ -412,10 +410,10 @@ sym(idct_dequant_dc_0_2x_sse2):
|
|||||||
paddw xmm3, xmm4
|
paddw xmm3, xmm4
|
||||||
|
|
||||||
; pack up before storing
|
; pack up before storing
|
||||||
packuswb xmm0, xmm5
|
packuswb xmm0, xmm7
|
||||||
packuswb xmm1, xmm5
|
packuswb xmm1, xmm7
|
||||||
packuswb xmm2, xmm5
|
packuswb xmm2, xmm7
|
||||||
packuswb xmm3, xmm5
|
packuswb xmm3, xmm7
|
||||||
|
|
||||||
; Load destination stride before writing out,
|
; Load destination stride before writing out,
|
||||||
; doesn't need to persist
|
; doesn't need to persist
|
||||||
@@ -443,7 +441,6 @@ sym(idct_dequant_dc_full_2x_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 7
|
SHADOW_ARGS_TO_STACK 7
|
||||||
SAVE_XMM 7
|
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -695,7 +692,6 @@ sym(idct_dequant_dc_full_2x_sse2):
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
|
|||||||
@@ -17,7 +17,7 @@ sym(vp8_short_inv_walsh4x4_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 2
|
SHADOW_ARGS_TO_STACK 2
|
||||||
SAVE_XMM 6
|
SAVE_XMM
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
; end prolog
|
; end prolog
|
||||||
@@ -41,7 +41,7 @@ sym(vp8_short_inv_walsh4x4_sse2):
|
|||||||
movdqa xmm4, xmm0
|
movdqa xmm4, xmm0
|
||||||
punpcklqdq xmm0, xmm3 ;d1 a1
|
punpcklqdq xmm0, xmm3 ;d1 a1
|
||||||
punpckhqdq xmm4, xmm3 ;c1 b1
|
punpckhqdq xmm4, xmm3 ;c1 b1
|
||||||
movd xmm6, eax
|
movd xmm7, eax
|
||||||
|
|
||||||
movdqa xmm1, xmm4 ;c1 b1
|
movdqa xmm1, xmm4 ;c1 b1
|
||||||
paddw xmm4, xmm0 ;dl+cl a1+b1 aka op[4] op[0]
|
paddw xmm4, xmm0 ;dl+cl a1+b1 aka op[4] op[0]
|
||||||
@@ -66,7 +66,7 @@ sym(vp8_short_inv_walsh4x4_sse2):
|
|||||||
pshufd xmm2, xmm1, 4eh ;ip[8] ip[12]
|
pshufd xmm2, xmm1, 4eh ;ip[8] ip[12]
|
||||||
movdqa xmm3, xmm4 ;ip[4] ip[0]
|
movdqa xmm3, xmm4 ;ip[4] ip[0]
|
||||||
|
|
||||||
pshufd xmm6, xmm6, 0 ;03 03 03 03 03 03 03 03
|
pshufd xmm7, xmm7, 0 ;03 03 03 03 03 03 03 03
|
||||||
|
|
||||||
paddw xmm4, xmm2 ;ip[4]+ip[8] ip[0]+ip[12] aka b1 a1
|
paddw xmm4, xmm2 ;ip[4]+ip[8] ip[0]+ip[12] aka b1 a1
|
||||||
psubw xmm3, xmm2 ;ip[4]-ip[8] ip[0]-ip[12] aka c1 d1
|
psubw xmm3, xmm2 ;ip[4]-ip[8] ip[0]-ip[12] aka c1 d1
|
||||||
@@ -90,8 +90,8 @@ sym(vp8_short_inv_walsh4x4_sse2):
|
|||||||
punpcklwd xmm5, xmm0 ; 31 21 11 01 30 20 10 00
|
punpcklwd xmm5, xmm0 ; 31 21 11 01 30 20 10 00
|
||||||
punpckhwd xmm1, xmm0 ; 33 23 13 03 32 22 12 02
|
punpckhwd xmm1, xmm0 ; 33 23 13 03 32 22 12 02
|
||||||
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
paddw xmm5, xmm6
|
paddw xmm5, xmm7
|
||||||
paddw xmm1, xmm6
|
paddw xmm1, xmm7
|
||||||
|
|
||||||
psraw xmm5, 3
|
psraw xmm5, 3
|
||||||
psraw xmm1, 3
|
psraw xmm1, 3
|
||||||
|
|||||||
@@ -16,7 +16,7 @@
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -122,10 +122,12 @@ next8_h:
|
|||||||
paddusb mm5, mm5 ; abs(p0-q0)*2
|
paddusb mm5, mm5 ; abs(p0-q0)*2
|
||||||
paddusb mm5, mm2 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb mm5, mm2 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
mov rdx, arg(2) ;blimit ; get blimit
|
mov rdx, arg(2) ;flimit ; get flimit
|
||||||
movq mm7, [rdx] ; blimit
|
movq mm2, [rdx] ; flimit mm2
|
||||||
|
paddb mm2, mm2 ; flimit*2 (less than 255)
|
||||||
|
paddb mm7, mm2 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
psubusb mm5, mm7 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb mm5, mm7 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
por mm1, mm5
|
por mm1, mm5
|
||||||
pxor mm5, mm5
|
pxor mm5, mm5
|
||||||
pcmpeqb mm1, mm5 ; mask mm1
|
pcmpeqb mm1, mm5 ; mask mm1
|
||||||
@@ -228,7 +230,7 @@ next8_h:
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -404,9 +406,9 @@ next8_v:
|
|||||||
pand mm5, [GLOBAL(tfe)] ; set lsb of each byte to zero
|
pand mm5, [GLOBAL(tfe)] ; set lsb of each byte to zero
|
||||||
psrlw mm5, 1 ; abs(p1-q1)/2
|
psrlw mm5, 1 ; abs(p1-q1)/2
|
||||||
|
|
||||||
mov rdx, arg(2) ;blimit ;
|
mov rdx, arg(2) ;flimit ;
|
||||||
|
|
||||||
movq mm4, [rdx] ;blimit
|
movq mm2, [rdx] ;flimit mm2
|
||||||
movq mm1, mm3 ; mm1=mm3=p0
|
movq mm1, mm3 ; mm1=mm3=p0
|
||||||
|
|
||||||
movq mm7, mm6 ; mm7=mm6=q0
|
movq mm7, mm6 ; mm7=mm6=q0
|
||||||
@@ -417,7 +419,10 @@ next8_v:
|
|||||||
paddusb mm1, mm1 ; abs(q0-p0)*2
|
paddusb mm1, mm1 ; abs(q0-p0)*2
|
||||||
paddusb mm1, mm5 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb mm1, mm5 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
psubusb mm1, mm4 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
paddb mm2, mm2 ; flimit*2 (less than 255)
|
||||||
|
paddb mm4, mm2 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
|
psubusb mm1, mm4 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
por mm1, mm0; ; mask
|
por mm1, mm0; ; mask
|
||||||
|
|
||||||
pxor mm0, mm0
|
pxor mm0, mm0
|
||||||
@@ -598,7 +603,7 @@ next8_v:
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -714,15 +719,17 @@ next8_mbh:
|
|||||||
paddusb mm5, mm5 ; abs(p0-q0)*2
|
paddusb mm5, mm5 ; abs(p0-q0)*2
|
||||||
paddusb mm5, mm2 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb mm5, mm2 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
mov rdx, arg(2) ;blimit ; get blimit
|
mov rdx, arg(2) ;flimit ; get flimit
|
||||||
movq mm7, [rdx] ; blimit
|
movq mm2, [rdx] ; flimit mm2
|
||||||
|
paddb mm2, mm2 ; flimit*2 (less than 255)
|
||||||
|
paddb mm7, mm2 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
psubusb mm5, mm7 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb mm5, mm7 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
por mm1, mm5
|
por mm1, mm5
|
||||||
pxor mm5, mm5
|
pxor mm5, mm5
|
||||||
pcmpeqb mm1, mm5 ; mask mm1
|
pcmpeqb mm1, mm5 ; mask mm1
|
||||||
|
|
||||||
; mm1 = mask, mm0=q0, mm7 = blimit, t0 = abs(q0-q1) t1 = abs(p1-p0)
|
; mm1 = mask, mm0=q0, mm7 = flimit, t0 = abs(q0-q1) t1 = abs(p1-p0)
|
||||||
; mm6 = p0,
|
; mm6 = p0,
|
||||||
|
|
||||||
; calculate high edge variance
|
; calculate high edge variance
|
||||||
@@ -915,7 +922,7 @@ next8_mbh:
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -1101,9 +1108,9 @@ next8_mbv:
|
|||||||
pand mm5, [GLOBAL(tfe)] ; set lsb of each byte to zero
|
pand mm5, [GLOBAL(tfe)] ; set lsb of each byte to zero
|
||||||
psrlw mm5, 1 ; abs(p1-q1)/2
|
psrlw mm5, 1 ; abs(p1-q1)/2
|
||||||
|
|
||||||
mov rdx, arg(2) ;blimit ;
|
mov rdx, arg(2) ;flimit ;
|
||||||
|
|
||||||
movq mm4, [rdx] ;blimit
|
movq mm2, [rdx] ;flimit mm2
|
||||||
movq mm1, mm3 ; mm1=mm3=p0
|
movq mm1, mm3 ; mm1=mm3=p0
|
||||||
|
|
||||||
movq mm7, mm6 ; mm7=mm6=q0
|
movq mm7, mm6 ; mm7=mm6=q0
|
||||||
@@ -1114,7 +1121,10 @@ next8_mbv:
|
|||||||
paddusb mm1, mm1 ; abs(q0-p0)*2
|
paddusb mm1, mm1 ; abs(q0-p0)*2
|
||||||
paddusb mm1, mm5 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb mm1, mm5 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
psubusb mm1, mm4 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
paddb mm2, mm2 ; flimit*2 (less than 255)
|
||||||
|
paddb mm4, mm2 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
|
psubusb mm1, mm4 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
por mm1, mm0; ; mask
|
por mm1, mm0; ; mask
|
||||||
|
|
||||||
pxor mm0, mm0
|
pxor mm0, mm0
|
||||||
@@ -1382,13 +1392,16 @@ next8_mbv:
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit
|
; const char *flimit,
|
||||||
|
; const char *limit,
|
||||||
|
; const char *thresh,
|
||||||
|
; int count
|
||||||
;)
|
;)
|
||||||
global sym(vp8_loop_filter_simple_horizontal_edge_mmx)
|
global sym(vp8_loop_filter_simple_horizontal_edge_mmx)
|
||||||
sym(vp8_loop_filter_simple_horizontal_edge_mmx):
|
sym(vp8_loop_filter_simple_horizontal_edge_mmx):
|
||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 3
|
SHADOW_ARGS_TO_STACK 6
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1397,10 +1410,14 @@ sym(vp8_loop_filter_simple_horizontal_edge_mmx):
|
|||||||
mov rsi, arg(0) ;src_ptr
|
mov rsi, arg(0) ;src_ptr
|
||||||
movsxd rax, dword ptr arg(1) ;src_pixel_step ; destination pitch?
|
movsxd rax, dword ptr arg(1) ;src_pixel_step ; destination pitch?
|
||||||
|
|
||||||
mov rcx, 2 ; count
|
movsxd rcx, dword ptr arg(5) ;count
|
||||||
nexts8_h:
|
nexts8_h:
|
||||||
mov rdx, arg(2) ;blimit ; get blimit
|
mov rdx, arg(3) ;limit
|
||||||
|
movq mm7, [rdx]
|
||||||
|
mov rdx, arg(2) ;flimit ; get flimit
|
||||||
movq mm3, [rdx] ;
|
movq mm3, [rdx] ;
|
||||||
|
paddb mm3, mm3 ; flimit*2 (less than 255)
|
||||||
|
paddb mm3, mm7 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
mov rdi, rsi ; rdi points to row +1 for indirect addressing
|
mov rdi, rsi ; rdi points to row +1 for indirect addressing
|
||||||
add rdi, rax
|
add rdi, rax
|
||||||
@@ -1428,7 +1445,7 @@ nexts8_h:
|
|||||||
paddusb mm5, mm5 ; abs(p0-q0)*2
|
paddusb mm5, mm5 ; abs(p0-q0)*2
|
||||||
paddusb mm5, mm1 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb mm5, mm1 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
psubusb mm5, mm3 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb mm5, mm3 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
pxor mm3, mm3
|
pxor mm3, mm3
|
||||||
pcmpeqb mm5, mm3
|
pcmpeqb mm5, mm3
|
||||||
|
|
||||||
@@ -1498,13 +1515,16 @@ nexts8_h:
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit
|
; const char *flimit,
|
||||||
|
; const char *limit,
|
||||||
|
; const char *thresh,
|
||||||
|
; int count
|
||||||
;)
|
;)
|
||||||
global sym(vp8_loop_filter_simple_vertical_edge_mmx)
|
global sym(vp8_loop_filter_simple_vertical_edge_mmx)
|
||||||
sym(vp8_loop_filter_simple_vertical_edge_mmx):
|
sym(vp8_loop_filter_simple_vertical_edge_mmx):
|
||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 3
|
SHADOW_ARGS_TO_STACK 6
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1519,7 +1539,7 @@ sym(vp8_loop_filter_simple_vertical_edge_mmx):
|
|||||||
movsxd rax, dword ptr arg(1) ;src_pixel_step ; destination pitch?
|
movsxd rax, dword ptr arg(1) ;src_pixel_step ; destination pitch?
|
||||||
|
|
||||||
lea rsi, [rsi + rax*4- 2]; ;
|
lea rsi, [rsi + rax*4- 2]; ;
|
||||||
mov rcx, 2 ; count
|
movsxd rcx, dword ptr arg(5) ;count
|
||||||
nexts8_v:
|
nexts8_v:
|
||||||
|
|
||||||
lea rdi, [rsi + rax];
|
lea rdi, [rsi + rax];
|
||||||
@@ -1582,10 +1602,14 @@ nexts8_v:
|
|||||||
paddusb mm5, mm5 ; abs(p0-q0)*2
|
paddusb mm5, mm5 ; abs(p0-q0)*2
|
||||||
paddusb mm5, mm6 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb mm5, mm6 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
mov rdx, arg(2) ;blimit ; get blimit
|
mov rdx, arg(2) ;flimit ; get flimit
|
||||||
movq mm7, [rdx]
|
movq mm7, [rdx]
|
||||||
|
mov rdx, arg(3) ; get limit
|
||||||
|
movq mm6, [rdx]
|
||||||
|
paddb mm7, mm7 ; flimit*2 (less than 255)
|
||||||
|
paddb mm7, mm6 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
psubusb mm5, mm7 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb mm5, mm7 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
pxor mm7, mm7
|
pxor mm7, mm7
|
||||||
pcmpeqb mm5, mm7 ; mm5 = mask
|
pcmpeqb mm5, mm7 ; mm5 = mask
|
||||||
|
|
||||||
|
|||||||
@@ -110,7 +110,7 @@
|
|||||||
psubusb xmm6, xmm5 ; p1-=p0
|
psubusb xmm6, xmm5 ; p1-=p0
|
||||||
|
|
||||||
por xmm6, xmm4 ; abs(p1 - p0)
|
por xmm6, xmm4 ; abs(p1 - p0)
|
||||||
mov rdx, arg(2) ; get blimit
|
mov rdx, arg(2) ; get flimit
|
||||||
|
|
||||||
movdqa t1, xmm6 ; save to t1
|
movdqa t1, xmm6 ; save to t1
|
||||||
|
|
||||||
@@ -123,7 +123,7 @@
|
|||||||
psubusb xmm1, xmm7
|
psubusb xmm1, xmm7
|
||||||
por xmm2, xmm3 ; abs(p1-q1)
|
por xmm2, xmm3 ; abs(p1-q1)
|
||||||
|
|
||||||
movdqa xmm7, XMMWORD PTR [rdx] ; blimit
|
movdqa xmm4, XMMWORD PTR [rdx] ; flimit
|
||||||
|
|
||||||
movdqa xmm3, xmm0 ; q0
|
movdqa xmm3, xmm0 ; q0
|
||||||
pand xmm2, [GLOBAL(tfe)] ; set lsb of each byte to zero
|
pand xmm2, [GLOBAL(tfe)] ; set lsb of each byte to zero
|
||||||
@@ -134,11 +134,13 @@
|
|||||||
psrlw xmm2, 1 ; abs(p1-q1)/2
|
psrlw xmm2, 1 ; abs(p1-q1)/2
|
||||||
|
|
||||||
psubusb xmm5, xmm3 ; p0-=q0
|
psubusb xmm5, xmm3 ; p0-=q0
|
||||||
|
paddb xmm4, xmm4 ; flimit*2 (less than 255)
|
||||||
|
|
||||||
psubusb xmm3, xmm6 ; q0-=p0
|
psubusb xmm3, xmm6 ; q0-=p0
|
||||||
por xmm5, xmm3 ; abs(p0 - q0)
|
por xmm5, xmm3 ; abs(p0 - q0)
|
||||||
|
|
||||||
paddusb xmm5, xmm5 ; abs(p0-q0)*2
|
paddusb xmm5, xmm5 ; abs(p0-q0)*2
|
||||||
|
paddb xmm7, xmm4 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
movdqa xmm4, t0 ; hev get abs (q1 - q0)
|
movdqa xmm4, t0 ; hev get abs (q1 - q0)
|
||||||
|
|
||||||
@@ -148,7 +150,7 @@
|
|||||||
|
|
||||||
movdqa xmm2, XMMWORD PTR [rdx] ; hev
|
movdqa xmm2, XMMWORD PTR [rdx] ; hev
|
||||||
|
|
||||||
psubusb xmm5, xmm7 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb xmm5, xmm7 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
psubusb xmm4, xmm2 ; hev
|
psubusb xmm4, xmm2 ; hev
|
||||||
|
|
||||||
psubusb xmm3, xmm2 ; hev
|
psubusb xmm3, xmm2 ; hev
|
||||||
@@ -276,7 +278,7 @@
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -286,7 +288,7 @@ sym(vp8_loop_filter_horizontal_edge_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -326,7 +328,7 @@ sym(vp8_loop_filter_horizontal_edge_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -336,7 +338,7 @@ sym(vp8_loop_filter_horizontal_edge_uv_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -572,7 +574,7 @@ sym(vp8_loop_filter_horizontal_edge_uv_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -582,7 +584,7 @@ sym(vp8_mbloop_filter_horizontal_edge_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -622,7 +624,7 @@ sym(vp8_mbloop_filter_horizontal_edge_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *u,
|
; unsigned char *u,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; unsigned char *v
|
; unsigned char *v
|
||||||
@@ -632,7 +634,7 @@ sym(vp8_mbloop_filter_horizontal_edge_uv_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -902,7 +904,7 @@ sym(vp8_mbloop_filter_horizontal_edge_uv_sse2):
|
|||||||
movdqa xmm4, XMMWORD PTR [rdx]; limit
|
movdqa xmm4, XMMWORD PTR [rdx]; limit
|
||||||
|
|
||||||
pmaxub xmm0, xmm7
|
pmaxub xmm0, xmm7
|
||||||
mov rdx, arg(2) ; blimit
|
mov rdx, arg(2) ; flimit
|
||||||
|
|
||||||
psubusb xmm0, xmm4
|
psubusb xmm0, xmm4
|
||||||
movdqa xmm5, xmm2 ; q1
|
movdqa xmm5, xmm2 ; q1
|
||||||
@@ -919,11 +921,12 @@ sym(vp8_mbloop_filter_horizontal_edge_uv_sse2):
|
|||||||
psrlw xmm5, 1 ; abs(p1-q1)/2
|
psrlw xmm5, 1 ; abs(p1-q1)/2
|
||||||
psubusb xmm6, xmm3 ; q0-p0
|
psubusb xmm6, xmm3 ; q0-p0
|
||||||
|
|
||||||
movdqa xmm4, XMMWORD PTR [rdx]; blimit
|
movdqa xmm2, XMMWORD PTR [rdx]; flimit
|
||||||
|
|
||||||
mov rdx, arg(4) ; get thresh
|
mov rdx, arg(4) ; get thresh
|
||||||
|
|
||||||
por xmm1, xmm6 ; abs(q0-p0)
|
por xmm1, xmm6 ; abs(q0-p0)
|
||||||
|
paddb xmm2, xmm2 ; flimit*2 (less than 255)
|
||||||
|
|
||||||
movdqa xmm6, t0 ; get abs (q1 - q0)
|
movdqa xmm6, t0 ; get abs (q1 - q0)
|
||||||
|
|
||||||
@@ -936,9 +939,10 @@ sym(vp8_mbloop_filter_horizontal_edge_uv_sse2):
|
|||||||
paddusb xmm1, xmm5 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb xmm1, xmm5 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
psubusb xmm6, xmm7 ; abs(q1 - q0) > thresh
|
psubusb xmm6, xmm7 ; abs(q1 - q0) > thresh
|
||||||
|
|
||||||
|
paddb xmm4, xmm2 ; flimit * 2 + limit (less than 255)
|
||||||
psubusb xmm3, xmm7 ; abs(p1 - p0)> thresh
|
psubusb xmm3, xmm7 ; abs(p1 - p0)> thresh
|
||||||
|
|
||||||
psubusb xmm1, xmm4 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb xmm1, xmm4 ; abs (p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
por xmm6, xmm3 ; abs(q1 - q0) > thresh || abs(p1 - p0) > thresh
|
por xmm6, xmm3 ; abs(q1 - q0) > thresh || abs(p1 - p0) > thresh
|
||||||
|
|
||||||
por xmm1, xmm0 ; mask
|
por xmm1, xmm0 ; mask
|
||||||
@@ -1010,7 +1014,7 @@ sym(vp8_mbloop_filter_horizontal_edge_uv_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -1020,7 +1024,7 @@ sym(vp8_loop_filter_vertical_edge_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1077,7 +1081,7 @@ sym(vp8_loop_filter_vertical_edge_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *u,
|
; unsigned char *u,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; unsigned char *v
|
; unsigned char *v
|
||||||
@@ -1087,7 +1091,7 @@ sym(vp8_loop_filter_vertical_edge_uv_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1235,7 +1239,7 @@ sym(vp8_loop_filter_vertical_edge_uv_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; int count
|
; int count
|
||||||
@@ -1245,7 +1249,7 @@ sym(vp8_mbloop_filter_vertical_edge_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1304,7 +1308,7 @@ sym(vp8_mbloop_filter_vertical_edge_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *u,
|
; unsigned char *u,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
; const char *limit,
|
; const char *limit,
|
||||||
; const char *thresh,
|
; const char *thresh,
|
||||||
; unsigned char *v
|
; unsigned char *v
|
||||||
@@ -1314,7 +1318,7 @@ sym(vp8_mbloop_filter_vertical_edge_uv_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1372,14 +1376,17 @@ sym(vp8_mbloop_filter_vertical_edge_uv_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
|
; const char *limit,
|
||||||
|
; const char *thresh,
|
||||||
|
; int count
|
||||||
;)
|
;)
|
||||||
global sym(vp8_loop_filter_simple_horizontal_edge_sse2)
|
global sym(vp8_loop_filter_simple_horizontal_edge_sse2)
|
||||||
sym(vp8_loop_filter_simple_horizontal_edge_sse2):
|
sym(vp8_loop_filter_simple_horizontal_edge_sse2):
|
||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 3
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1387,8 +1394,13 @@ sym(vp8_loop_filter_simple_horizontal_edge_sse2):
|
|||||||
|
|
||||||
mov rsi, arg(0) ;src_ptr
|
mov rsi, arg(0) ;src_ptr
|
||||||
movsxd rax, dword ptr arg(1) ;src_pixel_step ; destination pitch?
|
movsxd rax, dword ptr arg(1) ;src_pixel_step ; destination pitch?
|
||||||
mov rdx, arg(2) ;blimit
|
mov rdx, arg(2) ;flimit ; get flimit
|
||||||
movdqa xmm3, XMMWORD PTR [rdx]
|
movdqa xmm3, XMMWORD PTR [rdx]
|
||||||
|
mov rdx, arg(3) ;limit
|
||||||
|
movdqa xmm7, XMMWORD PTR [rdx]
|
||||||
|
|
||||||
|
paddb xmm3, xmm3 ; flimit*2 (less than 255)
|
||||||
|
paddb xmm3, xmm7 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
mov rdi, rsi ; rdi points to row +1 for indirect addressing
|
mov rdi, rsi ; rdi points to row +1 for indirect addressing
|
||||||
add rdi, rax
|
add rdi, rax
|
||||||
@@ -1416,7 +1428,7 @@ sym(vp8_loop_filter_simple_horizontal_edge_sse2):
|
|||||||
paddusb xmm5, xmm5 ; abs(p0-q0)*2
|
paddusb xmm5, xmm5 ; abs(p0-q0)*2
|
||||||
paddusb xmm5, xmm1 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb xmm5, xmm1 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
psubusb xmm5, xmm3 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb xmm5, xmm3 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
pxor xmm3, xmm3
|
pxor xmm3, xmm3
|
||||||
pcmpeqb xmm5, xmm3
|
pcmpeqb xmm5, xmm3
|
||||||
|
|
||||||
@@ -1481,14 +1493,17 @@ sym(vp8_loop_filter_simple_horizontal_edge_sse2):
|
|||||||
;(
|
;(
|
||||||
; unsigned char *src_ptr,
|
; unsigned char *src_ptr,
|
||||||
; int src_pixel_step,
|
; int src_pixel_step,
|
||||||
; const char *blimit,
|
; const char *flimit,
|
||||||
|
; const char *limit,
|
||||||
|
; const char *thresh,
|
||||||
|
; int count
|
||||||
;)
|
;)
|
||||||
global sym(vp8_loop_filter_simple_vertical_edge_sse2)
|
global sym(vp8_loop_filter_simple_vertical_edge_sse2)
|
||||||
sym(vp8_loop_filter_simple_vertical_edge_sse2):
|
sym(vp8_loop_filter_simple_vertical_edge_sse2):
|
||||||
push rbp ; save old base pointer value.
|
push rbp ; save old base pointer value.
|
||||||
mov rbp, rsp ; set new base pointer value.
|
mov rbp, rsp ; set new base pointer value.
|
||||||
SHADOW_ARGS_TO_STACK 3
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx ; save callee-saved reg
|
GET_GOT rbx ; save callee-saved reg
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1592,10 +1607,14 @@ sym(vp8_loop_filter_simple_vertical_edge_sse2):
|
|||||||
paddusb xmm5, xmm5 ; abs(p0-q0)*2
|
paddusb xmm5, xmm5 ; abs(p0-q0)*2
|
||||||
paddusb xmm5, xmm6 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
paddusb xmm5, xmm6 ; abs (p0 - q0) *2 + abs(p1-q1)/2
|
||||||
|
|
||||||
mov rdx, arg(2) ;blimit
|
mov rdx, arg(2) ;flimit
|
||||||
movdqa xmm7, XMMWORD PTR [rdx]
|
movdqa xmm7, XMMWORD PTR [rdx]
|
||||||
|
mov rdx, arg(3) ; get limit
|
||||||
|
movdqa xmm6, XMMWORD PTR [rdx]
|
||||||
|
paddb xmm7, xmm7 ; flimit*2 (less than 255)
|
||||||
|
paddb xmm7, xmm6 ; flimit * 2 + limit (less than 255)
|
||||||
|
|
||||||
psubusb xmm5, xmm7 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > blimit
|
psubusb xmm5, xmm7 ; abs(p0 - q0) *2 + abs(p1-q1)/2 > flimit * 2 + limit
|
||||||
pxor xmm7, xmm7
|
pxor xmm7, xmm7
|
||||||
pcmpeqb xmm5, xmm7 ; mm5 = mask
|
pcmpeqb xmm5, xmm7 ; mm5 = mask
|
||||||
|
|
||||||
|
|||||||
@@ -9,18 +9,30 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include "vpx_config.h"
|
#include "vpx_ports/config.h"
|
||||||
#include "vp8/common/loopfilter.h"
|
#include "vp8/common/loopfilter.h"
|
||||||
|
|
||||||
|
prototype_loopfilter(vp8_loop_filter_horizontal_edge_c);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_vertical_edge_c);
|
||||||
|
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_c);
|
||||||
|
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_c);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_horizontal_edge_c);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_vertical_edge_c);
|
||||||
|
|
||||||
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_mmx);
|
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_mmx);
|
||||||
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_mmx);
|
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_mmx);
|
||||||
prototype_loopfilter(vp8_loop_filter_vertical_edge_mmx);
|
prototype_loopfilter(vp8_loop_filter_vertical_edge_mmx);
|
||||||
prototype_loopfilter(vp8_loop_filter_horizontal_edge_mmx);
|
prototype_loopfilter(vp8_loop_filter_horizontal_edge_mmx);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_vertical_edge_mmx);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_horizontal_edge_mmx);
|
||||||
|
|
||||||
prototype_loopfilter(vp8_loop_filter_vertical_edge_sse2);
|
prototype_loopfilter(vp8_loop_filter_vertical_edge_sse2);
|
||||||
prototype_loopfilter(vp8_loop_filter_horizontal_edge_sse2);
|
prototype_loopfilter(vp8_loop_filter_horizontal_edge_sse2);
|
||||||
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_sse2);
|
prototype_loopfilter(vp8_mbloop_filter_vertical_edge_sse2);
|
||||||
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_sse2);
|
prototype_loopfilter(vp8_mbloop_filter_horizontal_edge_sse2);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_vertical_edge_sse2);
|
||||||
|
prototype_loopfilter(vp8_loop_filter_simple_horizontal_edge_sse2);
|
||||||
|
prototype_loopfilter(vp8_fast_loop_filter_vertical_edges_sse2);
|
||||||
|
|
||||||
extern loop_filter_uvfunction vp8_loop_filter_horizontal_edge_uv_sse2;
|
extern loop_filter_uvfunction vp8_loop_filter_horizontal_edge_uv_sse2;
|
||||||
extern loop_filter_uvfunction vp8_loop_filter_vertical_edge_uv_sse2;
|
extern loop_filter_uvfunction vp8_loop_filter_vertical_edge_uv_sse2;
|
||||||
@@ -30,77 +42,113 @@ extern loop_filter_uvfunction vp8_mbloop_filter_vertical_edge_uv_sse2;
|
|||||||
#if HAVE_MMX
|
#if HAVE_MMX
|
||||||
/* Horizontal MB filtering */
|
/* Horizontal MB filtering */
|
||||||
void vp8_loop_filter_mbh_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbh_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_horizontal_edge_mmx(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_horizontal_edge_mmx(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_mmx(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_horizontal_edge_mmx(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_mmx(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_horizontal_edge_mmx(v_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbhs_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Vertical MB Filtering */
|
/* Vertical MB Filtering */
|
||||||
void vp8_loop_filter_mbv_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbv_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_vertical_edge_mmx(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_vertical_edge_mmx(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_mmx(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_vertical_edge_mmx(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_mmx(v_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 1);
|
vp8_mbloop_filter_vertical_edge_mmx(v_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbvs_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Horizontal B Filtering */
|
/* Horizontal B Filtering */
|
||||||
void vp8_loop_filter_bh_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bh_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_horizontal_edge_mmx(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_horizontal_edge_mmx(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_mmx(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_horizontal_edge_mmx(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_mmx(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_horizontal_edge_mmx(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_mmx(u_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_horizontal_edge_mmx(u_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_mmx(v_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_horizontal_edge_mmx(v_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_bhs_mmx(unsigned char *y_ptr, int y_stride, const unsigned char *blimit)
|
void vp8_loop_filter_bhs_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr + 4 * y_stride, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr + 8 * y_stride, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr + 12 * y_stride, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_mmx(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Vertical B Filtering */
|
/* Vertical B Filtering */
|
||||||
void vp8_loop_filter_bv_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bv_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_vertical_edge_mmx(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_vertical_edge_mmx(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_mmx(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_vertical_edge_mmx(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_mmx(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_vertical_edge_mmx(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_vertical_edge_mmx(u_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_vertical_edge_mmx(u_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
|
|
||||||
if (v_ptr)
|
if (v_ptr)
|
||||||
vp8_loop_filter_vertical_edge_mmx(v_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, 1);
|
vp8_loop_filter_vertical_edge_mmx(v_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, 1);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_bvs_mmx(unsigned char *y_ptr, int y_stride, const unsigned char *blimit)
|
void vp8_loop_filter_bvs_mmx(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr + 4, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr + 8, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr + 12, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_mmx(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
@@ -108,65 +156,113 @@ void vp8_loop_filter_bvs_mmx(unsigned char *y_ptr, int y_stride, const unsigned
|
|||||||
/* Horizontal MB filtering */
|
/* Horizontal MB filtering */
|
||||||
#if HAVE_SSE2
|
#if HAVE_SSE2
|
||||||
void vp8_loop_filter_mbh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_horizontal_edge_sse2(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_horizontal_edge_sse2(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_horizontal_edge_uv_sse2(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, v_ptr);
|
vp8_mbloop_filter_horizontal_edge_uv_sse2(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, v_ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbhs_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Vertical MB Filtering */
|
/* Vertical MB Filtering */
|
||||||
void vp8_loop_filter_mbv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_mbv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_mbloop_filter_vertical_edge_sse2(y_ptr, y_stride, lfi->mblim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
|
vp8_mbloop_filter_vertical_edge_sse2(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_mbloop_filter_vertical_edge_uv_sse2(u_ptr, uv_stride, lfi->mblim, lfi->lim, lfi->hev_thr, v_ptr);
|
vp8_mbloop_filter_vertical_edge_uv_sse2(u_ptr, uv_stride, lfi->mbflim, lfi->lim, lfi->thr, v_ptr);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
void vp8_loop_filter_mbvs_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
|
{
|
||||||
|
(void) u_ptr;
|
||||||
|
(void) v_ptr;
|
||||||
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr, y_stride, lfi->mbflim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Horizontal B Filtering */
|
/* Horizontal B Filtering */
|
||||||
void vp8_loop_filter_bh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bh_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 4 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 8 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 12 * y_stride, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_horizontal_edge_sse2(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_horizontal_edge_uv_sse2(u_ptr + 4 * uv_stride, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, v_ptr + 4 * uv_stride);
|
vp8_loop_filter_horizontal_edge_uv_sse2(u_ptr + 4 * uv_stride, uv_stride, lfi->flim, lfi->lim, lfi->thr, v_ptr + 4 * uv_stride);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_bhs_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit)
|
void vp8_loop_filter_bhs_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr + 4 * y_stride, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr + 8 * y_stride, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr + 12 * y_stride, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr + 4 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr + 8 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_horizontal_edge_sse2(y_ptr + 12 * y_stride, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
/* Vertical B Filtering */
|
/* Vertical B Filtering */
|
||||||
void vp8_loop_filter_bv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
void vp8_loop_filter_bv_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
int y_stride, int uv_stride, loop_filter_info *lfi)
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_vertical_edge_sse2(y_ptr + 4, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
(void) simpler_lpf;
|
||||||
vp8_loop_filter_vertical_edge_sse2(y_ptr + 8, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_sse2(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
vp8_loop_filter_vertical_edge_sse2(y_ptr + 12, y_stride, lfi->blim, lfi->lim, lfi->hev_thr, 2);
|
vp8_loop_filter_vertical_edge_sse2(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_vertical_edge_sse2(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
|
||||||
if (u_ptr)
|
if (u_ptr)
|
||||||
vp8_loop_filter_vertical_edge_uv_sse2(u_ptr + 4, uv_stride, lfi->blim, lfi->lim, lfi->hev_thr, v_ptr + 4);
|
vp8_loop_filter_vertical_edge_uv_sse2(u_ptr + 4, uv_stride, lfi->flim, lfi->lim, lfi->thr, v_ptr + 4);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void vp8_loop_filter_bvs_sse2(unsigned char *y_ptr, int y_stride, const unsigned char *blimit)
|
void vp8_loop_filter_bvs_sse2(unsigned char *y_ptr, unsigned char *u_ptr, unsigned char *v_ptr,
|
||||||
|
int y_stride, int uv_stride, loop_filter_info *lfi, int simpler_lpf)
|
||||||
{
|
{
|
||||||
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 4, y_stride, blimit);
|
(void) u_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 8, y_stride, blimit);
|
(void) v_ptr;
|
||||||
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 12, y_stride, blimit);
|
(void) uv_stride;
|
||||||
|
(void) simpler_lpf;
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
}
|
}
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
#if 0
|
||||||
|
void vp8_fast_loop_filter_vertical_edges_sse(unsigned char *y_ptr,
|
||||||
|
int y_stride,
|
||||||
|
loop_filter_info *lfi)
|
||||||
|
{
|
||||||
|
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 4, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 8, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
vp8_loop_filter_simple_vertical_edge_sse2(y_ptr + 12, y_stride, lfi->flim, lfi->lim, lfi->thr, 2);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|||||||
@@ -24,10 +24,10 @@ extern prototype_loopfilter_block(vp8_loop_filter_mbv_mmx);
|
|||||||
extern prototype_loopfilter_block(vp8_loop_filter_bv_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_bv_mmx);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_mbh_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbh_mmx);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_bh_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_bh_mmx);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbvs_mmx);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_bvs_mmx);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbhs_mmx);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_mmx);
|
extern prototype_loopfilter_block(vp8_loop_filter_bhs_mmx);
|
||||||
|
|
||||||
|
|
||||||
#if !CONFIG_RUNTIME_CPU_DETECT
|
#if !CONFIG_RUNTIME_CPU_DETECT
|
||||||
@@ -44,13 +44,13 @@ extern prototype_simple_loopfilter(vp8_loop_filter_bhs_mmx);
|
|||||||
#define vp8_lf_normal_b_h vp8_loop_filter_bh_mmx
|
#define vp8_lf_normal_b_h vp8_loop_filter_bh_mmx
|
||||||
|
|
||||||
#undef vp8_lf_simple_mb_v
|
#undef vp8_lf_simple_mb_v
|
||||||
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_mmx
|
#define vp8_lf_simple_mb_v vp8_loop_filter_mbvs_mmx
|
||||||
|
|
||||||
#undef vp8_lf_simple_b_v
|
#undef vp8_lf_simple_b_v
|
||||||
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_mmx
|
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_mmx
|
||||||
|
|
||||||
#undef vp8_lf_simple_mb_h
|
#undef vp8_lf_simple_mb_h
|
||||||
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_mmx
|
#define vp8_lf_simple_mb_h vp8_loop_filter_mbhs_mmx
|
||||||
|
|
||||||
#undef vp8_lf_simple_b_h
|
#undef vp8_lf_simple_b_h
|
||||||
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_mmx
|
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_mmx
|
||||||
@@ -63,10 +63,10 @@ extern prototype_loopfilter_block(vp8_loop_filter_mbv_sse2);
|
|||||||
extern prototype_loopfilter_block(vp8_loop_filter_bv_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_bv_sse2);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_mbh_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbh_sse2);
|
||||||
extern prototype_loopfilter_block(vp8_loop_filter_bh_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_bh_sse2);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_simple_vertical_edge_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbvs_sse2);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bvs_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_bvs_sse2);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_simple_horizontal_edge_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_mbhs_sse2);
|
||||||
extern prototype_simple_loopfilter(vp8_loop_filter_bhs_sse2);
|
extern prototype_loopfilter_block(vp8_loop_filter_bhs_sse2);
|
||||||
|
|
||||||
|
|
||||||
#if !CONFIG_RUNTIME_CPU_DETECT
|
#if !CONFIG_RUNTIME_CPU_DETECT
|
||||||
@@ -83,13 +83,13 @@ extern prototype_simple_loopfilter(vp8_loop_filter_bhs_sse2);
|
|||||||
#define vp8_lf_normal_b_h vp8_loop_filter_bh_sse2
|
#define vp8_lf_normal_b_h vp8_loop_filter_bh_sse2
|
||||||
|
|
||||||
#undef vp8_lf_simple_mb_v
|
#undef vp8_lf_simple_mb_v
|
||||||
#define vp8_lf_simple_mb_v vp8_loop_filter_simple_vertical_edge_sse2
|
#define vp8_lf_simple_mb_v vp8_loop_filter_mbvs_sse2
|
||||||
|
|
||||||
#undef vp8_lf_simple_b_v
|
#undef vp8_lf_simple_b_v
|
||||||
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_sse2
|
#define vp8_lf_simple_b_v vp8_loop_filter_bvs_sse2
|
||||||
|
|
||||||
#undef vp8_lf_simple_mb_h
|
#undef vp8_lf_simple_mb_h
|
||||||
#define vp8_lf_simple_mb_h vp8_loop_filter_simple_horizontal_edge_sse2
|
#define vp8_lf_simple_mb_h vp8_loop_filter_mbhs_sse2
|
||||||
|
|
||||||
#undef vp8_lf_simple_b_h
|
#undef vp8_lf_simple_b_h
|
||||||
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_sse2
|
#define vp8_lf_simple_b_h vp8_loop_filter_bhs_sse2
|
||||||
|
|||||||
@@ -26,7 +26,7 @@ sym(vp8_post_proc_down_and_across_xmm):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 7
|
SHADOW_ARGS_TO_STACK 7
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -256,7 +256,7 @@ sym(vp8_mbpost_proc_down_xmm):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 5
|
SHADOW_ARGS_TO_STACK 5
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -456,7 +456,7 @@ sym(vp8_mbpost_proc_across_ip_xmm):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 5
|
SHADOW_ARGS_TO_STACK 5
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
|
|||||||
@@ -67,7 +67,7 @@ sym(vp8_recon4b_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 4
|
SHADOW_ARGS_TO_STACK 4
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
; end prolog
|
; end prolog
|
||||||
@@ -229,460 +229,3 @@ sym(vp8_copy_mem16x16_sse2):
|
|||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_dc_mmx2(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
global sym(vp8_intra_pred_uv_dc_mmx2)
|
|
||||||
sym(vp8_intra_pred_uv_dc_mmx2):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
push rsi
|
|
||||||
push rdi
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; from top
|
|
||||||
mov rsi, arg(2) ;src;
|
|
||||||
movsxd rax, dword ptr arg(3) ;src_stride;
|
|
||||||
sub rsi, rax
|
|
||||||
pxor mm0, mm0
|
|
||||||
movq mm1, [rsi]
|
|
||||||
psadbw mm1, mm0
|
|
||||||
|
|
||||||
; from left
|
|
||||||
dec rsi
|
|
||||||
lea rdi, [rax*3]
|
|
||||||
movzx ecx, byte [rsi+rax]
|
|
||||||
movzx edx, byte [rsi+rax*2]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rdi]
|
|
||||||
add ecx, edx
|
|
||||||
lea rsi, [rsi+rax*4]
|
|
||||||
movzx edx, byte [rsi]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rax]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rax*2]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rdi]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rax*4]
|
|
||||||
add ecx, edx
|
|
||||||
|
|
||||||
; add up
|
|
||||||
pextrw edx, mm1, 0x0
|
|
||||||
lea edx, [edx+ecx+8]
|
|
||||||
sar edx, 4
|
|
||||||
movd mm1, edx
|
|
||||||
pshufw mm1, mm1, 0x0
|
|
||||||
packuswb mm1, mm1
|
|
||||||
|
|
||||||
; write out
|
|
||||||
mov rdi, arg(0) ;dst;
|
|
||||||
movsxd rcx, dword ptr arg(1) ;dst_stride
|
|
||||||
lea rax, [rcx*3]
|
|
||||||
|
|
||||||
movq [rdi ], mm1
|
|
||||||
movq [rdi+rcx ], mm1
|
|
||||||
movq [rdi+rcx*2], mm1
|
|
||||||
movq [rdi+rax ], mm1
|
|
||||||
lea rdi, [rdi+rcx*4]
|
|
||||||
movq [rdi ], mm1
|
|
||||||
movq [rdi+rcx ], mm1
|
|
||||||
movq [rdi+rcx*2], mm1
|
|
||||||
movq [rdi+rax ], mm1
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
pop rdi
|
|
||||||
pop rsi
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_dctop_mmx2(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
global sym(vp8_intra_pred_uv_dctop_mmx2)
|
|
||||||
sym(vp8_intra_pred_uv_dctop_mmx2):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
GET_GOT rbx
|
|
||||||
push rsi
|
|
||||||
push rdi
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; from top
|
|
||||||
mov rsi, arg(2) ;src;
|
|
||||||
movsxd rax, dword ptr arg(3) ;src_stride;
|
|
||||||
sub rsi, rax
|
|
||||||
pxor mm0, mm0
|
|
||||||
movq mm1, [rsi]
|
|
||||||
psadbw mm1, mm0
|
|
||||||
|
|
||||||
; add up
|
|
||||||
paddw mm1, [GLOBAL(dc_4)]
|
|
||||||
psraw mm1, 3
|
|
||||||
pshufw mm1, mm1, 0x0
|
|
||||||
packuswb mm1, mm1
|
|
||||||
|
|
||||||
; write out
|
|
||||||
mov rdi, arg(0) ;dst;
|
|
||||||
movsxd rcx, dword ptr arg(1) ;dst_stride
|
|
||||||
lea rax, [rcx*3]
|
|
||||||
|
|
||||||
movq [rdi ], mm1
|
|
||||||
movq [rdi+rcx ], mm1
|
|
||||||
movq [rdi+rcx*2], mm1
|
|
||||||
movq [rdi+rax ], mm1
|
|
||||||
lea rdi, [rdi+rcx*4]
|
|
||||||
movq [rdi ], mm1
|
|
||||||
movq [rdi+rcx ], mm1
|
|
||||||
movq [rdi+rcx*2], mm1
|
|
||||||
movq [rdi+rax ], mm1
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
pop rdi
|
|
||||||
pop rsi
|
|
||||||
RESTORE_GOT
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_dcleft_mmx2(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
global sym(vp8_intra_pred_uv_dcleft_mmx2)
|
|
||||||
sym(vp8_intra_pred_uv_dcleft_mmx2):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
push rsi
|
|
||||||
push rdi
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; from left
|
|
||||||
mov rsi, arg(2) ;src;
|
|
||||||
movsxd rax, dword ptr arg(3) ;src_stride;
|
|
||||||
dec rsi
|
|
||||||
lea rdi, [rax*3]
|
|
||||||
movzx ecx, byte [rsi]
|
|
||||||
movzx edx, byte [rsi+rax]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rax*2]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rdi]
|
|
||||||
add ecx, edx
|
|
||||||
lea rsi, [rsi+rax*4]
|
|
||||||
movzx edx, byte [rsi]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rax]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rax*2]
|
|
||||||
add ecx, edx
|
|
||||||
movzx edx, byte [rsi+rdi]
|
|
||||||
lea edx, [ecx+edx+4]
|
|
||||||
|
|
||||||
; add up
|
|
||||||
shr edx, 3
|
|
||||||
movd mm1, edx
|
|
||||||
pshufw mm1, mm1, 0x0
|
|
||||||
packuswb mm1, mm1
|
|
||||||
|
|
||||||
; write out
|
|
||||||
mov rdi, arg(0) ;dst;
|
|
||||||
movsxd rcx, dword ptr arg(1) ;dst_stride
|
|
||||||
lea rax, [rcx*3]
|
|
||||||
|
|
||||||
movq [rdi ], mm1
|
|
||||||
movq [rdi+rcx ], mm1
|
|
||||||
movq [rdi+rcx*2], mm1
|
|
||||||
movq [rdi+rax ], mm1
|
|
||||||
lea rdi, [rdi+rcx*4]
|
|
||||||
movq [rdi ], mm1
|
|
||||||
movq [rdi+rcx ], mm1
|
|
||||||
movq [rdi+rcx*2], mm1
|
|
||||||
movq [rdi+rax ], mm1
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
pop rdi
|
|
||||||
pop rsi
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_dc128_mmx(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
global sym(vp8_intra_pred_uv_dc128_mmx)
|
|
||||||
sym(vp8_intra_pred_uv_dc128_mmx):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
GET_GOT rbx
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; write out
|
|
||||||
movq mm1, [GLOBAL(dc_128)]
|
|
||||||
mov rax, arg(0) ;dst;
|
|
||||||
movsxd rdx, dword ptr arg(1) ;dst_stride
|
|
||||||
lea rcx, [rdx*3]
|
|
||||||
|
|
||||||
movq [rax ], mm1
|
|
||||||
movq [rax+rdx ], mm1
|
|
||||||
movq [rax+rdx*2], mm1
|
|
||||||
movq [rax+rcx ], mm1
|
|
||||||
lea rax, [rax+rdx*4]
|
|
||||||
movq [rax ], mm1
|
|
||||||
movq [rax+rdx ], mm1
|
|
||||||
movq [rax+rdx*2], mm1
|
|
||||||
movq [rax+rcx ], mm1
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
RESTORE_GOT
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_tm_sse2(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
%macro vp8_intra_pred_uv_tm 1
|
|
||||||
global sym(vp8_intra_pred_uv_tm_%1)
|
|
||||||
sym(vp8_intra_pred_uv_tm_%1):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
GET_GOT rbx
|
|
||||||
push rsi
|
|
||||||
push rdi
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; read top row
|
|
||||||
mov edx, 4
|
|
||||||
mov rsi, arg(2) ;src;
|
|
||||||
movsxd rax, dword ptr arg(3) ;src_stride;
|
|
||||||
sub rsi, rax
|
|
||||||
pxor xmm0, xmm0
|
|
||||||
%ifidn %1, ssse3
|
|
||||||
movdqa xmm2, [GLOBAL(dc_1024)]
|
|
||||||
%endif
|
|
||||||
movq xmm1, [rsi]
|
|
||||||
punpcklbw xmm1, xmm0
|
|
||||||
|
|
||||||
; set up left ptrs ans subtract topleft
|
|
||||||
movd xmm3, [rsi-1]
|
|
||||||
lea rsi, [rsi+rax-1]
|
|
||||||
%ifidn %1, sse2
|
|
||||||
punpcklbw xmm3, xmm0
|
|
||||||
pshuflw xmm3, xmm3, 0x0
|
|
||||||
punpcklqdq xmm3, xmm3
|
|
||||||
%else
|
|
||||||
pshufb xmm3, xmm2
|
|
||||||
%endif
|
|
||||||
psubw xmm1, xmm3
|
|
||||||
|
|
||||||
; set up dest ptrs
|
|
||||||
mov rdi, arg(0) ;dst;
|
|
||||||
movsxd rcx, dword ptr arg(1) ;dst_stride
|
|
||||||
|
|
||||||
vp8_intra_pred_uv_tm_%1_loop:
|
|
||||||
movd xmm3, [rsi]
|
|
||||||
movd xmm5, [rsi+rax]
|
|
||||||
%ifidn %1, sse2
|
|
||||||
punpcklbw xmm3, xmm0
|
|
||||||
punpcklbw xmm5, xmm0
|
|
||||||
pshuflw xmm3, xmm3, 0x0
|
|
||||||
pshuflw xmm5, xmm5, 0x0
|
|
||||||
punpcklqdq xmm3, xmm3
|
|
||||||
punpcklqdq xmm5, xmm5
|
|
||||||
%else
|
|
||||||
pshufb xmm3, xmm2
|
|
||||||
pshufb xmm5, xmm2
|
|
||||||
%endif
|
|
||||||
paddw xmm3, xmm1
|
|
||||||
paddw xmm5, xmm1
|
|
||||||
packuswb xmm3, xmm5
|
|
||||||
movq [rdi ], xmm3
|
|
||||||
movhps[rdi+rcx], xmm3
|
|
||||||
lea rsi, [rsi+rax*2]
|
|
||||||
lea rdi, [rdi+rcx*2]
|
|
||||||
dec edx
|
|
||||||
jnz vp8_intra_pred_uv_tm_%1_loop
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
pop rdi
|
|
||||||
pop rsi
|
|
||||||
RESTORE_GOT
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
%endmacro
|
|
||||||
|
|
||||||
vp8_intra_pred_uv_tm sse2
|
|
||||||
vp8_intra_pred_uv_tm ssse3
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_ve_mmx(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
global sym(vp8_intra_pred_uv_ve_mmx)
|
|
||||||
sym(vp8_intra_pred_uv_ve_mmx):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; read from top
|
|
||||||
mov rax, arg(2) ;src;
|
|
||||||
movsxd rdx, dword ptr arg(3) ;src_stride;
|
|
||||||
sub rax, rdx
|
|
||||||
movq mm1, [rax]
|
|
||||||
|
|
||||||
; write out
|
|
||||||
mov rax, arg(0) ;dst;
|
|
||||||
movsxd rdx, dword ptr arg(1) ;dst_stride
|
|
||||||
lea rcx, [rdx*3]
|
|
||||||
|
|
||||||
movq [rax ], mm1
|
|
||||||
movq [rax+rdx ], mm1
|
|
||||||
movq [rax+rdx*2], mm1
|
|
||||||
movq [rax+rcx ], mm1
|
|
||||||
lea rax, [rax+rdx*4]
|
|
||||||
movq [rax ], mm1
|
|
||||||
movq [rax+rdx ], mm1
|
|
||||||
movq [rax+rdx*2], mm1
|
|
||||||
movq [rax+rcx ], mm1
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
|
|
||||||
;void vp8_intra_pred_uv_ho_mmx2(
|
|
||||||
; unsigned char *dst,
|
|
||||||
; int dst_stride
|
|
||||||
; unsigned char *src,
|
|
||||||
; int src_stride,
|
|
||||||
; )
|
|
||||||
%macro vp8_intra_pred_uv_ho 1
|
|
||||||
global sym(vp8_intra_pred_uv_ho_%1)
|
|
||||||
sym(vp8_intra_pred_uv_ho_%1):
|
|
||||||
push rbp
|
|
||||||
mov rbp, rsp
|
|
||||||
SHADOW_ARGS_TO_STACK 4
|
|
||||||
push rsi
|
|
||||||
push rdi
|
|
||||||
%ifidn %1, ssse3
|
|
||||||
%ifndef GET_GOT_SAVE_ARG
|
|
||||||
push rbx
|
|
||||||
%endif
|
|
||||||
GET_GOT rbx
|
|
||||||
%endif
|
|
||||||
; end prolog
|
|
||||||
|
|
||||||
; read from left and write out
|
|
||||||
%ifidn %1, mmx2
|
|
||||||
mov edx, 4
|
|
||||||
%endif
|
|
||||||
mov rsi, arg(2) ;src;
|
|
||||||
movsxd rax, dword ptr arg(3) ;src_stride;
|
|
||||||
mov rdi, arg(0) ;dst;
|
|
||||||
movsxd rcx, dword ptr arg(1) ;dst_stride
|
|
||||||
%ifidn %1, ssse3
|
|
||||||
lea rdx, [rcx*3]
|
|
||||||
movdqa xmm2, [GLOBAL(dc_00001111)]
|
|
||||||
lea rbx, [rax*3]
|
|
||||||
%endif
|
|
||||||
dec rsi
|
|
||||||
%ifidn %1, mmx2
|
|
||||||
vp8_intra_pred_uv_ho_%1_loop:
|
|
||||||
movd mm0, [rsi]
|
|
||||||
movd mm1, [rsi+rax]
|
|
||||||
punpcklbw mm0, mm0
|
|
||||||
punpcklbw mm1, mm1
|
|
||||||
pshufw mm0, mm0, 0x0
|
|
||||||
pshufw mm1, mm1, 0x0
|
|
||||||
movq [rdi ], mm0
|
|
||||||
movq [rdi+rcx], mm1
|
|
||||||
lea rsi, [rsi+rax*2]
|
|
||||||
lea rdi, [rdi+rcx*2]
|
|
||||||
dec edx
|
|
||||||
jnz vp8_intra_pred_uv_ho_%1_loop
|
|
||||||
%else
|
|
||||||
movd xmm0, [rsi]
|
|
||||||
movd xmm3, [rsi+rax]
|
|
||||||
movd xmm1, [rsi+rax*2]
|
|
||||||
movd xmm4, [rsi+rbx]
|
|
||||||
punpcklbw xmm0, xmm3
|
|
||||||
punpcklbw xmm1, xmm4
|
|
||||||
pshufb xmm0, xmm2
|
|
||||||
pshufb xmm1, xmm2
|
|
||||||
movq [rdi ], xmm0
|
|
||||||
movhps [rdi+rcx], xmm0
|
|
||||||
movq [rdi+rcx*2], xmm1
|
|
||||||
movhps [rdi+rdx], xmm1
|
|
||||||
lea rsi, [rsi+rax*4]
|
|
||||||
lea rdi, [rdi+rcx*4]
|
|
||||||
movd xmm0, [rsi]
|
|
||||||
movd xmm3, [rsi+rax]
|
|
||||||
movd xmm1, [rsi+rax*2]
|
|
||||||
movd xmm4, [rsi+rbx]
|
|
||||||
punpcklbw xmm0, xmm3
|
|
||||||
punpcklbw xmm1, xmm4
|
|
||||||
pshufb xmm0, xmm2
|
|
||||||
pshufb xmm1, xmm2
|
|
||||||
movq [rdi ], xmm0
|
|
||||||
movhps [rdi+rcx], xmm0
|
|
||||||
movq [rdi+rcx*2], xmm1
|
|
||||||
movhps [rdi+rdx], xmm1
|
|
||||||
%endif
|
|
||||||
|
|
||||||
; begin epilog
|
|
||||||
%ifidn %1, ssse3
|
|
||||||
RESTORE_GOT
|
|
||||||
%ifndef GET_GOT_SAVE_ARG
|
|
||||||
pop rbx
|
|
||||||
%endif
|
|
||||||
%endif
|
|
||||||
pop rdi
|
|
||||||
pop rsi
|
|
||||||
UNSHADOW_ARGS
|
|
||||||
pop rbp
|
|
||||||
ret
|
|
||||||
%endmacro
|
|
||||||
|
|
||||||
vp8_intra_pred_uv_ho mmx2
|
|
||||||
vp8_intra_pred_uv_ho ssse3
|
|
||||||
|
|
||||||
SECTION_RODATA
|
|
||||||
dc_128:
|
|
||||||
times 8 db 128
|
|
||||||
dc_4:
|
|
||||||
times 4 dw 4
|
|
||||||
align 16
|
|
||||||
dc_1024:
|
|
||||||
times 8 dw 0x400
|
|
||||||
align 16
|
|
||||||
dc_00001111:
|
|
||||||
times 8 db 0
|
|
||||||
times 8 db 1
|
|
||||||
|
|||||||
@@ -1,96 +0,0 @@
|
|||||||
/*
|
|
||||||
* Copyright (c) 2010 The WebM project authors. All Rights Reserved.
|
|
||||||
*
|
|
||||||
* Use of this source code is governed by a BSD-style license
|
|
||||||
* that can be found in the LICENSE file in the root of the source
|
|
||||||
* tree. An additional intellectual property rights grant can be found
|
|
||||||
* in the file PATENTS. All contributing project authors may
|
|
||||||
* be found in the AUTHORS file in the root of the source tree.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include "vpx_ports/config.h"
|
|
||||||
#include "vp8/common/recon.h"
|
|
||||||
#include "recon_x86.h"
|
|
||||||
#include "vpx_mem/vpx_mem.h"
|
|
||||||
|
|
||||||
#define build_intra_predictors_mbuv_prototype(sym) \
|
|
||||||
void sym(unsigned char *dst, int dst_stride, \
|
|
||||||
const unsigned char *src, int src_stride)
|
|
||||||
typedef build_intra_predictors_mbuv_prototype((*build_intra_predictors_mbuv_fn_t));
|
|
||||||
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_dc_mmx2);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_dctop_mmx2);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_dcleft_mmx2);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_dc128_mmx);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_ho_mmx2);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_ho_ssse3);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_ve_mmx);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_tm_sse2);
|
|
||||||
extern build_intra_predictors_mbuv_prototype(vp8_intra_pred_uv_tm_ssse3);
|
|
||||||
|
|
||||||
static void vp8_build_intra_predictors_mbuv_x86(MACROBLOCKD *x,
|
|
||||||
unsigned char *dst_u,
|
|
||||||
unsigned char *dst_v,
|
|
||||||
int dst_stride,
|
|
||||||
build_intra_predictors_mbuv_fn_t tm_func,
|
|
||||||
build_intra_predictors_mbuv_fn_t ho_func)
|
|
||||||
{
|
|
||||||
int mode = x->mode_info_context->mbmi.uv_mode;
|
|
||||||
build_intra_predictors_mbuv_fn_t fn;
|
|
||||||
int src_stride = x->dst.uv_stride;
|
|
||||||
|
|
||||||
switch (mode) {
|
|
||||||
case V_PRED: fn = vp8_intra_pred_uv_ve_mmx; break;
|
|
||||||
case H_PRED: fn = ho_func; break;
|
|
||||||
case TM_PRED: fn = tm_func; break;
|
|
||||||
case DC_PRED:
|
|
||||||
if (x->up_available) {
|
|
||||||
if (x->left_available) {
|
|
||||||
fn = vp8_intra_pred_uv_dc_mmx2; break;
|
|
||||||
} else {
|
|
||||||
fn = vp8_intra_pred_uv_dctop_mmx2; break;
|
|
||||||
}
|
|
||||||
} else if (x->left_available) {
|
|
||||||
fn = vp8_intra_pred_uv_dcleft_mmx2; break;
|
|
||||||
} else {
|
|
||||||
fn = vp8_intra_pred_uv_dc128_mmx; break;
|
|
||||||
}
|
|
||||||
break;
|
|
||||||
default: return;
|
|
||||||
}
|
|
||||||
|
|
||||||
fn(dst_u, dst_stride, x->dst.u_buffer, src_stride);
|
|
||||||
fn(dst_v, dst_stride, x->dst.v_buffer, src_stride);
|
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_build_intra_predictors_mbuv_sse2(MACROBLOCKD *x)
|
|
||||||
{
|
|
||||||
vp8_build_intra_predictors_mbuv_x86(x, &x->predictor[256],
|
|
||||||
&x->predictor[320], 8,
|
|
||||||
vp8_intra_pred_uv_tm_sse2,
|
|
||||||
vp8_intra_pred_uv_ho_mmx2);
|
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_build_intra_predictors_mbuv_ssse3(MACROBLOCKD *x)
|
|
||||||
{
|
|
||||||
vp8_build_intra_predictors_mbuv_x86(x, &x->predictor[256],
|
|
||||||
&x->predictor[320], 8,
|
|
||||||
vp8_intra_pred_uv_tm_ssse3,
|
|
||||||
vp8_intra_pred_uv_ho_ssse3);
|
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_build_intra_predictors_mbuv_s_sse2(MACROBLOCKD *x)
|
|
||||||
{
|
|
||||||
vp8_build_intra_predictors_mbuv_x86(x, x->dst.u_buffer,
|
|
||||||
x->dst.v_buffer, x->dst.uv_stride,
|
|
||||||
vp8_intra_pred_uv_tm_sse2,
|
|
||||||
vp8_intra_pred_uv_ho_mmx2);
|
|
||||||
}
|
|
||||||
|
|
||||||
void vp8_build_intra_predictors_mbuv_s_ssse3(MACROBLOCKD *x)
|
|
||||||
{
|
|
||||||
vp8_build_intra_predictors_mbuv_x86(x, x->dst.u_buffer,
|
|
||||||
x->dst.v_buffer, x->dst.uv_stride,
|
|
||||||
vp8_intra_pred_uv_tm_ssse3,
|
|
||||||
vp8_intra_pred_uv_ho_ssse3);
|
|
||||||
}
|
|
||||||
@@ -46,8 +46,6 @@ extern prototype_copy_block(vp8_copy_mem16x16_mmx);
|
|||||||
extern prototype_recon_block(vp8_recon2b_sse2);
|
extern prototype_recon_block(vp8_recon2b_sse2);
|
||||||
extern prototype_recon_block(vp8_recon4b_sse2);
|
extern prototype_recon_block(vp8_recon4b_sse2);
|
||||||
extern prototype_copy_block(vp8_copy_mem16x16_sse2);
|
extern prototype_copy_block(vp8_copy_mem16x16_sse2);
|
||||||
extern prototype_build_intra_predictors(vp8_build_intra_predictors_mbuv_sse2);
|
|
||||||
extern prototype_build_intra_predictors(vp8_build_intra_predictors_mbuv_s_sse2);
|
|
||||||
|
|
||||||
#if !CONFIG_RUNTIME_CPU_DETECT
|
#if !CONFIG_RUNTIME_CPU_DETECT
|
||||||
#undef vp8_recon_recon2
|
#undef vp8_recon_recon2
|
||||||
@@ -59,26 +57,6 @@ extern prototype_build_intra_predictors(vp8_build_intra_predictors_mbuv_s_sse2);
|
|||||||
#undef vp8_recon_copy16x16
|
#undef vp8_recon_copy16x16
|
||||||
#define vp8_recon_copy16x16 vp8_copy_mem16x16_sse2
|
#define vp8_recon_copy16x16 vp8_copy_mem16x16_sse2
|
||||||
|
|
||||||
#undef vp8_recon_build_intra_predictors_mbuv
|
|
||||||
#define vp8_recon_build_intra_predictors_mbuv vp8_build_intra_predictors_mbuv_sse2
|
|
||||||
|
|
||||||
#undef vp8_recon_build_intra_predictors_mbuv_s
|
|
||||||
#define vp8_recon_build_intra_predictors_mbuv_s vp8_build_intra_predictors_mbuv_s_sse2
|
|
||||||
|
|
||||||
#endif
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#if HAVE_SSSE3
|
|
||||||
extern prototype_build_intra_predictors(vp8_build_intra_predictors_mbuv_ssse3);
|
|
||||||
extern prototype_build_intra_predictors(vp8_build_intra_predictors_mbuv_s_ssse3);
|
|
||||||
|
|
||||||
#if !CONFIG_RUNTIME_CPU_DETECT
|
|
||||||
#undef vp8_recon_build_intra_predictors_mbuv
|
|
||||||
#define vp8_recon_build_intra_predictors_mbuv vp8_build_intra_predictors_mbuv_ssse3
|
|
||||||
|
|
||||||
#undef vp8_recon_build_intra_predictors_mbuv_s
|
|
||||||
#define vp8_recon_build_intra_predictors_mbuv_s vp8_build_intra_predictors_mbuv_s_ssse3
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
#endif
|
#endif
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -37,7 +37,7 @@ sym(vp8_filter_block1d8_h6_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 7
|
SHADOW_ARGS_TO_STACK 7
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -157,7 +157,7 @@ sym(vp8_filter_block1d16_h6_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 7
|
SHADOW_ARGS_TO_STACK 7
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -333,7 +333,7 @@ sym(vp8_filter_block1d8_v6_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 8
|
SHADOW_ARGS_TO_STACK 8
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -428,7 +428,7 @@ sym(vp8_filter_block1d16_v6_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 8
|
SHADOW_ARGS_TO_STACK 8
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -538,7 +538,7 @@ sym(vp8_filter_block1d8_h6_only_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -651,7 +651,7 @@ sym(vp8_filter_block1d16_h6_only_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -816,7 +816,7 @@ sym(vp8_filter_block1d8_v6_only_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -908,6 +908,7 @@ sym(vp8_unpack_block1d16_h6_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 5
|
SHADOW_ARGS_TO_STACK 5
|
||||||
|
;SAVE_XMM ;xmm6, xmm7 are not used here.
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -947,6 +948,7 @@ unpack_block1d16_h6_sse2_rowloop:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
|
;RESTORE_XMM
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -967,7 +969,7 @@ sym(vp8_bilinear_predict16x16_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1236,7 +1238,7 @@ sym(vp8_bilinear_predict8x8_sse2):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
|
|||||||
@@ -39,7 +39,6 @@ sym(vp8_filter_block1d8_h6_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -108,7 +107,6 @@ filter_block1d8_h6_rowloop_ssse3:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -164,7 +162,6 @@ filter_block1d8_h4_rowloop_ssse3:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -182,7 +179,7 @@ sym(vp8_filter_block1d16_h6_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -197,6 +194,10 @@ sym(vp8_filter_block1d16_h6_ssse3):
|
|||||||
|
|
||||||
mov rdi, arg(2) ;output_ptr
|
mov rdi, arg(2) ;output_ptr
|
||||||
|
|
||||||
|
;;
|
||||||
|
;; cmp esi, DWORD PTR [rax]
|
||||||
|
;; je vp8_filter_block1d16_h4_ssse3
|
||||||
|
|
||||||
mov rsi, arg(0) ;src_ptr
|
mov rsi, arg(0) ;src_ptr
|
||||||
|
|
||||||
movdqa xmm4, XMMWORD PTR [rax] ;k0_k5
|
movdqa xmm4, XMMWORD PTR [rax] ;k0_k5
|
||||||
@@ -270,7 +271,61 @@ filter_block1d16_h6_rowloop_ssse3:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
UNSHADOW_ARGS
|
||||||
|
pop rbp
|
||||||
|
ret
|
||||||
|
|
||||||
|
vp8_filter_block1d16_h4_ssse3:
|
||||||
|
movdqa xmm5, XMMWORD PTR [rax+256] ;k2_k4
|
||||||
|
movdqa xmm6, XMMWORD PTR [rax+128] ;k1_k3
|
||||||
|
|
||||||
|
mov rsi, arg(0) ;src_ptr
|
||||||
|
movsxd rax, dword ptr arg(1) ;src_pixels_per_line
|
||||||
|
movsxd rcx, dword ptr arg(4) ;output_height
|
||||||
|
movsxd rdx, dword ptr arg(3) ;output_pitch
|
||||||
|
|
||||||
|
filter_block1d16_h4_rowloop_ssse3:
|
||||||
|
movdqu xmm1, XMMWORD PTR [rsi - 2]
|
||||||
|
|
||||||
|
movdqa xmm2, xmm1
|
||||||
|
pshufb xmm1, [GLOBAL(shuf2b)]
|
||||||
|
pshufb xmm2, [GLOBAL(shuf3b)]
|
||||||
|
pmaddubsw xmm1, xmm5
|
||||||
|
|
||||||
|
movdqu xmm3, XMMWORD PTR [rsi + 6]
|
||||||
|
|
||||||
|
pmaddubsw xmm2, xmm6
|
||||||
|
movdqa xmm0, xmm3
|
||||||
|
pshufb xmm3, [GLOBAL(shuf3b)]
|
||||||
|
pshufb xmm0, [GLOBAL(shuf2b)]
|
||||||
|
|
||||||
|
paddsw xmm1, [GLOBAL(rd)]
|
||||||
|
paddsw xmm1, xmm2
|
||||||
|
|
||||||
|
pmaddubsw xmm0, xmm5
|
||||||
|
pmaddubsw xmm3, xmm6
|
||||||
|
|
||||||
|
psraw xmm1, 7
|
||||||
|
packuswb xmm1, xmm1
|
||||||
|
lea rsi, [rsi + rax]
|
||||||
|
paddsw xmm3, xmm0
|
||||||
|
paddsw xmm3, [GLOBAL(rd)]
|
||||||
|
psraw xmm3, 7
|
||||||
|
packuswb xmm3, xmm3
|
||||||
|
|
||||||
|
punpcklqdq xmm1, xmm3
|
||||||
|
|
||||||
|
movdqa XMMWORD Ptr [rdi], xmm1
|
||||||
|
|
||||||
|
add rdi, rdx
|
||||||
|
dec rcx
|
||||||
|
jnz filter_block1d16_h4_rowloop_ssse3
|
||||||
|
|
||||||
|
|
||||||
|
; begin epilog
|
||||||
|
pop rdi
|
||||||
|
pop rsi
|
||||||
|
RESTORE_GOT
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -289,7 +344,6 @@ sym(vp8_filter_block1d4_h6_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -397,7 +451,6 @@ filter_block1d4_h4_rowloop_ssse3:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -418,7 +471,6 @@ sym(vp8_filter_block1d16_v6_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -514,7 +566,6 @@ vp8_filter_block1d16_v6_ssse3_loop:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -587,7 +638,6 @@ vp8_filter_block1d16_v4_ssse3_loop:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -606,7 +656,6 @@ sym(vp8_filter_block1d8_v6_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -679,7 +728,6 @@ vp8_filter_block1d8_v6_ssse3_loop:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -728,7 +776,6 @@ vp8_filter_block1d8_v4_ssse3_loop:
|
|||||||
pop rdi
|
pop rdi
|
||||||
pop rsi
|
pop rsi
|
||||||
RESTORE_GOT
|
RESTORE_GOT
|
||||||
RESTORE_XMM
|
|
||||||
UNSHADOW_ARGS
|
UNSHADOW_ARGS
|
||||||
pop rbp
|
pop rbp
|
||||||
ret
|
ret
|
||||||
@@ -885,7 +932,7 @@ sym(vp8_bilinear_predict16x16_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
@@ -1148,7 +1195,7 @@ sym(vp8_bilinear_predict8x8_ssse3):
|
|||||||
push rbp
|
push rbp
|
||||||
mov rbp, rsp
|
mov rbp, rsp
|
||||||
SHADOW_ARGS_TO_STACK 6
|
SHADOW_ARGS_TO_STACK 6
|
||||||
SAVE_XMM 7
|
SAVE_XMM
|
||||||
GET_GOT rbx
|
GET_GOT rbx
|
||||||
push rsi
|
push rsi
|
||||||
push rdi
|
push rdi
|
||||||
|
|||||||
@@ -9,7 +9,7 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include "vpx_config.h"
|
#include "vpx_ports/config.h"
|
||||||
#include "vpx_ports/x86.h"
|
#include "vpx_ports/x86.h"
|
||||||
#include "vp8/common/g_common.h"
|
#include "vp8/common/g_common.h"
|
||||||
#include "vp8/common/subpixel.h"
|
#include "vp8/common/subpixel.h"
|
||||||
@@ -24,6 +24,10 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
VP8_COMMON_RTCD *rtcd = &ctx->rtcd;
|
VP8_COMMON_RTCD *rtcd = &ctx->rtcd;
|
||||||
int flags = x86_simd_caps();
|
int flags = x86_simd_caps();
|
||||||
|
int mmx_enabled = flags & HAS_MMX;
|
||||||
|
int xmm_enabled = flags & HAS_SSE;
|
||||||
|
int wmt_enabled = flags & HAS_SSE2;
|
||||||
|
int SSSE3Enabled = flags & HAS_SSSE3;
|
||||||
|
|
||||||
/* Note:
|
/* Note:
|
||||||
*
|
*
|
||||||
@@ -35,7 +39,7 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
/* Override default functions with fastest ones for this CPU. */
|
/* Override default functions with fastest ones for this CPU. */
|
||||||
#if HAVE_MMX
|
#if HAVE_MMX
|
||||||
|
|
||||||
if (flags & HAS_MMX)
|
if (mmx_enabled)
|
||||||
{
|
{
|
||||||
rtcd->idct.idct1 = vp8_short_idct4x4llm_1_mmx;
|
rtcd->idct.idct1 = vp8_short_idct4x4llm_1_mmx;
|
||||||
rtcd->idct.idct16 = vp8_short_idct4x4llm_mmx;
|
rtcd->idct.idct16 = vp8_short_idct4x4llm_mmx;
|
||||||
@@ -63,9 +67,9 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_mmx;
|
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_mmx;
|
||||||
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_mmx;
|
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_mmx;
|
||||||
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_mmx;
|
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_mmx;
|
||||||
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_simple_vertical_edge_mmx;
|
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_mbvs_mmx;
|
||||||
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_mmx;
|
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_mmx;
|
||||||
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_simple_horizontal_edge_mmx;
|
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_mbhs_mmx;
|
||||||
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_mmx;
|
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_mmx;
|
||||||
|
|
||||||
#if CONFIG_POSTPROC
|
#if CONFIG_POSTPROC
|
||||||
@@ -79,15 +83,11 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
#endif
|
#endif
|
||||||
#if HAVE_SSE2
|
#if HAVE_SSE2
|
||||||
|
|
||||||
if (flags & HAS_SSE2)
|
if (wmt_enabled)
|
||||||
{
|
{
|
||||||
rtcd->recon.recon2 = vp8_recon2b_sse2;
|
rtcd->recon.recon2 = vp8_recon2b_sse2;
|
||||||
rtcd->recon.recon4 = vp8_recon4b_sse2;
|
rtcd->recon.recon4 = vp8_recon4b_sse2;
|
||||||
rtcd->recon.copy16x16 = vp8_copy_mem16x16_sse2;
|
rtcd->recon.copy16x16 = vp8_copy_mem16x16_sse2;
|
||||||
rtcd->recon.build_intra_predictors_mbuv =
|
|
||||||
vp8_build_intra_predictors_mbuv_sse2;
|
|
||||||
rtcd->recon.build_intra_predictors_mbuv_s =
|
|
||||||
vp8_build_intra_predictors_mbuv_s_sse2;
|
|
||||||
|
|
||||||
rtcd->idct.iwalsh16 = vp8_short_inv_walsh4x4_sse2;
|
rtcd->idct.iwalsh16 = vp8_short_inv_walsh4x4_sse2;
|
||||||
|
|
||||||
@@ -101,9 +101,9 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_sse2;
|
rtcd->loopfilter.normal_b_v = vp8_loop_filter_bv_sse2;
|
||||||
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_sse2;
|
rtcd->loopfilter.normal_mb_h = vp8_loop_filter_mbh_sse2;
|
||||||
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_sse2;
|
rtcd->loopfilter.normal_b_h = vp8_loop_filter_bh_sse2;
|
||||||
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_simple_vertical_edge_sse2;
|
rtcd->loopfilter.simple_mb_v = vp8_loop_filter_mbvs_sse2;
|
||||||
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_sse2;
|
rtcd->loopfilter.simple_b_v = vp8_loop_filter_bvs_sse2;
|
||||||
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_simple_horizontal_edge_sse2;
|
rtcd->loopfilter.simple_mb_h = vp8_loop_filter_mbhs_sse2;
|
||||||
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_sse2;
|
rtcd->loopfilter.simple_b_h = vp8_loop_filter_bhs_sse2;
|
||||||
|
|
||||||
#if CONFIG_POSTPROC
|
#if CONFIG_POSTPROC
|
||||||
@@ -118,7 +118,7 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
|
|
||||||
#if HAVE_SSSE3
|
#if HAVE_SSSE3
|
||||||
|
|
||||||
if (flags & HAS_SSSE3)
|
if (SSSE3Enabled)
|
||||||
{
|
{
|
||||||
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_ssse3;
|
rtcd->subpix.sixtap16x16 = vp8_sixtap_predict16x16_ssse3;
|
||||||
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_ssse3;
|
rtcd->subpix.sixtap8x8 = vp8_sixtap_predict8x8_ssse3;
|
||||||
@@ -126,11 +126,6 @@ void vp8_arch_x86_common_init(VP8_COMMON *ctx)
|
|||||||
rtcd->subpix.sixtap4x4 = vp8_sixtap_predict4x4_ssse3;
|
rtcd->subpix.sixtap4x4 = vp8_sixtap_predict4x4_ssse3;
|
||||||
rtcd->subpix.bilinear16x16 = vp8_bilinear_predict16x16_ssse3;
|
rtcd->subpix.bilinear16x16 = vp8_bilinear_predict16x16_ssse3;
|
||||||
rtcd->subpix.bilinear8x8 = vp8_bilinear_predict8x8_ssse3;
|
rtcd->subpix.bilinear8x8 = vp8_bilinear_predict8x8_ssse3;
|
||||||
|
|
||||||
rtcd->recon.build_intra_predictors_mbuv =
|
|
||||||
vp8_build_intra_predictors_mbuv_ssse3;
|
|
||||||
rtcd->recon.build_intra_predictors_mbuv_s =
|
|
||||||
vp8_build_intra_predictors_mbuv_s_ssse3;
|
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@
|
|||||||
#include "vpx_ports/arm.h"
|
#include "vpx_ports/arm.h"
|
||||||
#include "vp8/common/blockd.h"
|
#include "vp8/common/blockd.h"
|
||||||
#include "vp8/common/pragmas.h"
|
#include "vp8/common/pragmas.h"
|
||||||
|
#include "vp8/common/postproc.h"
|
||||||
#include "vp8/decoder/dequantize.h"
|
#include "vp8/decoder/dequantize.h"
|
||||||
#include "vp8/decoder/onyxd_int.h"
|
#include "vp8/decoder/onyxd_int.h"
|
||||||
|
|
||||||
@@ -20,15 +21,12 @@ void vp8_arch_arm_decode_init(VP8D_COMP *pbi)
|
|||||||
{
|
{
|
||||||
#if CONFIG_RUNTIME_CPU_DETECT
|
#if CONFIG_RUNTIME_CPU_DETECT
|
||||||
int flags = pbi->common.rtcd.flags;
|
int flags = pbi->common.rtcd.flags;
|
||||||
|
int has_edsp = flags & HAS_EDSP;
|
||||||
#if HAVE_ARMV5TE
|
int has_media = flags & HAS_MEDIA;
|
||||||
if (flags & HAS_EDSP)
|
int has_neon = flags & HAS_NEON;
|
||||||
{
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
#if HAVE_ARMV6
|
#if HAVE_ARMV6
|
||||||
if (flags & HAS_MEDIA)
|
if (has_media)
|
||||||
{
|
{
|
||||||
pbi->dequant.block = vp8_dequantize_b_v6;
|
pbi->dequant.block = vp8_dequantize_b_v6;
|
||||||
pbi->dequant.idct_add = vp8_dequant_idct_add_v6;
|
pbi->dequant.idct_add = vp8_dequant_idct_add_v6;
|
||||||
@@ -40,7 +38,7 @@ void vp8_arch_arm_decode_init(VP8D_COMP *pbi)
|
|||||||
#endif
|
#endif
|
||||||
|
|
||||||
#if HAVE_ARMV7
|
#if HAVE_ARMV7
|
||||||
if (flags & HAS_NEON)
|
if (has_neon)
|
||||||
{
|
{
|
||||||
pbi->dequant.block = vp8_dequantize_b_neon;
|
pbi->dequant.block = vp8_dequantize_b_neon;
|
||||||
pbi->dequant.idct_add = vp8_dequant_idct_add_neon;
|
pbi->dequant.idct_add = vp8_dequant_idct_add_neon;
|
||||||
|
|||||||
52
vp8/decoder/arm/armv5/dequantize_v5.asm
Normal file
52
vp8/decoder/arm/armv5/dequantize_v5.asm
Normal file
@@ -0,0 +1,52 @@
|
|||||||
|
;
|
||||||
|
; Copyright (c) 2010 The WebM project authors. All Rights Reserved.
|
||||||
|
;
|
||||||
|
; Use of this source code is governed by a BSD-style license
|
||||||
|
; that can be found in the LICENSE file in the root of the source
|
||||||
|
; tree. An additional intellectual property rights grant can be found
|
||||||
|
; in the file PATENTS. All contributing project authors may
|
||||||
|
; be found in the AUTHORS file in the root of the source tree.
|
||||||
|
;
|
||||||
|
|
||||||
|
|
||||||
|
EXPORT |vp8_dequantize_b_armv5|
|
||||||
|
|
||||||
|
AREA |.text|, CODE, READONLY ; name this block of code
|
||||||
|
|
||||||
|
q RN r0
|
||||||
|
dqc RN r1
|
||||||
|
cnt RN r2
|
||||||
|
|
||||||
|
;void dequantize_b_armv5(short *Q, short *DQC)
|
||||||
|
|vp8_dequantize_b_armv5| PROC
|
||||||
|
stmdb sp!, {r4, lr}
|
||||||
|
ldr r3, [q]
|
||||||
|
ldr r4, [dqc], #8
|
||||||
|
|
||||||
|
mov cnt, #4
|
||||||
|
dequant_loop
|
||||||
|
smulbb lr, r3, r4
|
||||||
|
smultt r12, r3, r4
|
||||||
|
|
||||||
|
ldr r3, [q, #4]
|
||||||
|
ldr r4, [dqc, #-4]
|
||||||
|
|
||||||
|
strh lr, [q], #2
|
||||||
|
strh r12, [q], #2
|
||||||
|
|
||||||
|
smulbb lr, r3, r4
|
||||||
|
smultt r12, r3, r4
|
||||||
|
|
||||||
|
subs cnt, cnt, #1
|
||||||
|
ldrne r3, [q, #4]
|
||||||
|
ldrne r4, [dqc], #8
|
||||||
|
|
||||||
|
strh lr, [q], #2
|
||||||
|
strh r12, [q], #2
|
||||||
|
|
||||||
|
bne dequant_loop
|
||||||
|
|
||||||
|
ldmia sp!, {r4, pc}
|
||||||
|
ENDP ;|vp8_dequantize_b_arm|
|
||||||
|
|
||||||
|
END
|
||||||
@@ -26,6 +26,7 @@ extern void vp8_dequantize_b_loop_v6(short *Q, short *DQC, short *DQ);
|
|||||||
|
|
||||||
void vp8_dequantize_b_neon(BLOCKD *d)
|
void vp8_dequantize_b_neon(BLOCKD *d)
|
||||||
{
|
{
|
||||||
|
int i;
|
||||||
short *DQ = d->dqcoeff;
|
short *DQ = d->dqcoeff;
|
||||||
short *Q = d->qcoeff;
|
short *Q = d->qcoeff;
|
||||||
short *DQC = d->dequant;
|
short *DQC = d->dequant;
|
||||||
@@ -37,6 +38,7 @@ void vp8_dequantize_b_neon(BLOCKD *d)
|
|||||||
#if HAVE_ARMV6
|
#if HAVE_ARMV6
|
||||||
void vp8_dequantize_b_v6(BLOCKD *d)
|
void vp8_dequantize_b_v6(BLOCKD *d)
|
||||||
{
|
{
|
||||||
|
int i;
|
||||||
short *DQ = d->dqcoeff;
|
short *DQ = d->dqcoeff;
|
||||||
short *Q = d->qcoeff;
|
short *Q = d->qcoeff;
|
||||||
short *DQC = d->dequant;
|
short *DQC = d->dequant;
|
||||||
|
|||||||
@@ -35,7 +35,7 @@
|
|||||||
|
|
||||||
ldr r1, [sp, #4] ; stride
|
ldr r1, [sp, #4] ; stride
|
||||||
|
|
||||||
adr r12, cospi8sqrt2minus1 ; pointer to the first constant
|
ldr r12, _CONSTANTS_
|
||||||
|
|
||||||
vmul.i16 q1, q3, q5 ;input for short_idct4x4llm_neon
|
vmul.i16 q1, q3, q5 ;input for short_idct4x4llm_neon
|
||||||
vmul.i16 q2, q4, q6
|
vmul.i16 q2, q4, q6
|
||||||
@@ -123,6 +123,7 @@
|
|||||||
ENDP ; |vp8_dequant_idct_add_neon|
|
ENDP ; |vp8_dequant_idct_add_neon|
|
||||||
|
|
||||||
; Constant Pool
|
; Constant Pool
|
||||||
|
_CONSTANTS_ DCD cospi8sqrt2minus1
|
||||||
cospi8sqrt2minus1 DCD 0x4e7b4e7b
|
cospi8sqrt2minus1 DCD 0x4e7b4e7b
|
||||||
sinpi8sqrt2 DCD 0x8a8c8a8c
|
sinpi8sqrt2 DCD 0x8a8c8a8c
|
||||||
|
|
||||||
|
|||||||
@@ -41,7 +41,7 @@
|
|||||||
ldr r1, [sp, #4]
|
ldr r1, [sp, #4]
|
||||||
vld1.32 {d31[1]}, [r12]
|
vld1.32 {d31[1]}, [r12]
|
||||||
|
|
||||||
adr r2, cospi8sqrt2minus1 ; pointer to the first constant
|
ldr r2, _CONSTANTS_
|
||||||
|
|
||||||
ldrh r12, [r1], #2 ; lo *dc
|
ldrh r12, [r1], #2 ; lo *dc
|
||||||
ldrh r1, [r1] ; hi *dc
|
ldrh r1, [r1] ; hi *dc
|
||||||
@@ -198,6 +198,7 @@
|
|||||||
ENDP ; |idct_dequant_dc_full_2x_neon|
|
ENDP ; |idct_dequant_dc_full_2x_neon|
|
||||||
|
|
||||||
; Constant Pool
|
; Constant Pool
|
||||||
|
_CONSTANTS_ DCD cospi8sqrt2minus1
|
||||||
cospi8sqrt2minus1 DCD 0x4e7b
|
cospi8sqrt2minus1 DCD 0x4e7b
|
||||||
; because the lowest bit in 0x8a8c is 0, we can pre-shift this
|
; because the lowest bit in 0x8a8c is 0, we can pre-shift this
|
||||||
sinpi8sqrt2 DCD 0x4546
|
sinpi8sqrt2 DCD 0x4546
|
||||||
|
|||||||
@@ -40,7 +40,7 @@
|
|||||||
vld1.32 {d31[0]}, [r2]
|
vld1.32 {d31[0]}, [r2]
|
||||||
vld1.32 {d31[1]}, [r12]
|
vld1.32 {d31[1]}, [r12]
|
||||||
|
|
||||||
adr r2, cospi8sqrt2minus1 ; pointer to the first constant
|
ldr r2, _CONSTANTS_
|
||||||
|
|
||||||
; dequant: q[i] = q[i] * dq[i]
|
; dequant: q[i] = q[i] * dq[i]
|
||||||
vmul.i16 q2, q2, q0
|
vmul.i16 q2, q2, q0
|
||||||
@@ -190,6 +190,7 @@
|
|||||||
ENDP ; |idct_dequant_full_2x_neon|
|
ENDP ; |idct_dequant_full_2x_neon|
|
||||||
|
|
||||||
; Constant Pool
|
; Constant Pool
|
||||||
|
_CONSTANTS_ DCD cospi8sqrt2minus1
|
||||||
cospi8sqrt2minus1 DCD 0x4e7b
|
cospi8sqrt2minus1 DCD 0x4e7b
|
||||||
; because the lowest bit in 0x8a8c is 0, we can pre-shift this
|
; because the lowest bit in 0x8a8c is 0, we can pre-shift this
|
||||||
sinpi8sqrt2 DCD 0x4546
|
sinpi8sqrt2 DCD 0x4546
|
||||||
|
|||||||
@@ -9,14 +9,26 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
#include "vpx_ports/asm_offsets.h"
|
#include "vpx_ports/config.h"
|
||||||
|
#include <stddef.h>
|
||||||
|
|
||||||
#include "onyxd_int.h"
|
#include "onyxd_int.h"
|
||||||
|
|
||||||
BEGIN
|
#define DEFINE(sym, val) int sym = val;
|
||||||
|
|
||||||
|
/*
|
||||||
|
#define BLANK() asm volatile("\n->" : : )
|
||||||
|
*/
|
||||||
|
|
||||||
|
/*
|
||||||
|
* int main(void)
|
||||||
|
* {
|
||||||
|
*/
|
||||||
|
|
||||||
DEFINE(detok_scan, offsetof(DETOK, scan));
|
DEFINE(detok_scan, offsetof(DETOK, scan));
|
||||||
DEFINE(detok_ptr_block2leftabove, offsetof(DETOK, ptr_block2leftabove));
|
DEFINE(detok_ptr_block2leftabove, offsetof(DETOK, ptr_block2leftabove));
|
||||||
DEFINE(detok_coef_tree_ptr, offsetof(DETOK, vp8_coef_tree_ptr));
|
DEFINE(detok_coef_tree_ptr, offsetof(DETOK, vp8_coef_tree_ptr));
|
||||||
|
DEFINE(detok_teb_base_ptr, offsetof(DETOK, teb_base_ptr));
|
||||||
DEFINE(detok_norm_ptr, offsetof(DETOK, norm_ptr));
|
DEFINE(detok_norm_ptr, offsetof(DETOK, norm_ptr));
|
||||||
DEFINE(detok_ptr_coef_bands_x, offsetof(DETOK, ptr_coef_bands_x));
|
DEFINE(detok_ptr_coef_bands_x, offsetof(DETOK, ptr_coef_bands_x));
|
||||||
|
|
||||||
@@ -34,7 +46,12 @@ DEFINE(bool_decoder_value, offsetof(BOOL_DECODER, value));
|
|||||||
DEFINE(bool_decoder_count, offsetof(BOOL_DECODER, count));
|
DEFINE(bool_decoder_count, offsetof(BOOL_DECODER, count));
|
||||||
DEFINE(bool_decoder_range, offsetof(BOOL_DECODER, range));
|
DEFINE(bool_decoder_range, offsetof(BOOL_DECODER, range));
|
||||||
|
|
||||||
END
|
DEFINE(tokenextrabits_min_val, offsetof(TOKENEXTRABITS, min_val));
|
||||||
|
DEFINE(tokenextrabits_length, offsetof(TOKENEXTRABITS, Length));
|
||||||
|
|
||||||
/* add asserts for any offset that is not supported by assembly code */
|
//add asserts for any offset that is not supported by assembly code
|
||||||
/* add asserts for any size that is not supported by assembly code */
|
//add asserts for any size that is not supported by assembly code
|
||||||
|
/*
|
||||||
|
* return 0;
|
||||||
|
* }
|
||||||
|
*/
|
||||||
|
|||||||
@@ -13,6 +13,19 @@
|
|||||||
#include "vpx_ports/mem.h"
|
#include "vpx_ports/mem.h"
|
||||||
#include "vpx_mem/vpx_mem.h"
|
#include "vpx_mem/vpx_mem.h"
|
||||||
|
|
||||||
|
DECLARE_ALIGNED(16, const unsigned char, vp8dx_bitreader_norm[256]) =
|
||||||
|
{
|
||||||
|
0, 7, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
|
||||||
|
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
|
||||||
|
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
|
||||||
|
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
|
||||||
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||||
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||||
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||||
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||||
|
};
|
||||||
|
|
||||||
|
|
||||||
int vp8dx_start_decode(BOOL_DECODER *br,
|
int vp8dx_start_decode(BOOL_DECODER *br,
|
||||||
const unsigned char *source,
|
const unsigned char *source,
|
||||||
unsigned int source_sz)
|
unsigned int source_sz)
|
||||||
|
|||||||
@@ -34,7 +34,7 @@ typedef struct
|
|||||||
unsigned int range;
|
unsigned int range;
|
||||||
} BOOL_DECODER;
|
} BOOL_DECODER;
|
||||||
|
|
||||||
DECLARE_ALIGNED(16, extern const unsigned char, vp8_norm[256]);
|
DECLARE_ALIGNED(16, extern const unsigned char, vp8dx_bitreader_norm[256]);
|
||||||
|
|
||||||
int vp8dx_start_decode(BOOL_DECODER *br,
|
int vp8dx_start_decode(BOOL_DECODER *br,
|
||||||
const unsigned char *source,
|
const unsigned char *source,
|
||||||
@@ -51,26 +51,19 @@ void vp8dx_bool_decoder_fill(BOOL_DECODER *br);
|
|||||||
#define VP8DX_BOOL_DECODER_FILL(_count,_value,_bufptr,_bufend) \
|
#define VP8DX_BOOL_DECODER_FILL(_count,_value,_bufptr,_bufend) \
|
||||||
do \
|
do \
|
||||||
{ \
|
{ \
|
||||||
int shift = VP8_BD_VALUE_SIZE - 8 - ((_count) + 8); \
|
int shift; \
|
||||||
int loop_end, x; \
|
for(shift = VP8_BD_VALUE_SIZE - 8 - ((_count) + 8); shift >= 0; ) \
|
||||||
size_t bits_left = ((_bufend)-(_bufptr))*CHAR_BIT; \
|
|
||||||
\
|
|
||||||
x = shift + CHAR_BIT - bits_left; \
|
|
||||||
loop_end = 0; \
|
|
||||||
if(x >= 0) \
|
|
||||||
{ \
|
{ \
|
||||||
(_count) += VP8_LOTS_OF_BITS; \
|
if((_bufptr) >= (_bufend)) { \
|
||||||
loop_end = x; \
|
(_count) = VP8_LOTS_OF_BITS; \
|
||||||
if(!bits_left) break; \
|
break; \
|
||||||
} \
|
} \
|
||||||
while(shift >= loop_end) \
|
(_count) += 8; \
|
||||||
{ \
|
|
||||||
(_count) += CHAR_BIT; \
|
|
||||||
(_value) |= (VP8_BD_VALUE)*(_bufptr)++ << shift; \
|
(_value) |= (VP8_BD_VALUE)*(_bufptr)++ << shift; \
|
||||||
shift -= CHAR_BIT; \
|
shift -= 8; \
|
||||||
} \
|
} \
|
||||||
} \
|
} \
|
||||||
while(0) \
|
while(0)
|
||||||
|
|
||||||
|
|
||||||
static int vp8dx_decode_bool(BOOL_DECODER *br, int probability) {
|
static int vp8dx_decode_bool(BOOL_DECODER *br, int probability) {
|
||||||
@@ -81,14 +74,11 @@ static int vp8dx_decode_bool(BOOL_DECODER *br, int probability) {
|
|||||||
int count;
|
int count;
|
||||||
unsigned int range;
|
unsigned int range;
|
||||||
|
|
||||||
split = 1 + (((br->range - 1) * probability) >> 8);
|
|
||||||
|
|
||||||
if(br->count < 0)
|
|
||||||
vp8dx_bool_decoder_fill(br);
|
|
||||||
|
|
||||||
value = br->value;
|
value = br->value;
|
||||||
count = br->count;
|
count = br->count;
|
||||||
|
range = br->range;
|
||||||
|
|
||||||
|
split = 1 + (((range - 1) * probability) >> 8);
|
||||||
bigsplit = (VP8_BD_VALUE)split << (VP8_BD_VALUE_SIZE - 8);
|
bigsplit = (VP8_BD_VALUE)split << (VP8_BD_VALUE_SIZE - 8);
|
||||||
|
|
||||||
range = split;
|
range = split;
|
||||||
@@ -101,7 +91,7 @@ static int vp8dx_decode_bool(BOOL_DECODER *br, int probability) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
{
|
{
|
||||||
register unsigned int shift = vp8_norm[range];
|
register unsigned int shift = vp8dx_bitreader_norm[range];
|
||||||
range <<= shift;
|
range <<= shift;
|
||||||
value <<= shift;
|
value <<= shift;
|
||||||
count -= shift;
|
count -= shift;
|
||||||
@@ -109,7 +99,8 @@ static int vp8dx_decode_bool(BOOL_DECODER *br, int probability) {
|
|||||||
br->value = value;
|
br->value = value;
|
||||||
br->count = count;
|
br->count = count;
|
||||||
br->range = range;
|
br->range = range;
|
||||||
|
if(count < 0)
|
||||||
|
vp8dx_bool_decoder_fill(br);
|
||||||
return bit;
|
return bit;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -128,19 +119,18 @@ static int vp8_decode_value(BOOL_DECODER *br, int bits)
|
|||||||
|
|
||||||
static int vp8dx_bool_error(BOOL_DECODER *br)
|
static int vp8dx_bool_error(BOOL_DECODER *br)
|
||||||
{
|
{
|
||||||
/* Check if we have reached the end of the buffer.
|
/* Check if we have reached the end of the buffer.
|
||||||
*
|
*
|
||||||
* Variable 'count' stores the number of bits in the 'value' buffer, minus
|
* Variable 'count' stores the number of bits in the 'value' buffer,
|
||||||
* 8. The top byte is part of the algorithm, and the remainder is buffered
|
* minus 8. So if count == 8, there are 16 bits available to be read.
|
||||||
* to be shifted into it. So if count == 8, the top 16 bits of 'value' are
|
* Normally, count is filled with 8 and one byte is filled into the
|
||||||
* occupied, 8 for the algorithm and 8 in the buffer.
|
* value buffer. When we reach the end of the buffer, count is instead
|
||||||
*
|
* filled with VP8_LOTS_OF_BITS, 8 of which represent the last 8 real
|
||||||
* When reading a byte from the user's buffer, count is filled with 8 and
|
* bits from the bitstream. So the last bit in the bitstream will be
|
||||||
* one byte is filled into the value buffer. When we reach the end of the
|
* represented by count == VP8_LOTS_OF_BITS - 16.
|
||||||
* data, count is additionally filled with VP8_LOTS_OF_BITS. So when
|
*/
|
||||||
* count == VP8_LOTS_OF_BITS - 1, the user's data has been exhausted.
|
if ((br->count > VP8_BD_VALUE_SIZE)
|
||||||
*/
|
&& (br->count <= VP8_LOTS_OF_BITS - 16))
|
||||||
if ((br->count > VP8_BD_VALUE_SIZE) && (br->count < VP8_LOTS_OF_BITS))
|
|
||||||
{
|
{
|
||||||
/* We have tried to decode bits after the end of
|
/* We have tried to decode bits after the end of
|
||||||
* stream was encountered.
|
* stream was encountered.
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user