Martin Storsjö
23f57adaea
Do full register loads instead of single-lane loads in DeblockLumaEq4H_neon
...
Instead of loading the registers one lane at a time, load full
registers and then transpose them.
This is faster, reducing the runtime for the function from about
506 cycles to 434 cycles (tested on a Cortex A8).
This also avoids an issue which seems like a cpu bug, present
on Sony Xperia T (cpu implementer 0x51 architecture 7 variant 0x1
part 0x04d). On such a device, it seemed like the "vswp q9, q10"
could start executing before the previous
vld4.u8 {d20[x],d21[x],d22[x],d23[x]}, [r3], r1
had finished and written back their result. Changing the
"vswp q9, q10" into "vswp q10, q9", or into separate
"vswp d18, d20; vswp d19, d21" (or the other way around) seemed to
avoid the issue. This happened occasionally (a couple times per
100000 invocations or so).
2014-04-28 10:12:16 +03:00
huili2
9d1af8c378
Merge pull request #751 from huili2/NewSeq_replace_I
...
use new seq instead of I slice
2014-04-28 13:40:15 +08:00
Licai Guo
669d704fac
refine pNzc set access
2014-04-26 16:36:16 -07:00
volvet
c5f04cfbd4
Merge pull request #750 from mstorsjo/deblocking-neon-cpu-features
...
Check for WELS_CPU_NEON before calling DeblockingBSCalcEnc_neon
2014-04-25 19:05:12 +08:00
volvet
84ff16c015
Merge pull request #749 from mstorsjo/dos-newlines
...
Remove dos newlines in the android java code
2014-04-25 18:18:11 +08:00
Martin Storsjö
00a724076b
Check for WELS_CPU_NEON before calling DeblockingBSCalcEnc_neon
...
Checking HAVE_NEON is not enough; e.g. android devices with
armeabi-v7a are not required to have NEON, so every use of such
functions should be check WELS_CPU_NEON in the cpu features
as well.
2014-04-25 13:02:22 +03:00
sijchen
bd8d97dddb
Merge pull request #748 from huili2/newSeq_bugfix
...
fix bug of new seq check
2014-04-25 17:54:33 +08:00
huili2
0c544962d8
use new seq instead of I slice
2014-04-25 01:46:00 -07:00
Martin Storsjö
655d9c5dbf
Remove dos newlines in the android java code
2014-04-25 11:03:03 +03:00
huili2
c0d21a23f3
Merge pull request #745 from lyao2/scrollingUT
...
add scroll detection UT
2014-04-25 15:07:15 +08:00
huili2
c65d250817
fix bug of new seq check
2014-04-24 22:11:42 -07:00
volvet
c65e286036
Merge pull request #738 from mstorsjo/gnu-aarch64
...
Fix building the aarch64 assembly using gnu binutils
2014-04-25 09:07:43 +08:00
ruil2
f57bb5042a
Merge pull request #747 from ganyangbbl/resolution_issue
...
fix resolution setting issue
2014-04-25 09:03:01 +08:00
ganyang
8ee85918c8
fix resolution setting issue
2014-04-24 17:34:31 +08:00
Licai Guo
bcb76d383b
Merge pull request #746 from huili2/IDR_bugfix
...
fix typo of IDR loss
2014-04-24 17:30:04 +08:00
huili2
593e291d19
fix typo of IDR loss
2014-04-24 02:26:40 -07:00
Licai Guo
d905d6bdfd
Merge pull request #742 from huili2/ec_ut_copy
...
add slice/frame copy UT for EC
2014-04-24 17:08:41 +08:00
huili2
c4ab780d21
Merge pull request #732 from licaiguo/forJS
...
specify accurate align information for ST32
2014-04-24 16:50:00 +08:00
Licai Guo
fe23d53acc
Merge pull request #744 from huili2/ec_IdrLoss
...
enable the case for IDR loss
2014-04-24 16:47:36 +08:00
Licai Guo
41698901c1
Merge pull request #743 from huili2/ec_refidx_return
...
prevent from return if ref_idx is error
2014-04-24 16:47:20 +08:00
lyao2
34ad719cf2
Squashed commit of the following:
...
commit f73d6cf0fcae5f401fc2817ab736af996113ca09
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Thu Apr 24 15:02:21 2014 +0800
remove comments
commit 75416c2cf6c1ebb7aabf9e8c52d8c7163a8009b7
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Thu Apr 24 14:52:09 2014 +0800
for test
commit 7dfb65ce514edcff892bfb3919921cadcce1d055
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Thu Apr 24 14:12:31 2014 +0800
for test
commit eff771645e8c349dc4e454ab1751530b3cef18ed
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Thu Apr 24 10:51:34 2014 +0800
for test
commit 9c42b9a7a04068e70be94529941f549b58e63780
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Wed Apr 23 17:46:59 2014 +0800
update cpu_flag
commit cce3fccc0a4249b82ab2e0e92fe53579ef942799
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Wed Apr 23 17:26:56 2014 +0800
for test
commit 3d292995b3c4437a2674a687cc4e8da1b5fb83f5
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Wed Apr 23 16:45:57 2014 +0800
remove space
commit c608c2ba7cf010f1dcf8c0344f68536c48e181cb
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Wed Apr 23 16:42:43 2014 +0800
remove tabs
commit 3b769342a06e25ad23a2c86f23a94d0d7ca1a4c8
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Wed Apr 23 16:33:55 2014 +0800
refine UT case
commit 89b869f0c8f8c9bbd61e9de32caa77877aeae064
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Tue Apr 22 13:40:50 2014 +0800
Squashed commit of the following:
commit abe55494134ef8342ffe9566df4e1b3265fe21b6
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Tue Apr 22 10:50:07 2014 +0800
set MV range
commit 8c7f70c351e50d945c29118bed8b3781c22b7dbc
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Mon Apr 21 16:53:10 2014 +0800
refinement
commit bf35f19a7dc88743aacf8e89e681e0ef3302d40a
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Fri Apr 18 17:24:31 2014 +0800
correct tabs
commit 130b7f895d7020bfc571d910966891da93150242
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Fri Apr 18 17:17:06 2014 +0800
correct format
commit 0429703b0844363559dd2b3d44e45034232a9d8f
Author: lyao2 <lyao2@LYAO2-WS01.cisco.com>
Date: Fri Apr 18 15:12:44 2014 +0800
add scroll UT
2014-04-24 15:12:49 +08:00
huili2
314005435e
enable the case for IDR loss
2014-04-23 23:13:12 -07:00
ruil2
ba8e3f2967
Merge pull request #741 from varunbpatil/ref_frames
...
Fix calculation of num ref frames
2014-04-24 14:06:46 +08:00
Varun B Patil
422d1d1569
Fix calculation of num ref frames
2014-04-24 10:42:15 +05:30
huili2
4a6259cf74
add slice/frame copy UT for EC
2014-04-23 21:45:17 -07:00
huili2
9d5bc6fd74
Merge pull request #740 from licaiguo/test
...
fix bNewSeqBegin logic
2014-04-24 11:02:26 +08:00
Licai Guo
bada2d35bf
fix bNewSeqBegin logic
2014-04-23 18:50:11 -07:00
Licai Guo
f00d3ac15f
Merge pull request #737 from mstorsjo/make-aarch64
...
Add support for building the arm64 assembly with the make build system
2014-04-24 06:40:07 +08:00
Martin Storsjö
66f58e8357
Add macros for the non-standard mov.16b/mov.8b/ext.16b/ext.8b
...
This fixes building with gnu binutils, which don't support this
nonstandard form of the instructions.
Once Apple's tools support the proper standard form of the
instructions, the code should be updated to use that everywhere
instead, and these macros should be removed.
2014-04-23 11:47:12 +03:00
Martin Storsjö
7cd175d097
Use the correct ext syntax in the gnu version of macros
2014-04-23 11:47:12 +03:00
Martin Storsjö
b13a399ab5
Use a plain "ret" instead of "ret lr"
...
This fixes an issue with assembling with gnu binutils.
2014-04-23 11:47:12 +03:00
Martin Storsjö
f2642b308a
Add correct arguments to the gnu version of UNPACK_FILTER_SINGLE_TAG_16BITS
2014-04-23 11:47:12 +03:00
Martin Storsjö
90fad9fd98
Add \() to macro arguments to separate the argument from the following .8h or similar
2014-04-23 11:47:12 +03:00
Martin Storsjö
80bd541cbe
Remove .syntax unified from the aarch64 common header
...
This directive isn't available in aarch64 code, only in arm code.
2014-04-23 11:47:12 +03:00
Martin Storsjö
f1b2d51d86
Add support for building the arm64 assembly in platform-arch.mk
2014-04-23 11:44:47 +03:00
Martin Storsjö
3c2e9cd7bf
Regenerate makefiles to include the new arm64 assembly files
2014-04-23 11:44:47 +03:00
Martin Storsjö
84ff82ee24
Exclude the new arm64 include file
2014-04-23 11:44:47 +03:00
Martin Storsjö
c8901c7dcd
Add support for arm64 assembly source files in mktargets.py
...
Disambiguate between arm and arm64 sources by checking the directory
names.
The arm assembly sources can be assembled on arm64 and vice versa
without any effect since all of the implementation is hidden behind
the HAVE_NEON and HAVE_NEON_AARCH64 defines, but it still is cleaner
to not build extra empty object files than to build all *.S files
on all arm variants. (The iOS project files build all of the arm
assembly files, regardless of the target architecture, since
individual files can't easily be excluded based on the target
architecture there.)
2014-04-23 11:41:17 +03:00
Martin Storsjö
764f787dcb
Rename the makefile variable for arm assembly sources
...
This is in preparation for adding support for the aarc64 assembly
files as well.
2014-04-23 10:55:30 +03:00
Martin Storsjö
788b67cbde
Fix the indentation of a line in targets.mk
...
This would be avoided if the targets.mk files are updated by
rerunning mktargets.sh instead of manually updating them.
2014-04-23 10:55:30 +03:00
volvet
f1737cbec6
Merge pull request #736 from licaiguo/update-gitignore
...
update .gitignore to ignore *.orig
2014-04-23 15:40:56 +08:00
Licai Guo
5507c76e5d
update .gitignore to ignore *.orig
2014-04-23 00:20:08 -07:00
Licai Guo
021fff491b
Merge pull request #735 from mstorsjo/cleanup-mess
...
Clean up the mess left by merging the motion compensation arm64 neon code
2014-04-23 15:19:15 +08:00
Martin Storsjö
a842f14a3c
Remove .orig files left over from running astyle
2014-04-23 09:24:23 +03:00
Martin Storsjö
45aef90d26
Remove the executable bit from source files
2014-04-23 09:23:56 +03:00
Licai Guo
b6a765ad71
Merge pull request #734 from dongzha/MC_ARM64
...
Add Motion Compehension ARM64 Neon Code
2014-04-23 13:58:26 +08:00
huili2
13db7fea7b
prevent from return if ref_idx is error
2014-04-22 22:43:40 -07:00
dongzhang
ad9e2dab4f
Add Motion Compehension ARM64 Neon Code
2014-04-23 13:26:28 +08:00
Licai Guo
b47606a4ff
Merge pull request #733 from dongzha/ExpandPic_ARM64
...
Add expand picture support for ARM64 NEON
2014-04-23 09:57:39 +08:00
dongzhang
2444327a6c
Add expand picture support for ARM64 NEON
...
Remove duplicate MACROS
2014-04-23 09:14:32 +08:00