Commit Graph

780 Commits

Author SHA1 Message Date
Diego Biurrun
f54b55058a configure: Rename cmov processor capability to i686
The goal is to make the capapility slightly more general and have it
cover the availability of the nopl instruction in addition to cmov.
2013-05-12 21:23:38 +02:00
Christophe Gisquet
2c299d4165 x86: sbrdsp: implement SSE2 qmf_pre_shuffle
From 253 to 51 cycles on Arrandale and Win64.
44 cycles on SandyBridge.

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-05-10 09:31:27 +02:00
Diego Biurrun
f243bf7aa2 x86: dsputil: Remove unused argument from QPEL_OP macro 2013-05-08 18:18:58 +02:00
Diego Biurrun
3d40c1ee74 x86: dsputil: Move TRANSPOSE4 macro to the only place it is used 2013-05-08 18:18:23 +02:00
Diego Biurrun
71469f3b63 x86: dsputil: Move constant declarations into separate header 2013-05-08 18:18:23 +02:00
Diego Biurrun
ed880050ed x86: dsputil: Group all assembly constants together in constants.c 2013-05-08 01:04:04 +02:00
Diego Biurrun
8761466760 x86: dsputil: Move ff_pd assembly constants to the only place they are used 2013-05-08 01:04:04 +02:00
Diego Biurrun
1b343cedd7 x86: dsputil: Remove unused ff_pb_3F constant 2013-05-07 18:03:35 +02:00
Diego Biurrun
3334cbec0a x86: dsputil: Remove unused MOVQ_BONE macro 2013-05-07 18:03:35 +02:00
Diego Biurrun
63bac48f73 x86: dsputil: Move rv40-specific functions where they belong 2013-05-07 18:03:35 +02:00
Diego Biurrun
92f8e06ecb x86: dsputil hpeldsp: Move shared template functions into separate object 2013-05-07 18:03:34 +02:00
Diego Biurrun
7edaf4edb5 x86: rnd_template: Eliminate pointless OP_AVG macro indirection 2013-05-07 18:03:34 +02:00
Diego Biurrun
110796739a x86: hpeldsp: Move avg_pixels8_x2_mmx() out of hpeldsp_rnd_template.c
The function is only instantiated once, so there is no point
in keeping it in a template file.
2013-05-06 11:02:08 +02:00
Diego Biurrun
dc1b328d0d x86: hpeldsp: Only compile MMX hpeldsp code if MMX is enabled 2013-05-06 11:02:08 +02:00
Diego Biurrun
9e5e76ef9e x86: More specific ifdefs for dsputil/hpeldsp init functions 2013-05-06 11:02:07 +02:00
Diego Biurrun
6fee1b90ce avcodec: Add av_cold attributes to init functions missing them 2013-05-04 21:09:45 +02:00
Diego Biurrun
a5f8873620 silly typo fixes 2013-05-03 18:26:12 +02:00
Christophe Gisquet
5a97469a4f x86: sbrdsp: Implement SSE2 qmf_deint_bfly
Sandybridge: 47 cycles

Having a loop counter is a 7 cycle gain.
Unrolling is another 7 cycle gain.
Working in reverse scan is another 6 cycles.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-05-03 18:23:14 +02:00
Diego Biurrun
bf7c3c6b15 x86: dsputil: Move cavs and vc1-specific functions where they belong 2013-05-02 11:45:37 +02:00
Diego Biurrun
9328062321 x86: dsputil: Move avg_pixels16_mmx() out of rnd_template.c
The function does not do any rounding, so there is no point in
keeping it in a round template file.
2013-05-02 11:45:37 +02:00
Diego Biurrun
9c112a6158 x86: dsputil: Move avg_pixels8_mmx() out of rnd_template.c
The function is only instantiated once, so there is no point
in keeping it in a template file.
2013-05-02 11:45:37 +02:00
Diego Biurrun
9b3a04d306 x86: Move duplicated put_pixels{8|16}_mmx functions into their own file 2013-05-02 11:16:45 +02:00
Diego Biurrun
f2e9d44a57 x86: Drop unnecessary ff_ name prefixes from static functions 2013-04-30 16:02:03 +02:00
Diego Biurrun
643e433bf7 mpegaudiosp: More consistent names for ppc/x86 optimization files 2013-04-30 12:19:43 +02:00
Diego Biurrun
97c56ad796 x86: dsputil: Remove a set of pointless #ifs around function declarations 2013-04-30 01:42:32 +02:00
Diego Biurrun
85f2f82af6 x86: dsputil: cosmetics: Group ff_{avg|put}_pixels16_mmxext() declarations 2013-04-30 01:41:05 +02:00
Diego Biurrun
20784aa678 x86: hpeldsp: Remove unused macro definitions 2013-04-29 15:57:00 +02:00
Diego Biurrun
7c00e9d8ae x86: ac3dsp: Remove 3dnow version of ff_ac3_extract_exponents
The function requires increasing the fuzz factor for the ac3/eac3 encode
tests and even so makes fate fail. It only provides a slight encoding
speedup for legacy CPUs that do not support SS2. Thus its benefit is not
worth the trouble it creates and fixing it would be a waste of time.
2013-04-26 21:06:52 +02:00
Martin Storsjö
74685f6783 x86: Rename dsputil_rnd_template.c to rnd_template.c
This makes it less confusing when this template is shared both by
dsputil and by hpeldsp.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-25 23:03:09 +03:00
Martin Storsjö
486f76f029 x86: Get rid of duplication between *_rnd_template.c
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-23 23:30:17 +03:00
Martin Storsjö
6a8561dbd7 x86: Factorize duplicated inline assembly snippets
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-23 15:07:31 +02:00
Diego Biurrun
c1ad70c3cb x86: Move some conditional code around to avoid unused variable warnings 2013-04-22 17:50:02 +02:00
Diego Biurrun
b4ad7c54c8 x86: cavs: Refactor duplicate dspfunc macro 2013-04-22 12:05:09 +02:00
Diego Biurrun
78fa0bd0f7 x86: cavs: Put mmx-specific code into its own init function
Before, this code was labeled as mmxext and enabled both for the
3dnow and the mmxext case.
2013-04-22 10:42:50 +02:00
Diego Biurrun
311a592dfc x86: Remove some duplicate function declarations 2013-04-22 02:29:57 +02:00
Martin Storsjö
b71a0507b0 x86: Remove unused inline asm instruction defines
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-20 00:44:54 +03:00
Ronald S. Bultje
8db00081a3 x86: hpeldsp: Move half-pel assembly from dsputil to hpeldsp
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-19 23:18:53 +03:00
Ronald S. Bultje
015821229f vp3: Use full transpose for all IDCTs
This way, the special IDCT permutations are no longer needed. This
is similar to how H264 does it, and removes the dsputil dependency
imposed by the scantable code.

Also remove the unused type == 0 cases from the plain C version
of the idct.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:32:05 +03:00
Ronald S. Bultje
c46819f229 x86: Move constants to the only place where they are used
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-15 12:17:39 +03:00
Diego Biurrun
a3cb865310 x86: dsputil: Move some ifdefs to avoid unused variable warnings 2013-04-12 09:36:47 +02:00
Diego Biurrun
2004c7c8f7 x86: dsputil: cosmetics: Remove two pointless variable indirections 2013-04-12 09:36:47 +02:00
Diego Biurrun
c51a3a5bd9 x86: dsputil: Refactor some ff_{avg|put}_pixels function declarations 2013-04-12 09:36:46 +02:00
Diego Biurrun
e027032fc6 x86: dsputil: ff_h263_*_loop_filter declarations to a more suitable place 2013-04-12 09:36:46 +02:00
Diego Biurrun
a89c05500f x86: h264qpel: int --> ptrdiff_t for some line_size parameters 2013-04-12 09:30:12 +02:00
Diego Biurrun
ac9362c5d9 Move misplaced file author information where it belongs 2013-04-11 02:42:11 +02:00
Ronald S. Bultje
b93b27edb0 dsputil: Make dsputil selectable
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:04:05 +03:00
Ronald S. Bultje
62844c3fd6 h264: Integrate clear_blocks calls with IDCT
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-10 11:03:06 +03:00
Ronald S. Bultje
610b18e2e3 x86: qpel: Move fullpel and l2 functions to a separate file
This way, they can be shared between mpeg4qpel and h264qpel without
requiring either one to be compiled unconditionally.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-04-08 12:38:33 +03:00
Christophe Gisquet
f4b0d12f5b x86: sbrdsp: Implement SSE neg_odd_64
Timing on Arrandale:
        C   SSE
Win32:  57   44
Win64:  47   38
Unrolling and not storing mask both save some cycles.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2013-04-05 22:47:04 +02:00
Diego Biurrun
b6649ab503 cosmetics: Remove unnecessary extern keywords from function declarations 2013-03-27 14:21:45 +01:00