Commit Graph

15 Commits

Author SHA1 Message Date
Christophe Gisquet
7ece8b50b1 x86: simple_idct: 12bits versions
On 12 frames of a 444p 12 bits DNxHR sequence, _put function:
C:         78902 decicycles in idct,  262071 runs,     73 skips
avx:       32478 decicycles in idct,  262045 runs,     99 skips

Difference between the 2:
stddev:    0.39 PSNR:104.47 MAXDIFF:    2

This is unavoidable and due to the scale factors used in the x86
version, which cannot match the C ones.

In addition, the trick of adding an initial bias to the input of a
pass can overflow, as the input coefficients are already 15bits,
which is the maximum this function can handle.

Overall, however, the omse on 12 bits samples goes from 0.16916 to
0.16883. Reducing rowshift by 1 improves to 0.0908, but causes
overflows.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 15:34:32 +02:00
Christophe Gisquet
4369b9dc7b x86: simple_idct(_put): 10bits versions
Modeled from the prores version. Clips to [0;1023] and is bitexact.
Bitexactness requires to add offsets in different places compared to
prores or C, and makes the function approximately 2% slower.

For 16 frames of a DNxHD 4:2:2 10bits test sequence:

C:    60861 decicycles in idct, 1048205 runs,    371 skips
sse2: 27567 decicycles in idct, 1048216 runs,    360 skips
avx:  26272 decicycles in idct, 1048171 runs,    405 skips

The add version is not implemented, so the corresponding dsp
function is set to NULL to make it clear in a code executing it.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2015-10-13 13:32:21 +02:00
James Almer
4f4f08e6f0 x86/idctdsp: port {put,add}_pixels_clamped to yasm
Also add sse2 versions for both.
put_pixels_clamped port and sse2 version originally written by Timothy Gu.

Reviewed-by: Michael Niedermayer <michaelni@gmx.at>
Signed-off-by: James Almer <jamrial@gmail.com>
2014-09-24 21:52:13 -03:00
Michael Niedermayer
0dcebb9f63 Merge commit '84d173d3de97c753234ab0c0b50551d51413d663'
* commit '84d173d3de97c753234ab0c0b50551d51413d663':
  xvididct: Ensure that the scantable permutation is always set correctly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-08 22:17:04 +02:00
Diego Biurrun
84d173d3de xvididct: Ensure that the scantable permutation is always set correctly
This fixes cases where the scantable permuation would get overwritten by
the general idctdsp initialization.
2014-08-08 11:13:29 -07:00
Michael Niedermayer
f54e01c24e Merge commit 'a786c8259dafeca9744252230b5d78f67810770c'
* commit 'a786c8259dafeca9744252230b5d78f67810770c':
  idct: Split off Xvid IDCT

Conflicts:
	libavcodec/Makefile
	libavcodec/mpeg4videodec.c
	libavcodec/x86/Makefile
	libavcodec/x86/idctdsp_init.c

This split is somewhat restructured leaving the xvid IDCT available
outside mpeg4 if manually selected.

The code also could not be merged unchanged as it conflicted with a
bugfix in FFmpeg

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-08-01 16:21:52 +02:00
Diego Biurrun
a786c8259d idct: Split off Xvid IDCT
The Xvid IDCT is only required to decode some Xvid-encoded MPEG-4 files,
so there is no point in having it as an unconditional part of idctdsp.
2014-08-01 01:25:18 -07:00
Michael Niedermayer
776647360d Merge commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0'
* commit '5dcc201505f71b1e73e9eef12ce89d4eed252ad0':
  simple_idct: Move x86-specific declarations to a header in the x86 directory

Conflicts:
	libavcodec/x86/simple_idct.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-19 13:56:29 +02:00
Diego Biurrun
5dcc201505 simple_idct: Move x86-specific declarations to a header in the x86 directory 2014-07-19 02:33:36 -07:00
Michael Niedermayer
521f569734 Merge commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273'
* commit '8b0dd4942aac320d1ca3c40fa7ea1be342c71273':
  idctdsp: prettyprinting cosmetics

Conflicts:
	libavcodec/idctdsp.c
	libavcodec/ppc/idctdsp.c
	libavcodec/x86/idctdsp_init.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:16:04 +02:00
Michael Niedermayer
42d326353c Merge commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae'
* commit 'b4987f72197e0c62cf2633bf835a9c32d2a445ae':
  idct: Convert IDCT permutation #defines to an enum

Conflicts:
	libavcodec/idctdsp.c
	libavcodec/x86/cavsdsp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-18 22:01:17 +02:00
Diego Biurrun
8b0dd4942a idctdsp: prettyprinting cosmetics 2014-07-18 07:51:03 -07:00
Diego Biurrun
b4987f7219 idct: Convert IDCT permutation #defines to an enum
Also rename the enum values to be consistent with other DCT permutations.
2014-07-18 07:51:03 -07:00
Michael Niedermayer
581b5f0b9b Merge commit 'e3fcb14347466095839c2a3c47ebecff02da891e'
* commit 'e3fcb14347466095839c2a3c47ebecff02da891e':
  dsputil: Split off IDCT bits into their own context

Conflicts:
	configure
	libavcodec/aic.c
	libavcodec/arm/Makefile
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/asvdec.c
	libavcodec/dnxhdenc.c
	libavcodec/dsputil.c
	libavcodec/dvdec.c
	libavcodec/dxva2_mpeg2.c
	libavcodec/intrax8.c
	libavcodec/mdec.c
	libavcodec/mjpegdec.c
	libavcodec/mjpegenc_common.h
	libavcodec/mpegvideo.c
	libavcodec/ppc/dsputil_altivec.h
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/ppc/idctdsp.c
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil_init.c
	libavcodec/x86/dsputil_mmx.c
	libavcodec/x86/dsputil_x86.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-07-01 15:22:11 +02:00
Diego Biurrun
e3fcb14347 dsputil: Split off IDCT bits into their own context 2014-06-30 07:58:46 -07:00