Commit Graph

143 Commits

Author SHA1 Message Date
Michael Niedermayer
41d08ca575 avcodec/arm/cabac: fix inline cabac reader with the UNCHECKED bitstream reader
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-15 01:08:45 +01:00
Michael Niedermayer
2b8d28439b avcodec/h264_cabac: move the arm unchecked_bitstream reader special case closer to where the issue is
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-15 01:02:31 +01:00
Michael Niedermayer
669235e0b3 avcodec/h264_cabac: disable the unchecked bitstream reader for arm & aarch64
The newly added optimizations do not work with the unchecked reader

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-03-15 00:58:48 +01:00
Michael Niedermayer
965fa6b0d9 Merge commit 'fb0c9d41d685abb58575c5482ca33b8cd457c5ec'
* commit 'fb0c9d41d685abb58575c5482ca33b8cd457c5ec':
  avutil: remove timer.h include from internal.h

Conflicts:
	libavcodec/ffv1dec.c
	libavutil/internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-26 01:54:55 +01:00
Janne Grunau
fb0c9d41d6 avutil: remove timer.h include from internal.h
Added libavutil/timer.h include to all files with {START,STOP}_TIMER.
2014-01-25 21:50:20 +01:00
Michael Niedermayer
7853024071 avcodec/h264_cabac: Fix use with the checked bitstream-reader
Found-by: Dale Curtis <dalecurtis@google.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-01-24 00:05:17 +01:00
Michael Niedermayer
4fb1221e66 h264: reduce whitespace differences to libav
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-11-03 01:16:31 +01:00
Michael Niedermayer
21ba80214c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  h264_cabac: Mark functions calling decode_cabac_residual_internal as noinline

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-08-25 12:26:14 +02:00
Diego Biurrun
ff9d57e7df h264_cabac: Mark functions calling decode_cabac_residual_internal as noinline
This ensures that decode_cabac_residual_internal actually does get inlined,
which it otherwise does not, even though it is marked as always_inline.
2013-08-24 16:14:15 +02:00
Diego Biurrun
2a61592573 avcodec: Remove some commented-out debug cruft 2013-08-20 19:59:50 +02:00
Michael Niedermayer
86d9d349cc Merge commit '23e85be58fc64b2e804e68b0034a08a6d257e523'
* commit '23e85be58fc64b2e804e68b0034a08a6d257e523':
  h264: add a parameter to the CHROMA444 macro.
  h264: add a parameter to the CHROMA422 macro.

Conflicts:
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cavlc.c
	libavcodec/h264_loopfilter.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-21 13:06:14 +01:00
Michael Niedermayer
92656787cf Merge commit '6d2b6f21eb45ffbda1103c772060303648714832'
* commit '6d2b6f21eb45ffbda1103c772060303648714832':
  h264: add a parameter to the CABAC macro.
  h264: add a parameter to the FIELD_OR_MBAFF_PICTURE macro.

Conflicts:
	libavcodec/h264.c
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-21 12:58:00 +01:00
Michael Niedermayer
bf4d0f8328 Merge commit '7fa00653a550c0d24b3951c0f9fed6350ecf5ce4'
* commit '7fa00653a550c0d24b3951c0f9fed6350ecf5ce4':
  h264: add a parameter to the FIELD_PICTURE macro.
  h264: add a parameter to the FRAME_MBAFF macro.

Conflicts:
	libavcodec/h264.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_refs.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-21 12:50:18 +01:00
Michael Niedermayer
e168b50816 Merge commit 'da6be8fcec16a94d8084bda8bb8a0a411a96bcf7'
* commit 'da6be8fcec16a94d8084bda8bb8a0a411a96bcf7':
  h264: add a parameter to the MB_FIELD macro.
  h264: add a parameter to the MB_MBAFF macro.

Conflicts:
	libavcodec/h264.c
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-21 12:43:32 +01:00
Anton Khirnov
23e85be58f h264: add a parameter to the CHROMA444 macro.
This way it does not look like a constant.
2013-03-21 10:21:02 +01:00
Anton Khirnov
e962bd08ee h264: add a parameter to the CHROMA422 macro.
This way it does not look like a constant.
2013-03-21 10:20:58 +01:00
Anton Khirnov
6d2b6f21eb h264: add a parameter to the CABAC macro.
This way it does not look like a constant.
2013-03-21 10:20:52 +01:00
Anton Khirnov
7fa00653a5 h264: add a parameter to the FIELD_PICTURE macro.
This way it does not look like a constant.
2013-03-21 10:20:44 +01:00
Anton Khirnov
7bece9b22f h264: add a parameter to the FRAME_MBAFF macro.
This way it does not look like a constant.
2013-03-21 10:20:39 +01:00
Anton Khirnov
da6be8fcec h264: add a parameter to the MB_FIELD macro.
This way it does not look like a constant.
2013-03-21 10:20:35 +01:00
Anton Khirnov
82313eaa34 h264: add a parameter to the MB_MBAFF macro.
This way it does not look like a constant.
2013-03-21 10:20:30 +01:00
Michael Niedermayer
80e9e63c94 Merge commit '759001c534287a96dc96d1e274665feb7059145d'
* commit '759001c534287a96dc96d1e274665feb7059145d':
  lavc decoders: work with refcounted frames.

Anton Khirnov (1):
      lavc decoders: work with refcounted frames.

Clément Bœsch (47):
      lavc/ansi: reset file
      lavc/ansi: re-do refcounted frame changes from Anton
      fraps: reset file
      lavc/fraps: switch to refcounted frames
      gifdec: reset file
      lavc/gifdec: switch to refcounted frames
      dsicinav: resolve conflicts
      smc: resolve conflicts
      zmbv: resolve conflicts
      rpza: resolve conflicts
      vble: resolve conflicts
      xxan: resolve conflicts
      targa: resolve conflicts
      vmnc: resolve conflicts
      utvideodec: resolve conflicts
      tscc: resolve conflicts
      ulti: resolve conflicts
      ffv1dec: resolve conflicts
      dnxhddec: resolve conflicts
      v210dec: resolve conflicts
      vp3: resolve conflicts
      vcr1: resolve conflicts
      v210x: resolve conflicts
      wavpack: resolve conflicts
      pngdec: fix compilation
      roqvideodec: resolve conflicts
      pictordec: resolve conflicts
      mdec: resolve conflicts
      tiertexseqv: resolve conflicts
      smacker: resolve conflicts
      vb: resolve conflicts
      vqavideo: resolve conflicts
      xl: resolve conflicts
      tmv: resolve conflicts
      vmdav: resolve conflicts
      truemotion1: resolve conflicts
      truemotion2: resolve conflicts
      lcldec: fix compilation
      libcelt_dec: fix compilation
      qdrw: fix compilation
      r210dec: fix compilation
      rl2: fix compilation
      wnv1: fix compilation
      yop: fix compilation
      tiff: resolve conflicts
      interplayvideo: fix compilation
      qpeg: resolve conflicts (FIXME/TESTME).

Hendrik Leppkes (33):
      012v: convert to refcounted frames
      8bps: fix compilation
      8svx: resolve conflicts
      4xm: resolve conflicts
      aasc: resolve conflicts
      bfi: fix compilation
      aura: fix compilation
      alsdec: resolve conflicts
      avrndec: convert to refcounted frames
      avuidec: convert to refcounted frames
      bintext: convert to refcounted frames
      cavsdec: resolve conflicts
      brender_pix: convert to refcounted frames
      cinepak: resolve conflicts
      cinepak: avoid using AVFrame struct directly in private context
      cljr: fix compilation
      cpia: convert to refcounted frames
      cscd: resolve conflicts
      iff: resolve conflicts and do proper conversion to refcounted frames
      4xm: fix reference frame handling
      cyuv: fix compilation
      dxa: fix compilation
      eacmv: fix compilation
      eamad: fix compilation
      eatgv: fix compilation
      escape124: remove unused variable.
      escape130: convert to refcounted frames
      evrcdec: convert to refcounted frames
      exr: convert to refcounted frames
      mvcdec: convert to refcounted frames
      paf: properly free the frame data on decode close
      sgirle: convert to refcounted frames
      lavfi/moviesrc: use refcounted frames

Michael Niedermayer (56):
      Merge commit '759001c534287a96dc96d1e274665feb7059145d'
      resolve conflicts in headers
      motion_est: resolve conflict
      mpeg4videodec: fix conflicts
      dpcm conflict fix
      dpx: fix conflicts
      indeo3: resolve confilcts
      kmvc: resolve conflicts
      kmvc: resolve conflicts
      h264: resolve conflicts
      utils: resolve conflicts
      rawdec: resolve conflcits
      mpegvideo: resolve conflicts
      svq1enc: resolve conflicts
      mpegvideo: dont clear data, fix assertion failure on fate vsynth1 with threads
      pthreads: resolve conflicts
      frame_thread_encoder: simple compilefix not yet tested
      snow: update to buffer refs
      crytsalhd: fix compile
      dirac: switch to new API
      sonic: update to new API
      svq1: resolve conflict, update to new API
      ffwavesynth: update to new buffer API
      g729: update to new API
      indeo5: fix compile
      j2kdec: update to new buffer API
      linopencore-amr: fix compile
      libvorbisdec: update to new API
      loco: fix compile
      paf: update to new API
      proresdec: update to new API
      vp56: update to new api / resolve conflicts
      xface: convert to refcounted frames
      xan: fix compile&fate
      v408: update to ref counted buffers
      v308: update to ref counted buffers
      yuv4dec: update to ref counted buffers
      y41p: update to ref counted frames
      xbm: update to refcounted frames
      targa_y216: update to refcounted buffers
      qpeg: fix fate/crash
      cdxl: fix fate
      tscc: fix reget buffer useage
      targa_y216dec: fix style
      msmpeg4: fix fate
      h264: ref_picture() copy fields that have been lost too
      update_frame_pool: use channel field
      h264: Put code that prevents deadlocks back
      mpegvideo: dont allow last == current
      wmalossless: fix buffer ref messup
      ff_alloc_picture: free tables in case of dimension mismatches
      h264: fix null pointer dereference and assertion failure
      frame_thread_encoder: update to bufrefs
      ec: fix used arrays
      snowdec: fix off by 1 error in dimensions check
      h264: disallow single unpaired fields as references of frames

Paul B Mahol (2):
      lavc/vima: convert to refcounted frames
      sanm: convert to refcounted frames

Conflicts:
	libavcodec/4xm.c
	libavcodec/8bps.c
	libavcodec/8svx.c
	libavcodec/aasc.c
	libavcodec/alsdec.c
	libavcodec/anm.c
	libavcodec/ansi.c
	libavcodec/avs.c
	libavcodec/bethsoftvideo.c
	libavcodec/bfi.c
	libavcodec/c93.c
	libavcodec/cavsdec.c
	libavcodec/cdgraphics.c
	libavcodec/cinepak.c
	libavcodec/cljr.c
	libavcodec/cscd.c
	libavcodec/dnxhddec.c
	libavcodec/dpcm.c
	libavcodec/dpx.c
	libavcodec/dsicinav.c
	libavcodec/dvdec.c
	libavcodec/dxa.c
	libavcodec/eacmv.c
	libavcodec/eamad.c
	libavcodec/eatgq.c
	libavcodec/eatgv.c
	libavcodec/eatqi.c
	libavcodec/error_resilience.c
	libavcodec/escape124.c
	libavcodec/ffv1.h
	libavcodec/ffv1dec.c
	libavcodec/flicvideo.c
	libavcodec/fraps.c
	libavcodec/frwu.c
	libavcodec/g723_1.c
	libavcodec/gifdec.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_direct.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_refs.c
	libavcodec/huffyuvdec.c
	libavcodec/idcinvideo.c
	libavcodec/iff.c
	libavcodec/indeo2.c
	libavcodec/indeo3.c
	libavcodec/internal.h
	libavcodec/interplayvideo.c
	libavcodec/ivi_common.c
	libavcodec/jvdec.c
	libavcodec/kgv1dec.c
	libavcodec/kmvc.c
	libavcodec/lagarith.c
	libavcodec/libopenjpegdec.c
	libavcodec/mdec.c
	libavcodec/mimic.c
	libavcodec/mjpegbdec.c
	libavcodec/mjpegdec.c
	libavcodec/mmvideo.c
	libavcodec/motion_est.c
	libavcodec/motionpixels.c
	libavcodec/mpc7.c
	libavcodec/mpeg12.c
	libavcodec/mpeg4videodec.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo.h
	libavcodec/msrle.c
	libavcodec/msvideo1.c
	libavcodec/nuv.c
	libavcodec/options_table.h
	libavcodec/pcx.c
	libavcodec/pictordec.c
	libavcodec/pngdec.c
	libavcodec/pnmdec.c
	libavcodec/pthread.c
	libavcodec/qpeg.c
	libavcodec/qtrle.c
	libavcodec/r210dec.c
	libavcodec/rawdec.c
	libavcodec/roqvideodec.c
	libavcodec/rpza.c
	libavcodec/smacker.c
	libavcodec/smc.c
	libavcodec/svq1dec.c
	libavcodec/svq1enc.c
	libavcodec/targa.c
	libavcodec/tiertexseqv.c
	libavcodec/tiff.c
	libavcodec/tmv.c
	libavcodec/truemotion1.c
	libavcodec/truemotion2.c
	libavcodec/tscc.c
	libavcodec/ulti.c
	libavcodec/utils.c
	libavcodec/utvideodec.c
	libavcodec/v210dec.c
	libavcodec/v210x.c
	libavcodec/vb.c
	libavcodec/vble.c
	libavcodec/vcr1.c
	libavcodec/vmdav.c
	libavcodec/vmnc.c
	libavcodec/vp3.c
	libavcodec/vp56.c
	libavcodec/vp56.h
	libavcodec/vp6.c
	libavcodec/vqavideo.c
	libavcodec/wavpack.c
	libavcodec/xl.c
	libavcodec/xxan.c
	libavcodec/zmbv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-03-12 03:23:28 +01:00
Anton Khirnov
759001c534 lavc decoders: work with refcounted frames. 2013-03-08 07:38:30 +01:00
Michael Niedermayer
a984efd104 Merge commit 'c242bbd8b6939507a1a6fb64101b0553d92d303f'
* commit 'c242bbd8b6939507a1a6fb64101b0553d92d303f':
  Remove unnecessary dsputil.h #includes

Conflicts:
	libavcodec/ffv1.c
	libavcodec/h261dec.c
	libavcodec/h261enc.c
	libavcodec/h264pred.c
	libavcodec/lpc.h
	libavcodec/mjpegdec.c
	libavcodec/rectangle.h
	libavcodec/x86/idct_sse2_xvid.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-26 13:05:10 +01:00
Diego Biurrun
c242bbd8b6 Remove unnecessary dsputil.h #includes 2013-02-26 00:51:34 +01:00
Ronald S. Bultje
7ebfb466ae h264: Don't store intra pcm samples in h->mb
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-02-19 22:34:14 +02:00
Ronald S. Bultje
c63f9fb37a h264: don't store intra pcm samples in h->mb.
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-18 01:21:23 +01:00
Michael Niedermayer
b7fe35c9e5 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  h264: deMpegEncContextize

Conflicts:
	libavcodec/dxva2_h264.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_mb_template.c
	libavcodec/h264_parser.c
	libavcodec/h264_ps.c
	libavcodec/h264_refs.c
	libavcodec/h264_sei.c
	libavcodec/svq3.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-02-16 23:30:09 +01:00
Anton Khirnov
2c54155407 h264: deMpegEncContextize
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.

The nontrivial parts are:
1) extracting a simplified version of the frame management code from
   mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
   its own more complex system already and those were set only to appease
   the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
   for dxva, the draw_horiz_band() call is moved from
   ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
   because it's now different for h264 and MpegEncContext-based
   decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
   added some very simplistic frame management instead and dropped the
   use of ff_h264_frame_start(). Because of this I also had to move some
   initialization code to svq3.

Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-02-15 16:35:16 +01:00
Michael Niedermayer
cdf0877bc3 h264/cabac: check loop index
fix out of array read

Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-31 04:21:26 +01:00
Michael Niedermayer
ac8987591f Merge commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f'
* commit '88bd7fdc821aaa0cbcf44cf075c62aaa42121e3f':
  Drop DCTELEM typedef

Conflicts:
	libavcodec/alpha/dsputil_alpha.h
	libavcodec/alpha/motion_est_alpha.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/bfin/dsputil_bfin.h
	libavcodec/bfin/pixels_bfin.S
	libavcodec/cavs.c
	libavcodec/cavsdec.c
	libavcodec/dct-test.c
	libavcodec/dnxhdenc.c
	libavcodec/dsputil.c
	libavcodec/dsputil.h
	libavcodec/dsputil_template.c
	libavcodec/eamad.c
	libavcodec/h264_cavlc.c
	libavcodec/h264idct_template.c
	libavcodec/mpeg12.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo.h
	libavcodec/mpegvideo_enc.c
	libavcodec/ppc/dsputil_altivec.c
	libavcodec/proresdsp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-01-23 17:44:56 +01:00
Diego Biurrun
88bd7fdc82 Drop DCTELEM typedef
It does not help as an abstraction and adds dsputil dependencies.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2013-01-22 18:32:56 -08:00
Michael Niedermayer
8a03a60b4a h264: Check gray scale CBP, fix out of array accesses.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-11-18 23:02:46 +01:00
Ronald S. Bultje
2e59210edf lavc/h264: don't touch H264Context->ref_count[] during MB decoding.
The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.

This bug was found using TSAN and Helgrind.

Original patch by Ronald, adapted with a local_ref_count by Clément,
following the suggestion of Michael Niedermayer.

Signed-off-by: Clément Bœsch <clement.boesch@smartjog.com>
2012-10-05 07:35:58 +02:00
Michael Niedermayer
31ab1575e5 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  avcodec: Convert some commented-out printf/av_log instances to av_dlog
  avcodec: Drop silly and/or broken printf debug output
  avcodec: Drop some silly commented-out av_log() invocations
  avformat: Convert some commented-out printf/av_log instances to av_dlog
  avformat: Remove non-compiling and/or silly commented-out printf/av_log statements
  Remove some silly disabled code.
  ac3dec: ensure get_buffer() gets a buffer for the correct number of channels

Conflicts:
	libavcodec/dnxhddec.c
	libavcodec/ffv1.c
	libavcodec/h264.c
	libavcodec/h264_parser.c
	libavcodec/mjpegdec.c
	libavcodec/motion_est_template.c
	libavcodec/mpegaudiodec.c
	libavcodec/mpegvideo_enc.c
	libavcodec/put_bits.h
	libavcodec/ratecontrol.c
	libavcodec/wmaenc.c
	libavdevice/timefilter.c
	libavformat/asfdec.c
	libavformat/avidec.c
	libavformat/avienc.c
	libavformat/flvenc.c
	libavformat/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-10-01 16:12:38 +02:00
Diego Biurrun
1218777ffd avcodec: Convert some commented-out printf/av_log instances to av_dlog 2012-10-01 10:24:28 +02:00
Diego Biurrun
6f6b0311a3 avcodec: Drop some silly commented-out av_log() invocations 2012-10-01 10:24:28 +02:00
Michael Niedermayer
7a4e30f3b6 h264_cabac: switch to av_assert
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-07-26 16:33:17 +02:00
Michael Niedermayer
244682dd08 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  log: Only include unistd.h if configure found it
  ape: create audio stream before reading tags.
  mov: make a length variable larger.
  image2: Add "start_number" private option to the demuxer
  image2: Add "start_number" private option to the muxer
  avconv: remove a forgotten debugging printf.
  avconv: use more descriptive names for hardcoded filters.
  avconv: remove redundant handling of async.
  doc/filters: fix typo.
  h264: use asm cabac reader under a generic condition

Conflicts:
	ffmpeg.c
	libavformat/img2dec.c
	libavformat/img2enc.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-06-24 21:34:54 +02:00
Mans Rullgard
0b6f973635 h264: use asm cabac reader under a generic condition
This removes a dependency on implementation details from generic
code and allows easy addition of the equivalent optimisation for
other architectures than x86.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2012-06-23 22:14:21 +01:00
Roland Scheidegger
82c71913e4 h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
Now that only one table is used, there's some chance even darwin as compiles
this (apparently the label arithmetic used previously doesn't work if it
involves symbols defined in a different file, thanks to Ronald S. Bultje for
helping me with this).

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-28 20:02:27 +02:00
Roland Scheidegger
7f668cd2b5 h264: use one table instead of several for cabac functions
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.

The assembly uses the new table but still won't work for PIC case.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-28 20:02:27 +02:00
Roland Scheidegger
9b9df1cdff h264: new assembly version of get_cabac for x86_64 with PIC
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register. get_cabac() gets about 40% faster, for
an overall speedup of about 5%.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 09:43:25 -07:00
Roland Scheidegger
14e9ffc1e4 h264: use one table instead of several for cabac functions
The reason is this is easier for PIC code (in particular on darwin...).
Keep the old names as pointers (static in cabac_functions.h so gcc
knows these are just immediate offsets) so the c code can nicely stay the same
(alternatively could use offsets directly in the functions needing the
tables). This should produce the same code as before with non-pic and better
code (confirmed) with pic.

The assembly uses the new table but still won't work for PIC case.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-04-28 08:26:12 -07:00
Michael Niedermayer
9849515214 Revert "h264: assembly version of get_cabac for x86_64 with PIC (v4)"
This broke compilation on darwin, revert until a better solution is found.

This reverts commit a812b599b5.
2012-04-21 02:09:27 +02:00
Roland Scheidegger
a812b599b5 h264: assembly version of get_cabac for x86_64 with PIC (v4)
This adds a hand-optimized assembly version for get_cabac much like the
existing one, but it works if the table offsets are RIP-relative.
Compared to the non-RIP-relative version this adds 2 lea instructions
and it needs one extra register.
There is a surprisingly large performance improvement over the c version (more
so than the generated assembly seems to suggest) just in get_cabac, I measured
roughly 40% faster for get_cabac on a K8. However, overall the difference is
not that big, I measured roughly 5% on a test clip on a K8 and a Core2.
Hopefully it still compiles on x86 32bit...
v2: incorporated feedback from Loren Merritt to avoid rip-relative movs
for every table, and got rid of unnecessary @GOTPCREL.
v3: apply similar fixes to the the decode_significance functions, and use
same macro arguments for non-pic case.
v4: prettify inline asm arguments, add a non-fast-cmov version (as I expect
the c code to be faster otherwise since both cmov and sbb suck hard on a
Prescott, even can't construct the mask with a 64bit shift as that's just as
terrible - it's quite difficult to find usable instructions on that chip...).
This is tested to work but not on a P4, in theory it _should_ be fast there.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-21 00:27:06 +02:00
Michael Niedermayer
2c5a2958e9 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  h264: Factorize declaration of mb_sizes array.
  vsrc_buffer: when no frame is available, return an error instead of segfaulting.
  configure: add dl to frei0r extralibs.
  dsputil x86: use SSE float instruction instead of SSE2 integer equivalent
  dsputil x86: remove deprecated parameter from scalarproduct_int16 prototype
  vp8dsp x86: perform rounding shift with a single instruction
  fate: add BMP tests.
  swscale: handle complete dimensions for monoblack/white.
  aacenc: Mark deinterleave_input_samples argument as const.
  vf_unsharp: Mark readonly variable as const.
  h264: fix 4:2:2 PCM-macroblocks decoding

Conflicts:
	configure
	libavcodec/h264.h
	libavcodec/x86/dsputil_mmx.c
	libavfilter/vf_unsharp.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-04-05 22:26:50 +02:00
Diego Biurrun
0becb07842 h264: Factorize declaration of mb_sizes array. 2012-04-05 17:17:22 +02:00
Ronald S. Bultje
63a1b481f6 h264: fix cabac-on-stack after safe cabac reader. 2012-03-28 16:35:42 -07:00
Michael Niedermayer
79ae084e9b Merge remote-tracking branch 'qatar/master'
* qatar/master: (58 commits)
  amrnbdec: check frame size before decoding.
  cscd: use negative error values to indicate decode_init() failures.
  h264: prevent overreads in intra PCM decoding.
  FATE: do not decode audio in the nuv test.
  dxa: set audio stream time base using the sample rate
  psx-str: do not allow seeking by bytes
  asfdec: Do not set AVCodecContext.frame_size
  vqf: set packet parameters after av_new_packet()
  mpegaudiodec: use DSPUtil.butterflies_float().
  FATE: add mp3 test for sample that exhibited false overreads
  fate: add cdxl test for bit line plane arrangement
  vmnc: return error on decode_init() failure.
  libvorbis: add/update error messages
  libvorbis: use AVFifoBuffer for output packet buffer
  libvorbis: remove unneeded e_o_s check
  libvorbis: check return values for functions that can return errors
  libvorbis: use float input instead of s16
  libvorbis: do not flush libvorbis analysis if dsp state was not initialized
  libvorbis: use VBR by default, with default quality of 3
  libvorbis: fix use of minrate/maxrate AVOptions
  ...

Conflicts:
	Changelog
	doc/APIchanges
	libavcodec/avcodec.h
	libavcodec/dpxenc.c
	libavcodec/libvorbis.c
	libavcodec/vmnc.c
	libavformat/asfdec.c
	libavformat/id3v2enc.c
	libavformat/internal.h
	libavformat/mp3enc.c
	libavformat/utils.c
	libavformat/version.h
	libswscale/utils.c
	tests/fate/video.mak
	tests/ref/fate/nuv
	tests/ref/fate/prores-alpha
	tests/ref/lavf/ffm
	tests/ref/vsynth1/prores
	tests/ref/vsynth2/prores

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2012-03-01 03:17:11 +01:00