avcodec/huffman: replace qsort with AV_QSORT

ff_huff_build_tree uses qsort underneath. AV_QSORT is substantially
faster due to the inlining of the comparison callback. Furthermore, this
code is reasonably performance critical, since in e.g the fraps codec,
ff_huff_build_tree is called on every frame. This routine is also called
in vp6 on every frame in some circumstances.

Sample benchmark (x86-64, Haswell, GNU/Linux), vp6 from FATE:
vp6 (old):
  78930 decicycles in qsort,       1 runs,      0 skips
  45330 decicycles in qsort,       2 runs,      0 skips
  27825 decicycles in qsort,       4 runs,      0 skips
  17471 decicycles in qsort,       8 runs,      0 skips
  12296 decicycles in qsort,      16 runs,      0 skips
   9554 decicycles in qsort,      32 runs,      0 skips
   8404 decicycles in qsort,      64 runs,      0 skips
   7405 decicycles in qsort,     128 runs,      0 skips
   6740 decicycles in qsort,     256 runs,      0 skips
   7540 decicycles in qsort,     512 runs,      0 skips
   9498 decicycles in qsort,    1024 runs,      0 skips
   9938 decicycles in qsort,    2048 runs,      0 skips
   8043 decicycles in qsort,    4095 runs,      1 skips

vp6 (new):
  15880 decicycles in qsort,       1 runs,      0 skips
  10730 decicycles in qsort,       2 runs,      0 skips
  10155 decicycles in qsort,       4 runs,      0 skips
   7805 decicycles in qsort,       8 runs,      0 skips
   6883 decicycles in qsort,      16 runs,      0 skips
   6305 decicycles in qsort,      32 runs,      0 skips
   5854 decicycles in qsort,      64 runs,      0 skips
   5152 decicycles in qsort,     128 runs,      0 skips
   4452 decicycles in qsort,     256 runs,      0 skips
   4161 decicycles in qsort,     511 runs,      1 skips
   4081 decicycles in qsort,    1023 runs,      1 skips
   4072 decicycles in qsort,    2047 runs,      1 skips
   4004 decicycles in qsort,    4095 runs,      1 skips

Reviewed-by: Timothy Gu <timothygu99@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This commit is contained in:
Ganesh Ajjanagadde 2015-10-22 19:54:31 -04:00
parent 104f8ea873
commit 9bc3d3355f

View File

@ -26,6 +26,7 @@
#include <stdint.h>
#include "libavutil/qsort.h"
#include "avcodec.h"
#include "get_bits.h"
#include "huffman.h"
@ -170,7 +171,7 @@ int ff_huff_build_tree(AVCodecContext *avctx, VLC *vlc, int nb_codes, int nb_bit
"Tree construction is not possible\n");
return -1;
}
qsort(nodes, nb_codes, sizeof(Node), cmp);
AV_QSORT(nodes, nb_codes, Node, cmp);
cur_node = nb_codes;
nodes[nb_codes*2-1].count = 0;
for (i = 0; i < nb_codes * 2 - 1; i += 2) {