avcodec/huffman: replace qsort with AV_QSORT
ff_huff_build_tree uses qsort underneath. AV_QSORT is substantially faster due to the inlining of the comparison callback. Furthermore, this code is reasonably performance critical, since in e.g the fraps codec, ff_huff_build_tree is called on every frame. This routine is also called in vp6 on every frame in some circumstances. Sample benchmark (x86-64, Haswell, GNU/Linux), vp6 from FATE: vp6 (old): 78930 decicycles in qsort, 1 runs, 0 skips 45330 decicycles in qsort, 2 runs, 0 skips 27825 decicycles in qsort, 4 runs, 0 skips 17471 decicycles in qsort, 8 runs, 0 skips 12296 decicycles in qsort, 16 runs, 0 skips 9554 decicycles in qsort, 32 runs, 0 skips 8404 decicycles in qsort, 64 runs, 0 skips 7405 decicycles in qsort, 128 runs, 0 skips 6740 decicycles in qsort, 256 runs, 0 skips 7540 decicycles in qsort, 512 runs, 0 skips 9498 decicycles in qsort, 1024 runs, 0 skips 9938 decicycles in qsort, 2048 runs, 0 skips 8043 decicycles in qsort, 4095 runs, 1 skips vp6 (new): 15880 decicycles in qsort, 1 runs, 0 skips 10730 decicycles in qsort, 2 runs, 0 skips 10155 decicycles in qsort, 4 runs, 0 skips 7805 decicycles in qsort, 8 runs, 0 skips 6883 decicycles in qsort, 16 runs, 0 skips 6305 decicycles in qsort, 32 runs, 0 skips 5854 decicycles in qsort, 64 runs, 0 skips 5152 decicycles in qsort, 128 runs, 0 skips 4452 decicycles in qsort, 256 runs, 0 skips 4161 decicycles in qsort, 511 runs, 1 skips 4081 decicycles in qsort, 1023 runs, 1 skips 4072 decicycles in qsort, 2047 runs, 1 skips 4004 decicycles in qsort, 4095 runs, 1 skips Reviewed-by: Timothy Gu <timothygu99@gmail.com> Reviewed-by: Michael Niedermayer <michael@niedermayer.cc> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This commit is contained in:
parent
104f8ea873
commit
9bc3d3355f
@ -26,6 +26,7 @@
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
#include "libavutil/qsort.h"
|
||||
#include "avcodec.h"
|
||||
#include "get_bits.h"
|
||||
#include "huffman.h"
|
||||
@ -170,7 +171,7 @@ int ff_huff_build_tree(AVCodecContext *avctx, VLC *vlc, int nb_codes, int nb_bit
|
||||
"Tree construction is not possible\n");
|
||||
return -1;
|
||||
}
|
||||
qsort(nodes, nb_codes, sizeof(Node), cmp);
|
||||
AV_QSORT(nodes, nb_codes, Node, cmp);
|
||||
cur_node = nb_codes;
|
||||
nodes[nb_codes*2-1].count = 0;
|
||||
for (i = 0; i < nb_codes * 2 - 1; i += 2) {
|
||||
|
Loading…
x
Reference in New Issue
Block a user