Commit Graph

1499 Commits

Author SHA1 Message Date
Ilya Lavrenov
19e77e4787 convertTo from 8u 2015-01-12 10:59:29 +03:00
Ilya Lavrenov
b758dbd384 convertTo AVX2 2015-01-12 10:59:29 +03:00
Ilya Lavrenov
3a78a22733 convertScaleAbs for s8, f64 2015-01-12 10:59:29 +03:00
Ilya Lavrenov
5578088983 countNonZero 2015-01-12 10:59:28 +03:00
Ilya Lavrenov
972ff1d0c4 polarToCart 2015-01-12 10:59:28 +03:00
Ilya Lavrenov
0a5c9cf145 magnitude 64f 2015-01-12 10:59:28 +03:00
Ilya Lavrenov
6ab928fb39 phase 64f 2015-01-12 10:59:28 +03:00
Ilya Lavrenov
a2a8ba17fc compare 2015-01-12 10:59:28 +03:00
Ilya Lavrenov
8d48632ebe avx2 2015-01-12 10:59:28 +03:00
Joe Howse
379de5708f Fix shadowed variable warning 2015-01-05 10:56:46 -04:00
Ilya Lavrenov
68962adc54 SSE mul 2014-12-31 17:58:54 +03:00
Ilya Lavrenov
60f2f7898a SSE4.1 addWeighted fo 16u 2014-12-31 17:55:30 +03:00
Vadim Pisarevsky
2f6db4dfac Merge pull request #3547 from ilya-lavrenov:ocl_setto 2014-12-31 09:30:40 +00:00
Vadim Pisarevsky
f792fdc3e0 Merge pull request #3559 from ilya-lavrenov:sse_dot_s8 2014-12-31 08:06:06 +00:00
Vladislav Vinogradov
b4e7ee46c6 fix compilation without CUDA 2014-12-30 11:06:33 +03:00
Vladislav Vinogradov
00e7816c1b add auxiliary functions to work with Input/Output arrays:
they allow to perform asynchronous upload/download into temporary buffer
to get valid GpuMat object
2014-12-30 11:06:32 +03:00
Ilya Lavrenov
f57136fd79 SSE2 cv::Mat::dot 2014-12-30 00:34:09 +03:00
Ilya Lavrenov
f6b3bc01e5 addWeighted 2014-12-29 22:01:53 +03:00
Ilya Lavrenov
1af7d397d4 optimization of UMat::setTo 2014-12-29 13:34:21 +03:00
Vadim Pisarevsky
0ff67253f7 Merge pull request #3531 from jet47:cuda-core-refactoring 2014-12-26 12:12:42 +00:00
Vladislav Vinogradov
f36546dbd2 improve error reporting in _InputArray methods 2014-12-26 12:03:25 +03:00
Vladislav Vinogradov
f054d6316a add cuda::HostMem::getAllocator method
it allows to use cudaHostAlloc methods for cv::Mat objects
2014-12-23 17:42:49 +03:00
Vladislav Vinogradov
53862687d5 rename CudaMem -> HostMem to better reflect its purpose 2014-12-23 17:42:49 +03:00
Vladislav Vinogradov
9210d8e542 move allocMatFromBuf function to farneback.cpp:
* it is the only place, where it is used
* no need to make this function public
2014-12-23 17:42:49 +03:00
Vladislav Vinogradov
68e08bbecd fix null stream initialization for multi-gpu systems 2014-12-23 17:41:24 +03:00
Vladislav Vinogradov
05d40946f3 move StackAllocator to cpp file
it is internal class, no need to export it
2014-12-23 17:41:24 +03:00
Vladislav Vinogradov
7ed38b97c3 fix cuda::BufferPool deinitialization
The deinitialization of BufferPool internal objects is controled by global
object, but it depends on other global objects, which leads to errors
caused by undefined deinitialization order of global objects.

I merge global objects initialization into single class, which performs
initialization and deinitialization in correct order.
2014-12-23 17:41:24 +03:00
Chuanbo Weng
2d8c89c40b Remove unnecessary kercn limitation of 4.
When accessing global memory by DWORD4, memory bandwidth
can be fully utilized on Intel platform. This patch will
make more image format(e.g. 8UC4) be processed in DWORD4
by work-item. After applying this patch, 3 subcase of
./opencv_perf_core --gtest_filter=OCL_RepeatFixture_Repeat.Repeat/*
can be speedup on HD4000 graphics card with Beignet:
OCL_RepeatFixture_Repeat.Repeat/2, 64% improvement.
OCL_RepeatFixture_Repeat.Repeat/6, 50% improvement.
OCL_RepeatFixture_Repeat.Repeat/8, 56% improvement.

Signed-off-by: Chuanbo Weng <chuanbo.weng@intel.com>
2014-12-04 11:15:13 +08:00
Dmitry-Me
4ff8a3ad92 Fix incorrect size computation 2014-11-26 12:24:53 +03:00
Alexander Alekhin
f50f249f80 Merge pull request #3138 from alalek:icv_update 2014-11-06 15:58:14 +00:00
Alexander Karsakov
462c3c25a9 Removed incorrect using of rootn() and powr() in ocl_pow 2014-11-06 16:23:02 +03:00
Alexander Alekhin
4eb16122c0 ocl: change processing of OpenCL failures
disable "unwanted" messages
2014-11-05 19:44:36 +03:00
Alexander Alekhin
1c9f590f0d IPPICV: disable NormDiff_L1_16s_C1R for IPP/ICV 8.2/8.2.1 2014-11-05 13:26:23 +03:00
Ilya Lavrenov
5ca25ab8f0 cv::pow (integer power) 2014-11-01 13:19:51 +03:00
Ilya Lavrenov
ccdc71286c cv::polarToCart 2014-11-01 13:19:51 +03:00
Ilya Lavrenov
d5f006eee5 cv::magnitude; cv::corner** 2014-11-01 13:19:51 +03:00
Ilya Lavrenov
fb97273b3c cv::phase; cv::cartToPolar 2014-11-01 13:19:51 +03:00
Alexander Alekhin
fd59551ff0 Merge pull request #3354 from vbystricky:oclopt_convertScaleAbs 2014-10-29 13:53:56 +00:00
ElenaGvozdeva
d88fdd0378 use LOCAL_SIZE+1 2014-10-28 15:18:31 +03:00
ElenaGvozdeva
65b8a1cb37 Some small fixes 2014-10-27 14:38:22 +03:00
Elena Gvozdeva
c5a2879ce0 use vectors 2014-10-27 14:38:22 +03:00
Elena Gvozdeva
2d89df1804 use local memory 2014-10-27 14:38:21 +03:00
Elena Gvozdeva
d78bc3c321 naive implementation 2014-10-27 14:38:21 +03:00
Alexander Alekhin
dee56598e9 Merge pull request #3369 from vbystricky:fix_scaleAdd 2014-10-27 10:03:29 +00:00
Alexander Alekhin
1f08d8cb6f Merge pull request #3367 from akarsakov:ocl_image2d 2014-10-24 16:01:21 +00:00
vbystricky
8466911ad0 Move _dst.create() to the begining of scaleAdd function 2014-10-24 18:27:47 +04:00
Alexander Karsakov
237cb93143 Added extra checks to ocl::Image2D 2014-10-24 15:04:42 +03:00
Alexander Alekhin
579a7fff6d ocl: restore clFinish() in unmap() for AMD devices
This reverts commit 7d91b8efcd.
2014-10-24 14:29:38 +04:00
Alexander Karsakov
3a263c6326 Added tests for Image2D 2014-10-23 14:23:37 +03:00
vbystricky
a8aa6381d9 Optimize OpenCL version of conversScaleAbs function 2014-10-21 19:20:20 +04:00
ElenaGvozdeva
070e5ec042 Changed predictOptimalVectorWidth function, now it is possible to choose vector size. 2014-10-21 13:13:15 +03:00
Vadim Pisarevsky
926b64fff7 Merge pull request #3292 from mshabunin:fix-ios-warnings 2014-10-20 06:41:51 +00:00
Vadim Pisarevsky
d2b9dc5530 quickly corrected the previous refactoring of features2d: moved from set(SOME_PROP, val) to setSomeProp(val) 2014-10-18 20:44:26 +04:00
Maksim Shabunin
ef3d02214b Fixing iOS clang warnings, part 2 2014-10-17 18:14:54 +04:00
Vadim Pisarevsky
01d3848f17 all the tests now pass except for MSER 2014-10-17 14:56:58 +04:00
Pavel Vlasov
45958eaabc Implementation detector and selector for IPP and OpenCL;
IPP can be switched on and off on runtime;

Optional implementation collector was added (switched off by default in CMake). Gathers data of implementation used in functions and report this info through performance TS;

TS modifications for implementations control;
2014-10-15 14:24:41 +04:00
Vadim Pisarevsky
a798386660 Merge pull request #3326 from ilya-lavrenov:neon_canny 2014-10-11 17:58:24 +00:00
Vadim Pisarevsky
a3916113b9 Merge pull request #3254 from ilya-lavrenov:neon_scale_add 2014-10-10 14:26:14 +00:00
Ilya Lavrenov
5f23d99918 the rest modes of cv::Mat::convertTo 2014-10-10 14:10:50 +00:00
Ilya Lavrenov
4babecf3b0 fixes for cv::addWeighted and cv::Mat::dot 2014-10-09 12:55:52 +00:00
vbystricky
1d280352f4 Add code for print errors of OpenCL kernels runing 2014-10-09 13:59:38 +04:00
Ilya Lavrenov
00f16e9178 neon 2014-10-03 08:43:02 +00:00
Ilya Lavrenov
be3efdf274 cv::sum refactoring 2014-09-30 14:36:21 +00:00
Ilya Lavrenov
a3e56114d1 cv::multiply 2014-09-30 14:20:22 +00:00
Ilya Lavrenov
1c491c42cd fix for cornerHarris 2014-09-29 14:59:46 +00:00
Ilya Lavrenov
bbc161e1cb fix for cv::Mat::convertTo with scale 2014-09-28 14:51:30 -07:00
Ilya Lavrenov
f50f0ba63e cv::norm 2014-09-28 07:28:33 -07:00
Ilya Lavrenov
44ea50f1c4 cv::countNonZero 2014-09-28 07:06:53 -07:00
Ilya Lavrenov
34a571d37f cv::Mat::dot 2014-09-28 05:00:22 -07:00
Ilya Lavrenov
e46332a183 cv::Mat::convertTo with scale and shift 2014-09-28 03:49:56 -07:00
Ilya Lavrenov
74e60e44ad cv::compare 2014-09-28 02:41:08 -07:00
Ilya Lavrenov
857a2d5bfd cv::addWeighted 2014-09-28 01:11:07 -07:00
Maksim Shabunin
047abb0050 Merge pull request #3258 from ilya-lavrenov:neon_convert 2014-09-26 09:27:16 +00:00
Ilya Lavrenov
345b1369be correct neon rounding 2014-09-25 07:54:52 +00:00
Ilya Lavrenov
5d018c090f Neon optimization of cv::scaleAdd (CV_32F) 2014-09-23 21:16:29 +04:00
Ilya Lavrenov
4b3f2c1972 Neon optimization of Mat::convertTo 2014-09-23 15:06:17 +00:00
Vadim Pisarevsky
281ce441a8 Merge pull request #3250 from ilya-lavrenov:neon_convert_scale_abs 2014-09-23 07:15:24 +00:00
Ilya Lavrenov
515be70867 Neon optimization of cv::convertScaleAbs 2014-09-22 15:47:46 +00:00
Ilya Lavrenov
27b933ba5a Neon optimization of cv::sum 2014-09-22 09:22:03 +00:00
Vadim Pisarevsky
06e55ddf38 Merge pull request #2893 from ilya-lavrenov:tapi_vector_width_intel 2014-09-18 12:05:24 +00:00
Vadim Pisarevsky
4057e27539 Merge pull request #3126 from avdmitry:move_KDTree_to_ml 2014-09-14 18:57:23 +00:00
Alexander Karsakov
c942c6539a Remove mul24 since id can be larger 2^23 2014-09-08 13:11:58 +04:00
Vadim Pisarevsky
26c284b225 Merge pull request #3167 from akarsakov:ocl_rm_clFinish 2014-09-04 17:00:10 +00:00
Vadim Pisarevsky
64a53de27d Merge pull request #3185 from ElenaGvozdeva:ocl_norm 2014-09-04 08:53:47 +00:00
Ilya Lavrenov
98e7d4ceec changed optimal vector width for Intel 2014-09-04 11:59:41 +04:00
Elena Gvozdeva
9fe11db7e2 disabled IPP acceleration for 3-channel norms and for CV_8S only for APPLE 2014-09-04 10:38:45 +04:00
Alexander Karsakov
7d91b8efcd Removed redundant clFinish() after clEnqueueUnmapMemObject()
sss
2014-09-03 14:54:05 +04:00
Alexander Karsakov
f57a4bf87b Disabled minMaxIdx for 32FC1 since it occasionally fails on AMD devices (e.g. A10-6800K) 2014-09-03 14:36:51 +04:00
Vadim Pisarevsky
0276cc90c2 Merge pull request #3184 from ilya-lavrenov:arm 2014-09-03 05:40:19 +00:00
Ilya Lavrenov
5d3a128cd3 NEON impl on cv::convertScaleAba CV_32f 2014-09-01 17:04:36 +00:00
Vadim Pisarevsky
3bafe64666 Merge pull request #3170 from ElenaGvozdeva:ocl_fix 2014-09-01 10:40:02 +00:00
Vadim Pisarevsky
1f85ffa11b Merge pull request #3166 from akarsakov:ocl_native_sqrt 2014-09-01 10:36:50 +00:00
Alexander Alekhin
4d474d40e7 Merge pull request #3171 from akarsakov:amd_fft_fix 2014-08-29 16:28:31 +00:00
Ilya Lavrenov
71ec6144bd attempt to fix compilation of OpenCL cv::transpose for AMD 2014-08-29 16:59:30 +04:00
Alexander Karsakov
d4e6812be2 Added check AmdFft version to be sure that AmdFft binaries are available 2014-08-29 14:23:18 +04:00
Elena Gvozdeva
31ac73c315 fix for cv::memopTypeToStr 2014-08-29 14:18:52 +04:00
Alexander Alekhin
57fec2f2da OCL: enable clAmdFftGetVersion 2014-08-29 13:45:04 +04:00
Alexander Karsakov
491bf41356 Disabled native_sqrt for double, since it may be not implemented and gives compilation error. 2014-08-28 17:01:49 +04:00
Alexander Alekhin
b332152bef Merge pull request #2956 from ilya-lavrenov:tapi_accumulate 2014-08-28 09:08:51 +00:00
Vadim Pisarevsky
4d9d7e6ded Merge pull request #3160 from akarsakov:ocl_dft_double_support 2014-08-27 10:06:34 +00:00