* gpu::LUT, uses device memory instead of host memory * gpu::multiply, round mod for CV_8U depth
moved TargerArchs and DeviceInfo to core fixed bug in GpuMat::copy with mask (incorrect index in function tab)