From: Solving the maximum subsequence sum and related problems using BSP/CGM model and multi-GPU CUDA
Algorithm | Steps | Thrust function |
---|---|---|
Algorithm 2 | (1) PSUM (2) SSUM | thrust::inclusive_scan |
Algorithm 2 | (1) PSUM (2) SSUM | thrust::for_each (to correct borders - multi-GPUs) |
Algorithm 2 | (3) SMAX (4) PMAX | thrust::inclusive_scan |
Algorithm 2 | (3) SMAX (4) PMAX | thrust::for_each (to correct borders - multi-GPUs) |
Algorithm 2 | (7) Compute array M | thrust::transform |
Algorithm 2 | (8) Maximum reduction | thrust::reduce |
Algorithm 4 | (2) Transformation | thrust::for_each |
Algorithm 4 | (7) Segmented scan | thrust::inclusive_scan_by_key |
Algorithm 4 | (8) Bitwise and operation | thrust::transform, thrust::find |
Algorithm 4 | (8) Bitwise and operation | thrust::for_each (to correct borders - multi-GPUs) |
Algorithm 4 | (12) Find related solutions | findRelatedSolutions (Custom Kernel) |
– | Synchronization | __syncthreads (threads) |
– | Synchronization | cudaDeviceSynchronize (blocks and device-host) |