Solving the maximum subsequence sum and related problems using BSP/CGM model and multi-GPU CUDA

Journal of the Brazilian Computer Society

Table 5 CS-4 running times (milliseconds) of MPI implementation

n	N16:P1	N16:P2	N16:P4	N16:P8	N32:P1	N32:P2	N32:P4	N32:P8
2²⁰	10.730	11.403	9.788	91.368	11.597	19.765	21.263	119.741
2²¹	21.403	15.286	12.880	83.876	16.562	21.004	18.488	79.335
2²²	99.957	23.938	18.761	16.064	24.395	25.461	21.225	23.279
2²³	82.786	40.277	29.675	21.406	45.941	33.450	26.846	28.399
2²⁴	175.851	75.340	51.073	33.305	82.706	52.493	38.760	33.242
2²⁵	357.882	139.970	90.844	53.800	146.140	93.536	59.839	46.379
2²⁶	748.831	263.430	161.765	100.714	263.629	164.268	93.531	66.719
2²⁷	1546.020	559.685	307.145	171.782	555.407	288.917	170.073	111.906
2²⁸	–	1331.303	2356.562	314.681	854.059	426.578	317.116	203.811
2²⁹	–	–	–	24,301.072	–	4174.692	–	17,766.883