Fast becomes Faster: A Full OpenCL rewrite of Corrfunc

CI: – Dr. Manodeep Sinha

Correlation functions of galaxies have now been used to constrain the underlying matter density for many decades. A correlation function can answer a broad set of questions, ranging from precise values of cosmological parameters to how galaxies populate dark matter halos. A correlation function requires computing the pairwise separation for all pairs between two sets of points. Thus, computing a correlation function scales as O(N^2); and as the galaxy surveys (or theoretical models) get bigger, the required computing time becomes a bottleneck for the analysis pipeline. Such a computing bottleneck already exists for current galaxy surveys like the Dark Energy Survey (DES, 100s of millions of galaxies) and will become acute for future surveys targeting billions of galaxies (e.g., LSST, SKA). To alleviate this issue, we are requesting 13 weeks of ADACS development time to create a new OpenCL based GPU code to compute a correlation function for observed galaxies. Computing all pairwise separations is effectively a matrix-matrix multiplication, and given how GPU Basic Linear Algebra Subprograms (BLAS) libraries massively outperform the CPU BLAS libraries, we expect significant speedups by rewriting the correlation function GPUs. We will use our significant experience with the open-source code
Corrfunc, and the lessons from a previously allocated ADACS development time, to create this new high-performance GPU code.
Previously we tried to add a GPU kernel within the existing Corrfunc framework. While the port was successful, due to fundamental differences in optimisation strategies for CPUs vs GPUs, the ported kernel was slower than the CPU kernel. One of the recommendations from the Project report was a complete rewrite targeting GPU architecture from the start. By creating an entirely new code, we will be free to implement the specific performance pitfalls identified in the Project Report and not be constrained by any compatibility requirements with the existing Corrfunc framework.