Parallelisation of the SoFiA source finder

Figure 10 from Koribalski et al. (2020;

The WALLABY survey will soon begin to map the entire sky south of a declination of +30° in the 21-cm line emission of neutral hydrogen (H I) using the ASKAP telescope in Western Australia. The aim of WALLABY is to study the H I gas content of the nearby Universe and its role in the formation and evolution of galaxies. WALLABY is expected to produce about 1 PB of H I imaging data and detect up to 500,000 galaxies out to a redshift of 0.26. A small team led by Dr. Tobias Westmeier has developed a fully automated source finding pipeline, SoFiA, to help with the epic task of detecting galaxies in these huge quantities of data and determining their basic physical properties.

While SoFiA’s algorithms are powerful enough to reliably detect galaxies in WALLABY data cubes, ADACS has now made an important contribution to ensuring that SoFiA is also fast enough to do so in real time. This has been achieved via an ADACS Software Support Project, through identification and multi-threading of the most time-consuming algorithms in SoFiA using OpenMP. This allows the software to utilise all available CPUs on a machine, pushing the overall processing time down to an acceptable level.

Figure 1: Memory usage (in multiples of the data cube size) as a function of elapsed time for two runs of SoFiA on a 12.9 GB test data cube. The first run (teal) utilised just a single CPU and took more than half an hour to complete, while the second run (magenta) made use of multi-threading using 24 CPUs and completed in just 7 minutes.

Overall, ADACS has successfully reduced the execution time of SoFiA by a significant factor that solely depends on the number of CPUs available on the machine running SoFiA. Tests using 24 CPUs on a single node of the Hyades cluster at the International Centre for Radio Astronomy Research in Perth have resulted in a speed-up by almost a factor of 5 (see Fig. 1). This will allow the team to run the source finding pipeline on full-sized WALLABY data cubes in better than real time. As SoFiA is being used more widely by several radio astronomical survey teams around the world, the improved performance will also be of immediate benefit to other astronomy projects that share the burden of having to process large quantities of specral-line data.

A paper describing the improved SoFiA pipeline is in preparation, and the source code has been made publicly available on GitHub at

Related posts

Getting parallel bilby production-ready for LIGO’s fourth observing run

by Gregory Poole
2 years ago

Scientific Computing and Software Development Services

by Paul Hancock
5 years ago

A prototype for a continuous waves virtual laboratory

by Gregory Poole
4 years ago
Exit mobile version