Visualisation for COMPAS – Dr. Robert Džudžar

For the past three months, September-December 2020, I undertook an internship with ADACS. It was a period between successfully passing my PhD viva and finalising minor thesis amendments. The internship was a bit delayed due to the COVID-19 situation in Victoria, and also involved working from home – the idea of going to the office wasn’t on the horizon. Although it seemed quite daunting at first, working in a new environment with people I wouldn’t be able to communicate in person, my outlook changed immediately after I started.

Placed in the team called ‘Baffled Boffins’ (yes, I had to Google what it means), the work began. Initially I started learning about Django, a Python-based open-source web framework which is used for the development of secure and maintainable websites. The more I learned about Django, the more I got to know the team and the ongoing projects they were working on. This happened through daily Slack “Stand-up” and over Zoom meetings. A positive and dynamic environment made me feel like I was always part of the team from the start, which had a great influence on my productivity.

After about two weeks, I was assigned a real task! The task was to build a dynamic and interactive visualisation for COMPAS, using Bokeh. COMPAS (Compact Object Mergers: Population Astrophysics and Statistics) is a publicly available rapid binary population synthesis code, and Bokeh is an interactive visualization Python library for modern web browsers. Building visualisations is something that I love to do, and so the task was a perfect match for me. At the start, I had some prior knowledge of using Bokeh, at least so I thought – I soon realised how much I didn’t know. The learning curve was exponential on all sides. Day after day, piece by piece, I made my first simple dynamical visualisation, where a user selects their desired “Group” along with “X” and “Y” properties, and they are displayed on the Figure. This illustrates the hierarchical COMPAS.hdf5 input file, where properties are pulled based on the file. Almost immediately afterwards, I landed my very first ‘arc diff’ which marked my initial contribution to the project! And not only that! Since I was in charge of developing this part of the project myself, I was also showcasing progress and discussing future implementations with stakeholders!

Over time, the project grew and you can see a final example shown by the “Gif”: User selects properties, which are displayed as hexbin plot. The Figure updates after every interaction. Next to the Figure, a Table shows how many values are NaN/Inf of the selected property. Radio buttons control custom filters, for example, a User can constrain “Core Mass” and see how the data changes. The User can also interact with the Figure and display, by zooming or getting count statistics on hover, for example. Buttons allow the User to either i) refresh the current Figure setup, or ii) reset Figure and applied filters, iii) change the colormap. The availability of the displayed filters will depend on the input file, and even if a file lacks the properties for filtering, the dashboard will still have other functionalities. This is a great way to find out information about the data, on the fly, and because it’s fast the User doesn’t have to wait for the Figure to update (Gif shown has a loaded file of ~1.8 GB).

Compas demo

Besides the visualisation project, I was also involved in learning and setting up the basics for running a COMPAS job through the Django-Celery system. Celery is a task queue with a focus on real-time processing and task scheduling. This was challenging in many aspects, mostly because I had never come across anything like this before, but it was also very rewarding to learn something new.

Overall, the three months went by extremely fast. I worked with many amazing people, contributed to the project and learned heaps. I experienced how an Agile Scrum teamwork functions, learned about Django, Celery, Phabricator, Poetry, WSL, COMPAS, and improved my Bokeh and Python skills. I also learned a lot from my teammates, for whom I am especially grateful for all their guidance and expertise. This internship with ADACS was an amazing experience and I recommend it to all PhD candidates who are keen to implement and learn new coding skills.

Related posts

Introduction to Machine Learning Workshop

by Paul Hancock
7 years ago

TAC Portal – T. Reichardt

by ADACS Learning Resources
4 years ago

Parallel Optimisation of RSiena for Social Network Analyses

by Gregory Poole
2 years ago
Exit mobile version