Aire Progress Update - January
Patricia Ternes
This is a short update to keep everyone up to date on the progress of Aire, the University’s new HPC system.
Aire Update
Since our last update in November, the remaining Aire hardware has now been installed. Unfortunately, we’re struggling to replace the faulty cooling components that we’ve mentioned before. The new date for the repair is 23 January 2025. Once this is done, we can complete the hardware acceptance tests. We currently have 58% of Aire hardware in use which offer as much CPU as ARC4 and 4 times more GPU chips than ARC4.
The Research Computing team, working closely with early access test users, has made significant progress in testing and refining Aire. We’ve improved the storage layout by introducing a new structure and balancing quotas, enabling better utilisation of the infrastructure and fostering a collaborative environment. Applications, environments, and new compilers have been tested, and we are making adjustments to enhance portability and developing documentation to support users. Additionally, we’ve gathered valuable feedback on user requirements and are working to implement solutions or develop models that address these needs within a shared framework.
Aire Future State
Once Aire is installed and in service that doesn’t mean things stop being developed. We are committed to continuously improving its security, resilience, and usability while ensuring it evolves to meet the needs of our academic community. By Easter 2025, we’ll upgrade Aire’s network infrastructure with new ACI-capable switches that enhance reliability and allow for more precise control over security settings, ensuring the system remains robust in a rapidly changing technological landscape.
To further strengthen security, we will expand multifactor authentication (MFA) to all connections to Aire, including those made on campus.
Beyond security and infrastructure, we’ll continue refining Aire’s system configurations to support evolving academic needs. This includes developing models for private nodes, team-managed software stacks, streamlined data workflows and management, teaching resource management, and special job queues. This continuous work will ensure Aire remains flexible and capable of accommodating the diverse and growing demands of our academic community.
ARC3 Storage Removal
As you know, at the end of 2024, ARC3 was removed from service and the CPU and GPU nodes replaced with Aire hardware. Everyone was asked to delete or move their data by 29 November 2024.
Once the remaining Aire hardware is in service, we will confirm the dates for the ARC3 storage to be removed from campus.
In the meantime, if you need to recover data from ARC3 storage please raise a service desk ticket as normal. Please note: this should be exceptional circumstances only.
Please add any comments or questions to the Teams Channel and we will continue to keep you updated.
If you are interested in more information about Aire, please read our previous blog posts:
Author
Patricia Ternes
Research Software Engineer Manager