The Universidad Carlos III de Madrid (UC3M) is coordinating the European research project ADMIRE, financed by the European High Performance Computing Joint Undertaking (EuroHPC JU), whose main objective is to promote faster and more efficient data-intensive applications. It will develop an adaptive storage system and a clearly defined programming interface for optimising machine learning and data-intensive applications.
One of the problems with today's data-intensive applications on high-performance computers is that they have poor input/output (I/O) performance, which affects the runtime of applications handling massive amounts of data.
The aim of this project is to develop a new system software for future Exascale computers in order to optimise massive data handling operations so that applications can run faster and therefore perform better. "It is quite a ground-breaking project when you look at what has traditionally happened in high-performance computing. This is because, for the first time, we are dealing with the optimisation and readjustment of the I/O system while the applications are running," says Jes˙s Carretero, a lecturer in the UC3M’s Department of Computer Science and head of the ADMIRE project at the University.
The main technological challenge facing the research team on the ADMIRE project involves creating software that optimises I/O management, i.e., information flows. This software must be able to adapt dynamically to computational requirements and the ways in which information is stored. For this, a global software layer is needed that interacts with all the components of the system in order to coordinate them. Essentially, it must be able to adapt and manage in an efficient and dynamic manner the storage resources available. This can be achieved by means of machine learning at all levels of the system hierarchy.
The project aims to streamline all the data movement in a computer, as well as to minimise this movement between the storage system and the compute nodes. This requires creating a new layer of system software and adapting applications. "High performance computing typically involves what's called offline running. In other words, users can leave their jobs running, for hours or days, and after a certain amount of time it is completed and the results are delivered. What our project proposes is that, while these jobs are running, the operating system itself can optimise the performance of the system so that they run faster," says Carretero.