HPC Software Engineer: cloud-based deployment for on-demand high-performance computing in Destination Earth

Job reference: VN21-58
Location: Reading, UK
Deadline for applications: 17/01/2022
Salary and Grade: Grade A2: £63,782.28, net annual basic salary + other benefits
Contract type: STF-PS
Department:Computing
Contract Duration: Two years

About ECMWF

ECMWF is the European Centre for Medium-Range Weather Forecasts. It is an intergovernmental organisation created in 1975 by a group of European nations and is today supported by 34 Member and Co-operating States, mostly in Europe. The Centre’s mission is to serve and support its Member and Co-operating States and the wider community by developing and providing world-leading global numerical weather prediction. ECMWF functions as a 24/7 research and operational centre with a focus on medium and long-range predictions and holds one of the largest meteorological archives in the world. The success of its activities relies primarily on the talent of its scientists, strong partnerships with its Member and Co-operating States and the international community, some of the most powerful supercomputers in the world, and the use of innovative technologies such as machine learning across its operations.

Over the years, ECMWF has also developed a strong partnership with the European Union, and for the past seven years has been an entrusted entity for the implementation and operation of the Climate and the Atmosphere Monitoring Services of the EU's Copernicus component of its Space Programme, as well as a contributor to the Copernicus Emergency Management Service. The collaboration does not stop there and includes other areas of work, including High Performance Computing and the development of digital tools that enable ECMWF to extend its provision of data and products covering weather, climate, air quality, fire and flood prediction and monitoring.

ECMWF has recently become a multi-site organisation, with its headquarters based since its creation in Reading, UK, its new data centre in Bologna, Italy, and new offices in Bonn, Germany.

For additional details, see www.ecmwf.int.

About DestinE

It is foreseen that ECMWF will be a major partner in the implementation of the Destination Earth (DestinE) initiative starting later in 2021, together with ESA and EUMETSAT as partners. The objective of the European Commission DestinE initiative is to deploy several highly accurate thematic digital replicas of the Earth, called Digital Twins.  The Digital Twins will help monitor and predict environmental change and human impact, in order to develop and test scenarios that would support sustainable development and corresponding European policies for the Green Deal. 

DestinE will thus contribute to revolutionising the European capability to monitor and predict our changing planet, complementing exisiting national and European efforts such as those provided by the national meteorological services and the Copernicus Services. It will be run in several phases, of which the first, the implementation phase, covers the period end-2021 – mid-2024. Future phases are foreseen (subject to funding) that will operationalise the digital twins, scale-up system production and add applications and new twin options. 

DestinE covers several demanding digital technology aspects required to develop, implement and operate the two high-priority digital twins on weather induced and geophysical extremes and on climate change adaptation. ECMWF will be responsible for the delivery of these digital twins, which will rely on complex Earth-system simulation models, data assimilation methods for fusing simulations and observations through inverse modelling and the integration of observations and models from sectors such as water and food management, renewable energies and socio-economic risk and disaster management. 

These science components require advanced digital technology solutions to maximize the efficient computing and data handling on extreme-scale infrastructures, and to adapt and operate these infrastructures across different heterogeneous architectures within a federated framework. This federated framework includes the Core Platform and Data Lake developed, deployed and operated by ESA and EUMETSAT respectively. 

The DestinE developments take forward the long-term investments of the ECMWF Member States in building a unique European prediction capability and will support the further advancement of member states services and Copernicus Services.

For more information on DestinE, see https://ec.europa.eu/digital-single-market/en/destination-earth-destine and https://www.ecmwf.int/en/about/what-we-do/environmental-services/destination-earth

Summary of the role

ECMWF has an exciting opportunity to help build the required digital infrastructure for DestinE in collaboration with partners throughout Europe. This role will support the delivery of containerised HPC workflows on third-party cloud-based compute platforms.

DestinE is a distributed system of autonomous service providers tied together by a specified set of interfaces. The anticipated work of this position explores novel pathways for HPC deployment and data production. It is envisaged that containerised HPC workflows will need to be developed and deployed on EuroHPC infrastructure.

The HPC applications team is part of the High-Performance Computing and Storage Section in ECMWF’s Computing Department. The team provides in-depth knowledge and expertise to support ECMWF developers, advising and assisting in writing, maintaining, debugging, and optimising the suites of demanding scientific codes used by ECMWF. The other two teams in the section are responsible for maintaining ECMWF’s petascale high-performance computing systems, helping developers in achieving the most efficient use of these systems and providing the online and archive data storage systems.  

Whilst this position will be based at the ECMWF HQ in Reading, United Kingdom, there will be strong collaboration with staff working on DestinE in Bonn, Germany and the Platform & Services teams in Bologna, Italy, and it is anticipated visits to both sites will be required.

Main duties and key responsibilities

  • Supporting the adaptation of existing HPC codes and workflows on cloud-based systems
  • Designing and developing prototypes for containerised workflows (including complex MPI/hybrid intensive compute applications) on a range of HPC platforms (including EuroHPC systems)
  • Deploying, supporting and benchmarking the workflows on HPC and multi-cloud environments
  • Liaising closely with DestinE partners ESA and Eumetsat as well as other partnerships and users to identify and resolve performance and functional issues with the HPC workflows
  • Developing tooling to ease the deployment of HPC applications and workflows across different cloud infrastructures

Personal attributes

  • Excellent interpersonal and communication skills
  • Ability and willingness to collaborate with internal and external experts
  • Strong analytical and problem-solving skills, with a proactive continuous improvement approach
  • Self-motivated, and able to work with minimal supervision
  • Ability to maintain effective communication and documentation with the rest of the team and a distributed project partner community
  • Dedication, passion, and enthusiasm to succeed both individually and within a team
  • Highly organised with the capacity to work on a diverse range of tasks to tight deadlines in a matrix management environment

Education, experience, knowledge and skills (including language)

Education:

  • A university degree (EQF Level 6) or equivalent industry experience

Demonstrable experience in some of the following is required:

  • Experience with HPC environments and the deployment of large-scale parallel applications
  • Experience in designing and developing applications in an operational Linux Cloud environment
  • Experience with developing and deploying to HPC clusters in a Cloud environment (e.g., AWS Parallel Cluster, Azure CycleCloud)
  • Good understanding of cloud architectures and cloud platforms
  • Experience with running scientific workloads on Cloud platforms
  • Proven track record in software engineering
  • Experience with batch schedulers (e.g., SLURM, PBS), parallel filesystems (e.g., Lustre, GPFS, BeeGFS), and parallel programming and profiling tools would be an advantage
  • Candidates must be able to work effectively in English and interviews will be conducted in English
  • Good knowledge of one of the Centre’s other working languages (French or German) is not required but would be an advantage

Demonstrable knowledge and skills in some of the following is required:

  • Cloud Native (e.g., Kubernetes, Docker, Singularity)
  • Cloud IaaS (e.g., Amazon, Google, Microsoft, Oracle)
  • Good Programming and scripting skills (any higher level language, e.g. C/C++, Fortran, Java, Python, Julia, Bash)

Other information

Grade remuneration

The successful candidate will be recruited at the A2 grade, according to the scales of the Co-ordinated Organisations and the annual basic salary will be £63,782.28 net of tax. ECMWF also offers a generous benefits package, including a flexible teleworking policy. The position is assigned to the employment category STF-PS as defined in the ECMWF Staff Regulations. Full details of salary scales and allowances available on the ECMWF website at www.ecmwf.int/en/about/jobs, including the ECMWF Staff Regulations and the terms and conditions of employment.

Starting date: As soon as possible.

Length of contract: The contract duration is expected to be two years. The DestinE Contribution Agreement is likely to be divided in phases, the first of which will last approximately two years. There may be the possibility of further contract extensions in the future depending on requirements and funding availability.

Location: The position will be located at ECMWF's Headquarters in Reading, UK.

As a multi-site organisation, ECMWF has adopted a hybrid organisation model which allows flexibility to staff to mix office working and teleworking.

Successful applicants and members of their family forming part of their households will be exempt from immigration restrictions.

Interviews by videoconference (Via Teams) are expected to take place week in February 2022.

Who can apply

Applicants are invited to complete the online application form by clicking on the apply button below.

At ECMWF, we consider an inclusive environment as key for our success. We are dedicated to ensuring a workplace that embraces diversity and provides equal opportunities for all, without distinction as to race, gender, age, marital status, social status, disability, sexual orientation, religion, personality, ethnicity and culture. We value the benefits derived from a diverse workforce and are committed to having staff that reflect the diversity of the countries that are part of our community, in an environment that nurtures equality and inclusion.

Applications are invited from nationals from ECMWF Member States and Co-operating States, as well as from all EU Member States.

ECMWF Member and Co-operating States are: Austria, Belgium, Bulgaria, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Latvia, Lithuania, Luxembourg, Montenegro, Morocco, the Netherlands, Norway, North Macedonia, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey and the United Kingdom.

Applications from nationals from other countries may be considered in exceptional cases.

published: 16 December 2021     Please mention EARTHWORKS when responding to this advertisement.