Job Openings
HPC System Administrator
About the job HPC System Administrator
Job Title: System Administrator – HPC Managed Services
Location: Singapore
Working Hours: 08:30 – 18:00, Monday to Friday (excluding Public Holidays)
Role Overview
We are seeking a skilled and motivated System Administrator to join our High-Performance Computing (HPC) Managed Services team. The successful candidate will be responsible for day-to-day operations, system maintenance, and lifecycle management of HPC environments, ensuring reliability, performance, and seamless service delivery for our users.
Key Responsibilities
- Manage the full OS lifecycle, including installation, configuration, patching, and upgrades for HPC systems.
- Support and maintain configuration management (CM) processes and automation frameworks.
- Provide operational monitoring, troubleshooting, and performance tuning across login nodes, compute nodes, and hyperconverged infrastructure (HCI).
- Collaborate with internal and external teams to handle incidents and resolve system-level issues promptly.
- Participate in the 24×7 on-call rotation to support critical (P1/P2) incident escalations.
- Ensure compliance with operational standards, documentation practices, and ITIL processes.
Requirements
- At least three (3) years of relevant experience in HPC, system administration, or infrastructure operations.
- Hands-on experience with Linux-based operating systems (preferably Red Hat Enterprise Linux).
- Strong technical background in cluster management, job scheduling, and resource monitoring tools.
- Proficiency in scripting (Bash, Python, or similar) for automation and maintenance tasks.
- Familiarity with configuration management tools (e.g., Ansible, Puppet, Chef).
- Excellent problem-solving, communication, and teamwork skills.
Minimum Certification Requirements
- ITIL Foundation (or equivalent or higher).
- Red Hat Certified System Administrator (RHCSA) or equivalent or higher certification.