Staff HPC Engineer - Singapore - ILLUMINA SINGAPORE PTE. LTD.

    ILLUMINA SINGAPORE PTE. LTD.
    ILLUMINA SINGAPORE PTE. LTD. Singapore

    2 weeks ago

    Illumina Singapore Pte Ltd background
    Description
    Roles & Responsibilities

    Position Summary / Role Description

    We are seeking a highly skilled and experienced Staff IT Engineer with a good background in managing compute and storage resources in data center and Cloud environments. This will include managing Linux based Physical/cloud/virtual servers, File/object Storage maintenance and High-Performance Computing (HPC) technology. This is an incredibly exciting opportunity to work with a multitude of cutting-edge technologies within our HPC team and gain exposure to cloud (AWS/Google Cloud), Big Data Analytics, and AI and Machine Learning infrastructure.

    Responsibilities include but are not limited to:

    • Collaborating closely with business users to gather system and storage requirement details.
    • Designing and implementing storage and compute solutions that meet business requirements and align with defined strategies and compliance requirements.
    • Providing expert technical support for server and storage platforms.
    • Conducting performance analysis, installation, and tuning of infrastructure including storage, compute, and operating systems.
    • Creating detailed documentation for implementations, guides, and training.
    • Developing server automation solutions using Ansible, Terraform, and other automation solutions.
    • Maintaining best practices and global standards for managing systems and services across all environments.
    • Performing periodic performance reporting to support capacity planning.
    • Installing and maintaining security patches and firmware codes on operational and development systems.
    • Adhering to corporate change management policies and practices.
    • Participating in 24X7 on-call responsibilities in a global follow-the-sun model.
    • Working with management on individual and team-wide progress and improvements.

    Role Requirements:

    • 7+ years supporting enterprise compute in Red Hat and Debian based Linux distributions at an expert level for troubleshooting and administration.
    • Experience in implementing, troubleshooting, and supporting scale-out storage systems like NAS/File Storage, S3 Object Storage etc.
    • 5+ years' experience working with a global team.
    • 2+ years of experience working within a data center or network operation center environment.
    • Knowledge of HPC and job schedulers with an emphasis on SGE or Slurm.
    • Strong scripting/programming skills with fluent understanding of Python and Bash.
    • Experience with troubleshooting and diagnosing computer hardware and server hardware.
    • Experience in Docker or Kubernetes is a plus.
    • Experience in one of the following: cloud migration, data center migration, disaster recovery, application/servers' assessment, and discovery.
    • Strong communication and interpersonal skills; comfortable presenting ideas, solutions, and concepts to others.
    • Self-sufficient and Results-oriented with the ability to take ownership of IT projects and deliver them on time and within budget.
    • Ability to work in a fast-paced environment, learn quickly, and master diverse technologies.

    Education:

    • Bachelor's degree in computer science or related technical discipline .
    • AWS and/or GCP Associate Certification is preferred.
    • RedHat Certification is preferred.
    Tell employers what skills you have

    Kubernetes
    Change Management
    Data Center
    Big Data Analytics
    Debian
    Computer Hardware
    Tuning
    RedHat
    Firmware
    Operating Systems
    Docker
    GCP
    Ansible
    S3
    Disaster Recovery
    Linux