Reliability Engineer - Singapore - NTUC Enterprise Nexus Co-operative Limited

Wei Jie

Posted by:

Wei Jie

beBee Recruiter


Description

COMPANY DESCRIPTION
NTUC Enterprise Co-operative Limited is the holding entity and single largest shareholder of the NTUC group of Social Enterprises.

We aim to create a greater social force to do good by harnessing the capabilities of the social enterprises to meet pressing social needs in areas like health and eldercare, childcare, daily essentials, cooked food, and financial services.

Serving over two million customers, NTUC Enterprise wants to enable and empower all in Singapore to live better and more meaningful lives.


The NTUC Enterprise Centre of Excellence for Data, Digitalisation and Technology leads the transformation of the NTUC Social Enterprises by leveraging digital technologies to become more nimble, adaptable, and innovative in today's digital age.

The NTUC Enterprise Centre of Excellence for Data, Digitalisation and Technology has been registered as NTUC Enterprise Nexus, a wholly owned subsidiary of NTUC Enterprise.


DESIGNATION :
Reliability Engineer (AWS/ GCP) (1 year contract)


RESPONSIBILITIES
NTUC Enterprise Nexus Co-operative Limited is currently hiring for Reliability Engineer to join Digital Product Development organization.

The team combines software and system engineering to architect and run large-scale, distributed, and fault-tolerant systems.

The primary team's goal is to ensure sustainably achieve product reliability through software engineering practices, architecture patterns, culture embracement, process standardization, automation framework, education, and sharing.

The team practices industry reliability frameworks such as Service Level Objectives (SLOs) and Service Level Indication (SLIs), release engineering, IaC, and operations automation.

The team will empower our product developers in the Product Development Life Cycle to ensure product reliability, it is not limited to building self-serve tools/processes, and an infrastructure foundation that allows the product team to constantly deliver a high-reliability system.


As a Reliability Engineer, you have the opportunity to manage the complex challenges of the Social Enterprise System that are unique to NTUC Enterprise Nexus Co-operative Limited, while using your expertise in coding, algorithm, complexity analysis, and large-scale system design.

You will be reporting to the Architecture & Reliability Lead.

  • Work with product developers to ensure that the software delivery pipeline is as reliable
- as possible.

  • Responsible to drive practices that ensure reliability of the product.
  • Collaborate closely with product developers to ensure that the designed solution
- responds to non-functional requirements such as availability, performance, security, and
- maintainability.

  • Responsible for availability, latency, performance, efficiency, monitoring, emergency
- response, and system capacity planning.

  • To improve the whole lifecycle of services from inception and design, through
- deployment, operation, and refinement.

  • Support services before they go live through activities such as system design consulting,
- developing software platforms and frameworks, system capacity planning and
- post-mortems.

  • Maintain services once they are launched by measuring and monitoring availability,
- latency, and overall system health.

  • Scale systems sustainably through mechanisms like automation; evolve systems by
- pushing for changes that improve reliability and velocity.

  • Practice sustainable incident response and blameless postmortems.
  • Documenting "tribal" knowledge.
  • Advocate for Reliability Engineering practices

QUALIFICATIONS

  • Experience in analyzing and troubleshooting systems.
  • Understanding of Infrastructure monitoring, logging, alerting release, and configuration management.
  • Understanding of networking (e.g. TCP/IP, routing, network topology, load balancers, DNS, NTP).
  • Experience in one of the following: Python, Java, Go, Perl, Ruby, or shell scripting.
  • Experience in Public Cloud, AWS, and/or GCP.
  • Experience with software deployment and/or orchestration technologies, e.g., Puppet, Chef, Salt, Ansible, Docker, Kubernetes, Terraform.
  • Experience in CI/CD (e.g., JIRA, Git, Jenkins, Nexus,...)
  • Experience in standard IT security practices (e.g., encryption, certificates, key management)
  • Excellent communication, and problemsolving skills with strong attention to detail.
  • Flexibility to work nonbusiness hours that may include weekends and/or holidays
  • Selfstarter who is able to identify and perform tasks with mínimal supervision

More jobs from NTUC Enterprise Nexus Co-operative Limited