No more applications are being accepted for this job
- Create and implement solutions to automate technical operations of large systems, collaborating with teams to enhance stability throughout the software development process.
- Lead efforts to improve the stability of payment systems, including monitoring, log management, and creating diagnostic tools.
- Conduct regular drills and develop plans for quick service restoration during incidents, participating in on-call rotations as needed.
- Define metrics to evaluate system performance and runtime, improving observability. Plan system capacities to accommodate business growth and promotions.
- Analyze production incidents to establish best practices for a highly available payment architecture.
- At least 3 years relevant work experience from a large-scale systems.
- Bachelor's degree in Computer Science-related technical discipline, or equivalent practical experience.
- Solid knowledge of Computer Science, and familiar with the principles of Operating System, Computer Storage, Computer Networking etc.
- Software development experience in at least one programming language, such as Java/Go/C++/C#/Python
- Familiarity with IaC, CI/CD and Observability concepts & tools would be an added advantage
- Experiences of Redis, rocketMQ, MySQL, Nginx, Kubernetes, Docker, etc. are plus.
Site Reliability Engineer - Singapore - Morgan McKinley
Description
Our client, one of the global technology firm in Singapore are looking for an experienced Site Reliability Engineer to join their dynamic team.Responsibilities:
Morgan McKinley Pte Ltd
Ho Ji Keat Joey (HE ZIJIE)
EA Licence No: 11C5502
EA Registration No. R1983255