Senior Site Reliability Engineer - Singapore - NodeFlair

    NodeFlair
    NodeFlair Singapore

    2 weeks ago

    Default job background
    Description

    Global leader in monitoring cryptocurrency data since its establishment in 2014. Having constructed the world's most extensive cryptocurrency data platform, it tracks over 10,000 tokens across 400+ exchanges, catering to over 300 million page views in 100+ countries. We take pride in significantly contributing to the mainstream awareness, adoption, and education of cryptocurrency globally.

    Your Role and Responsibilities:

    As part of our team, you will:

    • Review architecture and software components with software engineers and architects, ensuring consistent best practices across all teams.
    • Own and ensure Service Level Objectives (SLOs) and Service Level Agreements (SLAs) are met, monitoring operational metrics and leading improvement plans.
    • Manage and audit security controls to meet enterprise requirements, collaborating with legal and compliance for risk management.
    • Conduct performance tests for large-scale events or critical releases.
    • Develop and implement Disaster Recovery plans and procedures, including data recovery and fault injection simulations on production replicas.
    • Lead incident response and post-mortems to resolve production issues, identify root causes, and prevent future occurrences.
    • Develop runbooks and other technical assets, completing periodic technical audits as required.
    • Perform day-to-day tasks including access onboarding-offboarding, config and patch management, etc.
    • Stay up-to-date with emerging trends, threats, and technologies to propose improvements and proof-of-concepts in technical roadmaps.
    • Collaborate with cross-functional teams to ensure smooth deployment and operation of software releases.
    • Provide feedback on the performance of junior staff and participate in people development initiatives.
    • Support any ad hoc tasks as required by the company.

    What We Look For in You:

    You should have:

    • Proven track record: 3 to 5 years of experience in managing software deployments and instrumentation in production environments with defined SLAs and SLOs.
    • Cloud Operations: Experience with cloud platforms (e.g., AWS, CloudFlare, GCP) and infrastructure-as-code tools (e.g., Terraform, CloudFormation). Strong programming and scripting skills, preferably in Python, Go, or Ruby.
    • Accreditation: Bachelor's degree in Comp Sci., InfoSec, or similar fields, or professional certificates (e.g., Certified DevOps Professional, Certified Solutions Architect Professional in AWS or GCP).
    • Scope of Work: Fully capable of taking substantial features from concept to shipping as a sole contributor. Works effectively in open-ended projects and is self-sufficient to deep dive and evaluate multiple solutions to a problem.
    • Problem Solving: Ability to solve hard problems with many constraints, using sound judgment to assess risks and present arguments in a well-structured, data-backed, written narrative.
    • Quick Thinking: Able to derive information, think critically, and make snap judgments based on measured data in high-pressure situations.
    • People Skills: Strong communicator who can build positive working relationships between teams and form relationships with key customers.

    Nice to have:

    • Experience working in an early-to-growth stage startup.
    • Experience building applications in different tech stacks.
    • Keen interest in decentralized technologies and their applications, including cryptocurrencies.

    Perks:

    • Remote Work Flexibility: Work wherever you feel most productive.
    • Flexible Working Hours: No 9-5 structure; work the hours you need to get your tasks done.
    • Comprehensive Insurance Coverage: Life, medical, and critical illness insurance provided.
    • Equity: Entitled to virtual options, subject to terms and conditions.
    • Transport Allowance: Monthly fixed allowance to ease the cost of traveling.
    • Flexible Claim Benefits: Quarterly budget allocation to subsidize meals and set up your work-from-home station.
    • Learning Allowance: Annual budget to support continuous learning for professional and personal development.
    • Social Activity Allowance: Subsidy for social activities organized by you and your colleagues.