Lead Data Engineer - Singapore - Transformhub Consulting

    Transformhub Consulting
    Transformhub Consulting Singapore

    2 weeks ago

    Default job background
    Full time
    Description

    Required Qualifications:

    • Bachelor qualification in a computer science or STEM (science, technology, engineering, or mathematics) related field.
    • At least 8+ years of strong data warehousing experience using RDBMS and Non-RDBMS databases.
    • At least 5 years of recent hands-on professional experience (actively coding) working as a data engineer (back-end software engineer considered).
    • Professional experience working in an agile, dynamic and customer facing environment is required.
    • Understanding of distributed systems and cloud technologies (AWS) is highly preferred.
    • Understanding of data streaming and scalable data processing is preferred to have.
    • Experience with large scale datasets, data lake and data warehouse technologies on at least TB scale (ideally PB scale of datasets) (Preferred: Snowflake)
    • At least 2+ years of experience in ETL (AWS Glue), Amazon S3, Amazon Redshift, Amazon RDS, Amazon Kinesis, Amazon Lambda, Apache Airflows, Amazon Step Functions.
    • At least 2+ experience in Snowflake is highly preferred.
    • Strong knowledge in scripting languages like Python, UNIX shell and Spark is required.
    • Understanding of RDBMS, Data ingestions, Data flows, Data Integrations etc.
    • Technical expertise with data models, data mining and segmentation techniques.
    • Experience with full SDLC lifecycle and Lean or Agile development methodologies.
    • Knowledge of CI/CD and GIT Deployments.
    • Ability to work in team in diverse/ multiple stakeholder environment.
    • Ability to communicate complex technology solutions to diverse teams namely, technical, business and management teams.

    Responsibilities:

    • Work with stakeholders to understand needs for data structure, availability, scalability, and accessibility.
    • Develop tools to improve data flows between internal/external systems and the data lake/warehouse.
    • Build robust and reproducible data ingest pipelines to collect, clean, harmonize, merge, and consolidate data sources.
    • Understanding existing data applications and infrastructure architecture
    • Build and support new data feeds for various Data Management layers and Data Lakes
    • Evaluate business needs and requirements
    • Support migration of existing data transformation jobs in Oracle, and MS-SQL to Snowflake.
    • Lead the migration of the existing data transformation jobs in Oracle, Hive, Impala etc. into Spark, Python on Glue etc.
    • Able to document the processes and steps.
    • Develop and maintain datasets.
    • Improve data quality and efficiency.
    • Lead Business requirements and deliver accordingly.
    • Collaborate with Data Scientists, Architect and Team on several projects