No more applications are being accepted for this job

Data Engineer - Singapore - THALES DIS (SINGAPORE) PTE. LTD.

THALES DIS (SINGAPORE) PTE. LTD. Singapore

2 weeks ago

Description

Roles & Responsibilities

As a Data Engineer in AIR Lab, you should be someone who enjoys designing, discussing topics around processing patterns like data quality control, streaming SQL, data sources/sinks and data synchronization; streaming-backfill, stream-to-stream joins etc. You should be someone who cares about the quality of the technical implementation and delivery as much as you care about the quality of delivery. You should be someone who enjoys working in a team of diverse people with multiple ethnic and cultural backgrounds. You should be someone who enjoys diving into the technical details of figuring out a problem and be able to communicate the solution back to the team so that the members can learn from it. You should be someone who loves learning new technologies and find innovative ways to apply newfound knowledge and be courageous to encourage fellow team members to be like YOU and enjoy participating in all aspects of engineering activities in the AIR Lab.

Responsibilities:

Improve and maintain the DataLake cybersecurity posture with regards to data governance and cybersecurity standards by working with other stakeholders (e.g., Data Architect, Data Assessment Office, Cybersecurity Office).
Improve and maintain the DataLake service levels for reliable data flow, health of infrastructure (i.e., compute and storage).
Improve and maintain the total-cost of ownership of the DataLake; this activity includes raising efficiencies around FinOps, CloudOps.
Improve and maintain the architecture transforming data between the DataLake and a distributed search and analytics engine (e.g., ElasticSearch)
Lead the technical evolution of the DataLake by conducting the following activities (non-exhaustively) exploring new methods, techniques, algorithms (e.g., data meshes, AI/MLOps infrastructure).
Improve and maintain the data model, data catalogue (e.g., event data, batched data, persisted, ephemeral).
Work with the Data Architect to effect best practices to the engineering organization.
To implement features by defining test, develop feature and associated automated tests. If appropriate, implement security tests and load tests.
Write and review the necessary technical and functional documentation in documentation repositories (e.g., , JIRA, READMEs).
Work in an agile, cross-functional multinational team, actively engaging to support the success of the team.

Requirements:

Education

Bachelors in Computer Science or Information Technology
Master's degree in Computer Science or Data Science, if applicable

Essential Skills/Experience

Proficiency in designing, implementing ETL data pipelines (with structured or unstructured data) using the frameworks like Apache Dataflow/Apache Beam, Apache Flink; proficient in deploying ETL pipelines into Kubernetes cluster in Azure cloud either as virtual machines or containerized workloads.
Proficiency in designing, implementing data lifecycle management using scalable object-storage systems like MinIO (e.g., tiering, object expiration, multi-nodal approach)
Proficiency in programming languages in Java, Kotlin with a focus around design and development for scalable applications with microservices and monolith approaches.
Proficiency in developing performant abstract data structures (e.g., deterministic data lookups versus heuristic data lookups); able to conduct independent research of methods, techniques and algorithms.
Demonstrated application of working with Continuous Integration and/or Continuous Delivery models; you are expected to be familiar with using Linux (e.g., shell commands)
Proficiency in distributed source code management tools like GitLab, Github and practice GitOps
With respects to ETL pipelines, you are expected to demonstrate proficiency in the following.

pipeline configuration using Gitlab
environment management using Gitlab; it's a bonus if you have demonstrated experience in deployment management (e.g., canary, blue/green rollouts) using Gitlab

Familiar with cloud deployment strategies to public clouds (e.g., Azure Cloud, AWS, GCP) and Kubernetes using virtualized and containerized workloads (e.g., Kaniko, Docker, Virtual Machine)
Good communication skills in English

Desirable Skills/Experience

Working knowledge of designing application with a "shift-left" cybersecurity approach.
Working knowledge of other languages (e.g., Python3, Scala2 or Scala3, Go, TypeScript, C, C++17, Java17)
Implement event-driven processing pipelines using frameworks like Apache Kafka, Apache Samza
Familiar with the cloud deployment models (e.g., public, private, community and hybrid)
Familiar with the main cloud service models: Software as a Service, Platform as a Service and Infrastructure as a Service.
Familiar with designing and/or implementing AI/MLOps pipelines in public cloud (e.g., Azure, AWS, GCP)

Essential / Desirable Traits

Possess learning agility, flexibility and pro-activity
Comfortable with agile teamwork and user engagement

Tell employers what skills you have

Kubernetes
Azure
Pipelines
Data Structures
Architect
Kotlin
ETL
Data Quality
Data Governance
Data Engineering
SQL
Continuous Integration
Docker
Data Science
Java
Apache
Linux

Other Data Engineer jobs