DATA Engineer

Overview

Remote

Depends on Experience

Full Time

No Travel Required

Skills

Amazon Kinesis

Amazon Redshift

Amazon S3

Amazon Web Services

Analytics

Apache Airflow

Apache Kafka

Apache Spark

Cloud Computing

Collaboration

Communication

Continuous Delivery

Continuous Integration

Data Engineering

Data Flow

Data Governance

Data Integrity

Data Lake

Data Modeling

Data Quality

Data Warehouse

Data Warehouse Architecture

DevOps

ELT

Extract

Transform

Load

Good Clinical Practice

Google Cloud Platform

Mentorship

Microsoft Azure

Optimization

PySpark

Python

Real-time

Regulatory Compliance

Reporting

SQL

Streaming

Talend

Virtual Team

Workflow

Job Details

We are seeking an experienced and self-motivated Senior Data Engineer to join our remote team. The ideal candidate will have 10+ years of hands-on experience in data engineering, including the design, implementation, and optimization of large-scale data pipelines, data lakes, and warehouses. This role requires expertise in ETL/ELT processes, distributed systems, cloud technologies (AWS/Azure/Google Cloud Platform), and modern data stack tools such as Spark, Kafka, and Airflow.

Key Responsibilities:

Design and build scalable data pipelines for batch and real-time data ingestion.
Develop and maintain data lake and data warehouse architecture for analytics and reporting.
Work with cross-functional teams including data analysts, data scientists, and software engineers to understand data needs.
Implement data quality checks, validation frameworks, and monitoring for pipeline health.
Optimize data workflows for performance and cost-efficiency in a cloud environment.
Collaborate on data governance and security practices to ensure compliance and data integrity.
Document data flows, pipeline architectures, and technical standards.
Mentor junior data engineers and contribute to engineering best practices.

Required Qualifications:

10+ years of experience in Data Engineering or related field.
Strong experience with Python, SQL, and Spark (PySpark preferred).
Deep understanding of ETL/ELT concepts and tools (e.g., Apache Airflow, dbt, Talend).
Proficiency with cloud platforms (AWS, Azure, or Google Cloud Platform), especially in using services like S3, Glue, Redshift, BigQuery, or Azure Synapse.
Hands-on experience with streaming technologies like Apache Kafka or Kinesis.
Solid understanding of data modeling, data warehousing, and data lake design.
Experience with CI/CD tools and DevOps for data infrastructure.
Strong communication skills and experience working in distributed teams.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Key Responsibilities:

Required Qualifications:

Share