DATA Engineer

Overview

Remote
Depends on Experience
Full Time
No Travel Required

Skills

Amazon Kinesis
Amazon Redshift
Amazon S3
Amazon Web Services
Analytics
Apache Airflow
Apache Kafka
Apache Spark
Cloud Computing
Collaboration
Communication
Continuous Delivery
Continuous Integration
Data Engineering
Data Flow
Data Governance
Data Integrity
Data Lake
Data Modeling
Data Quality
Data Warehouse
Data Warehouse Architecture
DevOps
ELT
Extract
Transform
Load
Good Clinical Practice
Google Cloud Platform
Mentorship
Microsoft Azure
Optimization
PySpark
Python
Real-time
Regulatory Compliance
Reporting
SQL
Streaming
Talend
Virtual Team
Workflow

Job Details

We are seeking an experienced and self-motivated Senior Data Engineer to join our remote team. The ideal candidate will have 10+ years of hands-on experience in data engineering, including the design, implementation, and optimization of large-scale data pipelines, data lakes, and warehouses. This role requires expertise in ETL/ELT processes, distributed systems, cloud technologies (AWS/Azure/Google Cloud Platform), and modern data stack tools such as Spark, Kafka, and Airflow.


Key Responsibilities:

  • Design and build scalable data pipelines for batch and real-time data ingestion.

  • Develop and maintain data lake and data warehouse architecture for analytics and reporting.

  • Work with cross-functional teams including data analysts, data scientists, and software engineers to understand data needs.

  • Implement data quality checks, validation frameworks, and monitoring for pipeline health.

  • Optimize data workflows for performance and cost-efficiency in a cloud environment.

  • Collaborate on data governance and security practices to ensure compliance and data integrity.

  • Document data flows, pipeline architectures, and technical standards.

  • Mentor junior data engineers and contribute to engineering best practices.


Required Qualifications:

  • 10+ years of experience in Data Engineering or related field.

  • Strong experience with Python, SQL, and Spark (PySpark preferred).

  • Deep understanding of ETL/ELT concepts and tools (e.g., Apache Airflow, dbt, Talend).

  • Proficiency with cloud platforms (AWS, Azure, or Google Cloud Platform), especially in using services like S3, Glue, Redshift, BigQuery, or Azure Synapse.

  • Hands-on experience with streaming technologies like Apache Kafka or Kinesis.

  • Solid understanding of data modeling, data warehousing, and data lake design.

  • Experience with CI/CD tools and DevOps for data infrastructure.

  • Strong communication skills and experience working in distributed teams.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.