Data Engineer

Overview

Remote

Up to $120,000

Full Time

No Travel Required

Skills

Data Engineering

Extract

Transform

Load

Python

AWS Glue

Amazon Lambda

Amazon Redshift

Amazon S3

Amazon Web Services

AWS RDS

SQL

Job Details

Location/Remote: 100% remote within the United States; must be willing to work Mountain Time Zone hours

Employment Type: Permanent / Direct Hire / Full-time

Compensation: up to $120k base salary (depending on experience) + 15% annual bonus

Benefits:

100% medical premiums covered for employees
Coverage for dependents on medical, dental, vision, life, and supplemental insurances (e.g., critical illness)
Short- and Long-Term Disability (STD/LTD)
HSA & FSA options
Unlimited PTO
Up to 12 weeks paid parental leave
401(k) with 5% company match

Position Overview

We are looking for a Data Engineer with strong expertise in building ETL pipelines using AWS Glue and other native AWS services. This role is ideal for someone passionate about automating and optimizing data workflows, with deep experience in Python, SQL, and cloud-first data architecture. You will be responsible for developing and maintaining scalable data pipelines that support business-critical analytics and operational use cases.

Key Responsibilities

Design, develop, and maintain ETL pipelines using AWS Glue for batch processing of structured and semi-structured data.
Build event-driven workflows using AWS Lambda, Step Functions, and CloudWatch Events.
Leverage Amazon S3 for data lake storage and ingestion, implementing optimized partitioning and lifecycle policies.
Use Amazon Redshift and Athena for querying and transforming large datasets.
Manage access and infrastructure securely using IAM roles and policies.
Monitor data workflows and job health using CloudWatch, CloudTrail, and custom alarms.
Develop reusable Python modules to automate and streamline data processing tasks.
Perform schema design, table creation, and stored procedure development in Redshift and PostgreSQL.
Support data migrations and schema changes across Redshift clusters and other database systems.
Collaborate closely with analytics and product teams to deliver high-quality, reliable data solutions.

Required Skills & Experience

5+ years of experience developing with Python for ETL workflows, automation, and scripting
5+ years of experience working with SQL, including query optimization, stored procedure development, and data modeling
4+ years of hands-on experience working with AWS services including AWS Glue, Lambda, Step Functions, S3, Redshift, Athena, CloudWatch, and IAM
Strong understanding of building, maintaining, and optimizing data pipelines end-to-end
Experience working with semi-structured data (e.g., JSON, XML) and transforming it into structured formats
Comfortable working in fast-paced environments, owning pipelines from design to deployment
Proficient with version control tools such as Git, SVN, or similar
Excellent communication and collaboration skills, with a strong focus on data quality and reliability

Preferred Qualifications

Experience with healthcare data (HL7, Medical Claims, Rx Claims) or similar healthcare data is preferred.
Experience with Apache Airflow or Amazon MWAA
Familiarity with building custom data ingestion tools using Pandas or similar Python libraries
Exposure to telemetry data or real-time event pipelines
Experience in gaming, media, or other high-volume data environments

Education & Certifications

Bachelor s degree in Computer Science, Engineering, Data Science, or a related technical field (or equivalent hands-on experience)
Relevant industry certifications in AWS, data engineering, or cloud technologies are a plus

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Agile

Share