Overview
Skills
Job Details
Location/Remote: 100% remote within the United States; must be willing to work Mountain Time Zone hours
Employment Type: Permanent / Direct Hire / Full-time
Compensation: up to $120k base salary (depending on experience) + 15% annual bonus
Benefits:
- 100% medical premiums covered for employees
- Coverage for dependents on medical, dental, vision, life, and supplemental insurances (e.g., critical illness)
- Short- and Long-Term Disability (STD/LTD)
- HSA & FSA options
- Unlimited PTO
- Up to 12 weeks paid parental leave
- 401(k) with 5% company match
Position Overview
We are looking for a Data Engineer with strong expertise in building ETL pipelines using AWS Glue and other native AWS services. This role is ideal for someone passionate about automating and optimizing data workflows, with deep experience in Python, SQL, and cloud-first data architecture. You will be responsible for developing and maintaining scalable data pipelines that support business-critical analytics and operational use cases.
Key Responsibilities
- Design, develop, and maintain ETL pipelines using AWS Glue for batch processing of structured and semi-structured data.
- Build event-driven workflows using AWS Lambda, Step Functions, and CloudWatch Events.
- Leverage Amazon S3 for data lake storage and ingestion, implementing optimized partitioning and lifecycle policies.
- Use Amazon Redshift and Athena for querying and transforming large datasets.
- Manage access and infrastructure securely using IAM roles and policies.
- Monitor data workflows and job health using CloudWatch, CloudTrail, and custom alarms.
- Develop reusable Python modules to automate and streamline data processing tasks.
- Perform schema design, table creation, and stored procedure development in Redshift and PostgreSQL.
- Support data migrations and schema changes across Redshift clusters and other database systems.
- Collaborate closely with analytics and product teams to deliver high-quality, reliable data solutions.
Required Skills & Experience
- 5+ years of experience developing with Python for ETL workflows, automation, and scripting
- 5+ years of experience working with SQL, including query optimization, stored procedure development, and data modeling
- 4+ years of hands-on experience working with AWS services including AWS Glue, Lambda, Step Functions, S3, Redshift, Athena, CloudWatch, and IAM
- Strong understanding of building, maintaining, and optimizing data pipelines end-to-end
- Experience working with semi-structured data (e.g., JSON, XML) and transforming it into structured formats
- Comfortable working in fast-paced environments, owning pipelines from design to deployment
- Proficient with version control tools such as Git, SVN, or similar
- Excellent communication and collaboration skills, with a strong focus on data quality and reliability
Preferred Qualifications
- Experience with healthcare data (HL7, Medical Claims, Rx Claims) or similar healthcare data is preferred.
- Experience with Apache Airflow or Amazon MWAA
- Familiarity with building custom data ingestion tools using Pandas or similar Python libraries
- Exposure to telemetry data or real-time event pipelines
- Experience in gaming, media, or other high-volume data environments
Education & Certifications
- Bachelor s degree in Computer Science, Engineering, Data Science, or a related technical field (or equivalent hands-on experience)
- Relevant industry certifications in AWS, data engineering, or cloud technologies are a plus