Lead Data Engineer with extensive experience in building data integration pipelines in CI/CD model

Overview

On Site
Depends on Experience
Contract - W2

Skills

GCS
Big Query
Streaming (pub/sub)
data proc and data flow
NIFI
Python
PYSpark
Kafka
SQL
shell scripting & Stored procs
Data warehouse
distributed data platforms and data lake
Database definition
schema design
Looker Views
Models
CI/CD pipeline
PySpark and SQL

Job Details

Position: Lead Data Engineer with extensive experience in building data integration pipelines in CI/CD model. (on our W2)

Location: Austin, TX

Experience: 10-15 years (Lead experience 2-5 + yrs.)

  • Ability to design and develop a high-performance data pipeline framework from scratch
    • Data ingestion across systems
    • Data quality and curation
    • Data transformation and efficient data storage
    • Data reconciliation, monitoring and controls
    • Support reporting model and other downstream application needs
  • Skill in technical design documentation, data modeling and performance tuning applications
  • Lead and manage a team of data engineers, contribute towards code reviews, and guide the team in designing and developing convoluted data pipelines adhering to the defined standards.
  • Be hands on, performs POCs on the open source/licensed tools in the market and share recommendations.
  • Provide technical leadership and contribute to the definition, development, integration, test, documentation and support across multiple platforms (Google Cloud Platform, Python, HANA)
  • Establish a consistent project management framework and develop processes to deliver high quality software, in rapid iterations, for the business partners in multiple geographies
  • Participate in a team that designs, develops, troubleshoots, and debugs software programs for databases, applications, tools etc.
  • Experience in balancing production platform stability, feature delivery and reduction of technical debt across a broad landscape of technologies.
  • Skill in the following platform, tools and technologies
    • Google Cloud Platform cloud platform GCS, Big Query, Streaming (pub/sub), data proc and data flow, NIFI
    • Python, PYSpark, Kafka, SQL, shell scripting & Stored procs
    • Data warehouse, distributed data platforms and data lake
    • Database definition, schema design, Looker Views, Models
    • CI/CD pipeline
  • Proven track record in scripting code in Python, PySpark and SQL
  • Excellent structured thinking skills, with the ability to break down multi-dimensional problems
  • Ability to navigate ambiguity and work in a fast-moving environment with multiple stakeholders
  • Good communication skills and ability to coordinate and work with cross functional teams.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.