Data Engineer Databricks, Python, SQL, PySpark

  • Jersey City, NJ
  • Posted 5 days ago | Updated 9 hours ago

Overview

Hybrid
Depends on Experience
Contract - W2
Contract - 12 Month(s)

Skills

Data Engineer
Databricks
Python
SQL
PySpark
Spark
workflows

Job Details

Job Title: Data Engineer Databricks, Python, SQL, PySpark Location: Jersey City, NJ Job Type: Long-term Contract

Job Summary:
We are seeking a highly skilled Data Engineer with hands-on experience in Databricks, Python, SQL, and PySpark to join our growing data engineering team. In this role, you ll build scalable data pipelines, work on big data processing, and collaborate across teams to deliver reliable, analytics-ready data. The ideal candidate has strong experience with cloud data platforms and is passionate about driving data solutions using modern tools and technologies.

Key Responsibilities:
Design, build, and maintain large-scale data pipelines on Databricks using PySpark and SQL.
Develop efficient, reliable, and scalable ETL/ELT workflows to ingest and transform structured and unstructured data.
Collaborate with data scientists, analysts, and product teams to understand data needs and deliver actionable datasets.
Optimize data performance and resource usage within Databricks clusters.
Automate data validation and monitoring to ensure pipeline reliability and data quality.
Write clean, modular, and testable code in Python.
Implement best practices for data security, governance, an compliance.
Document data workflows, architecture, and technical decisions.

Required Skills & Qualifications:
5+ years of experience in data engineering or a related role.
Strong hands-on experience with Databricks and Apache Spark (PySpark).
Proficiency in Python and SQL for data manipulation and scripting.
Experience working with large datasets and building scalable data processing workflows.
Familiarity with cloud platforms (AWS, Azure, or Google Cloud Platform), especially cloud-native data solutions.
Understanding of data modeling, warehousing concepts, and performance tuning.
Experience with version control (Git) and CI/CD for data pipelines.

Preferred Qualifications:
Experience with Delta Lake and the Lakehouse architecture.
Exposure to orchestration tools like Airflow, DBT, or Azure Data Factory.
Experience working in Agile/Scrum environments.
Knowledge of real-time data processing and streaming (e.g., Kafka, Structured Streaming) is a plus.
Certification in Databricks or relevant cloud technologies.

 

Note: If you refer a candidate who is successfully hired with us, you’ll receive a referral bonus.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.