Overview
Skills
Job Details
Note: Only open for candidates local to NJ.
Minimum experience : 8+
Ideally 5+ years of hand-on experience in distributed processing using Databricks, Apache Python/Spark, Kafka & leveraging Airflow scheduler/executor framework
Ideally 3+ years of hand-on experience programming experience in Scala (must have), Python & Java (preferred)
Experience with monitoring solutions such as Spark Cluster Logs, Azure Logs, AppInsights, Graphana to optimize pipelines and knowledge in Azure capable languages, Python, Scala or Java
Proficiency at working with large and complex code base management systems like: Github/Gitlab, Gitflow as a project commiter at both command-line and IDEs levels using: tools like: IntelliJ/AzureStudio
Experience working with Agile development methodologies and delivering within Azure DevOps, automated testing on tools used to support CI and release management
Expertise in optimized dataset structures in Parquet and Delta Lake formats, with ability to design and implement complex transformations between datasets
Expertise in optimized Airflow DAGS and branching logic for Tasks to implement complex pipelines and outcomes and expertise in both traditional SQL and NO-SQL authorship