Overview
Skills
Job Details
POSITION: ETL Developer (Must have a Matillion tool)
LOCATION: Seattle, WA (Onsite)
Experience: 10+ years
Work type: Fulltime (W2 / C2C)
PETADATA is currently looking to hire for the ETL / Spark Developer role for one of their clients.
Roles & Responsibilities:
-
Should have experience with both streaming and batch workflows will be essential in ensuring the efficient flow and processing of data to support our clients.
-
Collaborate with cross-functional teams to understand data requirements and design robust data architecture solutions.
-
Design, develop, and implement scalable data processing solutions using Apache Spark
-
Ability to organize and to keep the projects well-arranged and structured.
-
Ensure data quality, integrity, and consistency throughout the ETL pipeline.
-
Integrate data from different systems and sources to provide a unified view for analytical purposes.
-
Collaborate with data analysts to implement solutions that meet their data integration needs.
-
Design and implement streaming workflows using PySpark Streaming or other relevant technologies.
-
Develop batch processing workflows for large-scale data processing and analysis.
-
Analyze the business requirement to determine the volume of data extracted from different sources, data models, to ensure the quality of the data involved.
-
Should be able to figure out the best storage medium required for the data warehouse needed.
-
Identify the data storage needs to determine the amount of data to deal with the company's requirements.
-
Must ensure the data quality that everything is in place at the transformation stage to eliminate errors and fix unstructured and unorganized data extracted.
-
Responsible for ensuring that the data is loaded into the warehouse system and meets the business needs and standards.
-
Should be responsible for data flow validation, creating and building a secured database warehouse that meets a given company's needs and standards.
-
Must be responsible for determining the storage needs of a business and the volume of data involved.
Required skills:
-
Should have 10+ years of experience in implementing ETL processes to extract, transform, and load data from various sources to ensure data quality, integrity, and consistency throughout the ETL pipeline.
-
Must be expertise in Python, PySpark, ETL processes, CI/CD (Jenkins or GitHub).
-
Must have a Matillion tool.
-
Experienced in Python and PySpark to develop efficient data processing and analysis scripts.
-
Optimize code for performance and scalability, keeping up-to-date with the latest industry best practices.
-
Must load data and be proficient in valuable technical skills such as SQL, JAVA, XML, and DOM, among others.
-
Extensive knowledge and experience with Spark and its technologies.
-
Hands-on Experience with Apache Spark framework, including the Spark SQL module for querying databases.
-
Good knowledge on data analysis, design, and programming skills such as JavaScript, SQL and XML, and DOM.
-
Familiar with various coding languages used in web development, including HTML, CSS, JavaScript, Python, Java, Scala, or R proficiency.
-
Should have experience in writing clean code that's free of bugs and reproducible by other developers.
-
Experience in managing SQL databases and organizing big data.
-
Hands-on experience with ETL such as MS SQL, SSIS (Server Integration Services), Python / Perl, Oracle, SQL Server/ MySQL.
-
Solid understanding of Data warehousing schemes, Dimensional modeling, and implementing data storage solutions to support efficient data retrieval and analysis.
-
Must be an expert in debugging ETL processes, optimizing data flows, and ensuring that the data pipeline is robust and error-free.
-
Good knowledge of Verbal and communication skills.
Educational Qualification:
Bachelor's/ Master's degree in Computer Science, Engineering, or a related field.
We offer a professional work environment and are given every opportunity to grow in the Information technology world.
Note:
Candidates required to attend Phone/Video Call / In person interviews and after Selection of candidate (He/She) should go through all background checks on Education and Experience.
Please email your resume to:
After carefully reviewing your experience and skills one of our HR team members will contact you on the next steps.