Google Cloud Platform Data Engineer

  • Posted 10 days ago | Updated 10 days ago

Overview

Remote
Depends on Experience
Full Time
10% Travel

Skills

API
Apache Avro
Apache Flume
Apache HTTP Server
Apache Hadoop
Apache Hive
Apache Kafka
Apache Parquet
Apache Spark
Bash
Big data
Cloud computing
Cloud storage
Cloudera Impala
Confluence
Continuous delivery
Continuous integration
DLP
Dashboard
Data acquisition
Data architecture
Data collection
Data processing
Data flow
GitHub
File formats
Google Cloud Platform
HDFS
JIRA
JSON
Java
Jenkins
Linux
Python
Programming languages
Good Clinical Practice
RESTful
SOAP
SQL
Scala
Tableau
Terraform
Unix
Version control
Shell scripting
Extract
transform
load
Meta-data management
Testing
Systems analysis/design

Job Details

Key Responsibilities:
Design, develop, and implement scalable, high-performance data solutions on Google Cloud Platform.
Curate and manage a comprehensive data set detailing user permissions and group memberships.
Redesign the existing data pipeline to improve scalability and reduce processing time.
Ensure that changes to data access permissions are reflected in the Tableau dashboard within 24 hours.
Collaborate with technical and business users to share and manage data sets across multiple projects.
Utilize Google Cloud Platform tools and technologies to optimize data processing and storage.
Re-architect the data pipeline that builds the BigQuery dataset used for Google Cloud Platform IAM dashboards to make it more scalable.
Run and customize DLP scans.
Build bidirectional integrations between Google Cloud Platform and Collibra.
Explore and potentially implement Dataplex and custom format-preserving encryption for de-identifying data for developers in lower environments.
Required Skills :
5+ years of experience in an engineering role using Python, Java, Spark, and SQL.
5+ years of experience working as a Data Engineer in Google Cloud Platform.
Proficiency with Google's Identity and Access Management (IAM) API.
Strong Linux/Unix background and hands-on knowledge.
Experience with big data technologies such as HDFS, Spark, Impala, and Hive.
Experience with Shell scripting and bash.
Experience with version control platforms like GitHub.
Experience with unit testing code.
Experience with development ecosystems including Jenkins, Artifactory, CI/CD, and Terraform.
Demonstrated proficiency with Airflow.
Proficiency in multiple programming languages, frameworks, domains, and tools.
Coding skills in Scala.
Experience with Google Cloud Platform platform development tools such as Pub/Sub, Cloud Storage, Bigtable, BigQuery, Dataflow, Dataproc, and Composer.
Knowledge in Hadoop and cloud platforms and surrounding ecosystems.
Experience with web services and APIs (RESTful and SOAP).
Ability to document designs and concepts.
API Orchestration and Choreography for consumer apps.
Well-rounded technical expertise in Apache packages and hybrid cloud architectures.
Pipeline creation and automation for data acquisition.
Metadata extraction pipeline design and creation between raw and transformed datasets.
Quality control metrics data collection on data acquisition pipelines.
Ability to collaborate with scrum teams including scrum master, product owner, data analysts, Quality Assurance, business owners, and data architecture to produce the best possible end products.
Experience contributing to and leveraging Jira and Confluence.
Strong experience working with real-time streaming applications and batch-style large-scale distributed computing applications using tools like Spark, Kafka, Flume, Pub/Sub, and Airflow.
Ability to work with different file formats like Avro, Parquet, and JSON.
Managing and scheduling batch jobs.
Hands-on experience in Analysis, Design, Coding, and Testing phases of the Software Development Life Cycle (SDLC).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Global CI