Softwaree Engineer - Python

Overview

Remote
$75 - $80
Contract - W2
Contract - 6 Month(s)
No Travel Required

Skills

Python
Azure Devops
Google Cloud
Data Validations
API
Datawarehouse

Job Details

**** Candidates need to work on W2*****

Title: Software Engineer Duration: Until 12/31/2025 Client: Mayo Clinic Req ID: 35677646 Remote

Ideal candidates will have strong experience with Azure DevOps, Google Cloud based infrastructure, Python, data validations scripts/APIs, with a healthcare or life sciences background.

We need individuals who have worked in data engineering specifically for 5+ years, who can articulate to us how to architect a typical data warehouse and then show with examples that they ve been successful working in that space

Scope of Work: The resource will be supporting an engineering team tasked with building out a research data platform which will ingest and make discoverable research generated data.

Data Engineering Skills & Experience:

-Create, verify, and maintain data replication scripts
-Create, verify, and maintain data validation, processing, and ingestion pipelines
-Deploy and automate the execution of data replication scripts and data pipelines in cloud infrastructure
-Create and maintain data catalogs that describe datasets and their contents (i.e. files, file types, tables/views, columns, fields, etc.)
-Create, verify, and maintain dashboards and reports that characterize ingested datasets
-Create, verify, and maintain data validation scripts/APIs that verify the production dataset contains the correct number of samples/records, expects values/fields/columns are populated, and values are of the correct data type, format, and range.
-Deploy and automate the execution of data validation scripts/APIs
-Create and maintain user documentation (dataset descriptions, tutorials, code examples, etc.)
-Define entitlements, user groups, roles, and permissions utilized to grant access to datasets
Programming Languages:
Primary pipeline development language with be python.
Some datatypes and formats may require the use of other languages (i.e. java, R, etc.) because the libraries/frameworks/sdks available to work with those datatypes and formats are not available in python
Operating Systems:
Primary operating system for data pipeline execution will be linux, with data pipelines packaged, deployed, and run as containers.
Data source systems could be windows or linux based.
Infrastructure Primary data platform and data pipeline execution infrastructure will be hosted on Google Cloud Platform (Google Cloud Platform) utilizing cloud native technologies (i.e. Google Cloud Storage, BigQuery, Google Batch, Dataflow, Cloud SQL, etc.).
Data will be replicated from various on-premises sources that include laboratory instruments, network shared drives, and windows desktops attached to instruments.

Development Tools:

Sprints, features, and tasks will be managed in Azure DevOps.
Code will be managed and versioned Azure DevOps based git repositories.
Code will be compiled, packaged, and deployed utilizing Azure DevOps build pipelines.
Data pipelines will be packaged, deployed, and run in docker containers.
Docker containers will be stored and versioned in Google Cloud Artifact Repositories.
Veracode will be utilized to scan source code for vulnerabilities and Prisma Cloud will be utilized to scan containers.
The standard integrated development environment will be jetbrains (pycharm, intellij, etc.) or VSCode.
Preferred Candidates:
-Experience working on healthcare, life science, or scientific research projects
-A degree or domain knowledge in a life science related field (biochemistry, genetics, biology, etc)
-Experience with Google Cloud Platform based infrastructure and services 100% remote
- Mayo will provide equipment.

Education: Bachelor's Degree in Computer Science/Engineering or related field with 5 years of experience as noted below; OR an Associate's degree in Computer/Science/Engineering or related field with 7 years of experience.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.