System Analyst - Site Reliability Engineer II

Overview

On Site
Full Time

Skills

Incident Management
Disaster Recovery
Business Continuity Planning
Mentorship
Evaluation
Reliability Engineering
Scalability
Software Development
Python
Java
Version Control
Git
GitLab
GitHub
Critical Thinking
Conflict Resolution
Problem Solving
Customer Service
Application Development
FOCUS
Screenwriting
Ansible
Orchestration
Docker
Continuous Integration
Continuous Delivery
Server Administration
Linux
Computer Networking
Firewall
Root Cause Analysis
Project Management
Agile
Scrum
Amazon Web Services
Google Cloud
Google Cloud Platform
SaaS
IaaS
PaaS
Enterprise Architecture
Artificial Intelligence
Machine Learning (ML)
Red Hat Linux
Microsoft Azure
DevOps
Kubernetes
Cloud Computing
Health Care
Leadership
Vulnerability Management
Technical Support
System Documentation
Information Security
Policies and Procedures
Presentations
Quality Assurance
Change Management
Documentation
Workflow
Research
Customer Relationship Management (CRM)
Management
Collaboration
Innovation
Microsoft Exchange
ProVision
Recruiting

Job Details

At Duke Health, we're driven by a commitment to compassionate care that changes the lives of patients, their loved ones, and the greater community. No matter where your talents lie, join us and discover how we can advance health together.

Occupational Summary
The DHTS Systems Analyst-Site Reliability Engineer (SRE) is responsible for designing, implementing, and maintaining large-scale distributed systems with a focus on reliability, scalability, and performance.
The SRE collaborates with development teams to ensure that applications and services are designed and operated to meet reliability targets and scale efficiently. This role involves working with Kubernetes for
on-premises environments and Azure Kubernetes Service (AKS) for cloud-based solutions.

Essential Tasks/Responsibilities
Level 2 (DHTS System Analyst 2)
Participate in on-call rotations to respond to system alerts and incidents.
Assist in troubleshooting and resolving system issues and outages across both on-premises and cloud environments.
Collaborate with development teams to improve system reliability and efficiency across onpremises and cloud infrastructures.
Independently design and implement monitoring solutions for complex systems in OpenShift and AKS environments.
Lead incident response efforts and coordinate with multiple teams during outages, considering the nuances of both on-premises and cloud infrastructures.
Develop and implement automation solutions to improve system reliability and efficiency across OpenShift and AKS platforms.
Conduct thorough root cause analysis for incidents and propose long-term solutions that align with the organization's hybrid infrastructure strategy.
Contribute to the design and implementation of disaster recovery and business continuity plans, leveraging both on-premises and cloud resources.
Mentor junior team members and provide technical guidance on OpenShift and AKS best practices.
Participate in the evaluation and implementation of new technologies and tools that complement OpenShift and AKS environments.
Collaborate with development teams to define and implement SLIs, SLOs, and SLAs across both platforms.
Contribute to the development of architectural improvements to enhance system reliability and scalability in a hybrid infrastructure model.
Required Qualifications at this Level

Education
Bachelor's degree in a related field is preferred, or equivalent work experience.
Experience
Level 2 (DHTS System Analyst 2): Minimum 5 years of software development experience and/or
IT solutions engineering.
Required Skills and Knowledge
Level 2 (DHTS System Analyst 2)
Familiarity with project management and Agile/SCRUM methodologies
Proficiency in at least one programming language (e.g., Python, Go, Java)
Familiarity with version control systems (e.g., Git)
Familiarity with CI/CD technologies like GitLab CI or GitHub Actions
Basic understanding of server administration (preferably Linux)
Understanding of networking topologies, firewall rules, and certificate management
Ability to analyze customer requirements and translate into effective solutions
Critical thinking and problem-solving skills
Strong customer service orientation
Strong experience with Application Development Lifecycle, with a DevOps focus
Proficiency in script writing (e.g., Ansible Playbooks, Helm Charts)
Extensive experience with containerization and orchestration technologies (Docker, Kubernetes)
Strong experience with CI/CD technologies and practices
Advanced knowledge of server administration (preferably Linux)
Solid understanding of networking topologies, firewall rules, and certificate management
Proven ability to analyze complex customer requirements and translate into effective solutions
Advanced troubleshooting and root cause analysis skills
Strong project management skills, including Agile/SCRUM experience
Experience with cloud platforms (AWS, Azure, Google Cloud Platform) and services (SaaS, IaaS, PaaS, FaaS)
Knowledge of Enterprise Architecture best practices
Familiarity with AI and ML concepts
Desired Skills
Red Hat OpenShift certifications
Azure DevOps and Infrastructure certifications
CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer) certifications
Experience with multi-cloud environments
Knowledge of FHIR APIs and healthcare-specific technologies
Excellent time management, organizational, and task prioritization skills
Strong presentation skills
Ability to communicate effectively with non-technical staff and members of interdisciplinary teams
Ability to interact well and effectively communicate with all levels of leadership
Experience with data and system flow diagramming
Familiarity with vulnerability management and patching for application containers
Additional Responsibilities
Provide application system support for team apps, including rotating 24x7 support
Develop relationships with vendors to ensure customer needs are met in a timely manner
Author and update system documentation to share all knowledge acquired in the developer guide
Ensure systems conform to Duke Information Security Office policies and procedures
Assist in oral and written presentations to project teams, customers, and management
Coordinate and perform application testing
Follow established Change Management processes
Provide feedback on departmental processes and procedures and suggest improvements
Plan and coordinate system and application upgrades
Identify internal resources to build project teams as required
Perform detailed analysis and documentation of customer workflows
Collaborate with Administrative, Clinical, and Research customers to understand and meet needs
Develop relationships with key customer management representatives

Intent:
The intent of this job description is to provide a representative and level of the types of duties and
responsibilities that will be required of positions given this title and shall not be construed as a declaration of the total of the specific duties and responsibilities of any particular position. Employees may be directed to perform job-related tasks other than those specifically presented in this description.

Equal Opportunity:
Duke University is an Affirmative Action/Equal Opportunity Employer committed to providing employment opportunity without regard to an individual's age, color, disability, gender, gender expression, gender identity, genetic information, national origin, race, religion, sex, sexual orientation, or veteran status.
Duke aspires to create a community built on collaboration, innovation, creativity, and belonging. Our collective success depends on the robust exchange of ideas-an exchange that is best when the rich
diversity of our perspectives, backgrounds, and experiences flourishes. To achieve this exchange, it is essential that all members of the community feel secure and welcome, that the contributions of all individuals are respected, and that all voices are heard. All members of our community have a responsibility to uphold these values.
Essential Job Function:
Certain jobs at Duke University and Duke University Health System may include essential job functions that require specific physical and/or mental abilities. Additional information and provision for requests
for reasonable accommodation will be provided by each hiring department.

Duke is an Affirmative Action/Equal Opportunity Employer committed to providing employment opportunity without regard to an individual's age, color, disability, gender, gender expression, gender identity, genetic information, national origin, race, religion, sex, sexual orientation, or veteran status.

Duke aspires to create a community built on collaboration, innovation, creativity, and belonging. Our collective success depends on the robust exchange of ideas-an exchange that is best when the rich diversity of our perspectives, backgrounds, and experiences flourishes. To achieve this exchange, it is essential that all members of the community feel secure and welcome, that the contributions of all individuals are respected, and that all voices are heard. All members of our community have a responsibility to uphold these values.

Essential Physical Job Functions: Certain jobs at Duke University and Duke University Health System may include essentialjob functions that require specific physical and/or mental abilities. Additional information and provision for requests for reasonable accommodation will be provided by each hiring department.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.