Overview
Remote
$50 - $70
Contract - W2
Contract - 3 Month(s)
No Travel Required
Skills
Site Reliability Engineer
DevOps
Java
C#
Python
Nutanix
Job Details
Will work closely with software engineering and operations teams to design, build, and maintain scalable and reliable infrastructure. Primary focus will be on designing and building infrastructure, automating processes, system performance, to ensure a seamless user experience.
Key Responsibilities:
- Automation: Develop and implement automated solutions for system provisioning, configuration management, and incident response to reduce manual intervention and improve efficiency.
- System Monitoring and Maintenance: Design and develop health and performance of our systems, identifying and resolving issues before they impact users.
- Incident Management: Respond to on-call incidents, troubleshoot issues, and implement solutions to restore service as quickly as possible.
- Performance Optimization: Analyze system performance and implement improvements to enhance scalability, reliability, and efficiency.
- Disaster Recovery: Design and build disaster recovery plans to ensure quick recovery from unexpected disruptions.
- Collaboration: Work closely with software engineering, operations, and other cross-functional teams to design and implement reliable and scalable systems.
- Documentation: Create and maintain comprehensive documentation for system architecture, processes, and procedures.
Qualifications:
- Education: Bachelor's degree in Computer Science, Engineering, or a related field.
- Experience: Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role.
- Technical Skills: Proficiency in programming languages such as Java, C#, Python, or JavaScript. Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes), including container registries (e.g., Harbor).
- Monitoring Tools: Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Fluentbit, ELK stack).
- Problem-Solving: Strong analytical and problem-solving skills with the ability to troubleshoot complex issues.
- Communication: Excellent communication and collaboration skills to work effectively with cross-functional teams.
- Experience with hybrid multi-cloud platforms such as Nutanix
- On-Call: Willingness to participate in on-call rotations and respond to incidents as needed.
- Experience with infrastructure as code (e.g., Terraform, Ansible).
- Knowledge of networking concepts and protocols.
- Experience with Nutanix preferred.
- Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, Helm, Flux).
- Experience in developing clinical systems or working in the healthcare industry.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.