Overview
Skills
Job Details
Position Title: Site Reliability Engineer
locations/flexible work by preference: 1. Phoenix Hub 2. Pittsburg Hub
Hybrid: 4 days in office, 1 remote
Summary
The primary focus of this role is to monitor, automate, and enhance the reliability, performance, and availability of software systems. The ideal candidate will bring strong technical expertise, a collaborative mindset, and a proactive approach to problem-solving in a fast-paced environment.
Roles and Responsibilities
Gather and analyze metrics from operating systems and applications to assist in performance tuning and fault finding.
Collaborate with development teams to enhance services through rigorous testing and release processes.
Participate in system design consulting, platform management, and capacity planning.
Develop sustainable systems and services by implementing automation and system enhancements.
Balance development speed and reliability using well-defined service-level objectives (SLOs).
Must-Have Technical Skills
Dynatrace: Expertise in application performance monitoring.
SQL or MySQL: Strong database concepts and the ability to write and execute queries.
Prometheus: Basic understanding of monitoring and alerting.
Full-Stack Engineering Experience: Competency across databases, services, and networks to troubleshoot effectively.
CI/CD Pipelines: Ability to navigate and manage continuous integration/continuous deployment processes.
GitHub and GitLab: Proficiency in version control and collaborative development workflows.
Flex Skills/Nice to Have
Grafana: Experience with dashboard creation and performance monitoring.
Soft Skills
Strong collaboration and teamwork capabilities.
Willingness to learn and adapt to new challenges.
Ability to remain composed under pressure.
Logical and solution-oriented mindset.
Education and Certifications
Preferred: Bachelor's degree in computer science or a related field.
Equivalent Work Experience: Considered more important than formal education.
Screening Questions
What made you interested in this role?
Why are you seeking a new position?
What do you know about our company?
Define "Observability" in 2-3 sentences.
Skills
Programming proficiency in one or more high-level languages (e.g., Python, Java, C/C++, Ruby, JavaScript).
Experience with distributed storage technologies.
Strong problem-solving skills for identifying performance bottlenecks and optimization opportunities.
Proven success in technical engineering and delivering robust solutions.
Advanced coding skills extending beyond simple scripting.
Education/Experience
Preferred: Bachelor s degree in computer science or a related field.
Required: 6+ years of relevant experience (Level 4).