site reliability engineer Jobs in remote or san francisco, ca

Refine Results
1 - 20 of 151 Jobs

Site Reliability Engineer

LiveRamp

San Francisco, California, USA

Full-time

LiveRamp is the data collaboration platform of choice for the world's most innovative companies. A groundbreaking leader in consumer privacy, data ethics, and foundational identity, LiveRamp is setting the new standard for building a connected customer view with unmatched clarity and context while protecting precious brand and consumer trust. LiveRamp offers complete flexibility to collaborate wherever data lives to support the widest range of data collaboration use cases-within organizations, b

Site Reliability Engineer

Splunk Inc.

Remote or San Jose, California, USA

Full-time

Description Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Our customers love our technology, but it's our caring employees that make Splunk stand out as an amazing career destination. No matter where in the world or what level of the organization, we approach our wor

Site Reliability Engineer

Photon

Remote or Mexico City, Mexico City, Mexico

Full-time

About the Role: We are seeking a highly skilled and passionate Site Reliability Engineer (SRE) with deep expertise in Azure Cloud to join our dynamic engineering team. In this role, you will be responsible for ensuring the reliability, availability, and performance of our critical applications and infrastructure hosted on Microsoft Azure. You will leverage your technical expertise and problem-solving skills to build and maintain scalable, resilient, and automated systems. Responsibilities: Relia

Site Reliability Engineer

iPeople Infosystems LLC

Remote

Contract, Third Party

Role: Site Reliability Engineer Location: Remote 100% Type: Contract (W2/1099) Key Responsibilities: Develop and maintain reliable, scalable, and secure systems in Java, Go, and Python.Design, implement, and manage Kubernetes clusters and associated microservices.Build automation and monitoring tools to enhance system reliability and operational efficiency.Utilize observability tools such as Splunk and AppDynamics for proactive incident detection and resolution.Collaborate with development and

Site Reliability Engineer

Iceberg

Remote

Full-time

Some roles are about keeping the lights on. This isn t one of them. This is about stepping into a high-growth SaaS company serving some of the most security-conscious industries in the world - financial services, healthcare, and insurance - and helping build the backbone of a mission-critical DevSecOps platform. I'm looking for a Senior Site Reliability / DevSecOps Engineer who knows what it takes to build secure, scalable, and resilient systems - someone who thrives where development, security,

SRE Consultant with Java Development

System Soft Technologies

Remote

Contract

System Soft Technologies is widely recognized for its professionalism, strong corporate morals, customer satisfaction, and effective business practices. We provide a full spectrum of business and IT services and solutions, including custom application development, enterprise solutions, systems integration, mobility solutions, and business information management. System Soft Technologies combines business domain knowledge with industry-specific practices and methodologies to offer unique solution

Senior Site Reliability Engineer, Infrastructure

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

SRE Consultant - Remote

Mudrasys

Remote

Contract, Third Party

Position: SRE Consultant Location: Remote Duration: 6-12months Job Description: Self-Healing/Automated Repair framework to Automatically repair Batch Abnormal Ends due to non-ASCII values in Demographics data (Manual Fix 60 minutes, Automated Repair 2 minutes)Self-Healing for Consumer Alerting system preventing service blackout (Manual Fix 135 minutes, Automated Repair 2 minutes)Create Observability Aggregator framework and streamed Batch metrics into AppDynamics to show Real-Time Batch Ex

Site Reliability Engineer - W2 Contract / Visa Independence

Integrass

Remote

Contract

Will work closely with software engineering and operations teams to design, build, and maintain scalable and reliable infrastructure. Primary focus will be on designing and building infrastructure, automating processes, system performance, to ensure a seamless user experience. Key Responsibilities: Automation: Develop and implement automated solutions for system provisioning, configuration management, and incident response to reduce manual intervention and improve efficiency.System Monitoring an

Site Reliability Engineer

Leidos

Remote

Full-time

Come put your Site Reliability Engineer (SRE) skills into action! Leidos has openings for talented SREs to join our team and develop reusable solutions that support our customers in any environment. You will have the opportunity to contribute to the design and implementation of Continuous Integration and Continuous Delivery (CI/CD) pipelines that accelerate the secure delivery of software to production. You will automate the buildout of infrastructure in cloud and on-premises environments to ope

Senior Site Reliability Engineer (SRE)

Bossini Technologies

Remote

Contract

Senior Site Reliability Engineer (SRE)Location: United States (Remote) About the RoleWe are seeking a Senior Site Reliability Engineer (SRE) with proven experience in ensuring high availability, reliability, and performance across complex enterprise systems. The role centers on supporting a hybrid ecosystem involving SAP workloads, modern data pipelines, and AWS cloud infrastructure. This is an individual contributor role designed for someone who thrives on ownership, understands mission-critic

Automation Developer / Site Reliability Engineer

Princeton IT Services

MX

Contract

Job Title: Platform SRE Automation Developer / Site Reliability Engineer Job Location: Remote in Mexico Job Type; Full time contract Job Summary: This team's engineers support the growing consumer credit card business. The platform is built on a microservice architecture on a modern technology stack hosted in AWS public cloud and uses state of the art development practices and tooling for SDLC, with observability tools such as Datadog, Prometheus, Splunk, etc.Our engineers are responsible

Senior Site Reliability Engineer, Test Platform- REMOTE

Cisco Systems, Inc.

Remote or San Francisco, California, USA

Full-time

At Cisco Meraki, we create magic through the energy and passion of our employees, who shape our dynamic community and empower us to solve problems for our customers. This magic unfolds when technology becomes intuitive, functions as intended, and when every individual is valued. By providing our employees with the autonomy to make an impact, we strive to fulfill our mission of simplifying technology so our customers can focus on what matters most to them-whether it's their students, patients, cu

Lead Site Reliability Engineer II, Production Engineering

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Site Reliability Engineer TS Clearance

Connexions Data Inc

Remote

Full-time

Site Reliability Engineer Start: Immediate Location: Remote Type: Full Time Hire Top Secret Clearance with SCI eligibility Objectives of this role Run the production environment by monitoring availability and taking a holistic view of system health Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our suite of software solutions Measure and optimize system performance, with an eye toward pushing our capabilities fo

Senior Site Reliability Engineer

Circles Inc.

Remote or San Francisco, California, USA

Full-time

Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data - globally, nearly instantly and less expensively than legacy settlement systems. This ground-breaking new internet layer opens up previously unimaginable possibilities for payments, commerce and markets that can help raise global economic prosperity and enhance inclusion. Our infrastructure - including USDC, a blockchain-based dollar - helps busines

Principal Site Reliability Engineer, Datastores

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and

Lead II Site Reliability Engineer - ThousandEyes

Cisco Systems, Inc.

San Francisco, California, USA

Full-time

Who We Are Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don't own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end- user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and be

Principal Software Engineer - Site Reliability Engineering

Roblox

San Mateo, California, USA

Full-time

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers and creators. At Roblox, we're building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device.We're on a mission to connect a billion people with op

L3 Support SRE Engineer

Litmus7 Systems Consulting Inc.

San Ramon, California, USA

Full-time

Role - Sr L3 support Engineer Location - San Ramon, CA. Working from Office Should have good End to End knowledge of various Commerce subsystems which include at least Storefront, Core Commerce back end, Post Purchase processing, OMS, Store / Warehouse Management processes, Supply Chain and Logistic processes.Extensive backend development knowledge with core Java/J2EE and Microservice based event driven architecture.should be cognizant of key integrations undertaken in eCommerce and associated d