Overview
Skills
Job Details
Job Title: Sr. Observability Engineer (Grafana/Prometheus, LOKI, Mimir,Tempo)
Required Skills: Azure, Google Cloud Platform, CI/CD, DevOps, Observability tools (Grafana/Prometheus, LOKI, Mimir,Tempo), AKS, GKE, Proficiency in specific software or tools
Experience: Total 9+ yrs total experience; 5+ years of experience in similar role
- Grafana OSS Stack for observability (Mimir, Loki, Tempo, Grafana Alloy)
- Azure/Google Cloud Platform hands-on with details around pulling observability data from managed services
- Golang/Python coding or from solutioning background with experience on SRE development and Open telemetry implementation
Job Descriptions
We are seeking an experienced Engineer with 8+ years of experience to join our team. The ideal candidate will have strong technical skills in Google Cloud Platform, DevOps Engineering, Apache Spark, Microservices, Kubernetes, Docker, Prometheus, Grafana, and SRE. This is a hybrid work model with day shifts and no travel required.
Responsibilities:
1. Deploying/managing and optimizing enterprise level observability platform for Grafana OSS products like Mimir, Loki and Tempo
2. Design and develop standard dashboards in grafana for critical metrics for various Azure/Google Cloud Platform services using the observability data
3. Research and develop new solutions for other pillars of observability (like RUM, Synthetic Monitoring, Network monitoring and profiling)
4. Understand existing exporters for implementation/enhance or create custom exporters for pulling metrics from different Azure/Google Cloud Platform/SAAS services
5. Guiding the offshore team on the ingestion pipelines using Python/golang or any other open-source technologies