Talent.com
Principal Site Reliability Engineer (AI-first SRE)
Principal Site Reliability Engineer (AI-first SRE)Groupon • Peru
Principal Site Reliability Engineer (AI-first SRE)

Principal Site Reliability Engineer (AI-first SRE)

Groupon • Peru
Hace 7 días
Descripción del trabajo

Principal Site Reliability Engineer (AI-first SRE)

Groupon is modernizing its global platform, and reliability is at the center of that transformation. We’re looking for a Principal Site Reliability Engineer to lead the evolution from reactive maintenance to predictive, AI‑driven resilience. You’ll design intelligent, self‑healing systems that prevent incidents before they happen, ensuring our customers enjoy fast, secure, and reliable experiences across millions of daily interactions.

Key Responsibilities

  • Architect and maintain self‑healing systems with 99.9%+ availability targets.
  • Use AI / ML to automate infrastructure governance and detect configuration or IaC anti‑patterns.
  • Implement adaptive SLIs / SLOs that evolve automatically from real‑time data.
  • Build AIOps‑based observability and auto‑remediation pipelines.
  • Apply predictive modeling to forecast failures before they impact users.
  • Lead chaos, performance, and resilience testing programs.
  • Map platform and service behavior to revenue impact and drive improved revenue resilience through better infrastructure performance.
  • Mentor engineers and drive reliability standards across teams.
  • Partner with platform, data, and product teams to ensure stability aligns with business goals.
  • Support major incident response, incident review, and participate in on‑call rotations.

Key Requirements

  • 10+ years in software / systems engineering, including 5+ years in SRE or platform reliability.
  • Strong experience with GCP (preferred) or AWS, Kubernetes, and Terraform.
  • Proficiency in Python or Go for automation and tooling.
  • Deep understanding of observability stacks (Prometheus, Grafana, OpenTelemetry) and service meshes (Istio, Envoy).
  • Hands‑on AIOps experience : anomaly detection, predictive analytics, ML‑assisted operations.
  • Strong communication and influencing skills — data over hierarchy.
  • Nice to Have

  • Experience with MLOps or large‑scale data infrastructure.
  • Exposure to FinOps or cloud cost optimization.
  • Previous leadership of global incident response or SRE transformation programs.
  • What Success Looks Like

  • 99.9%+ uptime sustained through predictive rather than reactive responses.
  • Faster MTTR via automated detection and auto‑remediation.
  • Reliability insights used in leadership decisions.
  • Mentorship leading to stronger reliability practices across teams.
  • We Are Interested In

  • Technologists who see reliability as a product, not just a metric.
  • Engineers who use AI / ML as a tool for scale and insight.
  • Leaders who can balance innovation speed with operational excellence.
  • Engineers who understand the entire e‑commerce stack and how it impacts revenue.
  • What We Offer

  • The opportunity to work with cutting‑edge technologies in a transformative environment.
  • A collaborative and innovative work culture that values your expertise and contributions.
  • Professional growth and leadership development pathways tailored to your aspirations.
  • A chance to leave a lasting impact by shaping the future of reliable and scalable systems.
  • Join us to push the boundaries of platform reliability and drive meaningful change in a fast‑evolving digital world! Groupon is an AI‑First Company committed to building smarter, faster, and more innovative ways of working — and AI plays a key role in how we get there. We encourage candidates to leverage AI tools during the hiring process where it adds value, and we’re always keen to hear how technology improves the way you work.

    Groupon’s purpose is to build strong communities through thriving small businesses. To learn more about the world’s largest local e‑commerce marketplace, click here. You can also find out more about us in the latest Groupon news as well as learning about our DEI approach.

    Beware of Recruitment Fraud : Groupon follows a merit‑based recruitment process without charging job seekers any fees. We've noticed an increase in recruitment fraud, including fake job postings and fraudulent interviews and job offers aimed at stealing personal information or money. Be cautious of individuals falsely representing Groupon's Talent Acquisition team with fake job offers. If you encounter any suspicious job offers or interview calls demanding money, recognize these as scams. Groupon is not responsible for losses from such dealings. For legitimate job openings (and a sneak peek into life at Groupon), always check our official career website at Groupon Careers.

    #J-18808-Ljbffr

    Crear una alerta de empleo para esta búsqueda

    Reliability Engineer • Peru

    Ofertas relacionadas
    Agentic.NET Engineer - Remote - Latin America

    Agentic.NET Engineer - Remote - Latin America

    FullStack Labs • Peru
    NET Engineer - Remote - Latin America.NET Engineer - Remote - Latin America role at FullStack Labs.About FullStack : FullStack is the most transparent IT talent network, connecting highly skilled in...Mostrar más
    Última actualización: hace 24 días • Oferta promocionada
    Agentic Python Engineer - Remote - Latin America

    Agentic Python Engineer - Remote - Latin America

    FullStack Labs • Peru
    Agentic Python Engineer - Remote - Latin America.Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features. FullStack is the most transparent IT talent network, ...Mostrar más
    Última actualización: hace 24 días • Oferta promocionada
    Team Lead, Engineering - Node.js | LATAM

    Team Lead, Engineering - Node.js | LATAM

    Deel • Peru
    Deel is the all‑in‑one payroll and HR platform for global teams.Our mission is to unlock global opportunity for every person, team, and business. With AI‑powered tools and a fully owned payroll infr...Mostrar más
    Última actualización: hace 27 días • Oferta promocionada
    Dynamics 365 F&O Technical Architect (Relocate to Malta / Big 4)

    Dynamics 365 F&O Technical Architect (Relocate to Malta / Big 4)

    Black Pen Recruitment • Warsaw, Peru
    Our client’s Microsoft Business Solutions team is a Microsoft Gold Partner and leader in Microsoft software implementations for medium to large organisations, providing their clients with the abili...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    PS Engineer I - Peru

    PS Engineer I - Peru

    NCR Atleos Corporation • Peru
    About NCR Atleos • •NCR Atleos, headquartered in Atlanta, is a leader in expanding financial access.Our dedicated 20,000 employees optimize the branch, improve operational efficiency and maximize sel...Mostrar más
    Última actualización: hace 29 días • Oferta promocionada
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    RTC.pe • Peru
    Somos una empresa financiera no bancarizada fundada en noviembre de 2017 en Perú y en julio de 2019 iniciamos operaciones en México. RTC es una entidad financiera alternativa que ayuda a personas y ...Mostrar más
    Última actualización: hace 26 días • Oferta promocionada
    Lead Software Engineer – AI Focused

    Lead Software Engineer – AI Focused

    LunaJoy Health • Peru
    Lead Software Engineer – AI Focused role at LunaJoy Health.Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features. Lead Software Engineer – AI Focused.Luna Jo...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior AI Engineer

    Senior AI Engineer

    EQUALS TRUE • Peru
    REMOTE FIRST, INCLUSIVE ALWAYS.EQUALS TRUE was founded when a type design geek and a talent nerd decided to create real change in the technology industry workforce. We built the first‑in‑industry au...Mostrar más
    Última actualización: hace 5 días • Oferta promocionada
    Remote Technical Release Program Lead

    Remote Technical Release Program Lead

    Optery • Peru
    A technology company is seeking a Technical Release Program Manager to lead product releases across web, backend, and mobile applications. The role requires 5+ years of experience in release managem...Mostrar más
    Última actualización: hace 2 días • Oferta promocionada
    Sr DevOps Engineer

    Sr DevOps Engineer

    Talently • Peru
    Get AI-powered advice on this job and more exclusive features.Direct message the job poster from Talently.Talent Acquisition | IT Sourcer @Talently | LATAM Recruiter | Employability & Personal Bran...Mostrar más
    Última actualización: hace 5 días • Oferta promocionada
    Senior DevOps Engineer - Remote - Latin America

    Senior DevOps Engineer - Remote - Latin America

    FullStack Labs • Peru
    Be among the first 25 applicants.FullStack is the most transparent IT talent network, connecting highly skilled individuals with top global companies and Silicon Valley startups for remote, on-dema...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    DevOps Engineer Latin America (remote)

    DevOps Engineer Latin America (remote)

    Etleap • Peru
    DevOps Engineer Latin America (remote).This role is open to individuals living in Latin America.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more....Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada
    Senior SRE – AWS, Kubernetes, SLOs, Remote

    Senior SRE – AWS, Kubernetes, SLOs, Remote

    Third-Party Job Posts • Peru
    A leading hospitality technology provider is seeking a Sr.Site Reliability Engineer to ensure the reliability and performance of their platform. You'll work with AWS cloud solutions and Kubernetes w...Mostrar más
    Última actualización: hace 6 días • Oferta promocionada
    AWS CloudFormation Engineer - Remote - Latin America

    AWS CloudFormation Engineer - Remote - Latin America

    FullStack Labs • Peru
    AWS CloudFormation Engineer - Remote - Latin America.Be among the first 25 applicants.FullStack is the most transparent IT talent network, connecting highly skilled individuals with top global comp...Mostrar más
    Última actualización: hace 29 días • Oferta promocionada
    Senior DevOps Engineer - Remote - Latin America

    Senior DevOps Engineer - Remote - Latin America

    FullStack • Peru
    Senior DevOps Engineer - Remote - Latin America.FullStack is the most transparent IT talent network, connecting highly skilled individuals with top global companies and Silicon Valley startups for ...Mostrar más
    Última actualización: hace 5 días • Oferta promocionada
    AI Trainer - DevOps / SRE's / Systems Engineer - Bash / Linux

    AI Trainer - DevOps / SRE's / Systems Engineer - Bash / Linux

    Workana • Peru
    AI Trainer - DevOps / SRE's / Systems Engineer - Bash / Linux.Workana Premium is partnering with AfterQuery, a YC-backed AI research lab. This is a part-time, project-based opportunity where you'll con...Mostrar más
    Última actualización: hace 6 días • Oferta promocionada
    Senior Software Engineer in Test - Remote - Latin America

    Senior Software Engineer in Test - Remote - Latin America

    FullStack Labs • Peru
    FullStack is the most transparent IT talent network, connecting highly skilled individuals with top global companies and Silicon Valley startups for remote, on-demand projects.We focus on building ...Mostrar más
    Última actualización: hace 22 días • Oferta promocionada
    DevOps Engineer with Azure or AWS, Senior Level 1

    DevOps Engineer with Azure or AWS, Senior Level 1

    Globant • Peru
    DevOps Engineer with Azure or AWS, Senior Level 1.DevOps Engineer with Azure or AWS, Senior Level 1.At Globant, we are working to make the world a better place, one step at a time.We enhance busine...Mostrar más
    Última actualización: hace más de 30 días • Oferta promocionada