HealthEdge

Data Operations Manager

Job Locations US-Remote
ID
2026-7952
Category
Data & Analytics
Position Type
Full-Time

Overview

Overview: 

HealthEdge is looking for a Data Operations Manager with a focus on data insights who thinks like an engineer, works like scientist and communicates like a strategist. This hybrid role sits at the intersection of data platform engineering and ITSM-aligned Problem Management – purpose-built to expose the hidden toil, technical debt and repeat failure patterns that drain engineering capacity and impact payer customer availability across our integrated platform of solutions. 

You will leverage AI – including Claude and its supporting functions – to ingest disparate operational data sources, model failure trends and surface prioritized insights that drive permanent resolution rather than perpetual remediation. 

WHY THIS ROLE EXISTS 

Production stability is a business imperative for health plans that depend on HealthEdge solutions to process claims, manage care and engage members. Repeat incidents, unresolved technical debt and untracked toil create invisible drag on engineering velocity – and visible risk to client uptime. This role exists to make the invisible visible, and to turn data-driven findings into engineering action. 

KEY RESPONSIBILITIES 

Operational Data Engineering 

  • Design, build and maintain pipelines that consolidate data from PagerDuty, Jira, ServiceNow, Datadog, Splunk and other operational sources into a unified analytical layer. 
  • Develop and curate data models that identify repeat incidents, known error patterns, chronic alert noise and engineering toil consuming disproportionate remediation cycles. 
  • Maintain data quality, lineage and governance standards across all ingested sources – ensuring findings are defensible when presented to senior leadership. 
  • Leverage AI and automation – including the Claude API and Claude-powered workflows – to accelerate pattern detection, root cause hypothesis generation and report synthesis across large operational datasets. 

Problem Management Practice 

  • Own and drive the Problem Management lifecycle across HealthEdge client-facing products. 
  • Translate incident patterns into structured Problem Records with defined scope, impact quantification, and recommended permanent fix strategies. 
  • Partner with Engineering, SRE, Platform and Product teams to embed problem-driven prioritization into sprint planning and tech debt roadmaps. 
  • Facilitate Problem Review sessions – leading cross-functional teams from data to decision 
  • Define and track KPIs that demonstrate Problem Management value: reduction in repeat SEV1/SEV2 incidents, MTTR improvement, tech debt resolution velocity and engineering hours reclaimed from toil. 

AI-Driven Insight & Visualization 

  • Build interactive, executive-ready dashboards and data visualizations that make hotspots, failure modes and technical debt load immediately comprehensible to both engineering and business stakeholders. 
  • Apply generative AI tooling to synthesize multi-source operational signals into clear, narrative-driven analysis – reducing time from data to decision. 
  • Develop automated reporting workflows that surface trending issues and emerging risk patterns without requiring manual aggregation cycles. 
  • Support monthly ceremonies by providing KPI and Outcome trending, highlighting influences to trending themes. 

Stakeholder Communication & Business Translation 

  • Present operational intelligence findings and Problem Management outcomes to Engineering leadership, VP-level+ audiences and cross-functional stakeholders. 
  • Influence from a strategic perspective where the most urgent pockets of risk to platform availability exist, and drive prioritization accordingly. 
  • Translate technical findings – infrastructure failure modes, code regression patterns, dependency risks – into business value framing that drives prioritization conversations. 
  • Author Problem Record summaries, trend analyses and executive briefings that are concise, evidence-based and action-oriented. 

REQUIRED QUALIFICATIONS 

Technical Skills 

  • 5+ years of data engineering experience with production-grade pipeline design, transformation logic and operational data modeling. 
  • Proficiency with Python or Scala for data processing; strong SQL for analytical querying against large, event-driven datasets. 
  • Hands-on experience with Jira and at least two of the following: PagerDuty, Datadog, Splunk, ServiceNow – ideally in an operational analytics or SRE context. 
  • Experience integrating large language model (LLM) APIs – including Anthropic Claude, OpenAI or similar – into data workflows, automated summarization pipelines or insight generation applications. 
  • Proficiency building interactive dashboards and data visualizations, Amazon Quick Suite a strong plus. 

Operational & ITSM Knowledge 

  • Working knowledge of ITIL or equivalent ITSM frameworks – specifically Incident Management, Problem Management and Change Management process disciplines. 
  • Demonstrated ability to identify repeat failure patterns in incident or monitoring data and drive structured root cause analysis and resolution workflows. 
  • Familiarity with SRE principles – toil quantification, error budgets, SLO/SLA measurement – and how engineering teams use these to prioritize reliability work. 

Communication & Leadership 

  • Strong written and verbal communication skills, with demonstrated experience presenting technical analysis to VP or C-level audiences. 
  • Ability to translate complex, multi-variable findings into business impact narratives that drive prioritization decisions. 
  • Comfortable driving cross-functional alignment – navigating competing priorities across Engineering, Product, Operations and Leadership stakeholders. 
  • Self-directed and intellectually curious; you pursue root causes with the same rigor you bring to your data models. 

PREFERRED QUALIFICATIONS 

  • Experience in a healthcare SaaS environment or regulated platform with high availability requirements. 
  • Prior role embedded in an SRE, NOC, Platform Engineering or Operations function – particularly one that included formal Problem Management or post-incident review responsibilities. 
  • Experience building AI-powered operational tooling – such as automated incident summarization, intelligent alert correlation or AI-assisted root cause classification. 
  • Familiarity with HealthEdge products or the payer technology landscape is a meaningful plus. 
  • ITIL Foundation certification or equivalent. 

 

Geographic Responsibility:  Remote, US

Type of Employment: Full-time, permanent 

FLSA Classification (USA Only): Exempt 

Work Environment: The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job:  

  • The employee is occasionally required to move around the office. Specific vision abilities required by this job include close vision, color vision, peripheral vision, depth perception, and ability to adjust focus.  
  • Work across multiple time zones in a hybrid or remote work environment. 
  • Long periods of time sitting and/or standing in front of a computer using video technology. 
  • May require travel dependent on company needs. 

 

The above statements are intended to describe the general nature and level of the job being performed by the individual(s) assigned to this position. They are not intended to be an exhaustive list of all duties, responsibilities, and skills required. HealthEdge reserves the right to modify, add, or remove duties and to assign other duties as necessary. In addition, reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions of this position in compliance with the Americans with Disabilities Act of 1990Candidates may be required to go through a pre-employment criminal background check. 

 

HealthEdge is an equal opportunity employer. We are committed to workforce diversity and actively encourage all qualified persons to seek employment with us, including, but not limited to, racial and ethnic minorities, women, veterans and persons with disabilities. 

 

#LI-Remote 

 

**The annual US base salary range for this position is $130,000 to $140,000. This salary range may cover multiple career levels at HealthEdge. Final compensation will be determined during the interview process and is based on a combination of factors including, but not limited to, your skills, experience, qualifications and education.  

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed