Senior Operations / Reliability Engineer
Job
Skill
Redmond, WA (In Person)
$88,400 Salary, Full-Time
Review key factors to help you decide if the role fits your goals.
Pay Growth
?
out of 5
Not enough data
Not enough info to score pay or growth
Job Security
?
out of 5
Not enough data
Calculating job security score...
Total Score
79
out of 100
Average of individual scores
Skill Insights
Compare your current skills to what this opportunity needs—we'll show you what you already have and what could strengthen your application.
Job Description
Senior Operations / Reliability Engineer Skill sick time United States, Washington, Redmond May 08, 2026
Overview Placement Type:
N/A Salary:
$40-45 Hourly up to $45.00/hrStart Date:
May 18, 2026 Aquent, a leading talent solutions partner, is excited to collaborate with a pioneering technology company at the forefront of innovation. We are seeking a highly skilled and motivated individual to join a critical team that ensures the seamless operation and reliability of groundbreaking new hardware and software products. This is an unparalleled opportunity to directly influence the stability and success of cutting-edge technology, making a tangible impact on product quality and user experience from day one. As a Senior Operations / Reliability Engineer, you will be instrumental in supporting live operations, release stability, and prototype device monitoring for a new hardware and software product. Your expertise will be crucial in maintaining the health and performance of systems that power innovative experiences. You will contribute to monitoring device health through telemetry dashboards, investigating issues, assigning bugs, and gathering logs, including hands-on device support, to ensure stability in production. This engineering-oriented operations role involves deep dives into logs, dashboards, alerts, and live system behavior, working closely with software engineers, QA, infrastructure teams, and product leadership. You will play a vital role in ensuring that innovative products reach users with exceptional quality and reliability.Key Responsibilities:
- Monitor telemetry from services, applications, and prototype devices to assess operational health and identify anomalies, failures, or performance degradation.
- Analyze real-time metrics and logs to support troubleshooting across cloud, on-premises, and prototype device environments.
- Triage operational issues, communicate findings clearly to engineering and product teams, and provide actionable insights based on telemetry trends.
- Support software releases by validating deployments, monitoring live systems, and assessing post-deployment stability during rollouts and updates.
- Identify, debug, and help resolve live issues affecting services, devices, or internal users, partnering with engineering for mitigations and fixes.
- Assist with post-release verification and stabilization reporting, documenting observations, risks, and incidents.
- Support incident response by gathering data, summarizing impact, identifying suspected causes, and tracking mitigation progress.
- Participate in post-incident reviews, document lessons learned, and recommend improvements to monitoring, alerting, and operational procedures.
- Perform in-person troubleshooting for self-hosted systems, prototype devices, or test environments, including device configuration, deployment, and validation.
- Maintain documentation of hardware configurations, operational procedures, environment setup, and observed issues.
- Collaborate closely with software, QA, infrastructure, and product teams to support operational readiness and release reliability.
- Communicate operational status, risks, and technical findings clearly and promptly, providing concise summaries of system health and incident status.
Must-Have Qualifications:
- Bachelor's degree in Computer Science, Computer Engineering, Software Engineering, or a related technical field, or equivalent practical experience.
- 5-7 years of relevant experience in software engineering, DevOps, SRE, production operations, infrastructure, service reliability, or related technical operations roles.
- Experience monitoring live services, applications, infrastructure, or device environments.
- Proficiency in using dashboards, alerts, logs, metrics, and telemetry to diagnose system health and troubleshoot issues.
- Experience supporting software releases, deployments, production validation, or service rollouts.
- Ability to investigate technical issues, summarize findings, and communicate risks clearly to engineering and product teams.
- Experience documenting incidents, operational procedures, known issues, and troubleshooting steps.
- Familiarity with CI/CD workflows, cloud or hybrid infrastructure, release validation, and incident response practices.
- Strong problem-solving skills, communication skills, and the ability to work independently in a fast-moving engineering environment.
- Experience with mobile operating systems.
Nice-to-Have Qualifications:
- More than 7 years of experience in relevant technical operations roles.
About Aquent Talent:
- Aquent Talent connects the best talent in marketing, creative, and design with the world's biggest brands.
Similar remote jobs
Wells Fargo
Chandler, AZ
Posted2 days ago
Updated5 hours ago
Similar jobs in Redmond, WA
Amazon
Redmond, WA
Posted2 days ago
Updated5 hours ago
Providence Health & Services
Redmond, WA
Posted2 days ago
Updated5 hours ago
Similar jobs in Washington
Amazon
Redmond, WA
Posted2 days ago
Updated5 hours ago
USAO Western District of Washington
Seattle, WA
Posted2 days ago
Updated5 hours ago