Site Reliability Engineer
R
Resolve Tech Solutions
📍 irving, tx, United-States
Job Description
About the Company
Role: SRE RunOps Engineer
Location: Irving, TX
Onsite job
About the Role
Production Support & Incident Management
Serve as a primary responder for production incidents, ensuring rapid triage, mitigation, and resolution.
Responsibilities
- Lead root cause analysis (RCA) and drive long‑term corrective actions.
- Maintain and improve incident response processes, runbooks, and escalation paths.
- Collaborate with engineering, QA, and product teams to prevent recurrence of issues.
AWS Infrastructure Operations
- Support and optimize AWS services such as EC2, ECS/EKS, Lambda, S3, CloudWatch, IAM, RDS, and VPC networking.
- Monitor system health, performance, and capacity across c...