Team Lead Site Reliability Engineer
S
Sana Commerce
📍 Alexandria, Alexandria Governorate, Egypt
Job Description
Job Description
What you'll be doing:
- Leading the SRE team, setting objectives, and guiding the team towards achieving high reliability while balancing cost and performance SLAs.
- Collaborating with platform & product engineering teams to embed reliability and operational best practices into the software development lifecycle.
- Developing and implementing SRE policies and practices, including service level objectives (SLOs), service level indicators (SLIs), and error budgets.
- Driving automation across operations to reduce toil, improve system performance, ensure scalability, with a reasonable amount of allergic response towards repetitive manual work.
- Overseeing incident management, post-mortem analyses, and root cause investigations to prevent future outages and enhance system reliability.
- Facilitating capacity planning and scalability exercises to manage growth and ensure the efficient use of res...