Site Reliability Engineer
B
Broadridge Financial Solutions
📍 toronto, on, Canada
Job Description
Job Overview
The Sr. Site Reliability Engineer (SRE) will be responsible for the availability, performance, security, and scalability of Broadridge’s infrastructure and applications. The role works closely with development & operations teams to streamline the software development lifecycle, automate processes, and maintain reliable, scalable systems.
Key Responsibilities
- Monitor systems and lead incident response for production outages; develop and enhance monitoring systems such as Datadog.
- Design and maintain scalable infrastructure using IaC tools (Chef, Terraform, Ansible, CloudFormation).
- Ensure stability, performance, and scalability of Linux‑based infrastructure and services while applying SRE practices to meet reliability targets (SLAs, SLOs, SLIs).
- Build, manage, and maintain CI/CD pipelines for rapid and safe release cycles.
- Develop and implement scripts and tooling to automate repetitive operational task...