Site Reliability Engineer

Job description

SRE - Site Reliability Engineer - DevOps - AWS - Telecommunication - 9 Months - Paying up to £575 (Inside IR35) - Hybrid

This is an opportunity to work with one of the UK's most succesful broadcast providers. It will be a chance to work with some of the latest technology on the market.

Individuals must have commercial experience in working with Site Reliability and AWS.

What you'll do:

* Develop a telco grade PaaS capability and Design, document, and implement a PaaS solution to onboard and integrate vendor provided or requested applications within the telecommunications infrastructure.
* Take part in an on-call rota to action symptoms before they become outages.
* As a senior SRE engineer, be responsible for the engineering and support of production environments, including automation of patches, upgrades, reliability and performance improvements
* Ownership lab facilities for Dev & Test activities of PaaS
* Develop assurance, monitoring, and management capabilities for PaaS infrastructure using Zabbix, Prometheus, Grafana, and ELK stack.
* Act as technical escalation point for colleagues within the team.
* Act as a day to day technical point of contact for the engineers in other teams.
* Lead creation of automated reports for various services and PaaS infrastructure.
* Manage the operational playbook for the PaaS infrastructure and the services running within it.
* Automate dashboards and reporting for the platform against SLOs, SLAs and KPIs.
* Support managers with inputs on resourcing as needed.
* Monitor and manage Linux VMs, Containers and applications.
* Support and lifecycle management of various applications and services, including patching, upgrades, updates and troubleshooting.
* Plan and lead proactive disaster recovery testing.
* Work with suppliers to onboard their VNFs and CNFs

What you'll bring:

* Experience working with Public cloud, OpenStack, VM, Linux boxes
* Strong background automating the configuration and management of large-scale platforms: Linux, Git, any scripting language like Python, Go, Bash etc
* Experience in database deployment and management (SQL, NoSQL). Eg Couchbase, PostgreSQL
* Linux system administration & configuration management, primarily with CentOS or Ubuntu.
* Experience of building and maintaining CI/CD pipelines.
* Experience with automation/orchestration with tools such as Ansible and Terraform.
* Knowledge about web servers like nginx or Apache etc

The interview process has begun, so if you believe you have the right experience then please get in touch ASAP!