We're working with a globally recognised financial markets organisation undergoing a significant evolution in how it monitors and understands its systems.
This is a hands-on contract role focused on improving and scaling an existing observability environment, with the opportunity to help shape and potentially build out a next-generation solution.
What You'll Be Doing
- Designing and improving telemetry pipelines (metrics, logs, traces) across distributed systems
- Enhancing monitoring, alerting, and observability standards
- Working closely with engineering teams to improve system visibility and reliability
- Integrating and optimising tools such as OpenTelemetry, Prometheus, Grafana, Kafka, Splunk, Elastic, Loki
- Driving better incident response, root cause analysis, and performance insights
- Embedding observability into the software development life cycle (CI/CD, testing, deployments)
What We're Looking For
- Strong experience in Observability, SRE, or Platform Engineering
- Hands-on expertise with modern telemetry and monitoring tooling
- Experience working in cloud-native environments (Kubernetes, microservices)
- Ability to define and improve SLIs, SLOs, and alerting strategies
- Proficiency in at least one language (Python, Go, or Java)
- Experience with infrastructure-as-code (Terraform, Ansible)
Nice to Have
- Experience in financial services or regulated environments
- Exposure to large-scale, complex distributed systems
Why Apply?
- High-impact role within a large-scale, globally recognised environment
- Opportunity to shape observability strategy and tooling direction
- Modern tech stack and engineering practices
- Strong chance of contract extension and long-term engagement
Location: Primarily Remote (UK preference for candidates near London)
Duration: Initial 6 months (strong likelihood of extension)
Start: ASAP
Rate: TBC AND Inside IR35
Please note: this role requires a high level of expertise, and we are only able to consider candidates with strong and directly relevant experience.