Avatar for OneSignal

The World's Most Widely Used Push Messaging Platform

Site Reliability Engineer

Apply now

OneSignal has grown rapidly to where we are today serving billions of HTTP requests daily and sending upwards of over 8 billion messages daily. We achieved this scale by leveraging bare metal cloud and writing scale sensitive components in languages like Rust and Go. This potent combination of high performance, low cost hardware with efficient resource utilization has given us an incredible competitive edge.

In collaboration with our UK partner Elements Global Services, we are hiring Site Reliability Engineers to help us continue to scale by operating and engineering the future of our infrastructure. We are maintaining 99.95% uptime today, and we are investing to ensure we maintain that as then business continues to grow and as the product evolves.

Your primary task will be software engineering with a focus on infrastructure, operations, and automation. You'll be building systems to run our product, improving internal services, and advising product teams on architecture as it relates to the operability of the service.

The systems you'll be responsible include all of the services which power our product. This ranges from off-the-shelf services like haproxy, nginx, Redis, PostgreSQL, Kafka, and etc. to our in-house services such as the Rails web app, various Rust backend services, and our high performance API layer written in Go.

You'll be working with Kubernetes to automate our datacenter operations and writing operational services to automate database operations. One of the key challenges in this role is to not only understand systems to the point of being able to manually operate by hand, but also to understand in sufficient detail to write software systems to automate such operations.

For some additional context on how we think about SRE, please see the introductory chapter of the Google SRE book.

Skills and experience:

  • At least 3 years experience working as a software engineer
  • Experience operating reliable production systems at scale
  • Knowledge of Linux systems internals
  • Experience writing networking applications
  • Easily bored running tasks by hand and the ability to automate such tasks
  • Experience with PostgreSQL

Preferred skills and experience:

  • Operational experience deploying and managing Kubernetes on bare metal
  • Experience writing Kubernetes controllers and operators
  • Recent experience writing Go and/or Rust
  • Past experience as an SRE
  • Experience working with Layers 1-3 of the OSI networking model
  • Experience with any of Redis, Kafka, etcd, ZooKeeper, nginx, haproxy

In keeping with our beliefs and goals, no employee or applicant will face discrimination/harassment based on: race, color, ancestry, national origin, religion, age, gender, marital domestic partner status, sexual orientation, gender identity, disability status, or veteran status. Above and beyond discrimination/harassment based on 'protected categories,' we also strive to prevent other, subtler forms of inappropriate behavior (e.g., stereotyping) from ever gaining a foothold in our office. Whether blatant or hidden, barriers to success have no place in our culture. This role is with our UK partner Elements Global Services, which is dedicated to supporting and working on the OneSignal business. This role is via our UK partner Elements Global Services who is seeking a Site Reliability to help scale the OneSignal’s engineering capabilities.

Location
London
Job type
Full-time
Visa sponsorship
Not Available
Experience
5+ years
Hiring contact

Lina Rizzo

Avatar for Lina Rizzo

Healthcare benefits

100% of health and dental insurance premiums paid for.

Retirement benefits

Parental leave

Equity benefits

Vacation policy

Unlimited vacation policy with 3 week recommendation.

Company meals

Healthy catered lunch & dinner options every day.

Wellness benefits

Company events

Fun events like chocolate making, bike rides, game nights, and two company off-sites per year.

No meeting wednesday

We don't schedule internal meetings on Wednesdays.

Choice of workstation

Choose your own hardware and operating system.

OneSignal at a glance

The World's Most Widely Used Push Messaging Platform

OneSignal focuses on Mobile, Enterprise Software, Advertising Platforms, Apps, and Developer Tools. Their company has offices in New York City and San Mateo. They have a mid-size team that's between 51-200 employees. To date, OneSignal has raised $34.47M of funding; their latest round was closed on August 2019.

You can view their website at https://onesignal.com/ or find them on Twitter, Facebook, LinkedIn, and Product Hunt.

More jobs at OneSignal

View all jobs

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Developer Advocate

UX Engineer

Growth Engineer