Avatar for PingCAP

A Globally Scalable HTAP Database

Site Reliability Engineer

Apply now

PingCAP is the fastest-growing enterprise subscription company our investors have ever seen. And how are we growing so fast? By building TiDB, a globally scalable hybrid transactional and analytical database and one of the most popular open source database in the world (don’t take our word for it, check it out: https://github.com/pingcap/tidb), which enables companies to painlessly scale their business while keeping the underlying infrastructure simple. Our product has been trusted and verified by web-scale application leaders and adopted by over 1000 users across industries. We’re being led by the best in the space—our founders were the original creators of TiDB. We’re looking for talented and amazing team players who want to accelerate our growth, while doing some of the best work of their careers. Join us as we build the database for the future!

About the Role:

  • The Site Reliability Engineer is responsible for building and managing the infrastructure for the TiDB Cloud. You will design and implement systems capable of running large-scale, multi-tenant distributed databases that span several providers smoothly. You will also be responsible for the availability, monitoring, emergency response, and capacity planning of the service.

You Will

  • Design, write and deliver systems to improve the functionality and reliability of the infrastructure.
  • Design, implement and operate the automation and monitoring of TiDB Cloud to maximize availability.
  • Manage the infrastructure of TiDB Cloud, including participating in a weekly on-call rotation.
  • Keep a complex system running and solve problems relating to mission-critical services.
  • Take and automate stability and disaster recovery tests to help improve the overall resilience to failures.

You have

  • Strong fundamentals in distributed systems design and operation
  • Expertise in analyzing, monitoring, and troubleshooting large-scale distributed systems.
  • Excellent understanding of infrastructure, virtualization, containers, network, storage, database
  • Strong and hands-on experience with at least one of the programming languages: Go, Typescript
  • Expertise in working with major cloud providers like AWS, Azure, GCP, etc. and Cloud APIs.
  • Ability to distill a complex set of requirements into defined deliverables
  • Strong communicator with a results-driven attitude

Our Benefits:

  • Competitive salary
  • Equity in a fast-growing enterprise startup
  • Awesome, supportive coworkers with a good sense of humor
  • Working with a globally distributed team of passionate (and compassionate) developers, hackers, and open-source fanatics
  • Remote friendly
  • Medical, dental, vision insurance
  • 401k retirement plan
  • Flexible paid time off
  • In-office catered lunch, snacks, and drinks
  • Commuter benefits
  • Gym reimbursement
  • Employee referral bonus program

PingCAP is proud to be an Equal Opportunity Employer building a diverse and inclusive workforce.

San MateoRemote
Hires remotely in
San Mateo
Job type
Visa sponsorship
Hiring contact

Shasha Li

Avatar for Shasha Li

PingCAP at a glance

A Globally Scalable HTAP Database

PingCAP focuses on Open Source, Databases, Software, Cloud Data Services, and Big Data Analytics. Their company has offices in San Mateo. They have a large team that's between 201-500 employees. To date, PingCAP has raised $65M of funding; their latest round was closed on September 2018.

You can view their website at https://pingcap.com/en/ or find them on Twitter, Facebook, and LinkedIn.

More jobs at PingCAP

View all jobs

Software Engineer, Cloud

Technical Support Engineer/DBA

Technical Support Manager

Lead Product Designer

Similar jobs to Site Reliability Engineer at PingCAP