Avatar for Okta

Secure identity management

Principal Infrastructure Site Reliability Engineer (San Jose, CA) (Remote Eligible)

$165k – $210k AngelList Est.
Apply now

We are looking for an experienced Principal Site Reliability Engineer to join our Technical Operations team. At Okta, we are "Always On." The core of that starts with this team, ensuring that customers never worry about the Okta service. They strive to build the most reliable and performant systems on the planet.

We are looking for a Principal engineer who has experience and a passion for designing and running complex large scale services with any or multiple public cloud platforms. This role requires collaboration with the Okta Software Engineering and Site Reliability Engineering teams to ensure we are providing solutions to improve their productivity to build, manage and run their team’s services on the Okta infrastructure with high availability, reliability and performance. The ideal candidate is someone that welcomes the challenge and enjoys seeing their designs run at scale with automation, testing, and tuning. If you exemplify the ethics of, "If you have to do something more than once, automate it," we want to hear from you!

What You'll Do:

  • Execute on initiatives to build Okta's production infrastructure with a focus on automation and scale for multiple public clouds
  • Promote and apply best practices for building scalable and reliable services across the team
  • Be a subject matter expert with public cloud infrastructure and how Okta services can run on them efficiently and at scale
  • Design, build, run and monitor Okta's production infrastructure
  • Drive initiatives to evolve our current platform to increase efficiency and keep it in line with current standards and best practices
  • Respond to production incidents and determining how we can prevent them in the future
  • Identify and automate manual processes
  • Develop and deliver solutions that serve as a model for others with regard to execution, quality, scalability, operability, maintainability, etc
  • Communicate and collaborate across levels, functions and engineering teams
  • Mentor and coach junior engineers to leverage their full potential

Qualifications for the role:

  • Track record of leading successful large scale Infrastructure projects
  • 8+ years of experience with designing and running large scale solutions on public cloud
  • 2+ years of experience with Docker, Kubernetes or cloud managed Kubernetes, Service Mesh
  • Possess knowledge in network and edge technologies
  • Demonstrate strong Linux fundamentals
  • 3+ years of experience with automating systems and infrastructure via Terraform
  • Experience automating and running large scale production services in public cloud providers
  • Can code to a good standard with a programming language using standard software development practices like unit testing and iterative development
  • Experience working with Agile methodologies
  • Champion excellent documentation and communication skills, with the ability to influence others

Education and Training:

  • BS. Computer Science (plus) or relevant experience

Okta is rethinking the traditional work environment, providing our employees with the flexibility to be their most creative and successful versions of themselves, no matter where the employees located. We enable a flexible approach to work, meaning you can work from the office or home, regardless of where you live. Okta invests in the best technologies, and provides flexible benefits and collaborative work environments/experiences, empowering employees to work productively in a setting that best and uniquely suits their needs. Find your place at Okta https://www.okta.com/company/careers/.

Okta is an equal opportunity employer.


San JoseRemote
Hires remotely
Job type
Visa sponsorship
Not Available

Healthcare benefits

Equity benefits

Generous vacation

Company meals

Frequent catered lunch in some offices

Wellness benefits

Volunteer opportunities

Okta at a glance

Secure identity management

Okta focuses on Enterprise Software, Information Technology, Telecommunications, Security, and CRM. Their company has offices in San Francisco and Chicago. They have a very large team that's between 1001-5000 employees. To date, Okta has raised $229.25M of funding; their latest round was closed on September 2015 at a valuation of $1.175B.

You can view their website at http://www.okta.com or find them on Twitter.

Similar jobs to Principal Infrastructure Site Reliability Engineer (San Jose, CA) (Remote Eligible) at Okta