Sr. Site Reliability Engineer

Published: 1 month ago
Avatar for Frame.io

Frame.io

A cloud-based collaboration hub that keeps your teams focused on creating great content
Company Size
201-500
Company Type
Enterprise Software Company
Software
See all jobs at Frame.io image

Job Location

Job Type

Full Time

Visa Sponsorship

Available

Hires remotely

Everywhere

Relocation

Allowed

Hiring contact

Janine Grillo

The Role

See all roles

We’re looking for someone to join our Infrastructure team who can work closely with Backend Services to create more reliable and robust cloud infrastructure as we scale our product.

About Frame.io

Frame.io is changing the future of how videos are made by helping over 1 million creative professionals seamlessly collaborate from all over the world.

We’re backed by Accel, FirstMark, Insight Partners, SignalFire, Jared Leto, and a host of other amazing investors. Our market-leading product is used and loved by companies such as Turner, Disney, NASA, Snapchat, BBC, BuzzFeed, TED, Adobe, Udemy, and many more.

We’re in an exciting period of growth and are always seeking extremely talented and passionate individuals who share our vision for helping visual content creators produce their best work.

About the Role

As a Senior member of a Site Reliability Engineering team at Frame.io, you will work to transform and perfect our Kubernetes platform, develop multi-cloud strategy, reduce infrastructure cost, and make our infrastructure reliable, performant, and competitive. You will have the opportunity to work cross functionally to transform and maintain monitorable and reliable software systems, serving millions of users everyday. We’re looking for someone that has deep technical expertise and experience to join a fast-paced, growing team of SREs tackling challenging problems at scale.

Requirements

  • 8+ years of experience in managing cloud infrastructure, including hands-on experience with AWS (or another public cloud), Kubernetes, GitOps, Terraform, Docker, CI/CD
  • You have worked in multi-cloud environments and developed migration and deployment strategies around it
  • You have experience in setting up SLAs/SLOs/SLIs for key services and establishing the monitoring around them
  • You have deep experience in collaborating with engineering teams and developing tools and technologies for them
  • You have broad knowledge of Cloud Security and facilitate close collaboration between our security and infrastructure teams
  • You’ll be just as passionate about troubleshooting issues with distributed systems at scale as you are to automate, code and collaborate to solve problems
  • You have materially improved the operability of the systems you've run - through monitoring, service level management, lifecycle management, performance tuning, and documentation
  • You are passionate about reliable, scalable, observable software with strong sense of ownership
  • You have substantial experience with a programming language like Python and Golang
  • You have good knowledge of a centralized configuration tool like Chef, Puppet, or Ansible
  • Experience in storage technologies and developing cost-effective storage solutions is a plus

Responsibilities

  • Be a thought leader in the SRE team to generate new ideas to build next generation cost-efficient infrastructure to host Frame.io services
  • Develop multi-cloud/storage provider strategy to increase availability and reduce cost
  • Identify and bridge gaps to ensure Frame.io cloud infrastructure is reliable, scalable and secure
  • Continue building, maintaining, and improving our Kubernetes and ECS platforms
  • Run ChaosDays to continuously iterate on how we handle and respond to failure
  • Ensure our platform's reliability by taking part in our periodic on-call duty
  • Partner with product & engineering teams on design, development, and capacity planning to ensure Frame.io continues to scale and maximize availability + observability
  • Ensure sufficient logging, monitoring and alerting strategies around availability, latency and overall system health
  • Scale systems sustainably through automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Continuously improve Incident Response policies, procedures, tools, automation, and implementation
  • Reduce waste in the infrastructure by leading initiatives to cut cost without compromising the reliability and security of cloud systems
  • Design and implement tools for engineering to interact with the infrastructure and deploy services in an easy fashion
  • We stay active within the infrastructure + security communities by attending or talking at industry events like Kubecon and AWS:reinvent, and would love for you to join in, if you were interested as well

Benefits

  • Competitive salary and equity
  • Paid parental leave for primary or secondary caregivers
  • Unlimited PTO and designated Volunteering paid time off
  • Yearly stipend for learning and development
  • Medical, Dental, Vision Insurance and OneMedical membership
  • Flexible Spending Account
  • Monthly Work from Home Stipend
  • 1 paid company-wide holiday for each month in the calendar year
  • All-company week-long winter and summer breaks

Our Philosophy

Our philosophy is simple. At Frame.io, we believe that working with people of different backgrounds and perspectives allows us to elevate each other and helps us build a better product for our users.

We’re proud to be an equal opportunity employer, and are committed to providing all employees with a work environment that celebrates individuality and remains free from any form of discrimination and harassment. We base our employment decisions on the needs of our business, job requirements, and applicants' qualifications. In other words, we only care that you’re the best person for the job.

#LI-DNI

More about Frame.io

Perks and Benefits

Healthcare benefits
Parental leave
Generous vacation
image

Funding

AMOUNT RAISED
$82.2M
FUNDED OVER
4 rounds
Rounds
C
$50,000,000
Series C Nov 2019
image

Founders

John Traver
Founder • 3 years
image
Emery Wells
Founder • 3 years
New York City
image
Go to team image

Similar Jobs

Backtrace I/O company logo
Backtrace I/O
The crash reporting and analysis platform for enterprise workloads
KemSENSE company logo
KemSENSE
Intelligent, RFID based, chemical sensing
mParticle company logo
mParticle
The customer data platform for the connected age
Gravity Technologies company logo
Gravity Technologies
Connected Smartspace Transportation Network
Cherre company logo
Cherre
The real estate industry’s leading data management and analytics platform
KYC Hospitality company logo
KYC Hospitality
Enterprise Software for Hotels
Shift Lab company logo
Shift Lab
Design-focused, agile-run, solutions-driven. We shift creative technology forward
LineupApp company logo
LineupApp
Virtual Line and Waitlist Solution
Northspyre company logo
Northspyre
Harness the power automation and AI to take your real estate projects to new heights