Avatar for Datadog

Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere

Software Engineer - Site Reliability

$110k – $160k AngelList Est.
Apply now

About Datadog:

We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way

The team:

The Site Reliability teams at Datadog are responsible for ensuring that our high-volume, low-latency environments continue to perform around the clock. These teams collaborate closely with our product engineers to ensure that Datadog can monitor millions of servers and containers, ensuring our customers always have dependable and actionable data at their fingertips. You’ll be responsible for shaping the infrastructure of our data-intensive, real-time services as we continue to grow at petabyte scale.

You will:

  • Keep our service reliable, available and fast as a member of the operations team.
  • Respond to, investigate and fix service issues, whether they be deep in the OS kernel or in the application code.
  • Design, build and maintain the infrastructure we need to support orders of magnitude more customers.

Requirements:

  • You have a BS/MS/PhD in a scientific field or equivalent experience
  • You have a track record as an engineer in the operations of a large site
  • You value correctness and efficiency; you leave no stone unturned when diagnosing production issues
  • You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems
  • You have production experience with distributed compute/storage tools, e.g. zookeeper, cassandra, postgres, kafka, elasticsearch, redis

Bonus points:

  • You have submitted bug fixes to the aforementioned projects
  • You are fully fluent in python, ruby and go

Is this you? Tell us why, and apply now. Include links to your github, stackoverflow or other online projects.

Location
Paris • EugeneRemote
Hires remotely
Everywhere
Job type
Full-time
Visa sponsorship
Not Available

Medical insurance

Retirement savings plan

Open paid time off

Catered lunches

Snacks & drinks

Fitness fund

Commuter benefits

Outings & events

Referral bonus

Datadog at a glance

Modern monitoring & analytics. See inside any stack, any app, at any scale, anywhere

Datadog focuses on SaaS, Enterprise Software, Information Technology, Analytics, and Software. Their company has offices in New York City, San Francisco, New York, Boston, and Chicago. They have a very large team that's between 1001-5000 employees. To date, Datadog has raised $147.9M of funding; their latest round was closed on September 2019 at a valuation of $11B.

You can view their website at https://www.datadoghq.com or find them on Twitter and LinkedIn.

More jobs at Datadog

View all jobs

Systems Reliability Engineer - Multicloud

Software Engineer - Site Reliability

Software Engineer - Site Reliability, Network Edge

Engineering Team Lead

Open-Source Software Engineer - .NET / C#

Open-Source Software Engineer - .NET / C#