Avatar for Stripe

A new, comprehensive and clean standard for online payment processing

Data Infrastructure Engineer, Foundation

$105k – $140k AngelList Est.
Apply now

As a platform company powering businesses all over the world, Stripe processes payments, runs marketplaces, detects fraud, helps entrepreneurs start an internet business from anywhere in the world. Stripe's Data Infrastructure Engineers build the platform and run data pipelines that manage that data for both internal and external users.

While we don't have as much data as Twitter or Facebook we care a great deal about the quality of our data. Because every record in our data warehouse can be vitally important for the businesses that use Stripe, we're looking for people with a strong background in big data systems to help us scale while maintaining correct and complete data. You'll be working with a variety of internal teams, some engineering and some business, to help them solve their data needs. Your work will give teams visibility into how Stripe’s products are being used and where we can improve to serve our users needs better.

You will:

  • Work with teams to build and continue to evolve data models and data flows to enable data driven decision-making
  • Design alerting and testing to ensure the accuracy and timeliness of these pipelines. (e.g., improve instrumentation, optimize logging, etc)
  • Build and maintain our core big data infrastructure systems (Hadoop, Presto, Airflow)
  • Identify the shared data needs across Stripe, understand their specific requirements, and build efficient and scalable data pipelines to meet the various needs to enable data-driven decisions across Stripe

You might be a fit for this role if you:

  • Have a strong engineering background and are interested in data. You’ll be writing production Scala and Python code.
  • Have experience developing and maintaining distributed systems built with open source tools.
  • Have experience optimizing the end to end performance of distributed systems.
  • Have experience in writing and debugging ETL jobs using a distributed data framework (Hadoop/Spark etc…)
  • Have experience managing and designing data pipelines
  • Can follow the flow of data through various pipelines to debug data issues


  • Have experience with Scalding or Spark
  • Have experience with Airflow or other similar scheduling tools

It’s not expected that you’ll have deep expertise in every dimension above, but you should be interested in learning any of the areas that are less familiar.

Some things you might work on:

  • Write a unified user data model that gives a complete view of our users across a varied set of products like Stripe Connect and Stripe Atlas
  • Continuing to lower the latency and bridge the gap between our production systems and our data warehouse
  • Build tooling to load balance jobs between multiple Hadoop clusters to ensure performance and resiliency of our batch jobs.
  • Build a framework and tools to rearchitect recompute-the-world data pipelines to run incrementally.
  • Working on our customer support data pipeline to help us track our time to response for our users and our total support ticket volume to help us staff our support team appropriately
  • Build a system to automatically set up sort keys and partition keys for parquet files based on user query patterns
  • Embed with our billing team to create billing pipelines that enable more granular bills and help our better users understand their costs.
United States • OregonRemote
Hires remotely
Job type
Visa sponsorship
Not Available

Inclusive coverage

We offer comprehensive mental, physical, and medical health plans, supporting Stripes’ financial futures, providing fertility benefits and parental leave.

A principled approach to food

Our food program work with local ingredients and grows a global team through sustainable food practices and minimal waste.

Growth by the way of learning

We are voracious learners and teachers. Our Education team delivers an onboarding and product training curriculum for all new Stripes, and hosts expert-led courses on things like project management fundamentals and macroeconomics.

Stripe at a glance

A new, comprehensive and clean standard for online payment processing

Stripe focuses on Internet, SaaS, Payments, Developer APIs, and Software. Their company has offices in New York City, San Francisco, New York, Chicago, and Seattle. They have a very large team that's between 1001-5000 employees. To date, Stripe has raised $1.638B of funding; their latest round was closed on April 2020.

You can view their website at https://stripe.com or find them on Twitter, LinkedIn, and Product Hunt.

More jobs at Stripe

View all jobs

Infrastructure Engineer, Foundation - Seattle

Staff Engineer, Connect

Infrastructure Engineer, Foundation - Remote

Staff Engineer, Application Security

Similar jobs to Data Infrastructure Engineer, Foundation at Stripe

Avatar for Trellis
Modern API's that make it faster & easier to get the right insurance with the best value
Avatar for Hale
Securities & Crypto Trading Platform
Avatar for Impira
Impira offers an Artificial Intelligence (AI) platform that manages all of your unstructured data
Avatar for Invitae
Specializing in genetic diagnostics for hereditary disorders
Avatar for Penn Interactive (Barstool Sportsbook)
Interactive gaming arm of Penn National Gaming. Currently operating the Barstool Sportsbook
Avatar for LifeLink
Pioneering a new class of patient experience technology