Machine Learning Deployment Platform
Seldon is looking for a machine learning operations developer to join our team. We are focused on making it easy for machine learning models to be deployed and managed at scale in production. We provide Cloud Native products that run on top of Kubernetes and are open-core with several successful open source projects including Seldon Core, Alibi:Explain and Alibi:Detect. We also contribute to open source projects under the Kubeflow umbrella including KFServing.
About the role
Design and build scalable machine learning solutions on top of the open source and enterprise Seldon products.
Extend and contribute to the open source projects Seldon is working on.
Contribute to conferences and working groups within production machine learning deployment.
A degree or higher level academic background in a scientific or engineering subject.
Familiarity with linux based development.
At least 2 years of experience in industry or academia showing completed projects.
Core skills (The role will be focused on these skills so we would expect existing experience or a demonstrable desire to learn these)
Experience with Kubernetes and the ecosystem of Cloud Native tools.
Experience using machine learning tools in production.
Experience with GoLang.
Desired skills: (Any of these will be of great interest to us)
A record of open source contributions.
A broad understanding of data science and machine learning.
Familiarity with Kubeflow, MLFlow or Sagemaker
Familiarity with python tools for data science
About our tech stack
Some of our high profile technical projects:
We are core authors and maintainers of Seldon Core, the most popular Open Source model serving solution in the Cloud Native (Kubernetes) ecosystem
We built and maintain the black box model explainability tool Alibi
We are co-founders of the KFServing project, and collaborate with Microsoft, Google, IBM, etc on extending the project
We are core contributors of the Kubeflow project and meet on several workstreams with Google, Microsoft, RedHat, etc on a weekly basis
We are part of the SIG-MLOps Kubernetes open source working group, where we contribute through examples and prototypes around ML serving
We run the largest Tensorflow meetup in London
And much more 🚀
Some of the technologies we use in our day-to-day:
Go is our primary language for all-things backend infrastructure including our Kubernetes Operator, and our new GoLang Microservice Orchestrator)
Python is our primary language for machine learning, and powers our most popular Seldon Core Microservices wrapper, as well as our Explainability Toolbox Alibi
We leverage the Elastic Stack to provide full data provenance on inputs and outputs for thousands of models in production clusters
Metrics from our models collected using Prometheus, with custom Grafana integrations for visualisation and monitoring
Our primary service mesh backend leverages the Envoy Proxy, fully integrated with Istio, but also with an option for Ambassador
We leverage gRPC protobufs to standardise our schemas and reach unprecedented processing speeds through complex inference graphs
We use React.js for our all our enterprise user products and interfaces
Kubernetes and Docker to schedule and run all of our core cloud native technology stack
London, Cambridge, New York or San Francisco.
Share options to align you with the long-term success of the company.
Exciting phase of fast-paced start-up challenges with an ambitious team and unlimited potential for professional growth.
Access to discounted lunches, gyms, shopping and cinema tickets.
Cycle To Work Scheme.
Our interview process is normally a phone interview, a coding task, and 2-3 hours of final interview (carried out virtually). We promise not to ask you any brain teasers or trick questions. We might design a system together on a whiteboard, the same way we often work together, but we won’t make you write code on one. Our recruitment process has an average length of 3 weeks.