Enabling collaboration for everyone through security, compliance and insight
Site Reliability Engineer (SRE) - Cloud platform$85k – $130k • 0.1% – 0.3%
We are looking for a highly motivated and experienced Site Reliability Engineer (SRE) to join our rapidly growing engineering team.
As a critical member of our Engineering team, you are responsible for building out and optimizing Aware SaaS production infrastructure, scaling and monitoring cloud services, and developing solutions to extend the platform with automation. You will work closely with both product and machine learning software engineers to design infrastructure and coordinate the production deployment of Aware platform assets. This position is full-time and based in our Columbus, OH office.
Responsible for the availability, latency, performance, efficiency, monitoring, and emergency response of Aware production cloud service(s).
Work with Engineering team members for on-call duty & rotation
Demand Forecasting and Capacity Planning for cloud infrastructures
50% of time will be spent coding - building tools, solutions, and reusable script templates for the cloud service(s)
Build Release Management, Configuration Management and Monitoring system with code automations
Proficient in one or more programming languages (e.g. C#, Java, Python, C++ )
3+ years of experience in hands-on coding and deploying solutions on cloud platforms (e.g., Azure, AWS, Google Cloud)
3+ years of experience with code deployment and production configuration management
3+ years of experience with Linux
Experience with fast-paced, agile development team, and customers satisfaction focused product team
Experience with on-call, production incidents response and perform RCA
Experience with log management tools, building monitoring dashboard
Experience with manage SLA and contribute to the cross-function team
Linux, Docker, Kubernetes, Elasticsearch, PostgreSQL, Service Bus, NoSQL Storage, Git with integrated CI/CD systems.