Data Engineer
(2+ years exp)Keystone Strategy
Job Type
Full TimeVisa Sponsorship
Not AvailableRelocation
AllowedSkills
Hiring contact
Pedro VeintimillaThe Role
**Primary Responsibilities
Develop and maintain data pipeline architecture for ETL and other processing: Build infrastructure required for extraction, transformation, and loading of data from a wide variety of data sources using cloud ‘big data’ technologies.
Translate large volumes of raw, unstructured data into highly visual and easily digestible formats
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Help create, maintain, and implement tools, libraries, and systems to increase the efficiency and scalability of the team
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
*Skills
Required *
Python, Jupyter
AWS - S3, Redshift, Sagemaker, IAM, ECS, Fargate, data products (e.g. Athena, Glue); Heroku
DB: SQL, MongoDB
Workflow management engines (e.g. Airflow, AWS Step Functions, Luigi, Kubeflow)
DevOp: CircleCI, Github, Docker, Kubernetes
Experience improving efficiency, scalability, and stability of system resources
Nice to have
Experience with Machine Learning pipelines and deployment
Kubernetes, Papermill, Kafka, Dremio, Snowflake, Spark, Hadoop and similar (e.g. MapReduce, Yarn, HDFS, Hive, Spark, Presto, Pig, HBase, Parquet)
Experience with GCP (e.g. BigQuery, Google Cloud Composer) and Azure (Azure Data Factory, Databricks, ADLS)
Scrapy and Web Crawling
Send me a message or email at [email protected]