Data Engineer (Python and Spark)
(3+ years exp)Kinetic Technology
Job Type
Full TimeVisa Sponsorship
Not AvailableRemote Work Policy
Onsite or remotePreferred Timezones
Relocation
AllowedSkills
The Role
We are seeking a skilled and experienced Data Engineer with 3 to 5 years of industry experience to join our dynamic team. As a Data Engineer, you will be responsible for designing, developing, and maintaining our data infrastructure and processing pipelines using Python and Spark. Your expertise in data engineering and strong programming skills will contribute to the efficient processing, analysis, and storage of large-scale datasets. This is an exciting opportunity to work on challenging projects and collaborate with cross-functional teams in a fast-paced environment.
Responsibilities:
- Design and develop scalable and robust data processing pipelines using Python and Spark, ensuring efficient data ingestion, transformation, and storage.
- Collaborate with data scientists, analysts, and stakeholders to understand data requirements and implement appropriate data engineering solutions.
- Optimize and tune data processing workflows to enhance performance, reliability, and scalability.
- Build and maintain data warehouses, data lakes, and other data storage systems to support analytics and reporting needs.
- Develop and maintain ETL (Extract, Transform, Load) processes to integrate data from various sources into a unified format for analysis.
- Monitor data infrastructure and pipelines, identify and resolve issues, and ensure data quality and integrity.
- Implement and maintain data security and privacy measures to protect sensitive information.
- Stay updated with industry trends and emerging technologies in data engineering and provide recommendations for process improvements and tool selection.
- Collaborate with cross-functional teams to define and implement data governance policies and best practices.
- Document data engineering processes, standards, and workflows.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 3 to 5 years of relevant industry experience as a Data Engineer.
- Strong programming skills in Python, including experience with data manipulation, data structures, and algorithmic problem-solving.
- Proficiency in Apache Spark for big data processing, including experience with Spark SQL, Spark Streaming, and Spark MLlib.
- Solid understanding of distributed computing principles and concepts.
- Experience with data modeling, data warehousing, and ETL processes.
- Familiarity with cloud platforms like AWS, Azure, or GCP, and their data services (e.g., Amazon EMR, Azure Databricks).
- Proficient in SQL and experience with relational databases (e.g., PostgreSQL, MySQL).
- Strong analytical and problem-solving skills with the ability to handle complex data scenarios.
- Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.
- Attention to detail and a strong commitment to data quality, integrity, and security.
- Experience with version control systems (e.g., Git) and agile development methodologies is a plus.
If you are a passionate Data Engineer with a strong foundation in Python and Spark, and you enjoy working on challenging data engineering projects, we would love to hear from you. Join our team and contribute to the development of cutting-edge data solutions that drive business insights and innovation.