Data Lake Architect

(8+ years exp)

$120k – $170k

Published: 1 month ago

FreightVerify

FreightVerify wants to reinvent how the supply-chain thinks

Ann Arbor

51-200

Machine Learning Big Data Artificial Intelligence

See all jobs at FreightVerify

Job Location

San Francisco •

Ann Arbor

Job Type

Full Time

Visa Sponsorship

Not Available

Relocation

Allowed

The Role

FreightVerify is a supply chain technology company that provides real-time transportation visibility and business intelligence for global enterprise clients. Utilizing the latest IoT, big data, and AI technologies, FreightVerify provides a neutral platform that simplifies complex global supply chains through a common design language to leverage new and emerging tracking technologies.

AI (our comprehensive term for data science, machine learning, and deep learning) is the key technology that gives our platform its unique edge over our competition. This role is your opportunity to help build and deploy new AI products from the ground up.

The Data Lake Senior Architect will be part of the company AI team responsible for the system architecture, design, and development of a large, scaled-out, real-time, high-performing Hadoop infrastructure.

Responsibilities

Architect, design, and build a big data infrastructure platform, primarily based on Hadoop technologies on the Azure cloud platform, ensuring that the data lake meets specific minimum availability and security thresholds.
Actively collaborate with the Advanced Product and AI teams to define the big data platform to achieve company product and business objectives.
Actively collaborate with other technology teams and architects to define and develop cross-company/function technology stack synergies and interactions.
Actively collaborate with the cross-functional teams to establish, document, implement and reinforce disciplined software development processes and best-practices.
Be proficient in SQL and ETL processes, ETL and DB performance tuning, table partitioning, shell scripting, driving prototypes and proof of concepts.
Set up and implement infrastructure for custom Data visualization tools.
Research and experiment with emerging technologies and tools related to big data. Share your findings with the Advanced Products and AI teams.
Maintain and support the platform to agreed service standards on a day-to-day basis.

Qualifications

Bachelor's Degree from accredited university or college in Business, Information Systems, Information Technology (IT) or Computer Science
Minimum of 8 years of experience as a technologist, designing and developing data modeling.
A minimum of 3 years specializing in big data architecture or data analytics (including technologies such as Hadoop, NoSQL, Map-Reduce, and other Industry BigData Frameworks).
Understanding of Apache Hadoop and Hadoop ecosystem. Experience with one or more relevant tools such as Sqoop, Flume, Kafka, Oozie, Zookeeper, HCatalog.
Experience in database and big data technologies, including MPP and NoSQL databases, data warehouse design, ETL, BI reporting, and dashboard development.
Familiarity with one or more Hadoop technologies (Hive, Impala, Spark SQL, Presto).
Experience developing software code in one or more programming languages (Python, R, etc.)
Knowledge of best practices related to Data lakes, Data lake governance, Data Security (e.g., HITRUST), and Data Integration & Interoperability.
Experience with agile development and DevOps.

We are an equal opportunity employer, hiring and developing individuals from diverse backgrounds and experiences to add to our collaborative culture. We do not discriminate in our recruiting, hiring, and promoting processes. Candidates and employees are treated with respect regardless of race, religion, sex, age, sexual orientation, gender identity and expression, national origin, veteran status, or disability.