Machine Learning Data Engineer I

San Diego
Position Type
Full Time

Mitek is looking for a Data Engineer to join our global technology organization. The Data Engineer will be a contributing member of a newly formed cross functional document engineering team responsible for supporting our quality, research and development organizations to deliver best-in-class machine learning solutions and global document coverage for our Digital Identity Verification platforms. To do this, the Data Engineer will work as a part of our Data Operations function to deliver solutions that collect, organize, standardize and process datasets to support ongoing development initiatives.

Mitek delivers a large scale, globally distributed cloud platform that relies on large quantities of structured, semi-structured and unstructured data to deliver new capabilities and derive insights that drive all aspects of our platform delivery. For this reason, we need someone who is technically sharp, a creative problem solver, productive working independently or collaboratively, and has a data driven mindset.  

What you will do

  • Leverage data management principles, and data engineering concepts to implement solutions to manage Mitek’s machine learning processes and data requirements
  • Implement solutions to ingest, label and manage Mitek’s structured, semi-structured, and unstructured data requirements
  • Execute team-defined processes to ensure the quality, consistency and versioning of datasets used in the delivery of Mitek products
  • Facilitate data labeling activities and ensure the quality of all data used to advance Mitek’s machine learning practice
  • Participate in data collection efforts to assist teams to appropriately collect and ingest data into Mitek’s data lake
  • Run data through test tools to create reports and identify data anomalies and patterns
  • rovide additional support as necessary to create and modify datasets, label data, and manipulate data for product improvement or development purposes 

Who you are

  • Detail oriented, with a data-driven mindset
  • Strong problem solving/troubleshooting skills, with an analytical yet, creative, and innovative approach
  • Self-starter, with relentless curiosity
  • Thrives in a fast-paced start-up team-focused culture and adapts to a changing environment
  • Positive, people-oriented, and energetic attitude
  • Excellent verbal and written communication skills
  • Ability to summarize complex issues simply and effectively

What you need

  • Bachelor’s degree in Mathematics, Statistics, Computer Science or related field, accompanied with 0 - 2 years of relevant experience
  • Successful history of providing documentation to convey information and drive decision making
  • Working knowledge of scripting and development languages to manage and manipulate data to meet identified needs (Python and/or Go preferred)
  • Exposure to working with cloud-based delivery platforms (AWS, Azure, GCP)
  • Understanding of working with datasets used in the delivery of machine learning based solutions
  • Demonstrated ability to work with ambiguous requirements, adapt, and learn

What would be nice

  • Knowledge of data mining, machine learning, natural language processing, or information retrieval
  • Knowledge of Amazon Web Services and associated technologies
  • Exposure to Big Data platforms and technologies
  • Exposure to SQL and NoSQL databases and document stores such as MySQL, Aurora, RedShift, MongoDB, RavenDB etc.
  • Experience processing large amounts of structured and unstructured data
  • Prior experience in secure practices of handling sensitive data and PII
  • 2-4 years of experience in a quantitative role or 1-2 years experience in software engineering role
Apply Now