Senior Machine Learning Data Engineer

San Diego
Position Type
Full Time

You will be a core member of a newly formed cross functional document engineering team responsible for supporting our quality, research and development organizations to deliver best-in-class machine learning solutions and global document coverage for our Digital Identity Verification platforms. To do this, as Sr. Machine Learning Data Engineer you will partner across all technology functions (R&D, Engineering, QA, and Product) to collect, organize, standardize and process datasets to support ongoing development initiatives.

What you will do

  • Working with a large scale, globally distributed cloud platform that relies on large quantities of structured, semi-structured and unstructured data to deliver new capabilities and derive insights that drive all aspects of our platform delivery, you will:
  • Work with our Pachyderm data management platform to develop, version and maintain large datasets to support machine learning model development
  • Develop, test and maintain data processing pipelines that provide access to critical data and train machine learning models that are core to Mitek’s core product
  • Develop and maintain processes to ensure the quality, consistency and versioning of datasets used in the delivery of Mitek products
  • Manage data labeling activities and ensure the quality of all data used to advance Mitek’s machine learning practice
  • Leverage Tableau for the development of analysis, dashboarding and reporting solutions to monitor key measures associated with the performance of data management initiatives
  • Design data collection methods, evaluate large amounts of data, decompose high level information into details, abstract up from low-level information to a general understanding, analyze trends, and distinguish appropriate requirements
  • Run data through test tools to create reports and identify data anomalies and patterns
  • Provide additional support as necessary to create and modify datasets, label data, and manipulate data for product improvement or development purposes  

Who you are

  • Meticulously detail oriented, accompanied by a data-driven, strategic mindset, and excellent, logical and creative troubleshooting and problem-solving skills. Able to summarize complex issues simply and effectively
  • Planning, organization, and facilitation skills
  • A self-starter, with relentless curiosity and the ability to thrive in a fast-paced start-up team-focused culture within a changing environment
  • Analytical, creative, and innovative approach to solving problems
  • Demonstrated ability to work with ambiguous requirements, adapt, and learn
  • Positive and people-oriented, acoompanied with an energetic attitude and excellent interpersonal and relationship management skills
  • Able to manage and influence others (both within and outside your own direct work-group)
  • Excellent verbal and written communicator

What you need

  • BA / BS degree, ideally in Mathematics, Statistics, or Computer Science
  • 8+ years experience in software development or data engineering
  • Demonstrated success working with datasets used in the delivery of machine learning based solutions
  • Real-world implementation and usage experience with Pachyderm or equivalent data management platforms for building and maintaining datasets to support machine learning development activities
  • 2-4 years in positions providing data management and managing small to medium size projects
  • Working knowledge of scripting and development languages to manage and manipulate data to meet identified needs (Python and/or Go preferred)
  • Experience creating dashboards and effective data visualization
  • Training and experience in statistics, data manipulation, visualization, and analysis
  • Successful history of providing documentation to convey information and drive decision making

What would be nice

  • Knowledge of data mining, machine learning, natural language processing, or information retrieval
  • 2-4 years of experience in a quantitative role
  • Experience using Tableau as a tool for the development of data analytics and decision support solutions
  • Amazon Web Services and associated technologies
  • Exposure to Big Data platforms and technologies
  • Exposure to SQL and NoSQL databases and document stores such as MySQL, Aurora, RedShift, MongoDB, RavenDB etc.
  • Experience processing large amounts of structured and unstructured data
  • Prior experience in secure practices of handling sensitive data and PII
Apply Now