Sr. Data Engineer / Data Manager

San Diego
Position Type
Full Time

You will be a core member of a newly formed cross functional document engineering team responsible for supporting our quality, research and development organizations to deliver best-in-class machine learning solutions and global document coverage for our Digital Identity Verification platforms. To do this, the Sr. Data Engineer will partner across all technology functions (R&D, Engineering, QA, and Product) to collect, organize, standardize and process datasets to support ongoing development initiatives.

Working with a large scale, globally distributed cloud platform that relies on large quantities of structured, semi-structured and unstructured data to deliver new capabilities and derive insights that drive all aspects of our platform delivery, you will:

·        Work with our Pachyderm data management platform to develop, version and maintain large datasets to support machine learning model development

·        Develop, test and maintain data processing pipelines that provide access to critical data and train machine learning models that are core to Mitek’s core product

·        Develop and maintain processes to ensure the quality, consistency and versioning of datasets used in the delivery of Mitek products

·        Manage data labeling activities and ensure the quality of all data used to advance Mitek’s machine learning practice

·        Leverage Tableau for the development of analysis, dashboarding and reporting solutions to monitor key measures associated with the performance of data management initiatives

·        Design data collection methods, evaluate large amounts of data, decompose high level information into details, abstract up from low-level information to a general understanding, analyze trends, and distinguish appropriate requirements

·        Run data through test tools to create reports and identify data anomalies and patterns

·        Provide additional support as necessary to create and modify datasets, label data, and manipulate data for product improvement or development purposes  

Who you are

  • Meticulously detail oriented, accompanied by a data-driven, strategic mindset, and excellent, logical and creative troubleshooting and problem-solving skills. Able to summarize complex issues simply and effectively
  • Planning, organization, and facilitation skills
  • A self-starter, with relentless curiosity and the ability to thrive in a fast-paced start-up team-focused culture within a changing environment
  • Analytical, creative, and innovative approach to solving problems
  • Demonstrated ability to work with ambiguous requirements, adapt, and learn
  • Positive and people-oriented, acoompanied with an energetic attitude and excellent interpersonal and relationship management skills
  • Able to manage and influence others (both within and outside your own direct work-group)
  • Excellent verbal and written communicator

What you need

  • BA / BS degree, ideally in Mathematics, Statistics, or Computer Science
  • 8+ years experience in software development or data engineering
  • Demonstrated success working with datasets used in the delivery of machine learning based solutions
  • Real-world implementation and usage experience with Pachyderm or equivalent data management platforms for building and maintaining datasets to support machine learning development activities
  • 2-4 years in positions providing data management and managing small to medium size projects
  • Working knowledge of scripting and development languages to manage and manipulate data to meet identified needs (Python and/or Go preferred)
  • Experience creating dashboards and effective data visualization
  • Training and experience in statistics, data manipulation, visualization, and analysis
  • Successful history of providing documentation to convey information and drive decision making

What would be nice

  • Knowledge of data mining, machine learning, natural language processing, or information retrieval
  • 2-4 years of experience in a quantitative role
  • Experience using Tableau as a tool for the development of data analytics and decision support solutions
  • Amazon Web Services and associated technologies
  • Exposure to Big Data platforms and technologies
  • Exposure to SQL and NoSQL databases and document stores such as MySQL, Aurora, RedShift, MongoDB, RavenDB etc.
  • Experience processing large amounts of structured and unstructured data
  • Prior experience in secure practices of handling sensitive data and PII
Apply Now