We’re Mitek, a NASDAQ-listed global leader in mobile capture and digital identity verification solutions built on the latest advancements in AI and machine learning. Our Mobile Verify and Mobile Deposit products power and protect millions of identity evaluations and mobile deposits every day, around the world.
Mitek is committed to the health and safety of our employees and candidates. During this current pandemic, our global team of Mitekians are successfully and productively working full-time from home. Your experience with us - from introduction, to interview, to onboarding - will be a virtual one.
Mitek is seeking a Data Engineer to join our Data Engineering team based out of our San Diego headquarters. As the analytic muscle behind our large scale, globally distributed Digital Identity Verification cloud platforms, our technology teams rely on us to deliver large quantities of structured, semi-structured and unstructured data that enable them to build new capabilities and derive insights that drive all aspects of our platform delivery. Our goal is to continually build and implement solutions that support the operationalization of data acquisition and pipelining.
As a Data Engineer at Mitek, you'll join us in continually building and implementing solutions that support the operationalization of data acquisition and pipelining. The data solutions you create will directly impact our machine learning capabilities and global document coverage. We're a big data environment, and you'll be building data sets, working with data sources and data pipelines, and potentially even owning our data modeling. You'll work directly with our internal leaders in our Engineering, Product, R&D, and Testing groups to support and ensure our teams can measure the performance and success of their efforts.
You'll need to have strong skills in Python development, as well as SQL. Experience in a big data environment like Hadoop is critical. And if you know some Golang, that'll be helpful too!
What You'll Do:
- Develop, test and maintain data processing pipelines that provide access to critical data and train machine learning models that are core to Mitek’s core product
- Develop and maintain processes to ensure the quality, consistency and versioning of datasets used in the delivery of Mitek products
- Leverage Tableau for the development of analysis, dashboarding and reporting solutions to monitor key measures associated with the performance of data management initiatives
- Design data collection methods, evaluate large amounts of data, decompose high level information into details, abstract up from low-level information to a general understanding, analyze trends, and distinguish appropriate requirements
- Provide additional support as necessary to create and modify datasets, label data, and manipulate data for product improvement or development purposes
What You Bring:
- Bachelor's degree in Mathematics, Statistics, Computer Science, or a related discipline
- 3+ years of experience in data engineering or software engineering
- Experience using Python or Go/Golang to perform scripting/development, data management, and data manipulation
- Experience working with datasets used in the delivery of machine learning-based solutions
- Experience with distributed messaging and streaming technologies (RabbitMQ, Kinesis, Kafka)
- Experience creating dashboards and effective data visualization
- Training and experience in statistics, data manipulation, visualization, and analysis
- Successful history of creating and providing documentation to convey information and drive decision making
What Would Be Nice:
- Knowledge of data mining, machine learning, natural language processing, or information retrieval
- 2-4 years of experience in a quantitative role
- Experience using Tableau as a tool for the development of data analytics and decision support solutions
- Experience deploying software to a cloud platform environment. AWS, GCP, Azure.
- Exposure to Big Data platforms and technologies
- Exposure to SQL and NoSQL databases and document stores such as MySQL, Aurora, RedShift, MongoDB, RavenDB etc.
- Experience processing large amounts of structured and unstructured data
- Prior experience in secure practices of handling sensitive data and PII