On the Data Automation team, we develop machine learning models and infrastructure to extract key information from all kinds of financial documents such as analyst recommendations, corporate/municipal bond offerings, earnings releases. Our team has built some of the world's most sophisticated neural networks that have beaten the performance of even the best analysts in the market often demanding 99.9% precision. The models we build enable our clients to get accurate answers fast. The work we do is critical, as a single prediction error can have a market moving effect.
As part of our team, you will lead the research on cutting-edge ML/NLP techniques and design efforts for the most efficient and practical application of those techniques to complex business problems. You will utilize our automated ML suite, equipped with annotation platforms, for collecting training data and hyper-tuning models, as well as deploy your application on our scalable ETL infrastructure. If the idea of applying technology and information retrieval techniques to solve complex data problems excites you, keep reading.
In the upcoming year, you should expect to work on the following:
Replace our traditional Seq2Seq and BiLSTM annotation models with modern BERT/ELMO language models
Employ weakly supervised learning approaches for understanding structural associations embedded in various document layouts
Apply advanced NLP techniques for multi-entity disambiguation and reinforcement learning to replace heuristic-based decision tree
You will also collaborate closely with financial domain experts to gain valuable insights and leverage their business expertise to increase accuracy in annotating training data. A right combination of cross-field ML techniques, deep understanding of the business problem and high quality training data is fundamental to our models beating the precision of most academic and industry standards and presents a challenging opportunity! If this sounds like a challenge you are up for, please apply below.
We'll trust you to:
Learn cutting-edge research in advanced ML & NLP topics and devise an efficient application for projects
Direct ML strategy for the team and work closely with the ML platform team, ETL infrastructure team as well as guide truth-gathering efforts
Drive, design & develop projects as the principal point-of-contact, with the ability to determine suitable ML models, direct feature engineering processes and negotiate KPIs per business needs
Participate in technical conferences, publish papers and evaluate new technologies
You'll need to have:
A strong statistical background in ML, NLP, deep learning models along with familiarity in probabilistic information retrieval and optimization methods
Professional experience of building and deploying ML apps to production
2+ years of hands-on experience in Python/C/C++ development and knowledge of distributed, scalable architectures and CICD tooling
A solid understanding of data structures, algorithms and software design concepts
Strong communication skills and interest in learning financial product domains
BA, BS, MS, PhD in Computer Science, Data Science or related technology field
We'd love to see:
Knowledge of advanced concepts such as weakly supervised learning, reinforcement learning and active learning
Familiarity with SQL and NoSQL data modeling and exploratory data visualization
Professional experience as a technology lead or architect
Authored research publications, participation in ML competitions, working demos/repos