search open roles at our portfolio of FinTech companies now


AI Research Scientist

Roots Automation

Roots Automation

Software Engineering, Data Science
New York, NY, USA
Posted on Monday, August 28, 2023

Research Scientist, Multimodal NLP 

NYC based hybrid role

Compensation: $140,000 - $180,000 + equity options + benefits

About Roots Automation: 

At Roots, our mission is to make work more human by creating AI-powered "Digital Coworkers" that automate tedious and repetitive tasks. Our focus lies in tackling core challenges in Natural Language Understanding and Computer Vision while developing an automation product that envisions  the future of work.  

We believe in democratizing automations, allowing anyone to create them by defining tasks in plain English. Our platform aims to provide a robust environment to write and execute such tasks, effectively translating instructions into tangible outcomes while delivering enterprise-grade results and performance. 

Our primary industry focus is Insurance, where success hinges on our customer’s ability to read and understand various unstructured legal, medical, and financial documents. To solve this, we recently built a universal document understanding capability, InsurGPT,  that leverages an industry-specific fine-tuned large language model. 

The Role

As we advance the development of InsurGPTTM, our next step involves incorporating multimodal functionality by combining computer vision and natural language understanding to deepen our document understanding capabilities. To accomplish this, we are searching for a passionate research scientist with expertise in the intersection of computer vision and LLMs. Your contributions will play a pivotal role in shaping the future of our automation product.


  • Conduct foundational research at the intersection of vision and language to build a model that can surpass state-of-the-art benchmarks in visual Q&A on documents
  • Write production-grade code to translate research into product
  • Enhance document understanding capabilities of InsurGPTTM
  • Collaborate with cross-functional teams, including Product, Engineering, Annotations, and Customer Success, to develop end-to-end applications


  • Ph.D. in Machine Learning or a related area
  • Experience in generative AI and computer vision, preferably in multimodal deep learning
  • A successful track record in research, preferably demonstrated by first-authored publications in top-tier AI conferences like NeurIPS, ICML, CVPR, etc
  • Proficient in Python, PyTorch, and common ML modeling libraries
  • Good written and verbal communication skills
  • Contributions to large open-source ML projects (nice-to-have)

As a startup, Roots Automation offers a high-paced environment with ample growth and learning opportunities across multiple disciplines. Equity ownership opportunities are available for the right candidate. We strongly prefer candidates who are willing to work from the office in New York, NY three days a week.