Hi there, I’m Sunit!

Illustration of combining vision ad language modalities πŸ‘¨β€πŸ’»I’m a Machine Learning Engineer with 7 years of full-time work experience in Machine Learning, Natural Language Processing and Data Science. I graduated with my Master’s in Computer Science from Ohio State University in 2024, wherein my research focus was conversational AI. My highlight projects during grad school are in Distributed Systems and Advanced Database Systems.

πŸ”¬I have demonstrable experoence in Natural Language Processing and Content Recommendation Systems. With the advent of LLMs and their use in resource constraint environments, I focused on parameter efficient fine-tuning techniques(PEFT) in LLMs for downstream tasks in dialogue systems, and representation learning. This efficiently helped apply the domain specific knowledge of LLMs to a single task/multi task setting.

πŸ“šCheck out my Master’s Thesis on Task Oriented Dialogue Systems here.

Selected Experience

πŸ§‘πŸ»β€πŸ’ΌMachine Learning Engineer at Orbit Systems : Created the NER-Engine to classify and annotate news articles with entities indicating useful leads for JobsOhio to approach businesses looking to invest and expand operations. Designed and developed asynchronous architecture using Apache Kafka to prompt large language model(GPT-4o) in a parallelized event micro-batching approach to tag continuous stream of scraped articles.

πŸ€–Research Associate at Ohio State University : Co-led the university’s team in the Amazon Alexa Taskbot Challenge. We developed a versatile and personalizable dialogue agent, hosted on AWS and deployed as an Alexa Skill, to assist users with Cooking & DIY tasks. We published our approach in the ACL/SIGDIAL 2023 conference Publication, Github

🏘️Senior Machine Learning Engineer at Compass : Created the Similar Homes Recommendation System at Compass to recommend house listings to users based on their preferences. I developed a scorer model that was used in a Learning-to-Rank style pipeline to generate house recommendations.

πŸ§‘πŸ»β€πŸ’ΌSenior Data Scientist at Salesken : Created a Sales Conversations Auto-Pilot tool to assist sales agents drive sales conversations in a productive and streamlined manner. Blog

Projects

πŸ“šKnowledge Distillation for Dialogue QA - Trained a specialist LLM model using knowledge distillation from open-source GPT 3.5 (ChatGPT) on task related QA dataset. Built RAG pipelines using Parent Document Retriever and Contextual Compression Retriever and performed TruLens evaluation for the RAG Triads(context relevance, groundedness and answer relevance) GitHub

πŸ€–RLHF summary generation - Created a Reinforcement Learning from Human Feedback(RLHF) based text summarizer to remove toxic(offensive/bias) content from summaries. GitHub Architected a novel cross-attention based network to generate summaries focusing on questions asked from the passage. GitHub

🏘️LLM-Prompting/ElasticSearch based Recommendation System - Created an end-to-end property recommendation system using LLM prompting for dialogue with user, and ElasticSearch to generate recommendations based on user preferences captured through the dialogue with the system. GitHub

πŸ—„οΈComprehensive Study of Modern Data Analytics and Storage Engines - Performed a detailed comparitive study of 10 state-of-the-art data analytics and storage engines used as commercial big data warehousing and data processing solutions. Report

Courses and Certifications

Check out my certifications! I am always learning!!

πŸ“– AWS Cloud Technical Essentials - AWS Certificate

πŸ“– Generative AI with Large Language Models - DeepLearning.ai Certificate

πŸ“– Introduction to Retrieval Augmented Generation (RAG) - Duke University Certificate

πŸ“– Building and Evaluating Advanced RAG - Short course by DeepLearning.ai Accomplishment

πŸ“– Natural Language Processing in TensorFlow - DeepLearning.ai Certificate

πŸ“– Deep Learning Specialization - DeepLearning.ai Certificate

πŸ“– Neural Networks for Machine Learning - University of Toronto Certificate

πŸ“– Transfer Learning for NLP with TensorFlow Hub - Coursera Certificate

πŸ“– Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning - DeepLearning.ai Certificate

πŸ“– Machine Learning Foundations: A Case Study Approach - University of Washington Certificate

πŸ“– Siamese Network with Triplet Loss in Keras - Coursera Certificate

πŸ“– Sentiment Analysis with Deep Learning using BERT - Coursera Certificate