Vatsal Thakkar
AI Research Leader | CTO at AgenQ | Pioneering Agentic AI Systems
Leading the development of next-generation AI agents and multimodal systems. Bridging cutting-edge research with practical applications in enterprise AI.
About Me
I am an AI research leader and entrepreneur with a passion for developing intelligent systems that can understand, reason, and interact with the world. As CTO at AgenQ, I lead the development of advanced agentic AI systems that are transforming how enterprises leverage artificial intelligence.
My research focuses on multimodal learning, vision-language models, and efficient AI architectures. I have published extensively in top-tier conferences and journals, with work spanning computer vision, natural language processing, and AI safety. I am particularly interested in building AI systems that are not only powerful but also interpretable, efficient, and aligned with human values.
Beyond research, I am committed to bridging the gap between academia and industry, translating cutting-edge research into practical solutions that create real-world impact.
Professional Experience
Chief Technology Officer
Leading the technical vision and development of next-generation agentic AI systems. Architecting scalable AI infrastructure and directing research initiatives in multimodal learning and autonomous agents.
- Spearheading development of enterprise-grade AI agent frameworks
- Building high-performance teams across research and engineering
- Establishing research partnerships with leading academic institutions
- Driving product strategy for agentic AI solutions
Assistant Principal Investigator
Leading AI research initiatives for government applications, focusing on computer vision and natural language processing solutions for public sector challenges.
- Directing research projects in AI for social good
- Collaborating with government agencies on AI policy and implementation
- Mentoring junior researchers and PhD students
- Publishing research in top-tier conferences and journals
Research Associate
Conducting advanced research in computer vision, multimodal learning, and efficient AI architectures. Collaborating with faculty and students on cutting-edge AI projects.
- Developing novel architectures for vision-language models
- Researching efficient training methods for large-scale models
- Publishing in premier AI conferences (CVPR, ICCV, NeurIPS)
- Mentoring undergraduate and graduate students
Machine Learning Engineer
Developed and deployed large language models and multimodal AI systems for Indian languages. Worked on model optimization, deployment, and evaluation frameworks.
- Built training pipelines for large-scale language models
- Optimized models for efficient inference and deployment
- Developed evaluation frameworks for multilingual models
- Contributed to open-source AI tools and libraries
Research Intern
Conducted research on video understanding, action recognition, and multimodal learning under Prof. Shanmuganathan Raman. Published multiple papers in top-tier conferences.
- Developed novel methods for video action recognition
- Worked on vision-language models for video understanding
- Published papers at CVPR, ECCV, and other premier venues
- Collaborated with international research teams
Featured Projects
Agentic AI Framework
Enterprise-grade framework for building and deploying autonomous AI agents with advanced reasoning and planning capabilities.
Learn MoreMultimodal Video Understanding
State-of-the-art models for video action recognition and temporal understanding using vision-language pretraining.
View on GitHubEfficient Model Compression
Novel techniques for compressing large language models while maintaining performance, enabling deployment on resource-constrained devices.
View on GitHubVision-Language Models
Research on aligning visual and textual representations for improved cross-modal understanding and generation.
Read PapersAI Safety & Alignment
Research on building safe and aligned AI systems through interpretability, robustness testing, and value learning.
Learn MoreDocument Intelligence
Advanced document understanding system with OCR, layout analysis, and information extraction capabilities.
Learn MoreMultilingual LLMs
Development of large language models for Indian languages with focus on cultural context and linguistic diversity.
View on GitHubReal-time Object Detection
High-performance object detection system optimized for real-time inference on edge devices and embedded systems.
View on GitHubConversational AI Platform
End-to-end platform for building contextual chatbots and virtual assistants with advanced dialogue management.
Learn MoreSelected Publications
Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning
Neural Information Processing Systems (NeurIPS) 2024
ActionVOS: Actions as Prompts for Video Object Segmentation
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025
Appearance-Based Refinement for Object-Centric Motion Segmentation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024
Frame-level Correlative Fusion Network for RGB-D Salient Object Detection
Computer Vision and Image Understanding 2024
Improving Few-Shot Learning with Self-Supervised Vision Transformers
BMVC 2023
Skills & Expertise
AI & Machine Learning
Frameworks & Tools
Programming Languages
Research & Leadership
Get In Touch
I'm always interested in discussing new opportunities, collaborations, or research ideas. Feel free to reach out!
vatsal.thakkar@iitgn.ac.in