Vatsal Thakkar

AI Research Leader | CTO at AgenQ | Pioneering Agentic AI Systems

Leading the development of next-generation AI agents and multimodal systems. Bridging cutting-edge research with practical applications in enterprise AI.

Get In Touch View Publications

About Me

I am an AI research leader and entrepreneur with a passion for developing intelligent systems that can understand, reason, and interact with the world. As CTO at AgenQ, I lead the development of advanced agentic AI systems that are transforming how enterprises leverage artificial intelligence.

My research focuses on multimodal learning, vision-language models, and efficient AI architectures. I have published extensively in top-tier conferences and journals, with work spanning computer vision, natural language processing, and AI safety. I am particularly interested in building AI systems that are not only powerful but also interpretable, efficient, and aligned with human values.

Beyond research, I am committed to bridging the gap between academia and industry, translating cutting-edge research into practical solutions that create real-world impact.

Professional Experience

Chief Technology Officer

AgenQ
May 2024 - Present

Leading the technical vision and development of next-generation agentic AI systems. Architecting scalable AI infrastructure and directing research initiatives in multimodal learning and autonomous agents.

  • Spearheading development of enterprise-grade AI agent frameworks
  • Building high-performance teams across research and engineering
  • Establishing research partnerships with leading academic institutions
  • Driving product strategy for agentic AI solutions

Assistant Principal Investigator

AI HUB, Government of Gujarat
December 2024 - Present

Leading AI research initiatives for government applications, focusing on computer vision and natural language processing solutions for public sector challenges.

  • Directing research projects in AI for social good
  • Collaborating with government agencies on AI policy and implementation
  • Mentoring junior researchers and PhD students
  • Publishing research in top-tier conferences and journals

Research Associate

Indian Institute of Technology Gandhinagar
May 2024 - Present

Conducting advanced research in computer vision, multimodal learning, and efficient AI architectures. Collaborating with faculty and students on cutting-edge AI projects.

  • Developing novel architectures for vision-language models
  • Researching efficient training methods for large-scale models
  • Publishing in premier AI conferences (CVPR, ICCV, NeurIPS)
  • Mentoring undergraduate and graduate students

Machine Learning Engineer

Sarvam AI
August 2023 - April 2024

Developed and deployed large language models and multimodal AI systems for Indian languages. Worked on model optimization, deployment, and evaluation frameworks.

  • Built training pipelines for large-scale language models
  • Optimized models for efficient inference and deployment
  • Developed evaluation frameworks for multilingual models
  • Contributed to open-source AI tools and libraries

Research Intern

Indian Institute of Technology Gandhinagar
May 2022 - July 2023

Conducted research on video understanding, action recognition, and multimodal learning under Prof. Shanmuganathan Raman. Published multiple papers in top-tier conferences.

  • Developed novel methods for video action recognition
  • Worked on vision-language models for video understanding
  • Published papers at CVPR, ECCV, and other premier venues
  • Collaborated with international research teams

Featured Projects

Agentic AI Framework

Enterprise-grade framework for building and deploying autonomous AI agents with advanced reasoning and planning capabilities.

LLMs Agents Python RAG
Learn More

Multimodal Video Understanding

State-of-the-art models for video action recognition and temporal understanding using vision-language pretraining.

Computer Vision PyTorch Transformers
View on GitHub

Efficient Model Compression

Novel techniques for compressing large language models while maintaining performance, enabling deployment on resource-constrained devices.

Model Compression Quantization Pruning
View on GitHub

Vision-Language Models

Research on aligning visual and textual representations for improved cross-modal understanding and generation.

CLIP Multimodal Deep Learning
Read Papers

AI Safety & Alignment

Research on building safe and aligned AI systems through interpretability, robustness testing, and value learning.

AI Safety Interpretability Ethics
Learn More

Document Intelligence

Advanced document understanding system with OCR, layout analysis, and information extraction capabilities.

OCR NLP Layout Analysis
Learn More

Multilingual LLMs

Development of large language models for Indian languages with focus on cultural context and linguistic diversity.

LLMs Multilingual NLP
View on GitHub

Real-time Object Detection

High-performance object detection system optimized for real-time inference on edge devices and embedded systems.

Object Detection YOLO Edge AI
View on GitHub

Conversational AI Platform

End-to-end platform for building contextual chatbots and virtual assistants with advanced dialogue management.

Chatbots Dialogue LLMs
Learn More

Selected Publications

Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning

Bo Ye, Kai Gan, Tong Wei, Vatsal Thakkar, et al.

Neural Information Processing Systems (NeurIPS) 2024

ActionVOS: Actions as Prompts for Video Object Segmentation

Vatsal Thakkar, et al.

IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025

Appearance-Based Refinement for Object-Centric Motion Segmentation

Vatsal Thakkar, Junyu Xie, Weidi Xie, Aniket Agarwal

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation

Yaolin Huang, Yihong Chen, Vatsal Thakkar, Qinglin Lu, et al.

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Frame-level Correlative Fusion Network for RGB-D Salient Object Detection

Xiao Hua, Vatsal Thakkar, et al.

Computer Vision and Image Understanding 2024

Improving Few-Shot Learning with Self-Supervised Vision Transformers

Vatsal Thakkar, et al.

BMVC 2023

View All Publications

Skills & Expertise

AI & Machine Learning

Deep Learning Computer Vision Natural Language Processing Multimodal Learning Reinforcement Learning LLMs & Foundation Models Agent Systems Model Optimization

Frameworks & Tools

PyTorch TensorFlow Transformers CUDA Docker Kubernetes Git MLflow

Programming Languages

Python C++ Java JavaScript SQL Bash

Research & Leadership

Research Design Technical Writing Team Leadership Product Strategy Mentoring Public Speaking

Get In Touch

I'm always interested in discussing new opportunities, collaborations, or research ideas. Feel free to reach out!

Email

vatsal.thakkar@iitgn.ac.in