Pattern Recognition and Computer Vision Lab with AI - PeRCeiVe.AI Lab - Groups
About MLR: Multimodal Learning and Reasoning
Multimodal Learning and Reasoning is a research group focused on building models that jointly learn from and reason over text, audio, images, and video, with an emphasis on LLM-based reasoning, agentic interaction, and responsible AI.
Prof. Matteo Pennisi
Principal Investigator

Multimodal Learning and Reasoning is a research group whose goal is to develop models capable of jointly learning from and reasoning over multiple data modalities, such as images, audio, and text. The group studies methods for the effective integration of heterogeneous sources and for building advanced multimodal representations, with particular attention to reasoning enabled by Large Language Models and to agent-based paradigms for interaction and decision-making. The main research areas include audio-visual learning, video understanding, and cross-modal learning across text, audio, images, and video. An additional focus is placed on interpretability and privacy preservation, with the aim of making models more transparent, reliable, and understandable, while ensuring the responsible use of data

Members of MLR: Multimodal Learning and Reasoning Group
Simone Carnemolla
phd_students
Leonardo Russo
phd_students
Chiara Russo
phd_students
Luca Palazzo
phd_students