Sony and IIT Hyderabad collaboration through the Sony Research Award Program represents a significant stride in Continual Learning

Editorial team, July 10, 2025

Research team led by Professor Srijith

As one of the world's most innovative and recognizable brands, Sony is committed to support university research and innovation in the U.S., Canada, India, and select European countries, while also fostering partnerships with university faculty and researchers. One of many initiatives being undertaken by Sony is the Sony Research Award Program, currently in its ninth year. A classic example of this would be the fruitful collaboration between Professor P.K. Srijith of the Indian Institute of Technology, Hyderabad and Dr. Pankaj Wasnik of Sony. In this article, Professor Srijith delves into the work done alongside the Sony team and the outcome of this research resulting in a novel approach to perform continual learning in pre-trained large models (PLMs), by combining two popular Parameter-Efficient Fine-Tuning (PEFT) approaches representing a significant stride in Continual Learning.

Krystelle M: How would you say your collaboration with Sony through the Sony Research Award Program has impacted your research?

Professor Srijith P.K.: I am very grateful to have received the Sony Research Award, which allowed me to pursue my research interests in the field of machine learning and deep learning. The award provided me with generous funding to establish infrastructure, and opportunities to collaborate with Sony researchers working in Artificial Intelligence. The Sony Research Award further enables one to attend international conferences and workshops, to obtain valuable feedback from experts on the work and to initiate international collaborations. The Sony Research Award has been a rewarding and enriching experience for me, and I highly recommend it to anyone who wants to bring advances to their area of research.

I would also like to mention that our jointly written paper titled ‘AdaPrefix++: Integrating Adapters, Prefixes and Hypernetwork for Continual Learning’ has been accepted at the WACV Conference 2025.

Krystelle M: Tell us about your novel approach to perform continual learning in pre-trained large models (PLMs)

Professor Srijith P.K.: Machine Learning (ML) has long aspired to achieve human-like cognitive abilities. With the advent of deep learning (DL) models, we've seen remarkable improvements in tackling complex problems like image recognition and segmentation. However, a significant challenge remains: how can we create models that learn continuously without forgetting previous knowledge?

In real-world scenarios, AI models often encounter tasks sequentially without access to past information. Traditional DL models struggle with this, experiencing "catastrophic forgetting" – the loss of previously learned information when trained on new tasks. The continual learning (CL) paradigm aims to address catastrophic forgetting and enable models to learn continuously from a sequence of tasks. With the rising popularity of pre-trained large models (PLMs), using them for continual learning has become an attractive option. However, training PLMs from scratch is resource-intensive, requiring vast amounts of data and computational power.

To mitigate this, researchers have developed Parameter-Efficient Fine-Tuning (PEFT) methods. These approaches allow for fine-tuning PLMs with less data and computing resources. Recent studies have shown that combining PEFT with CL yields promising results, offering scalability and strong performance. In this work, we present a novel approach to perform continual learning in PLMs, which combines two popular PEFT approaches: Adapters and Prefixes. We propose two novel methods:

AdaPrefix: Considers a combination of Adapters and Prefixes for CL.
AdaPrefix++: A more parameter-efficient approach using a Hypernetwork to generate Prefixes and task-specific adapters for CL.

The AdaPrefix architecture incorporates two PEFT components:

Prefixes: Added to the multi-head attention layer of a transformer block in PLMs .
Adapter block: Inserted after the feed-forward layer of a transformer block in PLM.

In a CL setting, AdaPrefix maintains task-specific Adapters and prefixes. Given the task ID, we add adapters and prefixes to our PLM during inference to perform the downstream task.

While AdaPrefix effectively addresses catastrophic forgetting, it doesn't facilitate knowledge transfer between tasks. To overcome this limitation, we developed AdaPrefix++, which introduces a Hypernetwork to generate prefix parameters. Additionally, we incorporate task-specific adapters for improved performance.

Krystelle M: What are the key features of the AdaPrefix++ model and the benefits of this approach?

Professor Srijith P.K.: The Key features of AdaPrefix++ are:

Shared Hypernetwork: Enables knowledge transfer across different tasks.
Regularization technique: Prevents catastrophic forgetting in the Hypernetwork using knowledge distillation.
Improved performance: The Hypernetwork-generated prefixes provide knowledge transfer, leading to enhanced model performance.

The Benefits of Our Approach are:

Addressing Catastrophic Forgetting: Both AdaPrefix and AdaPrefix++ effectively combat the issue of forgetting previously learned information.
Knowledge Transfer: AdaPrefix++ facilitates knowledge transfer between tasks, potentially improving performance on new tasks.
Scalability: Our methods are designed to work with large pre-trained models, making them suitable for various applications.

We have also achieved some good results when we compared our results to existing methods on backbone architectures for downstream classification tasks.

The above plot shows that our approach performs well compared to all the other previous methods.

This plot shows how well our approach performs compared to other methods on each of the tasks after training continually on all the other tasks. This plot shows the stability of our approach compared to all the other existing approaches.

As we refine these approaches, we're excited about the potential implications for various ML applications, from natural language processing to computer vision. The ability to learn continuously without forgetting is a crucial step toward more human-like AI systems. Our work with AdaPrefix and AdaPrefix++ represents a significant stride in Continual Learning. By combining efficient fine-tuning methods with innovative architectures, we're pushing the boundaries of what's possible in AI.

Professor Srijith P.K. of IIT Hyderabad

More information:

Sony Research Award Program