|
EMANUELE FRASCAROLI
Dottorando Dipartimento di Ingegneria "Enzo Ferrari"
|
Home |
Pubblicazioni
2024
- Semantic Residual Prompts for Continual Learning
[Relazione in Atti di Convegno]
Menabue, Martin; Frascaroli, Emanuele; Boschini, Matteo; Sangineto, Enver; Bonicelli, Lorenzo; Porrello, Angelo; Calderara, Simone
abstract
Prompt-tuning methods for Continual Learning (CL) freeze a large pre-trained
model and train a few parameter vectors termed prompts. Most of these methods
organize these vectors in a pool of key-value pairs and use the input image as
query to retrieve the prompts (values). However, as keys are learned while
tasks progress, the prompting selection strategy is itself subject to
catastrophic forgetting, an issue often overlooked by existing approaches. For
instance, prompts introduced to accommodate new tasks might end up interfering
with previously learned prompts. To make the selection strategy more stable, we
leverage a foundation model (CLIP) to select our prompts within a two-level
adaptation mechanism. Specifically, the first level leverages a standard
textual prompt pool for the CLIP textual encoder, leading to stable class
prototypes. The second level, instead, uses these prototypes along with the
query image as keys to index a second pool. The retrieved prompts serve to
adapt a pre-trained ViT, granting plasticity. In doing so, we also propose a
novel residual mechanism to transfer CLIP semantics to the ViT layers. Through
extensive analysis on established CL benchmarks, we show that our method
significantly outperforms both state-of-the-art CL approaches and the zero-shot
CLIP test. Notably, our findings hold true even for datasets with a substantial
domain gap w.r.t. the pre-training knowledge of the backbone model, as
showcased by experiments on satellite imagery and medical datasets. The
codebase is available at https://github.com/aimagelab/mammoth.