DecoupleCSS

Decoupling Continual Semantic Segmentation

Yifu Guo1,2,* Yuquan Lu1,2,* Wentao Zhang1 Zishan Xu2 Dexia Chen1 Siyu Zhang3 Yizhe Zhang4 Ruixuan Wang1,†

1Sun Yat-sen University, 2South China Normal University,
3Southwest University, 4Nanjing University of Science and Technology

*Equal contribution. Corresponding author.

Abstract

Continual Semantic Segmentation (CSS) requires learning new classes without forgetting previously acquired knowledge, addressing the fundamental challenge of catastrophic forgetting in dense prediction tasks. However, existing CSS methods typically employ single-stage encoder-decoder architectures where segmentation masks and class labels are tightly coupled, leading to interference between old and new class learning and suboptimal retention-plasticity balance.

We introduce DecoupleCSS, a novel two-stage framework for CSS. By decoupling class-aware detection from class-agnostic segmentation, DecoupleCSS enables more effective continual learning, preserving past knowledge while learning new classes. The first stage leverages pre-trained text and image encoders, adapted using LoRA, to encode class-specific information and generate location-aware prompts. In the second stage, the Segment Anything Model (SAM) is employed to produce precise segmentation masks, ensuring that segmentation knowledge is shared across both new and previous classes.

Method Overview

Method Overview

Overview of the proposed method. (a) The overall architecture comprises language-driven task-aware class detection, segmentation prompt generation, and class-agnostic segmentation modules. (b) Representative results on challenging settings (2-2 and 4-2) for CSS on Pascal VOC 2012.

1 Class-Aware Detection

  • Text-Image Fusion: Leverages pre-trained text and image encoders to extract class-specific semantic information
  • LoRA Adaptation: Adapts image encoders for new tasks using Low-Rank Adaptation while preserving old knowledge
  • Prompt Generation: Generates class-aware and location-specific prompts for segmentation

2 Class-Agnostic Segmentation

  • SAM Integration: Employs Segment Anything Model to produce precise segmentation masks
  • Knowledge Sharing: Segmentation knowledge is shared across both new and previous classes
  • Consistent Performance: Maintains high segmentation quality for all classes

Architecture Details

Model Architecture

Detailed architecture of DecoupleCSS. The framework consists of class-aware detection (red components) and class-agnostic segmentation (gray components).

Pre-trained Text Encoder

Processes class-specific textual information to provide semantic guidance

LoRA-adapted Image Encoder

Efficiently adapts to new tasks while preserving old knowledge through Low-Rank Adaptation

Prompt Generator

Creates location-aware prompts for precise segmentation based on class-specific features

SAM Segmentator

Generates high-quality masks using class-agnostic foundation model capabilities

Experimental Results

Main Results

Main experimental results. Our method achieves state-of-the-art performance across multiple CSS benchmarks.

Pascal VOC 2012

Significant improvements across 19-1, 15-5, 10-1, 2-2, and 4-2 settings with superior old class retention and robust performance in challenging scenarios

ADE20K

Outstanding performance on complex multi-class scenarios in 100-50 and 150-50 settings

Challenging Scenarios

Robust performance on difficult 2-2 and 4-2 incremental learning settings

Key Contributions

Novel Framework

First to explicitly decouple class-aware detection from class-agnostic segmentation in CSS

Foundation Model Integration

Pioneering use of SAM for continual semantic segmentation tasks

Effective Knowledge Preservation

Achieves superior balance between retention and plasticity

State-of-the-Art Performance

Demonstrates significant improvements across diverse CSS benchmarks

Applications

Autonomous Driving

Incremental learning of new road elements and traffic scenarios

Medical Imaging

Progressive learning of new anatomical structures and pathologies

Remote Sensing

Continual adaptation to new geographical features and land use types

Citation

@misc{guo2025decouplingcontinualsemanticsegmentation,
      title={Decoupling Continual Semantic Segmentation}, 
      author={Yifu Guo and Yuquan Lu and Wentao Zhang and Zishan Xu and Dexia Chen and Siyu Zhang and Yizhe Zhang and Ruixuan Wang},
      year={2025},
      eprint={2508.05065},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.05065}, 
}

Contact