Paper

Knowledge-guided Causal Intervention for Weakly-supervised Object Localization

Previous weakly-supervised object localization (WSOL) methods aim to expand activation map discriminative areas to cover the whole objects, yet neglect two inherent challenges when relying solely on image-level labels. First, the ``entangled context'' issue arises from object-context co-occurrence (\eg, fish and water), making the model inspection hard to distinguish object boundaries clearly. Second, the ``C-L dilemma'' issue results from the information decay caused by the pooling layers, which struggle to retain both the semantic information for precise classification and those essential details for accurate localization, leading to a trade-off in performance. In this paper, we propose a knowledge-guided causal intervention method, dubbed KG-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention, which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the disentangled object feature, we introduce a multi-source knowledge guidance framework to strike a balance between absorbing classification knowledge and localization knowledge during model training. Extensive experiments conducted on several benchmark datasets demonstrate the effectiveness of KG-CI-CAM in learning distinct object boundaries amidst confounding contexts and mitigating the dilemma between classification and localization performance.

Results in Papers With Code
(↓ scroll down to see all results)