Repository Community: null

Repository Community: null https://scholar.dgist.ac.kr/handle/20.500.11750/59204 2026-05-15T03:30:47Z Logical Anomaly Detection with Text-based Logic via Component-Aware Contrastive Language-Image Training https://scholar.dgist.ac.kr/handle/20.500.11750/59905 Title: Logical Anomaly Detection with Text-based Logic via Component-Aware Contrastive Language-Image Training Author(s): Lee, Seung-eon; Kim, Soopil; An, Sion; Lee, Sang-Chul; Park, Sang Hyun Abstract: AI-based automatic visual inspection systems have been extensively researched to streamline various industrial products' labor-intensive anomaly detection processes. Despite significant advancements, detecting logical anomalies remains challenging due to the multitude of rules governing the assembly of multiple components to create a normal product. Existing methods have relied solely on image information for anomaly detection, resulting in limited accuracy as they fail to account for these diverse complex rules. Instead, humans detect anomalies by comparing the image with pre-defined logic which can be clearly expressed with natural language. Inspired by the human decision process, we propose a logical anomaly detection model that leverages text-based logic like human reasoning. With user-defined rules (i.e., positive rules) and logically distinct negative rules, we train the model using component-aware contrastive learning that increases the similarity between images and positive rules while decreasing the similarity with negative rules. However, accurately comparing textual and visual features is challenging due to multiple components, each governed by different rules, within a single image. To address this, we developed a zero-shot related region detection technique, which guides the model's focus on components relevant to each rule. We evaluated the proposed model on three public datasets and achieved state-of-the-art results in a few-shot logical anomaly detection task. Our findings highlight the potential of integrating vision-language models to enhance logical anomaly detection and utilizing text-based logic in complex industrial settings. 2025-08-06T15:00:00Z MOSInversion: Knowledge distillation-based incremental learning in organ segmentation using DeepInversion https://scholar.dgist.ac.kr/handle/20.500.11750/59400 Title: MOSInversion: Knowledge distillation-based incremental learning in organ segmentation using DeepInversion Author(s): Kim, Jihyeon; Lee, Gyeongmin; Shin, Seung Yeon; Kim, Soopil; Park, Sang Hyun Abstract: Despite recent advancements in multi-organ segmentation (MOS) of medical images, existing models are limited in terms of extending their capability to unseen classes. Incremental learning has been proposed to enable models to learn new classes progressively, possibly using multiple datasets from different institutions. In this setting, models easily experience performance degradation on previously learned classes i.e., catastrophic forgetting. Although many methods have been proposed to mitigate this issue, applying them to medical imaging applications like multi-organ segmentation is not easy due to the large memory requirement when used for 3D medical data such as CT scans or the need for additional training of a generator for image synthesis. In this paper, we propose an incremental learning framework that leverages diverse synthetic images to retain the knowledge learned from previously seen data. We design MOSInversion to generate the synthetic images by utilizing a pre-trained model from the previous step. MOSInversion generates diverse images by using segmentation masks so that we can manipulate the shape, location, and size of organs. We evaluate our proposed method using three abdominal CT datasets (FLARE21, MSD, and KiTS19) and achieve state-of-the-art accuracy. 2025-11-30T15:00:00Z 산업데이터 이상감지 전처리 요소 기술 https://scholar.dgist.ac.kr/handle/20.500.11750/59295 Title: 산업데이터 이상감지 전처리 요소 기술 Author(s): 김수필; 이경은 Revisiting Masked Image Modeling with Standardized Color Space for Domain Generalized Fundus Photography Classification https://scholar.dgist.ac.kr/handle/20.500.11750/59145 Title: Revisiting Masked Image Modeling with Standardized Color Space for Domain Generalized Fundus Photography Classification Author(s): Jang, Eojin; Kang, Myeongkyun; Kim, Soopil; Sagong, Min; Park, Sang Hyun Abstract: Diabetic retinopathy (DR) is a serious complication of diabetes, requiring rapid and accurate assessment through computer-aided grading of fundus photography. To enhance the practical applicability of DR grading, domain generalization (DG) and foundation models have been proposed to improve accuracy on data from unseen domains. Despite recent advancements, foundation models trained in a self-supervised manner still exhibit limited DG capabilities, as self-supervised learning does not account for domain variations. In this paper, we revisit masked image modeling (MIM) in foundation models to advance DR grading for domain generalization. We introduce a MIM-based approach that transforms images to achieve standardized color representation across domains. By transforming images from various domains into this color space, the model can learn consistent representation even for unseen images, promoting domain-invariant feature learning. Additionally, we employ joint representation learning of both the original and transformed images, using cross-attention to integrate their respective strengths for DR classification. We showed a performance improvement of up to nearly 4% across the three datasets, positioning our method as a promising solution for domain-generalized medical image classification. 2025-09-24T15:00:00Z