PowerSAM: Edge-Efficient Segment Anything for Power Systems Through Visual Model Distillation

1Fudan University, 2State Grid Shanghai Municipal Electric Power Company, 3Bowdoin College

PowerSAM is proposed as a real-time semantic segmentation framework for edge devices, addressing the challenges of power system equipment inspection, including labor intensity, costs, and human error. By leveraging knowledge distillation from large models to compact backbones and integrating a bounding box prompt generator with a segmentation model, PowerSAM significantly reduces computational complexity while maintaining high segmentation accuracy.

Overview image.

Abstract

The inspection of power system equipment is a critical task for ensuring grid reliability and safety which is labor-intensive, costly, and prone to human error, yet the automation process remains challenging due to complex environmental conditions and the edge device computation burden. In this work, we propose a real-time semantic segmentation framework designed for edge computing, leveraging knowledge distillation from large visual foundation models to compact backbones. Our method integrates a bounding box prompt generator with a segmentation model into a unified architecture, significantly reducing computational complexity while maintaining high segmentation accuracy.

A two-stage distillation strategy is employed for further optimization of edge device deployment. Extensive evaluations in the Power System dataset demonstrate that our approach outperforms state-of-the-art methods with high efficiency (20.04 FPS on the NVIDIA Jetson Orin NX) and competitive accuracy (19.456 IoU on power system components segmentation), offering a practical solution for real-time equipment monitoring and inspection in power systems.

Approach Overview

Approach overview image.