The inspection of power system equipment is a critical task for ensuring grid reliability and safety which is labor-intensive, costly, and prone to human error, yet the automation process remains challenging due to complex environmental conditions and the edge device computation burden. In this work, we propose a real-time semantic segmentation framework designed for edge computing, leveraging knowledge distillation from large visual foundation models to compact backbones. Our method integrates a bounding box prompt generator with a segmentation model into a unified architecture, significantly reducing computational complexity while maintaining high segmentation accuracy.
A two-stage distillation strategy is employed for further optimization of edge device deployment. Extensive evaluations in the Power System dataset demonstrate that our approach outperforms state-of-the-art methods with high efficiency (20.04 FPS on the NVIDIA Jetson Orin NX) and competitive accuracy (19.456 IoU on power system components segmentation), offering a practical solution for real-time equipment monitoring and inspection in power systems.