Abstract:Vehicle-infrastructure cooperative pure visual perception holds significant promise for intelligent transportation systems. However, perception accuracy degrades significantly under adverse weather conditions, such as rain, snow, and fog, primarily due to the scarcity of adverse weather data and the limited adaptability of late-fusion strategies. To address these challenges, this paper proposes a cooperative detection framework that integrates style-transfer-based data augmentation with an enhanced late fusion strategy. Specifically, stylized images are first generated. The effectiveness of style transfer is further enhanced by reinjecting shallow structural features and modeling multi-channel style correlations. During inference, a context-aware adaptive fusion mechanism is introduced to dynamically adjust the weights of detection results based on temporal stability, spatial correlation, and scene complexity. Experimental results show that, under adverse weather conditions, the proposed method achieves more than a 40% relative improvement in detection accuracy over the baseline model, while maintaining performance comparable to existing methods under normal conditions. The proposed framework improves the stability and robustness of vehicle-infrastructure cooperative perception in complex scenarios, providing valuable insights for cooperative perception in intelligent transportation systems. Source code is available at: https://github.com/kkk261/V2X-ELF.