An Improved Data-Efficiency Algorithm Based on Combining Isolation Forest and Mean Shift for Anomaly Data Filtering in Wind Power Curve

oleh: Wei Wang, Shiyou Yang, Yankun Yang

Format: Article
Diterbitkan: MDPI AG 2022-07-01

Deskripsi

A wind turbine working in a harsh environment is prone to generate abnormal data. An efficient algorithm based on the combination of an Isolation Forest (I-Forest) and a mean-shift algorithm is proposed for data cleaning in wind power curves. The I-Forest is used for detecting the local anomalies in each power and wind speed interval after data preprocessing. The contamination of I-Forest can be flexibly adjusted according to the data distribution of the wind turbine data. The remaining stacked data is eliminated by the mean-shift algorithm. To verify the filtering performance of the proposed combined method, five different algorithms, including the quartile and <i>k</i>-means (QK), the quartile and density-based spatial clustering (QD), the mathematical morphology operation (MMO), the fast data cleaning algorithm (FA), and the proposed one, are applied to the wind power curves of a prototype wind farm for comparisons. The numerical results have positively confirmed the reliability of the universal framework provided by the proposed algorithm.