Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
An online log template extraction method based on hierarchical clustering
oleh: Ruipeng Yang, Dan Qu, Yekui Qian, Yusheng Dai, Shaowei Zhu
Format: | Article |
---|---|
Diterbitkan: | SpringerOpen 2019-05-01 |
Deskripsi
Abstract The raw log messages record extremely rich system, network, and application running dynamic information that is a good data source for abnormal detection. Log template extraction is an important prerequisite for log sequence anomaly detection. The problems of the existing log template extraction methods are mostly offline, and the few online methods have insufficient F1-score in multi-source log data. In view of the shortcomings of the existing methods, an online log template extraction method called LogOHC is proposed. Firstly, the raw log messages are preprocessed, and the word distributed representation (word2vec) is used to vectorize the log messages online. Then, the online hierarchical clustering algorithm is applied, and finally, log templates are generated. The experimental analysis shows that LogOHC has a higher F1-score than the existing log template extraction methods, is suitable for multi-source log data sets, and has a shorter single-step execution time, which can meet the requirements of online real-time processing.