Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
Enhancing Web Text Clustering Accuracy and Efficiency With a Maximum Entropy Function Model: Overcoming High-Dimensional and Directional Challenges
oleh: Xumin Zhao, Guojie Xie, Yi Luo, Fenghua Liu, Hongpeng Bai
| Format: | Article |
|---|---|
| Diterbitkan: | IEEE 2024-01-01 |
Deskripsi
With the rapid development of large models such as Chatgpt, text clustering has become an important research topic in data mining. However, traditional clustering algorithms face challenges in terms of text clustering due to the high dimensionality and directionality of text data; in particular, the research on web text mining is insufficient, so the accuracy and efficiency of clustering algorithms need to be improved. Aiming at the above challenges, this paper proposes a maximum entropy function model and applies it to web text clustering to overcome these challenges and achieve better clustering results. Unlike the traditional clustering algorithm, this algorithm avoids the local minimum and realizes the global minimum. This study will help strengthen web text mining and provide valuable insights for future research. In summary, this paper proposes a novel text clustering method, MEMC, which uses the maximum entropy function model to overcome the challenges of high-dimensional and directional features. Compared with the popular algorithms in the international standard datasets, the method is 15% higher than the current popular k-means algorithm in purity and 6% higher than the AP algorithm.