Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
Data classification algorithm for data-intensive computing environments
oleh: Tiedong Chen, Shifeng Liu, Daqing Gong, Honghu Gao
Format: | Article |
---|---|
Diterbitkan: | SpringerOpen 2017-12-01 |
Deskripsi
Abstract Data-intensive computing has received substantial attention since the arrival of the big data era. Research on data mining in data-intensive computing environments is still in the initial stage. In this paper, a decision tree classification algorithm called MR-DIDC is proposed that is based on the programming framework of MapReduce and the SPRINT algorithm. MR-DIDC inherits the advantages of MapReduce, which make the algorithm more suitable for data-intensive computing applications. The performance of the algorithm is evaluated based on an example. The results of experiments showed that MR-DIDC can shorten the operation time and improve the accuracy in a big data environment.