MBGM: A Graph⁃Mining Tool Based on MapReduce and BSP

Release Date:2015-01-19 Author:Zhenjiang Dong, Lixia Liu, Bin Wu, and Yang Liu Click:

[Abstract] This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) computing model. The tool is named Mapreduce and BSP based Graph⁃mining tool (MBGM). The core of this mining system are four sets of parallel graph⁃mining algorithms programmed in the BSP parallel model and one set of data extraction⁃transformation⁃loading (ETL) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a well⁃designed data management function enables users to view, delete and input data in the Hadoop distributed file system (HDFS). Experiments on artificial data show that the components of graph⁃mining algorithm in MBGM are efficient.

[Keywords] cloud computing; parallel algorithms; graph data analysis; data mining; social network analysis