大数据——正在发生的深刻变革

发布时间:2013-08-12 作者:刘鹏,吴兆峰,胡谷雨 阅读量:

[摘要] 介绍和比较了大数据在存储、管理、处理及挖掘方面全球主要的技术。大数据技术总的趋势是通过分布式计算来解决“瓶颈”问题。由于不能完全依赖提高单个节点性能的方式提升系统整体性能,因此需要通过增加系统内节点数目的方式来达到目的。可以将存储、处理和分析的任务通过分布式的方式分散到系统中各个节点上来加快数据的存储、处理和分析的速度。

[关键词] 大数据;新摩尔定律;云计算;数据挖掘;Hadoop平台

[Abstract] In this paper, we describe and compare the main technologies for storing, managing, processing, and mining big data. Distributed computing is a new trend in solving bottlenecks associated with big-data development. Performance of the whole system cannot be improved only by improving the performance of a single node; therefore, it is necessary to increase the number of nodes within the system. Storage, processing and analysis can be distributed to each node in the system to speed up data storage, processing and analysis.

[Keywords] big data; new Moore’s law; cloud computing; data mining; Hadoop platform