2013年8月17日星期六

Advantages compared to the traditional cluster HADOOP

New self HADOOP, for their own understanding of this problem is that we work together HADOOP complete a task, while traditional clusters are receiving a task, then points to a machine in the cluster, by his single task.
not sure I understand it right?
------ Solution ---------------------------------------- ----
discuss this question, we should first understand what a hadoop cluster, a literature:
Hadoop cluster is designed for storing and analyzing vast amounts of unstructured data and design a specific type of cluster. Essentially, it is a computing cluster, data analysis will be assigned to work on multiple cluster nodes to process data in parallel.
From this point on, the landlord understood sense, then look advantages:
one,
suitable for big data processing. Big Data are generally widely distributed and is unstructured. The Hadoop is ideal for this kind of data is because, Hadoop works is that the data is split into pieces, and each "slice" is assigned to a specific cluster nodes for analysis. Data do not uniformly distributed, since each data slice is in a separate cluster nodes individually processed.
two,
flexible scalability. And any other type of data, the large data analysis is an important problem facing the increasing volume of data. And the biggest advantage of big data in real-time or near real-time analysis and processing. The Hadoop clusters can increase the parallel processing capability of the speed, but with the amount of data to be analyzed increases, the processing capacity of the cluster may be affected. However, by adding additional cluster node cluster can be effectively extended.
three,
cost. Hadoop cluster cheaper for two main reasons. It required software is open source, so that you can reduce costs. In fact, you can freely download the Apache Hadoop release. Meanwhile, Hadoop clusters by supporting commercial hardware control costs. Do not have to buy a server-class hardware, you can build a powerful Hadoop cluster. So, it turns out, Hadoop cluster is indeed a cost-effective solution.
four,
fault tolerance capabilities. When a data slice is sent to a node for analysis, the data on the other nodes in the cluster will have a copy. In this way, even if one node fails, the node data is still present in extra copies of the rest of the cluster, so that data can still be processed for analysis.





------ For reference only ---------------------------------- -----
Hadoop is an outstanding advantage in fact, is the inadequacy of traditional BI it. Hadoop is not perfect of course, there are some drawbacks: A disadvantage
cluster solution is based on the data "separable" and a node on a separate basis of parallel processing. If you do not meet on the analysis of parallel processing environment, then Hadoop cluster is not an appropriate tool to accomplish this task.
Another is the amount of data is small, configure, build, operation and maintenance and support of a bit does not pay

------ For reference only ---------------------------------- -----
this article is very good, comprehensive and in-depth can have a grasp on hadoop:
http://os.51cto.com/art/201211/364374.htm

没有评论:

发表评论