2013年8月5日星期一

hadoop application scenarios

Recent hadoop names constantly appear in the major sites
I really want to look ta
But I want to ask you prior to the study,
I'm a java programmer, hadoop What can help me? Under what scenarios need to use hadoop?
Well done their research process to achieve a goal

------ Solution ------------------------------------ --------
can help you complete the required high concurrency, large data storage architecture support.
------ Solution ---------------------------------------- ----
I was used to perform parallel clustering algorithm,
I have not found java fields and other open-source alternative to Hadoop MapReduce framework for parallel execution.
------ Solution ---------------------------------------- ----
easiest is to handle large data
------ Solution -------------------------- ------------------
hdfs provide file system
mr providing parallel computing
nosql databases can also be integrated, providing online / real-time business, Big Data even mention the
coupled with high scalability and fault tolerance, operation and maintenance costs are significantly reduced
You say useless so big, huh, huh
------ Solution --------------------------- -----------------
HADOOP two parts, HDFS file system and MapReduce computing framework
HDFS file access operation only provides an interface that is generally called an API HADOOP package inside the file it wants is often written above
MAPREDUCE deal HDFS file computational framework above, generally based on the business development of their own to run the JAR package to handle a file has been uploaded.
You said data acquisition and algorithm, it is necessary to achieve their business systems based on it.
advantages: hundreds of thousands of machines to be treated as the same file system, namely fatigue as a hard disk, you can store a lot of data. So many computers running MAPREDUECE a parallel processing of data, it is conceivable a few T's data does not take too long.

HIVE, can be used in the form of SQL-invoked MAPREDUCE computing framework system. Write SQL, it automatically resolves into N MAPREDUCE distribute tasks to a cluster running above.

HBASE built on HADOOP on NOSQL database because HADOOP only file system, HIVE query processing is very slow. So HBASE emerged, specifically for real-time check number.
------ Solution ---------------------------------------- ----
If you are using Java, then learn Hadoop is simpler right now cloud computing more fire, if you want to do cloud computing, hadoop is still relatively good, hadoop mainly used in large data processing, but also requires data correlation between not high, that is, multiple heterogeneous, hadoop now commonly used in log processing, search engines, but it also has disadvantages, one of which is not suitable for low-latency program
------ Solution - ------------------------------------------

I would like to ask what the logs for Hadoop to deal with it?
------ Solution ---------------------------------------- ----

- ----- Solution --------------------------------------------

I would like to ask what the logs for Hadoop to deal with it?  


such as statistics that page most visited sites, access time, the visitor IP, etc.
------ For reference only ----------------- ----------------------

Could explain meticulous about what?
------ For reference only -------------------------------------- -
  The reply deleted by an administrator at 2013-04-25 13:49:23

------ For reference only ---------------------------------- -----
  This reply was moderator deleted at 2013-05-12 22:21:17

------ For reference only ---------------------------------- -----
  The reply deleted by an administrator at 2013-06-03 09:16:59

没有评论:

发表评论