2013年8月16日星期五

Why hadoop map quickly , reduce card in ten minutes ?

Why is this ?
------ Solution ---------------------------------------- ----

If it is data skew , rewritten Partitioner better ~ , data skew in practice is common.
You can see on the website is not being replicated data, if that is the case , it may be a problem of the machine or network problems ~
------ Solution -------- ------------------------------------
data may be inclined , consider a . construct the corresponding combiner, 2. rewrite partitioner. course, if you upload files in to hdfs may also cause reduce very slow
------ Solution -------- ------------------------------------
reduce stuck there after the last successful execution of it ? May reduce the code wrong. . . You to do is to map and reduce what action ?
------ Solution ---------------------------------------- ----
1. data skew , rewritten partitioner,
2. code is wrong. To jobtrack.jsp? Jobid see the progress of implementation , see the log
------ For reference only -------------------------- -------------
is data skew it. . . reduce process polymerization process , if a certain type of data will result in a disproportionately large tilt, this part of the data will be calculated at a reduce , affect the overall performance
------ For reference only ----- ----------------------------------

that this question should conform to solve it , and if My data is randomly generated, I should not re- generate the data better

没有评论:

发表评论