2013年8月23日星期五

Connect with eclipse HDFS but reported Error: failure to login Bug

I try to connect with eclipse HDFS but reported Error: failure to login error
Here is the online method :
---------------------------------------------- -------------------------------------------------- ------------
will HADOOP_HOME / lib directory commons-configuration-1.6.jar, commonshttpclient-
3.0.1.jar, commons-lang-2.4.jar, jackson-core-asl-1.0.1.jar and jackson-mapper-asl-1.0.1.jar ; etc. 5 pack
Copy to hadoop-eclipse-plugin-0.20.203.0.jar lib directory , and then modify the package META-INF directory of MANIFEST.MF, will modify the classpath about the content:
Bundle-ClassPath: classes /, lib / hadoop-core.jar, lib/commons-cli-1.2.jar, lib/commons-httpclient-
3.0.1.jar, lib/jackson-core-asl-1.0.1.jar, lib/jackson-mapper-asl-1.0.1.jar, lib/commons-configuration-
1.6.jar, lib/commons-lang-2.4.jar
---------------------------------------------- -------------------------------------------------- ------------
I tried this method or not.

------ Solution ------------------------------------ --------
try first original one plugin jar files from Eclipse / plugins delete, start eclipse, close eclipse, then modified plugin copied to the Eclipse / plugins /, start eclipse, should be on it . Estimated plugin is eclipse cach live
------ For reference only ----------------------------- ----------
continue to try to chant .

hadoop stream can not be used to call a shell command

hadoop I use the command hadoop jar contrib/streaming/hadoop-streaming-1.0.0.jar-input input-output output-mapper / bin / cat-file test.sh-reducer ; test.sh
shell script test.sh:
#! / bin / bash
hadoop fs-get input/1.avi ~ / hadoop-1.0.0/tmp/1.avi
error message is:
12/06/01 16:31:25 ERROR streaming.StreamJob: Job not successful. Error: # of failed Reduce Tasks exceeded allowed ; limit. FailedCount: 1. LastFailedTask: task_201206011340_0024_r_000000
12/06/01 16:31:25 INFO streaming.StreamJob: killJob ...
Streaming Job Failed!

------ Solution ------------------------------------ --------
reduce tasks have to hang up, you can see on the web console jobtracker specific fault stack http://jobmaster:port, port is probably 50,070 or 50,030 Port
------ For reference only ------------------------------------- -
finally someone responds.
because in reduce.sh call the hadoop fs-get commands like grep or cat if the change to do what no problem,
reduce the return value back to the task is not 0, hadoop default will try four times, four times, after they reported the entire mission fails,
Is there a shell script called hadoop command experience, how to solve it? Seek advice
------ For reference only ------------------------------------ ---
looked wrong as you say, or do not know where is the problem, I hadoop novice, help see what is the problem?
java.lang.RuntimeException: PipeMapRed.waitOutputThreads (): subprocess failed with code 255
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads (PipeMapRed.java: 311)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished (PipeMapRed.java: 545)
at org.apache.hadoop.streaming.PipeMapper.close (PipeMapper.java: 132)
at org.apache.hadoop.mapred.MapRunner.run (MapRunner.java: 57)
at org.apache.hadoop.streaming.PipeMapRunner.run (PipeMapRunner.java: 36)
at org.apache.hadoop.mapred.MapTask.runOldMapper (MapTask.java: 436)
at org.apache.hadoop.mapred.MapTask.run (MapTask.java: 372)
at org.apache.hadoop.mapred.Child $ 4.run (Child.java: 255)
at java.security.AccessController.doPrivileged (Native Method)
at javax.security.auth.Subject.doAs (Subject.java: 396)
at org.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java: 1083)
at org.apache.hadoop.mapred.Child.main (Child.java: 249)
------ For reference only -------------- -------------------------
after suffering solve some problems, does not recognize the hadoop command, you need to shell script export hadoop environment Variables
------ For reference only ------------------------------------- -
Well then good ah

2013年8月22日星期四

whether you can control tasktracker hadoop number ?

rt. Is already deployed and will be up and running hadoop , I can set in code I need now tasktracker number ? For example, I have 10 machines on a cluster has been at work , I just want to open a mission among five machine work , how to achieve that ? Do not tell me to change the configuration file restart after shutdown . . .
Similarly , datanode whether the number can be controlled ?
------ Solution ---------------------------------------- ----
expect answers
------ Solution ------------------------------- -------------
CSDN too shallow ?
------ Solution ------------------- -------------------------
can be set , the system uses a default value that can be achieved through a function call to a class setting .
------ Solution ---------------------------------------- ----
only know that this can be set up. Concrete will not ah
------ For reference only --------------------------------- ------
how Meirenhuida ah. .

under the windows. / hadoop namenode-format always fails,pseudo-distributed

likos @ likos-PC / cygdrive / d / hadoop / run / bin
$. / hadoop namenode-format
12/03/09 16:26:26 INFO namenode.NameNode: STARTUP_MSG:
/ ********************************************* ***************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = likos-PC/192.168.0.119
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.203.0
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common/branches/br
********************************************** ************** /
Re-format filesystem in \ tmp \ hadoop-likos \ dfs \ name? (Y or N) n
Format aborted in \ tmp \ hadoop-likos \ dfs \ name
12/03/09 16:27:02 INFO namenode.NameNode: SHUTDOWN_MSG:
/ ********************************************* ***************
SHUTDOWN_MSG: Shutting down NameNode at likos-PC/192.168.0.119
********************************************** ************** /



The above is copied, what the situation, trying to solve
in this case. / start-all.sh only start jobtracker, other namenode datanode secondnamenode tasktracker can not start by jps see
------ Solution - -------------------------------------------
If you simply learn learn better.
If the test is to be done or to a cluster or on linux it.

you can first set in the configuration file tmp path, and then built out of the folder and re-format.
------ For reference only -------------------------------------- -
is clear that the path is wrong ah.
------ For reference only -------------------------------------- -
Thank you, although the problem has not resolved

2013年8月21日星期三

hbase configuration problems

Recently started learning hadoop
I was in redhat configure hadoop, hbase, zookeeper cluster, hadoop cluster can start, zookeeper cluster startup is not a problem
But hbase starts, when entering shell list at org.apache.hadoop.hbase.MasterNotRunningException: null error
checked the Internet, I hadoop config files core-site.xml and hbase in the appropriate place is the same, start hbase hbase master and slave are prompted to start, not an error, but the inspection found Hmaster JPS does not start to view the log hbase found at every start
org.apache.hadoop.hbase.Hmaster: unhandled exception
java.lang.illegalargumentexception: java.net.URISyntaxException: relative path in absolute uri Such errors cause failed to start because the host view untreated, may reverse domain name resolution is not found when the host, But I ping any machine on the host's ip and domain able to ping all my configuration files are also using a domain name without using ip, and / etc / hosts I have done under the corresponding reverse Resolution, hbase the lib package I also replaced in the core package hadoop, hadoop and hbase compatibility of these two files is not a problem, too much to do before the node hadoop, hbase of a single node can still use The. Problem in the end where it? Hope master wing
------ Solution ------------------------------------- -------


problems have been solved, is the use of virtual machine issue, previously vmware, now turned into a visual box, this question is no, hbase everything is normal. . .
------ For reference only -------------------------------------- -
the same problem, it is a depression!
------ For reference only -------------------------------------- -
your problem solved what I recently busy with work could not continue to study hadoop cluster, I intend to reconfigure the single hbase, hadoop and zookeeper or the use of clusters, try
- ---- For reference only ---------------------------------------
Thank you I also use vmware seems to have installed a visual box to try to get me a vmware half ah
------ For reference only ------- --------------------------------
java.net.URISyntaxException: Relative path in absolute ; URI:

hadoop in pig delimiter issue

File test.txt column delimiter is CRTL + A, that is, \ 001 ,

run raw = LOAD '... / test.txt' USING PigStorage ('\ 001') AS (a, b, c);

can PigStorage does not seem to recognize \ 001 error.

question :
1: without replacing the file delimiters premise, how to solve this problem ?
2: If delimiter ( not necessarily \ 001 ) with regex , then the load statement how to write ?
------ Solution ---------------------------------------- ----
1, with sed statement to ' \ 001 ' replace it, replace recognizable symbols.

2, raw = LOAD '... / test.txt' USING PigStorage (' Your regex ') AS (a, b, c);


------ For reference only ---------------------------------- -----
look younger issues now, thank you ,
not know the answer , but also holding individual field ah ! !
------ For reference only -------------------------------------- -
  This reply was moderator deleted at 2012-02-02 10:39:00

------ For reference only ---------------------------------- -----
this question nobody know ? ? ? . . . . . . .
------ For reference only -------------------------------------- -
knot post, Meirenhuida also end it

2013年8月20日星期二

About the distributed cache (memcached) data synchronization issue

Two computers : One is a memcached server , the other one is to write the logical server.

logical server only logical operations , memcached caching user data when logical server requires data obtained from memcached above and temporarily cached locally , modifying user data , and then save it to the memcached server , and empty the local the data.

so there will be a question of logical servers memcached multiple threads simultaneously modify a user on the data, there will be concurrent problems that may lead to memcached above data is overwritten.

My idea is to give the user the logical server side data plus a lock, a user to modify data, first get the user's lock , and then take the memcached access to the data above , and then modify the data, and then Save to memcached, and then release the lock .

feel this way, the efficiency may be lower , I do not know there is no better way to achieve this , please master guiding !
------ Solution ---------------------------------------- ----
the user data according to user name hash% the number of threads , each thread processing the corresponding user data
------ For reference only ----------- ----------------------------
added: logical server is using java .
------ For reference only -------------------------------------- -
I wish to address is the data synchronization problems ah
------ For reference only ------------------------ ---------------
there are different types of threads may modify the same data