2015-04-02 10 views
7

Ich habe RHADOOP in Hortonwork VM installiert. wenn ich mapreduce Code leite, um zu überprüfen Es ist ein Fehler zu werfenStreaming-Befehl fehlgeschlagen! in RHADOOP

sagen, ich benutze Benutzer als: rstudio (nicht root.but Zugang zu sudoer hat)

Streaming-Befehl fehlgeschlagen!

Kann jemand mir helfen, das Problem zu verstehen.Ich bekomme nicht viel Idee Thios Problem zu lösen.

Sys.setenv(HADOOP_HOME="/usr/hdp/2.2.0.0-2041/hadoop") 

    Sys.setenv(HADOOP_CMD="/usr/bin/hadoop") 
    Sys.setenv(HADOOP_STREAMING="/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-streaming.jar") 
    library(rhdfs) 
    hdfs.init() 
    library(rmr2) 
    ints = to.dfs(1:10) 
    calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v) 

) 

ich den Fehler immer und unten ist der Fehler in rhadoop

Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1 

4 
stop("hadoop streaming failed with error code ", retval, "\n") 
3 
mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, in.folder = if (is.list(input)) { lapply(input, to.dfs.path) } else to.dfs.path(input), out.folder = to.dfs.path(output), ... 
2 
mapreduce(input = input, output = output, input.format = "text", map = map) 
1 
wordcount(hdfs.data, hdfs.out) 



packageJobJar: [] [/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-streaming-2.6.0.2.2.0.0-2041.jar] /tmp/streamjob3075733686753367992.jar tmpDir=null 
15/04/07 21:43:10 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 
15/04/07 21:43:10 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 
15/04/07 21:43:11 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 
15/04/07 21:43:11 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 
15/04/07 21:43:11 INFO mapred.FileInputFormat: Total input paths to process : 1 
15/04/07 21:43:11 INFO mapreduce.JobSubmitter: number of splits:2 
15/04/07 21:43:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1428440418649_0006 
15/04/07 21:43:12 INFO impl.YarnClientImpl: Submitted application application_1428440418649_0006 
15/04/07 21:43:12 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1428440418649_0006/ 
15/04/07 21:43:12 INFO mapreduce.Job: Running job: job_1428440418649_0006 
15/04/07 21:43:19 INFO mapreduce.Job: Job job_1428440418649_0006 running in uber mode : false 
15/04/07 21:43:19 INFO mapreduce.Job: map 0% reduce 0% 
15/04/07 21:43:27 INFO mapreduce.Job: Task Id : attempt_1428440418649_0006_m_000001_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Container killed by the ApplicationMaster. 
Container killed on request. Exit code is 143 
Container exited with a non-zero exit code 143 

15/04/07 21:43:27 INFO mapreduce.Job: Task Id : attempt_1428440418649_0006_m_000000_0, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

15/04/07 21:43:35 INFO mapreduce.Job: Task Id : attempt_1428440418649_0006_m_000001_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

15/04/07 21:43:35 INFO mapreduce.Job: Task Id : attempt_1428440418649_0006_m_000000_1, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

15/04/07 21:43:43 INFO mapreduce.Job: Task Id : attempt_1428440418649_0006_m_000001_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

15/04/07 21:43:44 INFO mapreduce.Job: Task Id : attempt_1428440418649_0006_m_000000_2, Status : FAILED 
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322) 
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535) 
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) 
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) 
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) 
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

15/04/07 21:43:52 INFO mapreduce.Job: map 100% reduce 0% 
15/04/07 21:43:53 INFO mapreduce.Job: Job job_1428440418649_0006 failed with state FAILED due to: Task failed task_1428440418649_0006_m_000001 
Job failed as tasks failed. failedMaps:1 failedReduces:0 

15/04/07 21:43:54 INFO mapreduce.Job: Counters: 13 
    Job Counters 
     Failed map tasks=7 
     Killed map tasks=1 
     Launched map tasks=8 
     Other local map tasks=6 
     Data-local map tasks=2 
     Total time spent by all maps in occupied slots (ms)=49670 
     Total time spent by all reduces in occupied slots (ms)=0 
     Total time spent by all map tasks (ms)=49670 
     Total vcore-seconds taken by all map tasks=49670 
     Total megabyte-seconds taken by all map tasks=12417500 
    Map-Reduce Framework 
     CPU time spent (ms)=0 
     Physical memory (bytes) snapshot=0 
     Virtual memory (bytes) snapshot=0 
15/04/07 21:43:54 ERROR streaming.StreamJob: Job not successful! 
Streaming Command Failed! 
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : 
    hadoop streaming failed with error code 1 

Antwort

2

Ihre aktuelle Implementation verwendet Rstudio. Können Sie versuchen, den Code in .R zu schreiben und mit dem

zu starten. Ihre Ausnahme PipeMapRed.waitOutputThreads(): kann nur verursacht werden, wenn nicht der richtige Eingabe/Ausgabepfad angegeben ist. Bitte überprüfe deine Wege.

Dies sollte funktionieren.

0

Ihr Code gut für mich gearbeitet, um die HADOOP_CMD und HADOOP_STREAMING auf die Änderung meiner Systemkonfiguration übereinstimmen (Ich bin mit Hadoop 2.4 .0 auf Ubuntu 14.04).

Mein Vorschlag ist, dass:

  • Stellen Sie sicher, dass die funktionale Instanz von Hadoop läuft dh der Befehl jps auf Ihrem Terminal unten Ausgang zeigen sollte:

enter image description here

  • Stellen Sie sicher, Diese rJava-Bibliothek wird geladen, während Sie die Bibliothek (rhdfs) laden.
  • Stellen Sie sicher, dass Sie sich auf die richtige JAR-Streaming-Datei beziehen.

Unten ist die R-Code und die Ausgabe:

Sys.setenv("HADOOP_CMD"="/usr/local/hadoop/bin/hadoop") 
Sys.setenv("HADOOP_STREAMING"="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.0.jar") 

library(rhdfs) 
# Loading required package: rJava 
# HADOOP_CMD=/usr/local/hadoop/bin/hadoop 
# Be sure to run hdfs.init() 

hdfs.init() 
library(rmr2) 
ints = to.dfs(1:10) 
calc = mapreduce(input = ints, map = function(k, v) cbind(v, 2*v)) 

Ausgang:

15/04/07 05:18:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
15/04/07 05:18:45 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 
packageJobJar: [/usr/local/hadoop/data/hadoop-unjar1328285833881826794/] [] /tmp/ streamjob6167004817219806828.jar tmpDir=null 
15/04/07 05:18:47 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 
15/04/07 05:18:47 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 
15/04/07 05:18:48 INFO mapred.FileInputFormat: Total input paths to process : 1 
15/04/07 05:18:49 INFO mapreduce.JobSubmitter: number of splits:2 
15/04/07 05:18:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1428363713092_0002 
15/04/07 05:18:49 INFO impl.YarnClientImpl: Submitted application application_1428363713092_0002 
15/04/07 05:18:50 INFO mapreduce.Job: The url to track the job: http://manohar-dt:8088/proxy/application_1428363713092_0002/ 
15/04/07 05:18:50 INFO mapreduce.Job: Running job: job_1428363713092_0002 
15/04/07 05:19:00 INFO mapreduce.Job: Job job_1428363713092_0002 running in uber mode : false 
15/04/07 05:19:00 INFO mapreduce.Job: map 0% reduce 0% 
15/04/07 05:19:15 INFO mapreduce.Job: map 50% reduce 0% 
15/04/07 05:19:16 INFO mapreduce.Job: map 100% reduce 0% 
15/04/07 05:19:17 INFO mapreduce.Job: Job job_1428363713092_0002 completed successfully 
15/04/07 05:19:17 INFO mapreduce.Job: Counters: 30 
    File System Counters 
     FILE: Number of bytes read=0 
     FILE: Number of bytes written=194356 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=979 
     HDFS: Number of bytes written=919 
     HDFS: Number of read operations=14 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=4 
    Job Counters 
     Launched map tasks=2 
     Data-local map tasks=2 
    Total time spent by all maps in occupied slots (ms)=25803 
    Total time spent by all reduces in occupied slots (ms)=0 
    Total time spent by all map tasks (ms)=25803 
    Total vcore-seconds taken by all map tasks=25803 
    Total megabyte-seconds taken by all map tasks=26422272 
    Map-Reduce Framework 
    Map input records=3 
    Map output records=3 
    Input split bytes=186 
    Spilled Records=0 
    Failed Shuffles=0 
    Merged Map outputs=0 
    GC time elapsed (ms)=293 
    CPU time spent (ms)=3640 
    Physical memory (bytes) snapshot=322818048 
    Virtual memory (bytes) snapshot=2107604992 
    Total committed heap usage (bytes)=223346688 
    File Input Format Counters 
    Bytes Read=793 
    File Output Format Counters 
     Bytes Written=919 
15/04/07 05:19:17 INFO streaming.StreamJob: Output directory: /tmp/file11d247219866 

Hoffnung, das hilft.

+0

HI Manohar ..... sogar ich schrieb das gleiche .. Das Problem ist nicht in der Lage zu laufen..Das ist, was ich nicht herausfinden kann .... Ich habe viele Kombinationen ausprobiert, um dieses Problem zu beheben ... Ich weiß, es gibt kein Problem mit dem Code .. Diese Antwort hilft mir sowieso nicht – Aman

+0

Ich benutze Hortonwork und ich denke, Pfad für hadoop_cmd und hadoop_streaming ist korrekt .. Ich sehe kein anderes Problem abgesehen von diesem ... – Aman

+0

Hallo Aman, ist es möglich den Volltext deiner Fehlerausgabe einzufügen? –