So funktionieren meine Skripts perfekt, wenn ich: cat England.txt | ./mapperEngl.py | sortieren | ./reducerEngl.pyHadoop-Streaming-Job fehlgeschlagen (nicht erfolgreich) in Python
Allerdings, wenn ich laufen:
/shared/hadoop/Aktuell/bin/hadoop jar /shared/hadoop/cur/share/hadoop/tools/lib/hadoop-streaming-2.6. 0.jar -datei /home/hadoop/mapperEngl.py -mapper /home/hadoop/mapperEngl.py -datei /home/hadoop/reducerEngl.py -reducer /home/hadoop/reducerEngl.py -input/datadir/England. txt -Ausgang /outputdir/climateresults3.txt
ich erhalte die folgende Fehlermeldung:
16/05/03 09:27:15 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
16/05/03 09:27:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
packageJobJar: [/home/hadoop/mapperEngl.py, /home/hadoop/reducerEngl.py, /tmp/hadoop-unjar6814867016081507297/] [] /tmp/streamjob1585723008278678599.jar tmpDir=null
16/05/03 09:27:16 INFO client.RMProxy: Connecting to ResourceManager at mgmt-florida-poly-eth0/10.200.209.10:8032
16/05/03 09:27:16 INFO client.RMProxy: Connecting to ResourceManager at mgmt-florida-poly-eth0/10.200.209.10:8032
16/05/03 09:27:17 INFO mapred.FileInputFormat: Total input paths to process : 1
16/05/03 09:27:17 INFO mapreduce.JobSubmitter: number of splits:2
16/05/03 09:27:17 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1459438007195_0006
16/05/03 09:27:17 INFO impl.YarnClientImpl: Submitted application application_1459438007195_0006
16/05/03 09:27:17 INFO mapreduce.Job: The url to track the job: http://mgmt-florida-poly-eth0:8088/proxy/application_1459438007195_0006/
16/05/03 09:27:17 INFO mapreduce.Job: Running job: job_1459438007195_0006
16/05/03 09:27:25 INFO mapreduce.Job: Job job_1459438007195_0006 running in uber mode : false
16/05/03 09:27:25 INFO mapreduce.Job: map 0% reduce 0%
16/05/03 09:27:31 INFO mapreduce.Job: map 50% reduce 0%
16/05/03 09:27:32 INFO mapreduce.Job: map 100% reduce 0%
16/05/03 09:27:38 INFO mapreduce.Job: Task Id : attempt_1459438007195_0006_r_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/05/03 09:27:45 INFO mapreduce.Job: Task Id : attempt_1459438007195_0006_r_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/05/03 09:27:51 INFO mapreduce.Job: Task Id : attempt_1459438007195_0006_r_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:134)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:237)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
16/05/03 09:27:58 INFO mapreduce.Job: map 100% reduce 100%
16/05/03 09:27:58 INFO mapreduce.Job: Job job_1459438007195_0006 failed with state FAILED due to: Task failed task_1459438007195_0006_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1
16/05/03 09:27:58 INFO mapreduce.Job: Counters: 37
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=228560
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=29265
HDFS: Number of bytes written=0
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Failed reduce tasks=4
Launched map tasks=2
Launched reduce tasks=4
Rack-local map tasks=2
Total time spent by all maps in occupied slots (ms)=134880
Total time spent by all reduces in occupied slots (ms)=242432
Total time spent by all map tasks (ms)=8430
Total time spent by all reduce tasks (ms)=15152
Total vcore-seconds taken by all map tasks=8430
Total vcore-seconds taken by all reduce tasks=15152
Total megabyte-seconds taken by all map tasks=17264640
Total megabyte-seconds taken by all reduce tasks=31031296
Map-Reduce Framework
Map input records=107
Map output records=223
Map output bytes=9014
Map output materialized bytes=9472
Input split bytes=202
Combine input records=0
Spilled Records=223
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
CPU time spent (ms)=1540
Physical memory (bytes) snapshot=1305165824
Virtual memory (bytes) snapshot=5482422272
Total committed heap usage (bytes)=2022440960
File Input Format Counters
Bytes Read=29063
16/05/03 09:27:58 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
[[email protected] ~]$
I-Lösung von anderen Fragen, und es versucht haben, scheint nicht zu funktionieren.
Ja, nur komplett hier stecken.
Ich habe eine Weile nicht mit hadoop gearbeitet, aber ich denke, es sollte einen Weg geben, stderr von versagender Arbeit zu bekommen, es sollte dir helfen zu verstehen, warum der Job nicht glücklich ist. –
Wie lautet das Eingabeformat & was macht der Mapper/Reducer? – Akarsh