2016-08-06 35 views
0

Ich habe 2 Datenknoten + 1 Namen in meinem Cluster. Ich kann sehen, dass meine Datei email.json auf beiden Datenkanälen repliziert wird. Ich kann die Standorte hier basierend auf der fsck-Ausgabe sehen.Wo ein replizierter Block physisch zu finden ist

  • 192.168.99.1:50010 192.168.99.100:50010

ich den physischen Speicherort der Datei finden in einem der Server

Server # 1

[[email protected] ~]$ cat $HADOOP_CONF_DIR/hdfs-site.xml | grep datanode 
     <name>dfs.datanode.data.dir</name> 
     <value>/home/raviramadoss/datadir/datanode/dir1, /home/raviramadoss/datadir/datanode/dir2, /home/raviramadoss/datadir/datanode/dir3</value> 
[[email protected] ~]$ pwd 
/home/raviramadoss 
[[email protected] ~]$ find . | xargs grep 'his email should be filtered out' 2> /dev/null 
./datadir/datanode/dir1/current/BP-277552337-172.20.10.2-1470405150547/current/finalized/subdir0/subdir0/blk_1073741829:{"from":"[email protected]","to":"[email protected]","body":"This email should be filtered out"} 

Aber kann nicht scheinen, das Replikat zu finden auf dem zweiten Datenknoten.

Server # 2

ravis-MacBook-Pro:datadir raviramadoss$ cat /Users/raviramadoss/Downloads/hadoop-2.7.2/etc/hadoop/hdfs-site.xml | grep datanode 
     <name>dfs.datanode.data.dir</name> 
     <value>/Users/raviramadoss/datadir/datanode/dir1, Users/raviramadoss/datadir/datanode/dir2, Users/raviramadoss/datadir/datanode/dir3</value> 
ravis-MacBook-Pro:datadir raviramadoss$ cat $HADOOP_CONF_DIR/hdfs-site.xml | grep datanode 
     <name>dfs.datanode.data.dir</name> 
     <value>/Users/raviramadoss/datadir/datanode/dir1, Users/raviramadoss/datadir/datanode/dir2, Users/raviramadoss/datadir/datanode/dir3</value> 
ravis-MacBook-Pro:datadir raviramadoss$ pwd 
/Users/raviramadoss/datadir 
ravis-MacBook-Pro:datadir raviramadoss$ find . | xargs grep 'his email should be filtered out' 2> /dev/null 

FSCK Befehlsausgabe

hadoop fsck /users/raviramadoss/emails.json -locations -files -blocks 
DEPRECATED: Use of this script to execute hdfs command is deprecated. 
Instead use the hdfs command for it. 

16/08/06 16:41:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
Connecting to namenode via http://kirikou.worldofthe.com:50070/fsck?ugi=raviramadoss&locations=1&files=1&blocks=1&path=%2Fusers%2Fraviramadoss%2Femails.json 
FSCK started by raviramadoss (auth:SIMPLE) from /192.168.99.1 for path /users/raviramadoss/emails.json at Sat Aug 06 16:41:03 IST 2016 
/users/raviramadoss/emails.json 207 bytes, 1 block(s): Under replicated BP-277552337-172.20.10.2-1470405150547:blk_1073741829_1005. Target Replicas is 3 but found 2 replica(s). 
0. BP-277552337-172.20.10.2-1470405150547:blk_1073741829_1005 len=207 repl=2 [DatanodeInfoWithStorage[192.168.99.1:50010,DS-69e0ae16-85b8-4a7b-ae82-bd9e195aa946,DISK], DatanodeInfoWithStorage[192.168.99.100:50010,DS-2d2d3e94-58a9-465c-860d-90188838b675,DISK]] 

Status: HEALTHY 
Total size: 207 B 
Total dirs: 0 
Total files: 1 
Total symlinks:  0 
Total blocks (validated): 1 (avg. block size 207 B) 
Minimally replicated blocks: 1 (100.0 %) 
Over-replicated blocks: 0 (0.0 %) 
Under-replicated blocks: 1 (100.0 %) 
Mis-replicated blocks:  0 (0.0 %) 
Default replication factor: 2 
Average block replication: 2.0 
Corrupt blocks:  0 
Missing replicas:  1 (33.333332 %) 
Number of data-nodes:  2 
Number of racks:  1 
FSCK ended at Sat Aug 06 16:41:03 IST 2016 in 1 milliseconds 


The filesystem under path '/users/raviramadoss/emails.json' is HEALTHY 

Antwort

0

Fand heraus, dass ich 2 a/in dfs.datanode.data.dir für Server # fehlte. Sobald dies behoben ist, konnte ich das Replikat finden.