Ich habe 2 Datenknoten + 1 Namen in meinem Cluster. Ich kann sehen, dass meine Datei email.json auf beiden Datenkanälen repliziert wird. Ich kann die Standorte hier basierend auf der fsck-Ausgabe sehen.Wo ein replizierter Block physisch zu finden ist
- 192.168.99.1:50010 192.168.99.100:50010
ich den physischen Speicherort der Datei finden in einem der Server
Server # 1
[[email protected] ~]$ cat $HADOOP_CONF_DIR/hdfs-site.xml | grep datanode
<name>dfs.datanode.data.dir</name>
<value>/home/raviramadoss/datadir/datanode/dir1, /home/raviramadoss/datadir/datanode/dir2, /home/raviramadoss/datadir/datanode/dir3</value>
[[email protected] ~]$ pwd
/home/raviramadoss
[[email protected] ~]$ find . | xargs grep 'his email should be filtered out' 2> /dev/null
./datadir/datanode/dir1/current/BP-277552337-172.20.10.2-1470405150547/current/finalized/subdir0/subdir0/blk_1073741829:{"from":"[email protected]","to":"[email protected]","body":"This email should be filtered out"}
Aber kann nicht scheinen, das Replikat zu finden auf dem zweiten Datenknoten.
Server # 2
ravis-MacBook-Pro:datadir raviramadoss$ cat /Users/raviramadoss/Downloads/hadoop-2.7.2/etc/hadoop/hdfs-site.xml | grep datanode
<name>dfs.datanode.data.dir</name>
<value>/Users/raviramadoss/datadir/datanode/dir1, Users/raviramadoss/datadir/datanode/dir2, Users/raviramadoss/datadir/datanode/dir3</value>
ravis-MacBook-Pro:datadir raviramadoss$ cat $HADOOP_CONF_DIR/hdfs-site.xml | grep datanode
<name>dfs.datanode.data.dir</name>
<value>/Users/raviramadoss/datadir/datanode/dir1, Users/raviramadoss/datadir/datanode/dir2, Users/raviramadoss/datadir/datanode/dir3</value>
ravis-MacBook-Pro:datadir raviramadoss$ pwd
/Users/raviramadoss/datadir
ravis-MacBook-Pro:datadir raviramadoss$ find . | xargs grep 'his email should be filtered out' 2> /dev/null
FSCK Befehlsausgabe
hadoop fsck /users/raviramadoss/emails.json -locations -files -blocks
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
16/08/06 16:41:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://kirikou.worldofthe.com:50070/fsck?ugi=raviramadoss&locations=1&files=1&blocks=1&path=%2Fusers%2Fraviramadoss%2Femails.json
FSCK started by raviramadoss (auth:SIMPLE) from /192.168.99.1 for path /users/raviramadoss/emails.json at Sat Aug 06 16:41:03 IST 2016
/users/raviramadoss/emails.json 207 bytes, 1 block(s): Under replicated BP-277552337-172.20.10.2-1470405150547:blk_1073741829_1005. Target Replicas is 3 but found 2 replica(s).
0. BP-277552337-172.20.10.2-1470405150547:blk_1073741829_1005 len=207 repl=2 [DatanodeInfoWithStorage[192.168.99.1:50010,DS-69e0ae16-85b8-4a7b-ae82-bd9e195aa946,DISK], DatanodeInfoWithStorage[192.168.99.100:50010,DS-2d2d3e94-58a9-465c-860d-90188838b675,DISK]]
Status: HEALTHY
Total size: 207 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 207 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 1 (100.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 2.0
Corrupt blocks: 0
Missing replicas: 1 (33.333332 %)
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Sat Aug 06 16:41:03 IST 2016 in 1 milliseconds
The filesystem under path '/users/raviramadoss/emails.json' is HEALTHY