Hadoop(二):HDFS
本文为学习笔记,对应视频教程来自尚硅谷大数据Hadoop 3.x
HDFS概述
HDFS产出背景及定义
HDFS优缺点
HDFS组成架构
HDFS文件块大小(面试重点)
HDFS的Shell操作
基本语法
1 | [eitan@hadoop102 ~]$ hadoop fs |
常用命令实操
准备工作
启动 Hadoop 集群
-help:输出这个命令参数
1
2
3
4
5
6
7
8
9
10
11
12
13
14[eitan@hadoop102 ~]$ hadoop fs -help rm
-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ... :
Delete all files that match the specified file pattern. Equivalent to the Unix
command "rm <src>"
-f If the file does not exist, do not display a diagnostic message or
modify the exit status to reflect an error.
-[rR] Recursively deletes directories.
-skipTrash option bypasses trash, if enabled, and immediately deletes <src>.
-safely option requires safety confirmation, if enabled, requires
confirmation before deleting large directory with more than
<hadoop.shell.delete.limit.num.files> files. Delay is expected when
walking over large directory recursively to count the number of
files to be deleted before the confirmation.创建/sanguo文件夹
1
[eitan@hadoop102 ~]$ hadoop fs -mkdir /sanguo
上传
-moveFromLocal:从本地剪切粘贴到HDFS
1
2
3[eitan@hadoop102 ~]$ echo shuguo >> shuguo.txt
[eitan@hadoop102 ~]$ hadoop fs -moveFromLocal shuguo.txt /sanguo-copyFromLocal:从本地文件系统中拷贝文件到HDFS路径去
1
2
3[eitan@hadoop102 ~]$ echo weiguo >> ./documents/txt/weiguo.txt
[eitan@hadoop102 ~]$ hadoop fs -copyFromLocal ./documents/txt/weiguo.txt /sanguo-put:等同于copyFromLocal,生产环境更习惯用put
1
2
3[eitan@hadoop102 ~]$ echo wuguo >> ./documents/txt/wuguo.txt
[eitan@hadoop102 ~]$ hadoop fs -put ./documents/txt/wuguo.txt /sanguo-appendToFile:追加一个文件到已经存在的文件末尾
1
[eitan@hadoop102 ~]$ hadoop fs -appendToFile ./documents/txt/liubei.txt /sanguo/shuguo.txt
下载
-copyToLocal/-get:从HDFS拷贝到本地,生产环境更习惯用get
1
[eitan@hadoop102 ~]$ hadoop fs -copyToLocal/-get /sanguo/shuguo.txt ./documents/txt/
HDFS直接操作
-ls: 显示目录信息
1
2
3
4
5[eitan@hadoop102 ~]$ hadoop fs -ls /sanguo
Found 3 items
-rw-r--r-- 3 eitan supergroup 14 2022-05-09 19:42 /sanguo/shuguo.txt
-rw-r--r-- 3 eitan supergroup 7 2022-05-09 19:35 /sanguo/weiguo.txt
-rw-r--r-- 3 eitan supergroup 6 2022-05-09 19:37 /sanguo/wuguo.txt-cat:显示文件内容
1
2
3[eitan@hadoop102 ~]$ hadoop fs -cat /sanguo/shuguo.txt
shuguo
liubei-chgrp、-chmod、-chown:Linux文件系统中的用法一样,修改文件所属权限
1
2[eitan@hadoop102 ~]$ hadoop fs -chmod 666 /sanguo/shuguo.txt
[eitan@hadoop102 ~]$ hadoop fs -chown eitan:eitan /sanguo/shuguo.txt-mkdir:创建路径
1
[eitan@hadoop102 ~]$ hadoop fs -mkdir /jinguo
-cp:从HDFS的一个路径拷贝到HDFS的另一个路径
1
[eitan@hadoop102 ~]$ hadoop fs -cp /sanguo/shuguo.txt /jinguo
-mv:在HDFS目录中移动文件
1
2[eitan@hadoop102 ~]$ hadoop fs -mv /sanguo/wuguo.txt /jinguo
[eitan@hadoop102 ~]$ hadoop fs -mv /sanguo/weiguo.txt /jinguo-tail:显示一个文件的末尾1kb的数据
1
2
3[eitan@hadoop102 ~]$ hadoop fs -tail /jinguo/shuguo.txt
shuguo
liubei-rm:删除文件或文件夹
1
2[eitan@hadoop102 ~]$ hadoop fs -rm /sanguo/shuguo.txt
Deleted /sanguo/shuguo.txt-rm -r:递归删除目录及目录里面内容
1
[eitan@hadoop102 ~]$ hadoop fs -rm -r /sanguo
-du统计文件夹的大小信息
1
2
3
4
5
6
7[eitan@hadoop102 ~]$ hadoop fs -du /jinguo
14 42 /jinguo/shuguo.txt
7 21 /jinguo/weiguo.txt
6 18 /jinguo/wuguo.txt
[eitan@hadoop102 ~]$ hadoop fs -du -s /jinguo
27 179 /jinguo-setrep:设置HDFS中文件的副本数量
1
2[eitan@hadoop102 ~]$ hadoop fs -setrep 10 /jinguo/shuguo.txt
Replication 10 set: /jinguo/shuguo.txt这里设置的副本数只是记录在NameNode的元数据中,是否真的会有这么多副本,还得看DataNode的数量。因为目前只有3台设备,最多也就3个副本,只有节点数的增加到10台时,副本数才能达到10。