Professional Documents
Culture Documents
nr
/
excel.onlineclasses@gmail.com
http://www.excelonlineclasses.co.nr/
http://www.excelonlineclasses.co.nr/
HDFS
Nagarjuna K
http://www.excelonlineclasses.co.nr/
HDFS
Distributed
FS designed to run on
Commodity Hardware
Provides
moving data
http://www.excelonlineclasses.co.nr/
than
Hardware Failure
Assumptions &
Goals
HDFS
Chances
HDFS
Emphasis
on Data throughput
http://www.excelonlineclasses.co.nr/
Large Datasets
Assumptions &
Goals
http://www.excelonlineclasses.co.nr/
http://www.excelonlineclasses.co.nr/
Computation
Data
intensive porgraming
intensive programing
http://www.excelonlineclasses.co.nr/
Lots
of small files
Multiple
Lots
of small files
in HDFS Compress
All the metadata is stored in HDFS memory
http://www.excelonlineclasses.co.nr/
Blocks
disc
read/write
512 bytes
FileSystem
Blocks
In
Newer
http://www.excelonlineclasses.co.nr/
Blocks
HDFS
64 MB
Unlike
vs Latency
http://www.excelonlineclasses.co.nr/
time = 10ms
transfer rate (throughput) = 100MBPS
make
Default
is 64 MB
As the transfer rate increases , Block
size can be increased
http://www.excelonlineclasses.co.nr/
hadoop
corrupt ?
etc.,
http://www.excelonlineclasses.co.nr/
File Permissions on
HDFS
Clients
identity determined
Sharing
Going
forward
Kerberos authentication
http://www.excelonlineclasses.co.nr/
http://www.excelonlineclasses.co.nr/
http://www.excelonlineclasses.co.nr/
http://www.excelonlineclasses.co.nr/