Powered By Blogger

Sunday, November 3, 2019

HDFS Architecture concepts

Hive internall uses MR, developed by Facebook
Pig is used for cleaning data, same is done by Spark
Sqoop to move data to and from from hdfs to rdbms.

Hbase : No sql db similar to Mongo. For doing random access.
helps in doing quick search.

Spark : general purpose in memory computing engine. Spark is alternative for MR.
Plug and play framework which can work with HDFS , s3 filesystem
can work with Mesos, yarn
10 to 100 times faster compared to MR.

Same code can be used for cleansing and transforming.

Spark supports batch and stream processing.


HDFS :
Hadoop Distributed file system.

cluster can be of any size. Each machine can have below kind of config
cpu - cores - 4 cpu cores
ram - memory - 8 GB
disk - 1 TB HD


Lets assume 1 GB file, we need to persist with the name file1
We could have stored in single machine or across cluster.

Divide the big file into 4 small parts(Assuming 4 node cluster) and push the content to data node.

block, input split :
We have block size , by default it is 128MB in hdp2.

In HDP1 block size is 64MB.

If the file size is 500MB, then 3 blocks will be of size 128MB, remaining
1 block will be of 500 - 3*128MB, remaining memory will be released.

If we are copying in 3 machines. blocks can go to same nodes.


All the block storage is registed with namednode.
files block node     replication
file1 b1     node1   node2
file1 b2     node2   node1


Master or namednode will have all the metadata of data node.
ex : in the book index page is namednode, book contents are like content.


Commodity hardware : cheap hardware which can fail.
Named node cannot expect to fail.


To avoid the hardware failure, we have replication factor.
usually we have 3 copies of data by default.


How does named node know about data node failure.


Every 3 seconds, each data node sends heart beat saying it is alive to name node.
if named node does not recieve heartbeat consecutively for 10 times, then
named node will assume data node is slow or dead.

Metadata will be removed about that data node, and copies the existing data node
to another node.

Check if the data node will be removed from the cluster????

64GB RAM
3 to 4 TB harddisk
16 cpu
will be the data node config.


Named node : will have more RAM.

What is named node fails?
We will have namednode failure. We will have secondary Namenode.


fsimage and editlogs are the metadata files used by named node.
it will be stored in shared location.

fsimage : fixed snapshot of the image at that point in time.
analogy : team score at 40th hour


editlogs : incremental changes will go to editlogs
40th hour to new time , 40.2

editlogs will be merged to fsimage at some time interval and reset the editlogs
to empty.

merging is costly activity. So secondary node does merging every 30 secs and puts
back fsimage latest copy to shared location. This process is called as
checkpointing.


In hadoop1 named node is single point of failure. However in HDP2 we will have
secondary node.

If the secondary node fails then hdp admin team should bring the node.

If the size of block size is high we will have reduced parallelism.
If the size of block size is small we will have increased parallelism but it
should not be too small, and it will be burden to named node.


______________
Rack Awareness
______________

Group of machines.  For diaster recovery we will have racks, racks will be
geographically distributed.

Usually we store the data in differnt racks to achieve replication.

so hadoop uses 2 racks 3 copies to store the data, it is called as Rack awareness
mechanism.

If we use 3 racks and 3 copies , write bandwidth will be too high. so 2 racks
3 copies is better approach.



______________
corrupted block
-______________

block report will tell about corrupted block.
Replication will take care of this and there is no manual intervention needed.

block scanner will run in data node and send info to named node.

dfs.replication is the config : 3

in local setup/pseudo distributed mode will be 1 in local machine.

Ideally we will have one secondary node.

Fedetation : we can have more than 2 secondary node.

Checkpointing can happen based on time or number of transactions.

No comments:

Post a Comment