Anatomy of File Write in HDFS:
Consider writing a file sample.csv by HDFS client program running on R1N1’s JVM.First
the HDFS client program calls the method create() on a Java class DistributedFileSystem (subclass of FileSystem).DFS makes a RPC call to name node to create a new file in the file system's namespace. No blocks are associated to the file at this stage .Name node performs various checks and ensures that the file doesn't exists. Also check whether the user has the right permissions to create the file. Then name node creates a record for the new file. Then DFS creates FSDataOutputStream for the client to write data to. FSDOS wraps a DFSOutputStream, which handles communication between DN and NN. Finally,In response to ‘FileSystem.create()’, HDFS Client receives this FSDataOutputStream.
From now on HDFS Client deals with FSDataOutputStream. HDFS Client invokes write() on the stream.
Following are the important components involved in a file write;
· Data Queue: When client writes data, DFSOS splits into packets and writes into this internal queue.
· DataStreamer: The data queue is consumed by this component, which also communicates with name node for block allocation.
· Ack Queue: Packets consumed by DataStreamer are temporarily saved in an this internal queue.
DataStreamer communicates with NN to allocate new blocks by picking a list of suitable DNs to store the replicas. NN uses ‘Replica Placement’ as a strategy to pick DNs for a block.The list of DNs form a pipeline. Since the replication factor is assumed as 3, there are 3 nodes picked by NN.
DataStreamer consumes few packets from data queue. A copy of the consumed data is stored in ‘ack queue’.DataStreamer streams the packet to the first node in pipeline. Once the data is written in DN1, the data is forwarded to next DN. This repeats till last DN.Once the packet is written to the last DN, an acknowledgement is sent from each DN to DFSOS. The packet P1 is removed from Ack Queue. The whole process continues till a block is filled. After that, the pipeline is closed and DataStreamer asks NN for fresh set of DNs for next block. And the cycle repeats.
HDFS Client calls the close() method once the write is finished. This would flush all the remaining packets to the pipeline & waits for ack before informing the NN that the write is complete.
DataNode Write Error:
A normal write begins with a write() method call from HDFS client on the stream. And let’s say an error occurred while writing to R2N1.The pipeline will be closed. Packets in ack queue are moved to front data queue. The current block on good DNs are given a new identity and its communicated to NN, so that the partial block on the failed DN will be deleted if the failed DN recovers later. The failed data node is removed from pipeline and the remaining data is written to the remaining two DNs.NN notices that the block is under-replicated, and it arranges for further replica to be created on another node.