What happens if you increase or decrease the default block size in HDFS?
Anonym
HDFS is designed with the goal of providing efficient streaming access to large file An ideal Data Blocks size is based on several factors: ClusterSize, Average input file, Map task capacity of the cluster. But general recommendation is starting block size at 128 MB. Issues with small block size: 1)Small block size also is a problem for Namenode since it keeps metadata of all blocks and it keeps metadata in memory. Due to small block size Namenode can run out of memory. 2)Too small block size would result in more number of unnecessary splits, which would result in more number of tasks which might be beyond the capacity of the cluster. Issues with large block size: 1)The cluster would be underutilized because of large block size there would be fewer splits and in turn would be fewer map tasks which will slow down the job. 2)_Large block size would decrease parallelism.