The witnessed an increasing adoption of erasure coding in modern clustered storage systems to reduce the storage overhead of traditional 3-way replication. However, it remains an open issue of how to customize the data analytics paradigm for erasure coded storage, especially when the storage system operates in failure mode. The propose degraded first scheduling, a new MapReduce scheduling scheme that improves MapReduce performance in erasure-coded clustered storage systems in failure mode. Its main idea is to launch degraded tasks earlier so as to leverage the unused network resources. The proposes degraded-first scheduling algorithm, whose main idea is to schedule some degraded tasks at earlier stages of a MapReduce job and allow them to download data first using the unused network resources. The experiment conduct mathematical analysis and discrete event simulation to show the performance gain of degraded first scheduling over Hadoop’s default locality-first scheduling.
Keywords : Degraded First Scheduling Algorithm, Mathematical Analysis And Discrete, Erasure-Coded Storage.