The Second Workshop on Distributed Storage Systems and Coding for Big Data


Introduction to workshop

Mass storage is critical in the era of Big Data. Dispersing a huge data file in a large-scale distributed storage system is necessary in order to enhance reliability and availability. By introducing redundancy in the system, we can protect the data integrity from node failures. As node failures occur frequently in large-scale storage system, a considerable volume of Internet traffic is dedicated to the repair of failed storage nodes. Several classes of distributed storage codes, such as regenerating codes, locally repairable codes and so on, are introduced recently to reduce this overhead and disk input/output cost. Nevertheless, there still remains substantial research work for advancing distributed storage coding and systems in both theory and applications.

This workshop will provide an excellent platform for computer systems researchers and data scientists to exchange ideas and experience that coding techniques and distributed storage systems can offer to big data applications, and to understand the challenges that we need tackle to realize the full potential.


Contributions devoted to the evaluation, optimization, or enhancement of distributed storage systems and cloud systems, as well as solutions for mass storage, are solicited. Topics of interest include but are not limited to:


The full manuscript should be at most SIX pages using the two-column IEEE format. Additional pages will be charged additional fee.

Papers MUST be submitted in PDF format and only through the online submission system:



The authors of accepted papers must guarantee that their papers will be presented at the conference. At least one author of each accepted paper must register for the conference in order include the paper in IEEE Xplore Digital Library.

Authors of accepted papers will be invited to submit a revised and extended version of their paper (at least 30% of additional material) after the workshop to a related special issue of a journal as a special issue in the ZTE Communications.


Time Workshop Schedule
09:00-09:10 Plenary
09:10-09:40 Keynote: A New Zigzag MDS Code with Optimal Encoding and Efficient Decoding
Prof. Hui Li (Peking University, China)
09:40-10:00 Parity Declustering for Fault-Tolerant Storage Systems via t-designs
Son Hoang Dau (Singapore University of Technology and Design, Singapore),
Yan Jia, Chao Jin, Weiya Xi, and Kheong Sann Chan (Data Storage Institute, Singapore)
10:00-10:20 Coffee Break
10:20-10:40 A C Library of Repair-Efficient Erasure Codes for Distributed Data Storage Systems
Chao Tian (University of Tennessee at Knoxville, United States)
10:40-11:00 STORE: Data Recovery with Approximate Minimum Network Bandwidth and Disk I/O in Distributed Storage Systems
Tai Zhou, Hui Li, Bing Zhu, Yumeng Zhang, Hanxu Hou, and Jun Chen (Peking University Shenzhen Graduate School, China)
11:00-11:20 ReCT: Improving MapReduce Performance under Failures with Resilient Checkpointing Tactics
Hao Wang, Haopeng Chen, and Fei Hu (Shanghai Jiao Tong University, China)
11:20-11:40 An Efficient Scheme to Ensure Data Availability for a Cloud Service Provider
Seungmin Kang (National University of Singapore, Singapore), Bharadwaj Veeravalli, Khin Mi Mi Aung, and Chao Jin (Data Storage Institute, Singapore)

Venue: Diplomat, Hyatt Regency Bethesda, Washington DC, USA

2014 IEEE International Conference on Big Data (IEEE BigData 2014)