A COMPARATIVE STUDY OF THE ARCHITECTURES AND APPLICATIONS OF SCALABLE HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEMS
Abstract
Distributed File Systems have enabled the efficient and scalable sharing of data across networks. These systems were designed to handle some technical problems associated with network data. For instance reliability and availability of data, scalability of infrastructure supporting storage, high cost of gaining access to data, maintenance cost and expansion. In this paper, we attempt to make a comparison of the key technical blocks that are referred to as the mainstay of Distributed File Systems such as Hadoop FS, Google FS, Luster FS, Ceph FS, Gluster FS, Oracle Cluster FS and TidyFS. This paper aims at elucidating the basic concepts and techniques employed in the above mentioned File Systems. We explained the different architectures and applications of these Distributed File Systems
References
Akram Elomari, Larbi Hassouni, Abderrahim Maizate (2017). The Main Characteristics of Five Distributed File Systems Required for Big Data: A Comparatively Study. Advances in Science, Technology and Engineering Systems Journal, vol. 2, No. 4, pp. 78-91.
Burleson (2017). Oracle Cluster File System (OCFS) Tips. Available at http://www.dbaoracle. com/disk_ocfs.htm
Carlos Fernando Gamboa (2008). Atlas LCG 3D Oracle cluster migration strategy at BNL, Gris Group, RACF Facility, Brookhaven National Lab, WLCG Collaboration Workshop. Available at http://slideplayer.com/slide/8285174/
D. Fetterly, M. Haridasan, M. Isard, and S. Sundararaman, (2011). TidyFS: A Simple and Small Distributed File System, in USENIX ATC’11, Available at http://research.microsoft.com/pubs/148515/tidyfs.pdf
DataFlair Team, (2017). Hadoop HDFS Architecture Explanation and Assumptions. HDFS Tutorials. Available at https://data-flair.training/blogs/hadoophdfs-architecture/.
Feiyi W. Mark N. Sarp O. Dong F. (2013). Ceph Parallel File System Evaluation Report. Oak Ridge National LaboratoryOak Ridge, Tennesse.
Felix Hupfeld, Toni Cortes, Bj¨orn Kolbeck, Jan Stender, Erich Focht, Matthias Hess, Jesus Malo, Jonathan Marti, Eugenio Cesario. (2008). The XtreemFS architecture– a case for object-based file systems in Grids, Concurrency And Computation: Practice And Experience Concurrency Computat.: Pract. Exper.: 8:1–12.
Giacinto Donvito, Giovanni Marzulli, Domenico Diacono (2014).Testing of several distributed Filesystems (HDFS,Ceph and GlusterFS) for supporting the HEP experiments analysis. Journal of Physics: Conference Series 513 (2014) 042014 doi:10.1088/1742-6596/513/4/042014.
Hooman Peiro Sajjad and Mahmoud Hakimzadeh Harirbaf. (2013). Maintaining Strong Consistency Semantics in a Horizontally Scalable and Highly Available Implementation of HDFS,Master Thesis, KTH Royal Institute of Technology.
Hurwitz J., Nugent A., Halper F. (2013). Big Data for Dummies. John Willey and Sons Inc, USA IBM (2018). Apache MapReduce. Retrieved May 7, 2018, from: https://www.ibm.com/analytics/hadoop/MapReduce
John Spray (2015). CephFS Development Update. Available at events.linuxfoundation.org/sites/events/files/slides/C ephFS-Vault.pdf
Copyright (c) 2023 FUDMA JOURNAL OF SCIENCES
This work is licensed under a Creative Commons Attribution 4.0 International License.
FUDMA Journal of Sciences