DEVELOPMENT OF AN ENHANCED C4.5 DECISION TREE ALGORITHM USING A MEMOIZED MAPREDUCE MODEL

Authors

  • Florence Paul Ahmadu Bello University
  • A. Afolayan Obiniyi
  • F. Armand Donfack-Kana
  • D. Elaoyi Paul

DOI:

https://doi.org/10.33003/fjs-2023-0705-1691

Keywords:

Data mining, C4.5 Algorithm, Memoization, MapReduce, Hadoop

Abstract

Classification technique in data mining concentrates on the prediction of categorical or discrete target variables which is designed to be handled by the classical C4.5 decision tree algorithm, an algorithm whose aim is to produce a tree which accurately predicts the target variable for a new unseen data. However its recursive nature poses a limitation when huge volume of dataset is involved; making computation more complex and resulting in an inefficient implementation of the algorithm in terms of computing time, memory utilization and data complexity. Meanwhile, several researches have been done to control these limitations. One of such improvements is the parallelizing of the algorithm using the MapReduce model. This involves dividing the large dataset into smaller units and sharing them on multiple computers for parallel processing, but the recursive nature of the algorithm makes the cost of computing large number of repeated calculations quite high, which is our concern in this work. . This research is aimed at reducing computation time further, by using a memoized MapReduce model, which involves the saving of the result of previous calculations in a cache; hence, when same calculations are encountered again, the cached result is returned, thus re-computation is avoided. The cached result is considered a reduced cost compared to the computational cost of re-computation.

References

Ajakaiye DE (1983) Deep structures of alkaline ringcomplexes from geophysical data. In: Abstract, international conference on alkaline ring complexes in Africa. Zaria, Nigeria

Akanbi, E.S, Ugodulunwa F.X.O and Gyang B.N (2012). Mapping Potential Cassiterite deposit at Naraguta area north central Nigeria using geophysics and geographic information system Journal of mining and geology volume 13, 21-26.

Akanbi, E.S, Ugodulunwa F.X.O and Gyang B.N (2017). 2-D Electrical Resistivity Survey for Cassiterite Potential Mapping in Jos-Bukuru Area, North Central, Nigeria. Journal of Geography, Environment and Earth Science International 10(1): 1-12.

Carlin JF. (2012). Tin: Statistics and Information. United States Geological Survey Mineral Commodity Summaries, pp. 170-171.

Imeokparia EG. (2015). The Applied Geochemist and the Challenges of Georesource Evaluation for Sustainable Development and Environmental Management. Inaugural Lecture delivered at University of Benin, Benin City, Edo State, Nigeria.International Tin Research Institute Ltd. (ITRI Ltd.). (2016). Report on Global Tin Resources & Reserves (Security of Long-term Tin Supply). 20.

Kinnaird JA, Nex PAM, Milani L. (2016). Tin in Africa. Episodes 39(2): 361-380. https://doi.org/10.18814/epiiugs/2016/v39i2/95783.

Macleod WN, Turner DC, Wright EP. The geology of the jos plateau; General Geology Bulletin Geological Survey, Nigeria. 1971; 32:118.

Olade MA. (1980). Geochemical characteristics of tin-bearing and tin-barren granites, northern Nigeria. Economic Geology 75: 71-82. https://doi.org/10.213/gsecongeo.75.1.71.

Obaje N. G. (2009). Geology and Mineral Resources of Nigeria. Springer-Verleg Berlin Heidelberg.

Pratt, L.M., Comer JB, Brassell SC (1992) Geochemistry of organic matter in sediments and sedimentary rocks. SEPM Short Course 37:100 pp.

Saad .R.,Adli, and Mohamed, A.S.(2012). The Study of Iron Ore Prospect using 2-D Resistivity and Induced Polarization Method Electronic. Journal of Geotechnical engineering. vol.17.bund.

Sainsbury CL. (1969). Tin Resources of the World. US Geological Survey Bulletin 1301, p. 55.

Turner DC. (1983). Upper Proterozoic Schist Belts in the Nigerian Sector of the Pan-African province of West Africa. Prec. Res. 21: 55-79. https://doi.org/10.1016/0301-9268(83)90005-0

Umeshwar P. (2011). Economic Geology (Economic Mineral Deposits). 2nd ed. CBS Publishers and Distributors, p. 319

Published

2023-11-04

How to Cite

Paul, F., Obiniyi, A. A., Donfack-Kana, F. A., & Paul, D. E. (2023). DEVELOPMENT OF AN ENHANCED C4.5 DECISION TREE ALGORITHM USING A MEMOIZED MAPREDUCE MODEL. FUDMA JOURNAL OF SCIENCES, 7(5), 156 - 164. https://doi.org/10.33003/fjs-2023-0705-1691