COMPARATIVE ANALYSIS OF DATA MINING TECHNIQUES FOR MOVIE PREDICTION

Authors

  • Gbenga Ogunsanwo TASUED
  • Ayokunle A. Omotunde
  • Olubukola Adekola
  • Aaron Izang
  • Samuel B. Abel

DOI:

https://doi.org/10.33003/fjs-2022-0606-1117

Keywords:

Comparative Analysis, Movie, Support Vector Machine (SVM), Gradient Boosting Algorithm

Abstract

The rate at which movies are being produced is increasing at exponential rates and it has become pertinent to ascertain success rate since the investment that goes into these movie creation runs in millions of dollars. A number of data mining-based methods, ranging from Support Vector Machine (SVM) to logistic regression, have been proposed with a varying level of success with SVM showing the most promising results. This paper aims to carry out a comparative analysis of the performance of Gradient Boosting and SVM algorithms in optimizing the prediction of movie success. The study developed a framework for the research methodology; the dataset used contained 33 movie attributes and 838 entries. The dataset was cleaned with six attributes; features were identified and selected from the datasets using four methods. These methods include: Analysis of Variance (ANOVA), Lasso Regularization, Combination of Lasso Regularization and Random Forest (RF). Model Formulation were done using Support Vector Machine (SVM) and Gradient Boosting Algorithm and the performance evaluation of the developed predictive models was done using accuracy, precision and recall values. The results shows that the accuracy of the Gradient Boosting algorithm is around 100%, SVM-Linear is 86 %, SVM-Poly is 88%, SVM-RBF is 88% and SVM-Sigmoid is 72%. The study concluded that Gradient Boosting algorithm is more robust in predicting movie success. Also recommended that comparison should be done with different machine learning techniques.

References

Bloom, J. S., Ehrenreich, I. M., Loo, W. T., Lite, T.-L. V., & Kruglyak, L. (2013). Finding the sources of missing heritability in a yeast cross. Nature 494, 234–237.

Carroll, S. B., Grenier, J., & Weatherbee, S. (2015). From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design. Malden, MA: John Wiley & Sons.

Coen, E. (2012). Cells to Civilizations: The Principles of Change That Shape Life. Princeton, NJ: Princeton University Press.

Dawkins, R. (2013). The Extended Phenotype: The Long Reach of the Gene. Oxford: Oxford

University Press.

Engelman, C. D., Baurley, J. W., Chiu, Y.-F., Joubert, B. R., Lewinger, J. P., & Maenner, M.J., (2015). Detecting gene-environment interactions in genome-wide association data. Genet. Epidemiol. 33(Suppl. 1), S68–S73.

French-Constant, R. H., Daborn, P. J., & Le Goff, G. (2014). The genetics and genomics of insecticide resistance. Trends Genet. 20, 163–170.

Gerstein, M. B., Bruce, C., Rozowsky, J. S., Zheng, D., Du, J., Korbel, J. O., et al. (2014). What is a gene, post-ENCODE? History and updated definition. Genome 4(1).

Gjuvsland, A. B., Vik, J. O., Beard, D. A., Hunter, P. J., and Omholt, S. W. (2013).Bridging the genotype-phenotype gap: what does it take? J. Physiol. 591, 2055–2066.

Gompel, N., & Prud’homme, B.(2019).The causes of repeated genetic evolution. Dev. Biol. 332, 36–47.

Graur, D., Zheng, Y., & Azevedo, R.B.R.(2015). An Evolutionary Classificationof Genomic Function. Genome Biol. Evol. 7, 642–645.

Johannsen, W.(2011). The genotype conception of heredity. Am. Nat. 45, 129-159.

Keller, E. F. (2012). The Mirage of a Space between Nature and Nurture. Durham, NC: Duke University Press.

Liu, F., Wen, B., and Kayser, M.(2013). Colorful DNA polymorphisms in humans. Semin. Cell Dev. Biol. 24, 562–575.

Martin, A., & Orgogozo, V. (2013). The Loci of Repeated Evolution: A Catalogue of Genetic Hotspots of Phenotypic Variation. Evol. Int. J. Orgn. Evol. 67,3

Palmer, A. R.(2014). Symmetry breaking and the evolution of development. Science 306, 828–833.

Salazar-Ciudad, I., & Marín-Riera, M. (2013). Adaptive dynamics under development-based genotype-phenotype maps. Nature 497, 361–364.

Steiner, C. C., Weber, J. N., and Hoekstra, H. E. (2013). Adaptive variation in beach mice produced by two interacting pigmentation genes. PLoS Biol. 5:e219.

Stewart, C. B., Schilling, J. W., & Wilson, A. C. (2017). Adaptive evolution in the stomach lysozymes of foregut fermenters. Nature 330, 401–404.

Stotz, K. (2012). Murder on the development express: who killed nature/nurture? Biol. Philos. 27, 919–929.

Sturtevant, A. H. (2012). The use of mosaics in the study of the developmental effects of genes. Proc. Sixth Int. Congr. Genet. Ithaca N. Y. 1, 304–307.

Tautz, D., & Schmid, K. J. (2018). From genes to individuals: developmental genes and the generation of the phenotype. Philos. Trans. R. Soc. Lond. B.Biol. Sci. 353, 231–240.

Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H.(2014). The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006.

Wilkins, A. (2014). “The genetic tool-kit’: the life-history of an important metaphor,†in Advances in Evolutionary Developmental Biology, ed. J. Todd Streelman (Hoboken, NJ: John Wiley & Sons).

Published

2023-01-10

How to Cite

Ogunsanwo, G., Omotunde, A. A., Adekola, O., Izang, A., & Abel, S. B. (2023). COMPARATIVE ANALYSIS OF DATA MINING TECHNIQUES FOR MOVIE PREDICTION. FUDMA JOURNAL OF SCIENCES, 6(6), 224 - 228. https://doi.org/10.33003/fjs-2022-0606-1117

Most read articles by the same author(s)