THE EVOLUTION OF APACHE HADOOP: A TECHNICAL JOURNEY FROM WEB CRAWLING TO ENTERPRISE COMPUTING

Nagwa Elmobark

doi:10.20428/jst.v30i4.2615

Pdf (الإنجليزية)

منشور 2025-03-18

DOI: https://doi.org/10.20428/jst.v30i4.2615

Nagwa Elmobark

Researcher, Department of Computer Science, University of Mansoura, Egypt

الملخص

This paper systematically analyzes Apache Hadoop's technological evolution, tracing its transformation from a web crawling subsystem to a comprehensive enterprise computing platform. Beginning with its origins in Google's foundational papers on the Google File System (GFS) and MapReduce, we examine the critical architectural decisions and technical innovations that shaped Hadoop's development across its major releases. The study talks about important technical milestones, such as how it came out of the Nutch project in 2006, how Yahoo! put it into production in 2008, how the stability-focused 1.0 release came out in 2011, how the groundbreaking YARN architecture came out in 2013, and how the security-enhanced 3.0 version came out in 2017. Our study shows how each stage of development solved a different problem related to distributed computing while also making Hadoop more useful than just being used for the web. We show how architectural changes in resource management, data storage efficiency, and processing flexibility helped Hadoop grow from a specific MapReduce implementation to a flexible distributed computing framework that can handle a wide range of business workloads. The research provides valuable insights into the technical considerations that drive distributed system evolution and offers lessons for future large-scale computing platforms.

إصدار

مجلد 30 عدد 4 (2025)

القسم

Articles

كيفية الاقتباس

[1]

Elmobark, N. 2025. THE EVOLUTION OF APACHE HADOOP: A TECHNICAL JOURNEY FROM WEB CRAWLING TO ENTERPRISE COMPUTING. مجلة العلوم والتكنولوجيا. 30, 4 (2025). DOI:https://doi.org/10.20428/jst.v30i4.2615.

##plugins.themes.bootstrap3.article.sidebar##

##plugins.themes.bootstrap3.article.main##

الملخص

##plugins.themes.bootstrap3.article.details##

الأعمال الأكثر قراءة لنفس المؤلف/المؤلفين