Dual ETL – Hadoop Cluster Auto Failover

Sainath Muvva

doi:10.5281/zenodo.14280339

Dual ETL – Hadoop Cluster Auto Failover

Authors: Sainath Muvva

DOI: https://doi.org/10.5281/zenodo.14280339

Short DOI: https://doi.org/g8ttk9

Country: USA

Full-text Research PDF File: View | Download

Abstract: This paper examines the design of data infrastructure for high-speed delivery, focusing on the 4 V's of big data and the importance of geographically separated primary and Disaster Recovery clusters. It explores the complexities of the failover process of Hadoop clusters, identifying challenges such as manual metadata updates and data quality checks. The research proposes automation solutions, including the use of DistCp for data replication and Hive commands for metadata updates, aiming to enhance data infrastructure resilience and reduce manual intervention during critical events.

Keywords: ETL, Hadoop, Distcp, Data Quality

Paper Id: 231748

Published On: 2019-09-03

Published In: Volume 7, Issue 5, September-October 2019

Cite This: Dual ETL – Hadoop Cluster Auto Failover - Sainath Muvva - IJIRMPS Volume 7, Issue 5, September-October 2019. DOI 10.5281/zenodo.14280339

All research papers published in this journal/on this website are openly accessible and licensed under Creative Commons Attribution-ShareAlike 4.0 International License; accordingly, any user can read, download, copy, distribute, print, search, or link to the full texts of the authors/researchers submitted and published articles, crawl them for indexing, pass them as data to any software, or use them for any other lawful purpose. The journal is fulfilling the DOAJ's definition of open access.

About IJIRMPS Indexing & Archiving Publication Ethics Peer Review & Plagiarism	Website/Journal Policies Usage Policy Content Policies Privacy Policy	Contact Us +91-9687-828-838 editor@ijirmps.org

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Dual ETL – Hadoop Cluster Auto Failover

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Dual ETL – Hadoop Cluster Auto Failover

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907