Beyond Human Scrutiny: Unleashing Machine Learning on the Data Quality Frontier

Sainath Muvva

doi:10.5281/zenodo.14565883

Beyond Human Scrutiny: Unleashing Machine Learning on the Data Quality Frontier

Authors: Sainath Muvva

DOI: https://doi.org/10.5281/zenodo.14565883

Short DOI: https://doi.org/g8w63k

Country: USA

Full-text Research PDF File: View | Download

Abstract: In today's data-centric world, maintaining high-quality information is paramount, as flawed datasets can undermine analytical efforts, lead to misguided choices, and potentially destabilize entire systems. The field of machine learning (ML) presents sophisticated methodologies for identifying and rectifying a wide array of data quality challenges, including inaccuracies, gaps in information, inconsistent entries, and outliers. This research delves into the application of various ML paradigms - supervised, unsupervised, and hybrid approaches - in the pursuit of data excellence. We examine key strategies such as employing classification algorithms for error identification, utilizing regression techniques for filling data gaps, implementing clustering methods to pinpoint anomalies, and harnessing the power of deep learning for data transformation and enhancement. Additionally, our study addresses the practical hurdles and showcases real-world implementations where ML has significantly improved data quality management processes.

Keywords: Data Quality, Machine Learning, Data Imputation, Anomaly Detection, Data Cleaning, Supervised Learning, Unsupervised Learning, Regression Models, Clustering, Data Transformation, Record Deduplication, Missing Data, Entity Resolution, Data Consistency, Deep Learning, Data Quality Assurance, Data Quality Dimensions, Model Interpretability, Data Pipelines, Active Learning

Paper Id: 231927

Published On: 2024-10-08

Published In: Volume 12, Issue 5, September-October 2024

All research papers published in this journal/on this website are openly accessible and licensed under Creative Commons Attribution-ShareAlike 4.0 International License; accordingly, any user can read, download, copy, distribute, print, search, or link to the full texts of the authors/researchers submitted and published articles, crawl them for indexing, pass them as data to any software, or use them for any other lawful purpose. The journal is fulfilling the DOAJ's definition of open access.

About IJIRMPS Indexing & Archiving Publication Ethics Peer Review & Plagiarism	Website/Journal Policies Usage Policy Content Policies Privacy Policy	Contact Us +91-9687-828-838 editor@ijirmps.org

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Beyond Human Scrutiny: Unleashing Machine Learning on the Data Quality Frontier

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences E-ISSN: 2349-7300 • Impact Factor - 9.907

A Widely Indexed Open Access Peer Reviewed Online Scholarly International Journal

Beyond Human Scrutiny: Unleashing Machine Learning on the Data Quality Frontier

Share this

International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences
E-ISSN: 2349-7300 • Impact Factor - 9.907