Self-Healing Metadata Catalog via Graph Neural Anomaly Detection
Keywords:
Graph Neural Networks, metadata catalog, schema drift, data lineage, anomaly detection, self-healing systemsAbstract
Comprehensive metadata management solutions that handle structural and data lineage issues are essential as the data assets develop rapidly. The aim of this paper is to introduce a GNN-based self-healing metadata catalogue architecture for real-time anomaly detection. Schema drifts, unresolved references, and parentless tables are in lineage diagrams.
Downloads
References
M. Köhler, D. Lemmerich, and F. Lemmerich, “Towards Automated Metadata Management: Taxonomy and Survey of Modern Metadata Catalogs,” Proc. VLDB Endowment, vol. 16, no. 3, pp. 357–370, Jan. 2023.
H. Xiao, J. Zhou, and Q. Zhang, “Lineage-Aware Data Quality Management Using Temporal Graph Models,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 1, pp. 412–425, Jan. 2023.
Y. Ma, X. Liu, T. Zhang, and J. Tang, “A Comprehensive Survey on Graph Neural Networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 9, pp. 1–21, Sep. 2022.
C. Zhang, Y. Song, C. Wang, and Y. Wang, “Anomaly Detection with Graph Neural Networks: A Review,” IEEE Access, vol. 10, pp. 13811–13829, 2022.
R. Singh and H. Paik, “Automating DataOps Using Metadata Graphs and Lineage Analysis,” in Proc. IEEE Int. Conf. Cloud Eng. (IC2E), 2022, pp. 150–158.
S. Abiteboul, L. D. Raedt, and T. Milo, “Managing Metadata in the Era of AI-Driven Data Systems,” Commun. ACM, vol. 66, no. 2, pp. 62–71, Feb. 2023.
Z. Liu et al., “Inductive Representation Learning on Large Attributed Graphs for Metadata Anomaly Detection,” in Proc. NeurIPS, 2023.
D. Narayanan et al., “Data Lineage Graph Construction at Scale: Challenges and Solutions,” in Proc. ACM SIGMOD, 2022, pp. 1015–1028.
A. Goyal, A. Chawla, and S. Malik, “Metadata-Driven Policy Enforcement in Data Lake Architectures,” IEEE Trans. Serv. Comput., vol. 17, no. 1, pp. 143–156, Jan.–Mar. 2024.
W. Zhou et al., “Graph Neural Networks in Practice: Anomaly Detection in Enterprise Metadata,” J. Big Data, vol. 11, no. 1, pp. 75–92, Feb. 2024.
F. Zhang and M. Gupta, “Adaptive GNNs for Real-Time Metadata Integrity Assurance,” in Proc. IEEE Int. Conf. Big Data, 2023, pp. 2134–2142.
X. Li, Y. Wu, and C. Wang, “Schema Drift Detection with Graph-Aware Transformers,” Proc. VLDB Endowment, vol. 17, no. 1, pp. 112–125, Jan. 2024.
L. Li, Q. Chen, and Y. Li, “Human-in-the-Loop Data Governance with Intelligent Repair Agents,” in Proc. IEEE Int. Conf. Data Eng. (ICDE), 2023, pp. 1456–1468.
J. Wu, D. Wang, and Y. Liu, “Temporal Graph Learning for Metadata Evolution Tracking,” IEEE Trans. Knowl. Data Eng., vol. 36, no. 4, pp. 987–1001, Apr. 2024.
T. Han, M. Qian, and S. Zhang, “Orphaned Table Detection via Graph-Based Anomaly Profiling,” Data Knowl. Eng., vol. 143, pp. 101941, Mar. 2024.
R. Singh and M. Gupta, “Scalable Lineage Graph Embedding Using Subgraph Attention Mechanisms,” in Proc. ACM KDD, 2023, pp. 211–221.
A. Verma et al., “Explainable Graph Neural Networks for Metadata Catalog Integrity,” IEEE Trans. Artif. Intell., vol. 3, no. 4, pp. 233–246, Dec. 2023.
Y. Pan, D. Li, and M. Lei, “A Survey of Self-Healing Systems: Taxonomy, Techniques, and Trends,” ACM Comput. Surv., vol. 56, no. 1, pp. 1–36, Jan. 2024.
S. Roy, K. Jain, and N. Mehta, “AI-Powered Metadata Resilience: From Detection to Automated Repair,” IEEE Trans. Dependable Secure Comput., preprint, Mar. 2025.
J. Chen, L. Xiao, and T. He, “Multi-Agent Architectures for Autonomous Metadata Anomaly Remediation,” in Proc. AAAI Conf. Artif. Intell., 2024, pp. 5641–5649.