Data Lineage for AI
-
- £4.99
Publisher Description
This is an Apple Books audiobook narrated by a digital voice based on a human narrator.
As artificial intelligence systems increasingly rely on large, complex, and externally sourced datasets, organizations are under growing pressure to prove where training data originated, how it was transformed, and whether it was used appropriately. Without defensible data lineage, AI systems become difficult to audit, explain, or regulate.
Data Lineage for AI is a technical and practical guide for engineers, compliance teams, and risk professionals responsible for managing and documenting the origins of AI training data. The book explains how lineage enables transparency, accountability, and regulatory compliance across the AI lifecycle.
This volume focuses on operational methods for capturing provenance and maintaining traceability from raw data ingestion through feature engineering and model training. It connects lineage practices directly to audit readiness, investigation support, and regulatory expectations.
Key areas covered include:
What data lineage mean in AI and machine learning contextsCapturing provenance across data pipelines and transformationsTooling and architectures for lineage logging and storageLinking datasets to specific models and training runsUsing lineage as audit evidence for compliance reviewsSupporting regulatory inquiries and incident investigations
Written for practitioners operating in regulated or high-risk environments, this book provides concrete techniques to make AI data usage transparent, defensible, and verifiable without slowing delivery teams.