With any data pipeline there will be the problem of invalid data, this article suggests a way of dealing with this data and providing required features to aid its management. This approach can be applied to streaming or batch data ingestion equally. Classification of invalid data There are many ways in which a pipeline…