Novel uses of provenance in data science applications
Professor Susan Davidson Computer and Information Science University of Pennsylvania
24 באוקטובר 2021, 12:00
Seminar Room 420, Checkpoint Building
:Abstract
Provenance – the “why” and “where” of data – has been extensively studiedand has been used to understand the results of database queries, debug networks and analyze data science workflows In this talk I will discuss two novel uses of data provenance: creating fine-grained citations for data extracted from a database; and incrementally updating machine learning models after deletions have been made to the training set. I will also show how the latter can be used to efficiently clean label uncertainties in machine learning training sets