Evolution is fascinating to watch. To me, it is the most interesting when one can observe the evolution of a single man.

Shana Alexander

A very small cause which escapes our notice determines a considerable effect that we cannot fail to see, and then we say that the effect is due to chance.

Jules H. Poincare

 

With the abundance of computer storage, and with the growing complexity and pervasiveness of computing, there has been a trend in explicitly recording semantically rich metadata in computational systems. A prime example of such metadata is provenance, that describes the steps by which the data was derived. Provenance metadata (also referred to as lineage, parentage, pedigree, genealogy, filiation) is a pervasive metadata that is universally useful in enabling several next generation applications. The management of provenance metadata forms the basis of this project.

Provenance is useful for several scenarios, especially where origins of data can be used to infer quality and hence can be used for more effective data sharing, and where sources of results such as experimental setup/input can give indications as to why the results were obtained and how to selectively choose the settings for the next experiment/(s). Example applications where provenance metadata is used are numerous and include GIS, materials engineering, life sciences, astronomical sky surveys, and data warehousing. In health informatics, a novel application is Personal Health Records (PHRs), where the patient can actively participate in health care. We believe that provenance is critical for the success of PHRs, especially because information in PHRs can come from various sources, EHRs (electronic health records) maintained by one/more service providers, data entered by patient, aggregated data obtained from a community, other medical information sources etc. In such a case, quality of data becomes very important – is a data item clinically valid and can it be used by doctors? what is the reason that these symptoms were diagnosed with X, rather than the common Y? The importance of provenance metadata as well as the necessity to formalize the concepts is further illustrated by the several workshops organized on this topic in the recent past.

The key insights of our project are: