Transparent and explainable models
In data science, models are usually created by experts in modeling (i.e. data analysts) and then used by domain experts (such as astronomers, chemists, social scientists, etc.). This grant is addressing two major challenges in this process. On the one hand, while many models are validated by computing their performance on given data, it often remains unclear exactly why they work. On the other hand, it is well known that there are many more people with modeling needs (that have data and a set of questions to be answered) than there are people with the needed modeling skills. Hence, we are developing a set of guidelines on how to create tools to close that knowledge gap between model builders and model users. In this grant, we will not be able to address all types of modeling. We focus on two of the most difficult approaches: (a) clustering techniques, independent of a specific algorithm, (b) deep neural networks, independent on the analysis need. In addition, we will develop a general purpose approach, informed by a decade long experience in visual parameter space analysis. All three approaches are complementary and allow us to avoid pitfalls in any single of these approaches. The validation of our results will be done in three different levels: (1) using benchmark data sets, (2) working with very specific applications in astronomy, medicine, and finance, (3) through an iterative design process in collaboration with expert algorithmic modelers as well as domain scientists.