School of Informatics
Niclas Ståhl defends his thesis "Integrating Domain Knowledge into Deep Learning - Increasing Model Performance Through Human Expertise".
The dissertation is held at Insikten, Portalen, but is also live streamed on Zoom. Because of current recommendations it is adviced to join the stream on Zoom.
Click on the link to see the dissertation.
The research in this thesis is focused on how deep learning models can be designed and implemented to better emulate and integrate the heuristic reasoning process of human experts. Several case studies, within the domains of steel making and drug discovery, are conducted in order to evaluate the performance of these models, in comparison to models that do not consider human expert knowledge from the targeted domain. These case studies focus on separate problems in targeted industries and, for example, deals with predictions of the outcome in the melting of steel and the rolling of steel sheets. This thesis also deals with property predictions of molecules and the generation of new drug candidates.
The research in this thesis addresses three main qustions. Firstly, the impact of different data representations and especially representations that are close to how human experts represent the addressed problem is studied. Secondly, focus is put on how the internal structure of data-driven algorithms can be designed to address problems in the same way as human experts. Finally, it is investigated how constraints, which are specified by experts, can be integrated into models to get models that generalise better and prevent non-feasible extrapolation. These questions are addressed in empirical case studies, where the model development is steered by knowledge about the problem provided by domain experts. In the presented cases, the developed algorithms performed significantly better, under the performance metric used, than in other commonly used machine learning models. Hence, this thesis shows the need for the involvement of domain experts in the design of models possessing high performance. This requires models that can both be applied to various data representations and be designed to compute and reason in multiple ways. Hence, the models must be flexible, so that they can be designed to reason in different ways and be applied to different data representations that correspond to the mental representations that is used by the human experts for the targeted problem.
The conclusion of this thesis is that such models can be designed using artificial neural networks, and that these models perform better than conventional machine learning models, when applied to new data. Such models can be designed so that they can be applied to complex representations of the data, representing the targeted problem better. More information regarding the problem can also be provided by designing the model in such a way that the reasoning of human experts is emulated. This can either be done by emulating the expert behaviour in computational steps or by specifying known boundaries of the problem. The main conclusion is that models that are applied to data representations that capture the real world problem in a good way, emulate the steps that human experts take to solve the problem, or know the limitations around the problem, have more information about the problem, and therefore, have an edge over other models, ultimately resulting in better performance.
Keith L. Downing, Professor, Norwegian University of Science and Technology
Göran Falkman, Associate Professor, University of Skövde
Alexander Karlsson, Senior Lecturer, University of Skövde
Gunnar Mathiason, Senior Lecturer, University of Skövde
Jonas Boström, Astra Zeneca
Panagiotis Papapetrou, Professor, Stockholms University
Rebecka Jörnsten, Professor, Chalmers University of Technology
Mats Granath, Senior Lecturer, University of Gothenburg
School of Informatics