This page contains a list of abstracts for the peer-reviewed articles at the session on Monday in SWIFT 2009. All abstracts are fetched from camera-ready submissions.
- "Experts’ Views on User Activities in Information Fusion System Development Processes", Maria Nilsson and Joeri van Laere
Abstract: The role of the user in information fusion has gained increasing attention during the last years. However, looking at the research performed within the community it remains difficult to acquire a good understanding of the role of users, especially in terms of the information fusion system development process. To address this, a questionnaire was distributed during FUSION 2009 conference which aimed to explore these issues. Here, we present an initial analysis of the responses obtained. A consensus was found regarding how users have been incorporated in the development process. Specifically, they are seen as part of the concept generation phase, the requirement gathering phase or the evaluation phase rather than of the design phase. Also, the typical activities performed with users were found to be interviews and using users as advisers. However, results from the questionnaire also indicated that a consensus regarding what information is needed from users for automating a manual information fusion process is still lacking. The reasons for, and implications of this lack are discussed in the light of current research.
Keywords: HCI, Human Factors, Information Fusion, Methods, System Development Process, User study
- "Challenges in Tactical Support Functions for Fighter Aircraft",
Tina Erlandsson, Sören Molander, Jens Alfredson, Per-Johan Nordlund
Abstract: This paper describes challenges for tactical support functions in a fighter aircraft systems with one or several cooperating members. A short description of the domain of cooperating fighter aircraft is presented, which is linked to tactical support functions using an air-to-air scenario. The main rationale for developing an advanced tactical support system is to aid the pilot in handling complex time-critical scenarios and missions involving extensive cooperation. Important research directions include developing quality measures/metrics for situation awareness, methods for pro-active tactical support functions, and methods to automatically reduce the tactical information gap.
Keywords: Tactical support, discussion paper, fighter aircraft
- "Generating Comprehensible QSAR Models", Cecilia Sönströd, Ulf Johansson and Ulf Norinder
Abstract: This paper presents work in progress from the INFUSIS project and contains initial experimentation, using publicly available medicinal chemistry datasets, on obtaining comprehensible QSAR models. Three techniques are evaluated on both predictive performance, measured as accuracy, and comprehensibility, measured as model size. The chosen techniques are J48 decision trees and JRip and Chipper decision lists. The results show that J48 obtains superior accuracy and that Chipper performs best of the two decision list algorithms on accuracy. Furthermore, it is seen that, regarding accuracy, all techniques benefit from feature reduction, which almost always results in increased accuracy. Regarding comprehensibility, JRip obtains the smallest models, followed by Chipper, with J48 producing the largest models. For model size, feature reduction is not seen to be universally beneficial; only J48 produces smaller models for the reduced datasets, while both decision list algorithms actually produce larger models on average. The overall conclusion is that, for these datasets, there exists a definite tradeoff between accuracy and comprehensibility that needs to be investigated further.
Keywords: Classification, Comprehensibility, Concept Description, Data Mining, QSAR.
- "Evaluating Ensembles on QSAR Classification", Ulf Johansson, Tuve Löfström and Ulf Norinder
Abstract: Novel, often quite technical algorithms, for ensembling artificial neural networks are constantly suggested. Naturally, when presenting a novel algorithm, the authors, at least implicitly, claim that their algorithm, in some aspect, represents the state-of-the-art. Obviously, the most important criterion is predictive performance, normally measured using either accuracy or area under the ROC-curve (AUC). This paper presents a study where the predictive performance of two widely acknowledged ensemble techniques; GASEN and NegBagg, is compared to more straightforward alternatives like bagging. The somewhat surprising result of the experimentation using, in total, 32 publicly available data sets from the medical domain, was that both GASEN and NegBagg were clearly outperformed by several of the straightforward techniques. One particularly striking result was that not applying the GASEN technique; i.e., ensembling all available networks instead of using the subset suggested by GASEN, turned out to produce more accurate ensembles.
Keywords: Classification, Data Mining, Ensembles, QSAR.
- "Utilizing Information on Uncertainty for In Silico Modeling using Random Forests", Henrik Boström and Ulf Norinder
Abstract: Information on uncertainty of measurements or estimates of molecular properties are rarely utilized by in silico predictive models. In this study, different approaches to handling uncertain numerical features are explored when using the state-of-the-art random forest algorithm for generating predictive models. Two main approaches are considered: i) sampling from probability distributions prior to tree generation, which does not require any change to the underlying tree learning algorithm, and ii) adjusting the algorithm to allow for handling probability distributions, similar to how missing values typically are handled, i.e., partitions may include fractions of examples. An experiment with six datasets concerning the prediction of various chemical properties is presented, where 95% confidence intervals are included for one of the 92 numerical features. In total, five approaches to handling uncertain numeric features are compared: ignoring the uncertainty, sampling from distributions that are assumed to be uniform and normal respectively, and adjusting tree learning to handle probability distributions that are assumed to be uniform and normal respectively. The experimental results show that all approaches that utilize information on uncertainty indeed outperform the single approach ignoring this, both with respect to accuracy and area under ROC curve. A decomposition of the squared error of the constituent classification trees shows that the highest variance is obtained by ignoring the information on uncertainty, but that this also results in the highest mean squared error of the constituent trees.
- "Names of chemical compounds within drug discovery context", Elzbieta Dura, Ola Engkvist and Sorel Muresan
Abstract: Drug discovery is costly and time consuming, mainly due to very high attrition rates. To remedy this, the INFUSIS project tries to improve predictive modeling for ADMET by fusing information from various sources. Unstructured texts are among the most important sources and we use text corpus technology to uncover information on toxicity of specific compounds in articles on diabetes research. A selection of 200,000 abstracts on diabetes from PubMed was turned into a corpus. This allows extracting patterns of the actual use of chemical nomenclature in texts. The task of proper identification of chemical compounds in texts is not trivial despite availability of large compound libraries. There is a significant difference in how terms are registered in lexicons and how they are actually used.
Keywords: text mining, chemical compound, drug discovery, named entity recognition
- "An empirical investigation of the static weapon-target allocation problem", Fredrik Johansson and Göran Falkman
Abstract: The allocation of weapons to targets (such as missiles and hostile aircrafts) is a well-known resource allocation problem within the field of operations research. It has been proven that this problem, in general, is NP-complete. For this reason, optimal solutions to the static weapon-target allocation (WTA) problem can not be obtained in real-time for large-scale problems. We try to find the limit for how large problems that can be solved optimally in real-time by exhaustive search algorithms through running empirical experiments. We also propose a heuristic genetic algorithm for solving larger-scale problems.
- "Comparative study of evidential reasoning schemes for fusing ESM reports under varying sensor uncertainty and fusion unreliability", Pierre Valin, Pascal Djiknavorian, Eloi Bossé and Dominic Grenier
Abstract: We address the problem of fusing Electromagnetic Support Measures reports by two evidential reasoning schemes, namely Dempster-Shafer theory and Dezert-Smarandache theory. These schemes provide results in different frames of discernment, but are able to fuse realistic ESM data. We discuss their advantages and disadvantages under varying conditions of sensor data certainty and fusion reliability, the latter coming from errors in the association process. A thresholded version of Dempster-Shafer theory is fine-tuned for performance across a wide range of values for certainty and reliability, allowing designers who wish to use this method to assess the expected performance. A compromise has to be achieved between stability under occasional miss-associations, and latency under a real change of allegiance. The alternative way of reporting results through Dezert-Smarandache theory is studied under similar conditions, and shown to provide good results which are however more dependent on the unreliability, and less stable.
Keywords: Evidential reasoning, realistic data, Dempster-Shafer, Dezert-Smarandache