Search results

    Search results

    Show all results for ""
    Can not find any results or suggestions for "."

    Search tips

    • Make sure there are no spelling errors
    • Try different search terms or synonyms
    • Narrow your search for more hits

    How can we help?

    Contact Us

    Find Employees

    University of Skövde, link to startpage

    Search results

      Search results

      Show all results for ""
      Can not find any results or suggestions for "."

      Search tips

      • Make sure there are no spelling errors
      • Try different search terms or synonyms
      • Narrow your search for more hits

      How can we help?

      Contact Us

      Find Employees

      University of Skövde, link to startpage

      Dissertation: Towards Privacy Preserving Micro-Data Analysis

      Date 10 December
      Time 15:00 - 18:00
      Location Insikten, Portalen, University of Skövde

      Navoda Senavirathne defends her thesis "Towards Privacy Preserving Micro-Data Analysis: A Machine Learning Based Perspective under Prevailing Privacy Regulations".

      View on Zoom

      The dissertation is held in Insikten, Portalen, but is also live streamed on Zoom. Click on the link to see the dissertation om Zoom.

      View the live stream on Zoom

      Abstract

      Machine learning (ML) has been employed in a wide variety of domains where micro-data (i.e., personal data) are used in the training process. In recent research, it has been shown that ML models are vulnerable to privacy attacks that exploit their observable predictions and optimization information in order to extract sensitive information about the underlying data subjects. Therefore, models trained on micro-data pose a distinct threat to the privacy of the data subjects. To mitigate these risks, privacy preserving machine learning (PPML) techniques are proposed in the literature. Existing PPML techniques are mainly based on differential privacy or cryptography based techniques. However, using these techniques for privacy preservation either results in poor predictive accuracy of the derived ML models or a high computational cost. Also, they operate under the assumption that raw data are available for training the ML models.

      Due to stringent requirements for data protection and data publishing, it is plausible that the micro-data are anonymized by the data controllers before releasing them for analysis. In the event that anonymized data are available for ML model training, it is vital to understand its impact on ML utility and privacy aspects. In literature on data privacy, anonymization and PPML are often studied as two disconnected fields. But we argue that a natural synergy exists between these two fields that results in a myriad of benefits for the data controllers as well as for the data subjects, in the light of new privacy regulations, business requirements, and privacy risk factors. When anonymized data are used to train the ML models there is an intrinsic requirement to re-think the existing privacy preserving mechanisms used in both data anonymization and PPML.

      One of the main contributions of this thesis is, understanding the opportunities and challenges presented by data anonymization in a ML setting. During this exploration, we highlight how certain provisions of the General Data Protection Regulation (GDPR) could be in direct conflict with the interest of ML utility and privacy. Inspired by these findings, we then propose a novel anonymization technique based on probabilistic k-anonymity that comprises amenable characteristics for ML utility and privacy. Next, we introduce a privacy-preserving technique for ML model selection based on integral privacy that can inhibit the inferences drawn by the adversaries about the training data or their transformations over time, by the means of selecting models with certain characteristics that can improve the adversary's uncertainty. Moreover, we provide a rigorous characterization of a well-known privacy attack targeting the ML models (i.e., membership inference), and then identify the limitations of the existing methods that can easily be manipulated in order to overstate or understate the particular privacy risk. Finally, we present a new  membership inference attack model, based on activation pattern based anomaly detection that overcomes these limitations while providing greater accuracy in identifying membership.

      Together, we believe these contributions will broaden the understanding  of the research community, not only concerning the technical aspects of preserving privacy in ML but also highlighting its interplay with existing privacy regulations such as GDPR. It is hoped such findings will shape our journey for knowledge discovery in the era of big data.

      Opponent

      Sonja Buchegger, Professor, KTH Royal Institute of Technology, Sweden

      Supervisors

      Vicenç Torra, Professor, Umeå University, Sweden.
      Maria Riveiro, Professor, Jönköping University and University of Skövde

      Committee

      Alina Campan, Associate Professor, Northern Kentucky University, USA
      Sébastien Gambs, Professor, Université du Québec à Montréal, Canada
      Simone Fischer-Hübner, Professor, Karlstad University, Sweden

      Kontakt

      PhD Student Informatics

      Published: 11/10/2021
      Edited: 11/10/2021
      Responsible: webmaster@his.se