Machine learning (ML) is one of the most common methods for analysing personal data. It is used in different domains such as e-commerce, health care and financial services. But research has shown that there is a risk that sensitive personal data is leaked in the process. Navoda Senavirathne, PhD Student at the University of Skövde, aims, with her thesis, to broaden our understanding of the privacy vulnerabilities with machine learning and proposes suitable mitigation strategies for them.
Publishing personal data without applying any data protection mechanisms can violate the privacy of the individuals. Therefore, data controllers apply various data protection mechanisms, such as data anonymisation, on personal data before they are published. But it is important to ensure that the results of the data analysis methods used on personal data also preserve the privacy of the underlying individuals.
ML is one of the most widely used data analysis techniques that train computers to recognise and learn patterns from the data to solve complex tasks. It is used in a variety of domains where collected personal data is used to train the ML models. But recent research shows that the ML models, through their outputs, can leak privacy-sensitive information about the underlying personal data.
This means that ML models lead to privacy vulnerabilities that can be exploited to extract sensitive information about the individuals whose data are used for model training. Therefore, it is very important to understand the privacy vulnerabilities in ML before personal data are used for model training.
Aims to broaden our understanding of vulnerabilities
Navoda Senavirathne has in her research aimed to broaden our understanding of the privacy vulnerabilities in ML, while proposing suitable mitigation strategies for them. She has developed a privacy attack model that significantly outperforms the existing attack models for exploiting the privacy vulnerabilities in ML. This shows that these nascent privacy risks are no longer theoretical but also practical
“I have also studied data anonymisation as a potential mitigation strategy for existing privacy attack models while emphasising the benefits of data anonymisation for both organisations and individuals. In addition, I highlight some areas of GDPR which are vague and in conflict with the utility and privacy aspects of ML. Hence, the legislators must rethink them,” says Navoda Senavirathne.
Helping those who handle personal data
In addition, in her thesis she critically analyses the challenges of adapting the most common anonymisation methods and proposes a refined data anonymisation method that works in the ML context. Through systematic experiments, she shows that existing data anonymisation methods reduce the privacy risks for ML models only under certain conditions.
“These findings inspired me to develop an approach for privacy-preserving ML model selection. I believe that my research will make it easier for those who work with personal data, to train useful ML models for knowledge and extraction purposes, while ensuring the privacy of the individuals,” concludes Navoda Senavirathne.
Navoda Senavirathne defends her thesis “Towards Privacy Preserving Micro-Data Analysis: A Machine Learning Based Perspective under Prevailing Privacy Regulations" on Friday 10 December at the University of Skövde.