Final Report Summary - PASS (Privacy architectures for system services)
Novice users choose from pre-existing privacy policies, but there should also be a mechanism for specifying their own. The system also distinguishes between different privacy requirements for different stages of the data life-cycle. For example, it might be unnecessary to scrub data files that are 'idle' on the user's computer. However the moment a data file is used as a mail attachment it is sanitised appropriately. Finally, intended use will also affect how the privacy service works. We define different privacy levels for different uses. When the user communicates with a trusted service (e.g. their bank or a colleague) there is a different level of information scrubbing than when communicating with some un-trusted service (e.g. an online forum or a search engine).
Work and results:
(1) Efficient detection of information leaks
We developed new algorithms for more efficient detection of information leaks. Our approach intelligently removes common phrases and non-sensitive phrases from the fingerprinting process. Non-sensitive phrases are identified by looking at available public documents of the organisation that we want to protect from information leaks and common phrases are identified with the help of search engine. In this way, our solution both accelerates leak detection and increases the accuracy of the result.
(2) High-speed string and regular expression matching
We developed a system the does sting and regular expression matching on graphics processors unit (GPU). The system exploits the highly-parallel nature of GPUs to achieve unprecedented performance in the order of tens of gigabits per second.
(3) Privacy-preserving systems
We are investigating security and privacy issues in cloud computing environments. Specifically we developed a runtime system along with cryptographic protocols to enable secure delegation of access rights, and privacy-preserving data-sharing in biomedical data clouds.
(4) Global-scale data-mining for private data
We have conducted large-scale research into data leakage from documents in the World Wide Web. We have looked at 15 million online documents to explore what kind of data and meta-data is being leaked. We have also exploited the meta-data to create graphs of user communities that may otherwise hard to detect.
(5) Crawling and Identification of possibly private data on public sources
We have built crawlers that collect information present on public websites that contain personal identifiable information, such as social security numbers, tax ids, phone numbers, etc.
(6) Forensic analysis tools
We have built tools to analyse data resident on the local disk (logs, events, raw data, etc.) as well as online activity on various services websites (Facebook, Gmail, etc.)
(7) Bibliographical survey
There is an ongoing bibliographical survey for the entire duration of the project on issues of security and privacy.
(8) Publicity, knowledge dissemination, and education and training
We have taken every chance to publish our work at internationally recognised conferences and journals. Furthermore, we have given talks promoting our work done in the PASS project. Also, we have designed two new classes, an advance undergraduate class and a graduate level class, in order to teach university student issues of security and privacy.
(9) Research in related topics in security and privacy
During the project we are also taking the opportunity to look at a variety of other issues in security and privacy of computer and network systems.