Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Database

Research Collection School Of Computing and Information Systems

Articles 1 - 2 of 2

Full-Text Articles in Computer Engineering

Provable De-Anonymization Of Large Datasets With Sparse Dimensions, Anupam Datta, Divya Sharma, Arunesh Sinha Apr 2012

Provable De-Anonymization Of Large Datasets With Sparse Dimensions, Anupam Datta, Divya Sharma, Arunesh Sinha

Research Collection School Of Computing and Information Systems

There is a significant body of empirical work on statistical de-anonymization attacks against databases containing micro-dataabout individuals, e.g., their preferences, movie ratings, or transactiondata. Our goal is to analytically explain why such attacks work. Specifically, we analyze a variant of the Narayanan-Shmatikov algorithm thatwas used to effectively de-anonymize the Netflix database of movie ratings. We prove theorems characterizing mathematical properties of thedatabase and the auxiliary information available to the adversary thatenable two classes of privacy attacks. In the first attack, the adversarysuccessfully identifies the individual about whom she possesses auxiliaryinformation (an isolation attack). In the second attack, the adversarylearns additional …


Splash: Systematic Proteomics Laboratory Analysis And Storage Hub, Siaw Ling Lo, You Tao, Qingsong Lin, Shashikant B. Joshi, Maxey Chung, Choy Leong Hew Mar 2006

Splash: Systematic Proteomics Laboratory Analysis And Storage Hub, Siaw Ling Lo, You Tao, Qingsong Lin, Shashikant B. Joshi, Maxey Chung, Choy Leong Hew

Research Collection School Of Computing and Information Systems

In the field of proteomics, the increasing difficulty to unify the data format, due to the different platforms/instrumentation and laboratory documentation systems, greatly hinders experimental data verification, exchange, and comparison. Therefore, it is essential to establish standard formats for every necessary aspect of proteomics data. One of the recently published data models is the proteomics experiment data repository [Taylor, C. F., Paton, N. W., Garwood, K. L., Kirby, P. D. et al., Nat. Biotechnol. 2003, 21, 247-254]. Compliant with this format, we developed the systematic proteomics laboratory analysis and storage hub (SPLASH) database system as an informatics infrastructure to support …