The Nystr\"om methods have been popular techniques for scalable kernel based learning. They approximate explicit, low-dimensional feature mappings for kernel functions from the pairwise comparisons with the training data. However, Nystr\"om methods are generally applied without the supervision provided by the training labels in the classification/regression problems. This leads to pairwise comparisons with randomly chosen training samples in the model. Conversely, this work studies a supervised Nystr\"om method that chooses the critical subsets of samples for the success of the Machine Learning model. Particularly, we select the Nystr\"om support vectors via the negative margin criterion, and create explicit feature maps that are more suitable for the classification task on the data. Experimental results on six datasets show that, without increasing the complexity over unsupervised techniques, our method can significantly improve the classification performance achieved via kernel approximation methods and reduce the number of features needed to reach or exceed the performance of the full-dimensional kernel machines.
Traditionally, differential privacy mechanism design has been tailored for a scalar-valued query function. Although many mechanisms such as the Laplace and Gaussian mechanisms can be extended to a matrix-valued query function by adding i.i.d. noise to each element of the matrix, this method is often sub-optimal as it forfeits an opportunity to exploit the structural characteristics typically associated with matrix analysis. In this work, we consider the design of differential privacy mechanism specifically for a matrix-valued query function. The proposed solution is to utilize a matrix-variate noise, as opposed to the traditional scalar-valued noise. Particularly, we propose a novel differential privacy mechanism called the Matrix-Variate Gaussian (MVG) mechanism, which adds a matrix-valued noise drawn from a matrix-variate Gaussian distribution. We prove that the MVG mechanism preserves (ϵ,δ)-differential privacy, and show that it allows the structural characteristics of the matrix-valued query function to naturally be exploited. Furthermore, due to the multi-dimensional nature of the MVG mechanism and the matrix-valued query, we introduce the concept of directional noise, which can be utilized to mitigate the impact the noise has on the utility of the query. Finally, we demonstrate the performance of the MVG mechanism and the advantages of directional noise using three matrix-valued queries on three privacy-sensitive datasets. We find that the MVG mechanism notably outperforms four previous state-of-the-art approaches, and provides comparable utility to the non-private baseline. Our work thus presents a promising prospect for both future research and implementation of differential privacy for matrix-valued query functions.
The rapid rise of IoT and Big Data can facilitate the use of data to enhance our quality of life. However, the omnipresent and sensitive nature of data can simultaneously generate privacy concerns. Hence, there is a strong need to develop techniques that ensure the data serve the intended purposes, but not for prying into one's sensitive information. We address this challenge via utility maximizing lossy compression of data. Our techniques combine the mathematical rigor of Kernel Learning models with the structural richness of Deep Neural Networks, and lead to the novel Multi-Kernel Learning and Hybrid Learning models. We systematically construct the proposed models in progressive stages, as motivated by the cumulative improvement in the experimental results from the two previously non-intersecting regimes, namely, Kernel Learning and Deep Neural Networks. The final experimental results of the three proposed models on three mobile sensing datasets show that, not only are our methods able to improve the utility prediction accuracies, but they can also cause sensitive predictions to perform nearly as bad as random guessing, resulting in a win-win situation in terms of utility and privacy.
Pattern recognition on big data can be challenging for kernel machines as the complexity grows with the squared number of training samples. In this work, we overcome this hurdle via the outlying data sample removal pre-processing step. This approach removes less-informative data samples and trains the kernel machines only with the remaining data, and hence, directly reduces the complexity by reducing the number of training samples. To enhance the classification performance, the outlier removal process is done such that the discriminant information of the data is mostly intact. This is achieved via the novel Outlier-Removal Discriminant Information (ORDI) metric, which measures the contribution of each sample toward the discriminant information of the dataset. Hence, the ORDI metric can be used together with the simple filter method to effectively remove insignificant outliers to both reduce the computational cost and enhance the classification performance. We experimentally show on two real-world datasets at the sample removal ratio of 0.2 that, with outlier removal via ORDI, we can simultaneously (1) improve the accuracy of the classifier by 1 %, and (2) provide significant saving on the total running time by 1.5x and 2x on the two datasets. Hence, ORDI can provide a win-win situation in this performance-complexity tradeoff of the kernel machines for big data analysis.
Differential privacy mechanism design has traditionally been tailored for a scalar-valued query function. Although many mechanisms such as the Laplace and Gaussian mechanisms can be extended to a matrix-valued query function by adding i.i.d. noise to each element of the matrix, this method is often suboptimal as it forfeits an opportunity to exploit the structural characteristics typically associated with matrix analysis. To address this challenge, we propose a novel differential privacy mechanism called the Matrix-Variate Gaussian (MVG) mechanism, which adds a matrix-valued noise drawn from a matrix-variate Gaussian distribution, and we rigorously prove that the MVG mechanism preserves (ϵ,δ)-differential privacy. Furthermore, we introduce the concept of directional noise made possible by the design of the MVG mechanism. Directional noise allows the impact of the noise on the utility of the matrix-valued query function to be moderated. Finally, we experimentally demonstrate the performance of our mechanism using three matrix-valued queries on three privacy-sensitive datasets. We find that the MVG mechanism notably outperforms four previous state-of-the-art approaches, and provides comparable utility to the non-private baseline. Our work thus presents a promising prospect for both future research and implementation of differential privacy for matrix-valued query functions.
A key challenge facing the design of differential privacy in the non-interactive setting is to maintain the utility of the released data. To overcome this challenge, we utilize the Diaconis Freedman-Meckes (DFM) effect, which states that most projections of high-dimensional data are nearly Gaussian. Hence, we propose the RON-Gauss model that leverages the novel combination of dimensionality reduction via random orthonormal (RON) projection and the Gaussian generative model for synthesizing differentially-private data. We analyze how RON Gauss benefits from the DFM effect, and present multiple algorithms for a range of machine learning applications, including both unsupervised and supervised learning. Furthermore, we rigorously prove that (a) our algorithms satisfy the strong ϵ-differential privacy guarantee, and (b) RON projection can lower the level of perturbation required for differential privacy. Finally, we illustrate the effectiveness of RON-Gauss under three common machine learning applications -- clustering, classification, and regression -- on three large real-world datasets. Our empirical results show that (a) RON-Gauss outperforms previous approaches by up to an order of magnitude, and (b) loss in utility compared to the non-private real data is small. Thus, RON-Gauss can serve as a key enabler for real-world deployment of privacy-preserving data release.
In machine learning, feature engineering has been a pivotal stage in building a high-quality predictor. Particularly, this work explores the multiple Kernel Discriminant Component Analysis (mKDCA) feature-map and its variants. However, seeking the right subset of kernels for mKDCA feature-map can be challenging. Therefore, we consider the problem of kernel selection, and propose an algorithm based on Differential Mutual Information (DMI) and incremental forward search. DMI serves as an effective metric for selecting kernels, as is theoretically supported by mutual information and Fisher's discriminant analysis. On the other hand, incremental forward search plays a role in removing redundancy among kernels. Finally, we illustrate the potential of the method via an application in privacy-aware classification, and show on three mobile-sensing datasets that selecting an effective set of kernels for mKDCA feature-maps can enhance the utility classification performance, while successfully preserve the data privacy. Specifically, the results show that the proposed DMI forward search method can perform better than the state-of-the-art, and, with much smaller computational cost, can perform as well as the optimal, yet computationally expensive, exhaustive search.
The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification.
As the analytic tools become more powerful, and more data are generated on a daily basis, the issue of data privacy arises. This leads to the study of the design of privacy-preserving machine learning algorithms. Given two objectives, namely, utility maximization and privacy-loss minimization, this work is based on two previously non-intersecting regimes — Compressive Privacy and multi-kernel method. Compressive Privacy is a privacy framework that employs utility-preserving lossy-encoding scheme to protect the privacy of the data, while multi-kernel method is a kernel-based machine learning regime that explores the idea of using multiple kernels for building better predictors. In relation to the neural-network architecture, multi-kernel method can be described as a two-hidden-layered network with its width proportional to the number of kernels. The compressive multi-kernel method proposed consists of two stages — the compression stage and the multi-kernel stage. The compression stage follows the Compressive Privacy paradigm to provide the desired privacy protection. Each kernel matrix is compressed with a lossy projection matrix derived from the Discriminant Component Analysis (DCA). The multikernel stage uses the signal-to-noise ratio (SNR) score of each kernel to non-uniformly combine multiple compressive kernels. The proposed method is evaluated on two mobile-sensing datasets — MHEALTH and HAR — where activity recognition is defined as utility and person identification is defined as privacy. The results show that the compression regime is successful in privacy preservation as the privacy classification accuracies are almost at the random-guess level in all experiments. On the other hand, the novel SNR-based multi-kernel shows utility classification accuracy improvement upon the state-of-the-art in both datasets. These results indicate a promising direction for research in privacy-preserving machine learning.
In the internet era, the data being collected on consumers like us are growing exponentially and attacks on our privacy are becoming a real threat. To better assure our privacy, it is safer to let data owner control the data to be uploaded to the network, as opposed to taking chance with the data servers or the third parties. To this end, we propose a privacy-preserving technique, named Compressive Privacy (CP), to enable the data creator to compress data via collaborative learning, so that the compressed data uploaded onto the internet will be useful only for the intended utility and will not be easily diverted to malicious applications.
For data in a high-dimensional feature vector space, a common approach to data compression is dimension reduction or, equivalently, subspace projection. The most prominent tool is Principal Component Analysis (PCA). For unsupervised learning, PCA can best recover the original data given a specific reduced dimensionality. However, for supervised learning environment, it is more effective to adopt a supervised PCA, known as the Discriminant Component Analysis (DCA), in order to maximize the discriminant capability.
The DCA subspace analysis embraces two different subspaces. The signal subspace components of DCA are associated with the discriminant distance/power (related to the classification effectiveness), while the noise subspace components of DCA are tightly coupled with the recoverability and/or privacy protection. This paper will present three DCA-related data compression methods useful for privacy-preserving applications.
Utility-driven DCA: Because the rank of the signal subspace is limited by the number of classes, DCA can effectively support classification using a relatively small dimensionality (i.e. high compression).
Desensitized PCA: By incorporating a signal-subspace ridge into DCA, it leads to a variant especially effective for extracting privacy-preserving components. In this case, the eigenvalues of the noise-space are made to become insensitive to the privacy labels and are ordered according to their corresponding component powers.
Desensitized K-means/SOM: Since the revelation of the K-means or SOM cluster structure could leak sensitive information, it will be safer perform K-means or SOM clustering on desensitized PCA subspace.
Over the past decades, face recognition has been a problem of critical interest in the machine learning and signal processing communities. However, conventional approaches such as eigenfaces do not protect the privacy of user data, which is emerging as an important design consideration in today's society. In this work, we leverage a supervised-learning subspace projection method called Discriminant Component Analysis (DCA) for privacy-preserving face recognition. By projecting the data onto the lower-dimensional signal subspace prescribed by DCA, high performance of face recognition is achievable without compromising privacy of the data owners. We evaluate our approach on three image datasets: Yale, Olivetti and Glasses datasets - the last is derived from the former two. Our approach can serve as a key enabler for real-world deployment of privacy-preserving face recognition applications, and provides a promising direction to researchers and private sectors.
Synthetic biology is facilitating novel methods and components to build in vivo and in vitro circuits to better understand and re-engineer biological networks. Circadian oscillators serve as molecular clocks that govern several important cellular processes such as cell division and apoptosis. Hence, successful demonstration of synthetic oscillators have become a primary design target for many synthetic biology endeavors. Recently, three synthetic transcriptional oscillators were demonstrated by Kim and Winfree utilizing modular architecture of synthetic gene analogues and a few enzymes. However, the periods and amplitudes of synthetic oscillators were sensitive to initial conditions and allowed limited tunability. In addition, it being a closed system, the oscillations were observe to die out after a certain period of time. To increase tunability and robustness of synthetic biochemical oscillators in the face of disturbances and modeling uncertainties, a control theoretic approach for real-time adjustment of oscillator behaviors would be required. In this paper, assuming an open system implementation is feasible, we demonstrate how dynamic inversion techniques can be used to synthesize the required controllers.