YOURPRIVACYPROTECTOR, A recommender system for privacy settings in social networks

Ensuring privacy of users of social networks is probably an unsolvable conundrum. At the same time, an informed use of the existing privacy options by the social network participants may alleviate – or even prevent – some of the more drastic privacy-averse incidents. Unfortunately, recent surveys show that an average user is either not aware of these options or does not use them, probably due to their perceived complexity. It is therefore reasonable to believe that tools assisting users with two tasks: 1) understanding their social net behavior in terms of their privacy settings and broad privacy categories, and 2)recommending reasonable privacy options, will be a valuable tool for everyday privacy practice in a social network context. This paper presents YourPrivacyProtector, a recommender system that shows how simple machine learning techniques may provide useful assistance in these two tasks to Facebook users. We support our claim with empirical results of application of YourPrivacyProtector to two groups of Facebook users.


Greedy algorithms for prediction

In many prediction problems, it is not uncommon that the number of variables used to construct a forecast is of the same order of magnitude as the sample size, if not larger. We then face the problem of constructing a prediction in the presence of potentially large estimation error. Control of the estimation error is either achieved by selecting variables or combining all the variables in some special way. This paper considers greedy algorithms to solve this problem. It is shown that the resulting estimators are consistent under weak conditions. In particular, the derived rates of convergence are either minimax or improve on the ones given in the literature allowing for dependence and unbounded regressors. Some versions of the algorithms provide fast solution to problems such as Lasso.


Compressive Spectral Clustering

Spectral clustering has become a popular technique due to its high performance in many contexts. It comprises three main steps: create a similarity graph between N objects to cluster, compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object, and run k-means on these features to separate objects into k classes. Each of these three steps becomes computationally intensive for large N and/or k. We propose to speed up the last two steps based on recent results in the emerging field of graph signal processing: graph filtering of random signals, and random sampling of bandlimited graph signals. We prove that our method, with a gain in computation time that can reach several orders of magnitude, is in fact an approximation of spectral clustering, for which we are able to control the error. We test the performance of our method on artificial and real-world network data.


Brane Brick Models and 2d (0,2) Triality

Quantum nonergodicity and fermion localization in a system with a single-particle mobility edge

On directional derivatives of Skorokhod maps in convex polyhedral domains

Scheduling of unit-length jobs with bipartite incompatibility graphs on four uniform machines

Discovering Neuronal Cell Types and Their Gene Expression Profiles Using a Spatial Point Process Mixture Model

Generate Image Descriptions based on Deep RNN and Memory Cells for Images Features

ASEP(q,j) converges to the KPZ equation

Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering

Clustering action potential spikes: Insights on the use of overfitted finite mixture models and Dirichlet process mixture models

Characteristics of Visual Categorization of Long-Concatenated and Object-Directed Human Actions by a Multiple Spatio-Temporal Scales Recurrent Neural Network Model

Massively Multilingual Word Embeddings

Fantastic 4 system for NIST 2015 Language Recognition Evaluation

Locally stationary processes prediction by auto-regression

Exchangeable exogenous shock models

‘Almost-stable’ matchings in the Hospitals / Residents problem with Couples: An Integer Programming approach

Lifetime-Based Memory Management for Distributed Data Processing Systems

Wayfinding and cognitive maps for pedestrian models

Gysin maps, duality and Schubert classes

Counting configuration-free sets in groups

Counting spanning trees on fractal graphs and their asymptotic complexity

The Structure of $W_4$-Immersion-Free Graphs

Non-binary branching process and Non-Markovian exploration process

Computing with hardware neurons: spiking or classical? Perspectives of applied Spiking Neural Networks from the hardware side

Issues with the Smith-Wilson method

Graph parameters from symplectic group invariants

Utilização de Grafos e Matriz de Similaridade na Sumarização Automática de Documentos Baseada em Extração de Frases

Hierarchical expectation propagation for Bayesian aggregation of average data

Exponential extinction time of the contact process on rank-one inhomogeneous random graphs

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

Spectral properties of cographs

Compressive PCA on Graphs

The Wild Bootstrap for Multivariate Nelson-Aalen Estimators

Berry-Esseen’s bound and Cramér’s large deviation expansion for a supercritical branching process in a random environment

A short conceptual proof of Narayana’s path-counting formula

Region Based Approximation for High Dimensional Bayesian Network Models

Harmonic Grammar in a DisCo Model of Meaning

Large deviations for the Ornstein-Uhlenbeck process without tears

Variance-Reduced and Projection-Free Stochastic Optimization

The Spacey Random Walk: a Stochastic Process for Higher-order Data

A New Algorithm to Simulate the First Exit Times of a Vector of Brownian Motions, with an Application to Finance

Exchangeable Random Measures for Sparse and Modular Graphs with Overlapping Communities

Parallel Ordered Sets Using Join

Sequence Classification with Neural Conditional Random Fields

Mining Software Quality from Software Reviews: Research Trends and Open Issues

Reducing Runtime by Recycling Samples

Classification methods applied to credit scoring: A systematic review and overall comparison

Products of Differences in Prime Order Finite Fields

Oriented Book Embeddings

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters