An almost sure KPZ relation for SLE and Brownian motion

Convex Regularization for High-Dimensional Tensor Regression

In this paper we present a general convex optimization approach for solving high-dimensional tensor regression problems under low-dimensional structural assumptions. We consider using convex and \emph{weakly decomposable} regularizers assuming that the underlying tensor lies in an unknown low-dimensional subspace. Within our framework, we derive general risk bounds of the resulting estimate under fairly general dependence structure among covariates. Our framework leads to upper bounds in terms of two very simple quantities, the Gaussian width of a convex set in tensor space and the intrinsic dimension of the low-dimensional tensor subspace. These general bounds provide useful upper bounds on rates of convergence for a number of fundamental statistical models of interest including multi-response regression, vector auto-regressive models, low-rank tensor models and pairwise interaction models. Moreover, in many of these settings we prove that the resulting estimates are minimax optimal.

An attractor neural network architecture with an ultra high information capacity: numerical results

The topology of large Open Connectome networks for the human brain

A note on the Borwein conjecture

Bridges of Markov counting processes: quantitative estimates

Building Memory with Concept Learning Capabilities from Large-scale Knowledge Base

We present a new perspective on neural knowledge base (KB) embeddings, from which we build a framework that can model symbolic knowledge in the KB together with its learning process. We show that this framework well regularizes previous neural KB embedding model for superior performance in reasoning tasks, while having the capabilities of dealing with unseen entities, that is, to learn their embeddings from natural language descriptions, which is very like human’s behavior of learning semantic concepts.

Bayesian Uncertainty Management in Temporal Dependence of Extremes

The shape of random tanglegrams

A Feynman-Kac formula for differential forms on manifolds with boundary and applications

Codegree thresholds for covering 3-uniform hypergraphs

Kalman-based Stochastic Gradient Method with Stop Condition and Insensitivity to Conditioning

Proximal and stochastic gradient descent (SGD) methods are believed to efficiently minimize large composite objective functions, but such methods have two algorithmic challenges: (1) a lack of fast or justified stopping conditions, and (2) sensitivity to the problem’s conditioning. Second order SGD methods show promise in solving these problems, but they are (3) marred by the complexity of their analysis. In this work, we address these three issues on the limited, but important, linear regression problem by introducing and analyzing a second order proximal/SGD method based on Kalman Filtering (kSGD). Through our analysis, we develop a fast algorithm with a justified stopping condition, prove that kSGD is insensitive to the problem’s conditioning, and develop a unique approach for analyzing the complex second order dynamics. Our theoretical results are supported by numerical experiments on a large public use data set from the Center for Medicare and Medicaid. Byproducts of our analysis include, primarily, a foundation for extending kSGD to other problem types, parallel implementations with convergence guarantees and low memory applications, and, secondarily, extensions to Kalman Filtering theory.

Bandwidth in the Cloud

Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions

Many real-world problems come with action spaces represented as feature vectors. Although high-dimensional control is a largely unsolved problem, there has recently been progress for modest dimensionalities. Here we report on a successful attempt at addressing problems of dimensionality as high as 2000, of a particular form. Motivated by important applications such as recommendation systems that do not fit the standard reinforcement learning frameworks, we introduce Slate Markov Decision Processes (slate-MDPs). A Slate-MDP is an MDP with a combinatorial action space consisting of slates (tuples) of primitive actions of which one is executed in an underlying MDP. The agent does not control the choice of this executed action and the action might not even be from the slate, e.g., for recommendation systems for which all recommendations can be ignored. We use deep Q-learning based on feature representations of both the state and action to learn the value of whole slates. Unlike existing methods, we optimize for both the combinatorial and sequential aspects of our tasks. The new agent’s superiority over agents that either ignore the combinatorial or sequential long-term value aspect is demonstrated on a range of environments with dynamics from a real-world recommendation system. Further, we use deep deterministic policy gradients to learn a policy that for each position of the slate, guides attention towards the part of the action space in which the value is the highest and we only evaluate actions in this area. The attention is used within a sequentially greedy procedure leveraging submodularity. Finally, we show how introducing risk-seeking can dramatically imporve the agents performance and ability to discover more far reaching strategies.

Bayesian Matrix Completion via Adaptive Relaxed Spectral Regularization

Bayesian matrix completion has been studied based on a low-rank matrix factorization formulation with promising results. However, little work has been done on Bayesian matrix completion based on the more direct spectral regularization formulation. We fill this gap by presenting a novel Bayesian matrix completion method based on spectral regularization. In order to circumvent the difficulties of dealing with the orthonormality constraints of singular vectors, we derive a new equivalent form with relaxed constraints, which then leads us to design an adaptive version of spectral regularization feasible for Bayesian inference. Our Bayesian method requires no parameter tuning and can infer the number of latent factors automatically. Experiments on synthetic and real datasets demonstrate encouraging results on rank recovery and collaborative filtering, with notably good results for very sparse matrices.

On the Medianwidth of Graphs

Target-Dependent Sentiment Classification with Long Short Term Memory

Target-dependent sentiment classification remains a challenge: modeling the semantic relatedness of a target with its context words in a sentence. Different context words have different influences on determining the sentiment polarity of a sentence towards the target. Therefore, it is desirable to integrate the connections between target word and context words when building a learning system. In this paper, we develop two target dependent long short-term memory (LSTM) models, where target information is automatically taken into account. We evaluate our methods on a benchmark dataset from Twitter. Empirical results show that modeling sentence representation with standard LSTM does not perform well. Incorporating target information into LSTM can significantly boost the classification accuracy. The target-dependent LSTM models achieve state-of-the-art performances without using syntactic parser or external sentiment lexicons.

Fast Weighted String Matching

A weighted string over an alphabet of size \sigma is a string in which a set of letters may occur at each position with respective occurrence probabilities. Weighted strings, also known as position weight matrices or uncertain sequences, naturally arise in many contexts. In this article, we study the problem of weighted string matching with a special focus on average-case analysis. Given a weighted pattern string x of length m, a text string y of length n>m, and a cumulative weight threshold 1/z, defined as the minimal probability of occurrence of factors in a weighted string, we present an algorithm requiring average-case search time o(n) for pattern matching for weight ratio \frac{z}{m} < \min\{\frac{1}{\log z},\frac{\log \sigma}{\log z (\log m + \log \log \sigma)}\}. For a pattern string x of length m, a weighted text string y of length n>m, and a cumulative weight threshold 1/z, we present an algorithm requiring average-case search time o(\sigma n) for the same weight ratio. The importance of these results lies on the fact that these algorithms work in average-case sublinear search time in the size of the text, and in linear preprocessing time and space in the size of the pattern, for these ratios.

A class of graphs approaching Vizing’s conjecture

Correlated fluctuations in strongly-coupled binary networks beyond equilibrium

On the survival probability in the Matheron – De Marsily model

Variational Multiscale Nonparametric Regression: Smooth Functions

Relating $2$-Rainbow Domination to Roman domination

Incorporating social contact data in spatio-temporal models for infectious disease spread

Full Current Statistics for a Disordered Open Exclusion Process

Approaches for Sentiment Analysis on Twitter: A State-of-Art study

Microbloging is an extremely prevalent broadcast medium amidst the Internet fraternity these days. People share their opinions and sentiments about variety of subjects like products, news, institutions, etc., every day on microbloging websites. Sentiment analysis plays a key role in prediction systems, opinion mining systems, etc. Twitter, one of the microbloging platforms allows a limit of 140 characters to its users. This restriction stimulates users to be very concise about their opinion and twitter an ocean of sentiments to analyze. Twitter also provides developer friendly streaming API for data retrieval purpose allowing the analyst to search real time tweets from various users. In this paper, we discuss the state-of-art of the works which are focused on Twitter, the online social network platform, for sentiment analysis. We survey various lexical, machine learning and hybrid approaches for sentiment analysis on Twitter.

Querying with Łukasiewicz logic

Green’s function for elliptic systems: moment bounds

Discrete Equilibrium Sampling with Arbitrary Nonequilibrium Processes

A statistical approach to crowdsourced smartphone-based earthquake early warning systems

Unbiased estimators and multilevel Monte Carlo

Bayesian Variable Selection and Estimation for Group Lasso

Some Ratio Monotonic Properties of a New Kind of Numbers introduced by Z.-W. Sun

A Bollobás-type theorem for affine subspaces

On Ratio Monotonicity of a New Kind of Numbers Conjectured by Z.-W. Sun

Orthogonal apartments in Hilbert Grassmannians

Shattering bounds for tuple systems

The Enumeration of Cyclic MNOLS

Fractal frontiers of bursts and cracks in a fiber bundle model of creep rupture

Bag Reference Vector for Multi-instance Learning

Multi-instance learning (MIL) has a wide range of applications due to its distinctive characteristics. Although many state-of-the-art algorithms have achieved decent performances, a plurality of existing methods solve the problem only in instance level rather than excavating relations among bags. In this paper, we propose an efficient algorithm to describe each bag by a corresponding feature vector via comparing it with other bags. In other words, the crucial information of a bag is extracted from the similarity between that bag and other reference bags. In addition, we apply extensions of Hausdorff distance to representing the similarity, to a certain extent, overcoming the key challenge of MIL problem, the ambiguity of instances’ labels in positive bags. Experimental results on benchmarks and text categorization tasks show that the proposed method outperforms the previous state-of-the-art by a large margin.

Fast Low-Rank Matrix Learning with Nonconvex Regularization

Low-rank modeling has a lot of important applications in machine learning, computer vision and social network analysis. While the matrix rank is often approximated by the convex nuclear norm, the use of nonconvex low-rank regularizers has demonstrated better recovery performance. However, the resultant optimization problem is much more challenging. A very recent state-of-the-art is based on the proximal gradient algorithm. However, it requires an expensive full SVD in each proximal step. In this paper, we show that for many commonly-used nonconvex low-rank regularizers, a cutoff can be derived to automatically threshold the singular values obtained from the proximal operator. This allows the use of power method to approximate the SVD efficiently. Besides, the proximal operator can be reduced to that of a much smaller matrix projected onto this leading subspace. Convergence, with a rate of O(1/T) where T is the number of iterations, can be guaranteed. Extensive experiments are performed on matrix completion and robust principal component analysis. The proposed method achieves significant speedup over the state-of-the-art. Moreover, the matrix solution obtained is more accurate and has a lower rank than that of the traditional nuclear norm regularizer.

Bayesian non-parametric inference for $Λ$-coalescents: consistency and a parametric method

A Study on Artificial Intelligence IQ and Standard Intelligent Model

Currently, potential threats of artificial intelligence (AI) to human have triggered a large controversy in society, behind which, the nature of the issue is whether the artificial intelligence (AI) system can be evaluated quantitatively. This article analyzes and evaluates the challenges that the AI development level is facing, and proposes that the evaluation methods for the human intelligence test and the AI system are not uniform; and the key reason for which is that none of the models can uniformly describe the AI system and the beings like human. Aiming at this problem, a standard intelligent system model is established in this study to describe the AI system and the beings like human uniformly. Based on the model, the article makes an abstract mathematical description, and builds the standard intelligent machine mathematical model; expands the Von Neumann architecture and proposes the Liufeng – Shiyong architecture; gives the definition of the artificial intelligence IQ, and establishes the artificial intelligence scale and the evaluation method; conduct the test on 50 search engines and three human subjects at different ages across the world, and finally obtains the ranking of the absolute IQ and deviation IQ ranking for artificial intelligence IQ 2014.

Sequential Bayesian Model Selection of Regular Vine Copulas

Posterior Belief Assessment: Extracting Meaningful Subjective Judgements from Bayesian Analyses with Complex Statistical Models

Neural Enquirer: Learning to Query Tables

Modeling Human Understanding of Complex Intentional Action with a Bayesian Nonparametric Subgoal Model

A New Infinite Family of Hemisystems of the Hermitian Surface

Triplet Spike Time Dependent Plasticity in a Floating-Gate Synapse

Large deviations for near-extreme eigenvalues in the beta-ensembles

A New Statistical Framework for Genetic Pleiotropic Analysis of High Dimensional Phenotype Data

A Combinatorial Problem Related to Sparse Systems of Equations

Probabilistic Integration

Mean-Field Inference in Gaussian Restricted Boltzmann Machine