• Adaptive Sequential Optimization with Applications to Machine Learning
A framework is introduced for solving a sequence of slowly changing optimization problems, including those arising in regression and classification applications, using optimization algorithms such as stochastic gradient descent (SGD). The optimization problems change slowly in the sense that the minimizers change at either a fixed or bounded rate. A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples needed from the distributions underlying each problem in order to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. Experiments with synthetic and real data are used to confirm that this approach performs well.
• Frequency Distribution of Error Messages
Which programming error messages are the most common? We investigate this question, motivated by writing error explanations for novices. We consider large data sets in Python and Java that include both syntax and run-time errors. In both data sets, after grouping essentially identical messages, the error message frequencies empirically resemble Zipf-Mandelbrot distributions. We use a maximum-likelihood approach to fit the distribution parameters. This gives one possible way to contrast languages or compilers quantitatively.
• IllinoisSL: A JAVA Library for Structured Prediction
IllinoisSL is a Java library for learning structured prediction models. It supports structured Support Vector Machines and structured Perceptron. The library consists of a core learning module and several applications, which can be executed from command-lines. Documentation is provided to guide users. In Comparison to other structured learning libraries, IllinoisSL is efficient, general, and easy to use.
• Provable Smoothing Approach in High Dimensional Generalized Regression Model
The generalized regression model is an important semiparametric generalization to the linear regression model. It assumes there exist unknown monotone increasing link functions connecting the response

to a single index

of explanatory variables

. The generalized regression model covers a lot of well-exploited statistical models. It is appealing in many applications where regression models are regularly employed. In low dimensions, rank-based M-estimators are recommended, giving root-

consistent estimators of

. However, their applications to high dimensional data are questionable. This is mainly due to the discontinuity of loss function

. In detail, (i) computationally, because of

‘s non-smoothness, the optimization problem is intractable; (ii) theoretically, the discontinuity of

renders difficulty for analysis in high dimensions. In contrast, this paper suggests a simple, yet powerful, smoothing approach for rank-based estimators. A family of smoothing functions is provided, and the amount of smoothing necessary for efficient inference is carefully calculated. We show the resulting estimators are scaling near-optimal, i.e., they are consistent estimators of

as long as

are within a near-optimal range (Here

represents the sparsity degree). These are the first such results in the literature. The proposed approaches’ power is further verified empirically.
• A hybrid sampler for Poisson-Kingman mixture models
• A note on convergence to stationarity of random processes with immigration
• A simplicial complex is uniquely determined by its set of discrete Morse functions
• Anderson localization of light in disordered superlattices containing graphene layers
• Bilingual Distributed Word Representations from Document-Aligned Comparable Data
• Bradley-Terry model in random environment : does the best always win?
• Channel Vector Subspace Estimation from Low-Dimensional Projections
• CRDT: Correlation Ratio Based Decision Tree Model for Healthcare Data Mining
• Crescent configurations
• Criticality and Chaos in Systems of Communities
• Curvature and transport inequalities for Markov chains in discrete spaces
• Davie’s type uniqueness for a class of SDEs with jumps
• Decompositions of the Boolean lattice into rank-symmetric chains
• Detecting changes in Hilbert space data based on ‘repeated’ and change-aligned principal components
• Deterministic Sparse Suffix Sorting on Rewritable Texts
• Exact confidence intervals for the average causal effect on a binary outcome
• Exploration and Exploitation of Victorian Science in Darwin’s Reading Notebooks
• From gap probabilities in random matrix theory to eigenvalue expansions
• Generalized Mittag Leffler distributions arising as limits in preferential attachment models
• High Dimensional Data Modeling Techniques for Detection of Chemical Plumes and Anomalies in Hyperspectral Images and Movies
• High-dimensional robust precision matrix estimation: Cellwise corruption under $ε$-contamination
• Integer Programming Models and Parameterized Algorithms for Controlling Palletizers
• Interactions between Ehrenfest’s urns arising from group actions
• Interplay of pair density waves and random field disorders in the pseudogap phase of cuprate superconductors
• Ising critical behavior of inhomogeneous Curie-Weiss and annealed random graphs
• Links as a Service (LaaS): Feeling Alone in the Shared Cloud
• Lower bounds on the dilation of plane spanners
• Mapping Generative Models onto Networks of Digital Spiking Neurons
• MM Algorithms for Variance Components Models
• Multi-Objective Weighted Sampling
• Noise-Robust ASR for the third ‘CHiME’ Challenge Exploiting Time-Frequency Masking based Multi-Channel Speech Enhancement and Recurrent Neural Network
• Nonlinear diffusion equations and curvature conditions in metric measure spaces
• On some properties and relations between restricted barred preferential arrangements, multi-poly-Bernoulli numbers and related numbers
• On the Edit Distance of Powers of Cycles
• On the spectral radius of simple digraphs with prescribed number of arcs
• Opinion mining from twitter data using evolutionary multinomial mixture models
• Orderings of weakly correlated random variables, and prime number races with many contestants
• Parameterized Algorithms for Min-Max Multiway Cut and List Digraph Homomorphism
• Poland-Scheraga DNA denaturation model with self-avoidance: numerical study in terms of pseudo-critical temperature settles the question of disorder relevance
• Provable approximation properties for deep neural networks
• Renewal Structure of the Brownian Taut String
• Scaling limit of the recurrent biased random walk on a Galton-Watson tree
• Semi – Equivelar Maps on the Torus and the Klein Bottle with few vertices
• Short cycle covers on cubic graphs using chosen 2-factor
• Sparsification Upper and Lower Bounds for Graph Problems and Not-All-Equal SAT
• Sparsity-based Correction of Exponential Artifacts
• Spatially Encoding Temporal Correlations to Classify Temporal Data Using Convolutional Neural Networks
• spTest: An R Package Implementing Nonparametric Tests of Isotropy
• Testing in high-dimensional spiked models
• The complex Brownian motion as a strong limit of processes constructed from a Poisson process
• The minimum size of graphs with given rainbow index
• TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data
• White noise perturbation of locally stable dynamical systems
Like this:
Like Loading...