• Valence Bonds in Random Quantum Magnets: Theory and Application to YbMgGaO4
• Individuals, Institutions, and Innovation in the Debates of the French Revolution
• Graph Embedding with Rich Information through Bipartite Heterogeneous Network
• The Fyodorov-Bouchaud formula and Liouville conformal field theory
• 3 List Coloring Graphs of Girth at least Five on Surfaces
• Edgeworth correction for the largest eigenvalue in a spiked PCA model
• On the zeros of random harmonic polynomials: the Weyl model
• Improved Bounds on the Symmetric Capacity of the Binary Input Channels
• Characterization of Gradient Dominance and Regularity Conditions for Neural Networks
• Computing reflection length in an affine Coxeter group
• VisDA: The Visual Domain Adaptation Challenge
• Unsupervised Object Discovery and Segmentation of RGBD-images
• Growth Mixture Modeling with Measurement Selection
• Exploring Neural Architectures for Multilingual Customer Feedback Analysis: IJCNLP 2017 Shared Task
• Providing Accurate Models across Private Partitioned Data: Secure Maximum Likelihood Estimation
• First-Person Perceptual Guidance Behavior Decomposition using Active Constraint Classification
• Interleaved Training and Training-Based Transmission Design for Hybrid Massive Antenna Downlink
• Asynchronous Decentralized Parallel Stochastic Gradient Descent
• Network Load Balancing Methods: Experimental Comparisons and Improvement
• Universal Convergence of Kriging
• Learning Differentially Private Language Models Without Losing Accuracy
• Importance sampling the union of rare events with an application to power systems analysis
• Butson-type complex Hadamard matrices and association schemes on Galois rings of characteristic 4
• Consequentialist conditional cooperation in social dilemmas with imperfect information
• Entropy Compression Method and Legitimate Colorings in Projective Planes
• Artificial-Noise-Aided Secure Channel with a Full-duplex Source
• On Affine and Conjugate Nonparametric Regression
• Operator limit of the circular beta ensemble
• Necessary and Sufficient Condition for Asymptotic Normality of Standardized Sample Means
• Improved Search in Hamming Space using Deep Multi-Index Hashing
• Learning Visual Features from Snapshots for Web Search
• On the Relationship between Conditional (CAR) and Simultaneous (SAR) Autoregressive Models
• Delocalization and Limiting Spectral Distribution of Erdős-Rényi Graphs with Constant Expected Degree
• Fractional Derivatives of Convex Lyapunov Functions and Control Problems in Fractional Order Systems
• Minimax Estimation of Bandable Precision Matrices
• Quarter-Turn Baxter Permutations
• Finite Model Approximations for Partially Observed Markov Decision Processes with Discounted Cost
• ProLanGO: Protein Function Prediction Using Neural~Machine Translation Based on a Recurrent Neural Network
• Second Order Asymptotics for Communication under Strong Asynchronism
• Protein Folding Optimization using Differential Evolution Extended with Local Search and Component Reinitialization
• SLING: A framework for frame semantic parsing
• A Primal-Dual based Distributed Approximation Algorithm for Prize Collecting Steiner Tree
• DD-$α$AMG on QPACE 3
• Unsupervised Context-Sensitive Spelling Correction of English and Dutch Clinical Free-Text with Word and Character N-Gram Embeddings
• Early stopping for statistical inverse problems via truncated SVD estimation
• Reti bayesiane per lo studio del fenomeno degli incidenti stradali tra i giovani in Toscana
• Best Linear Approximation of Wiener Systems Using Multilevel Signals: Theory and Experiments
• Joint identification via deconvolution of the flux and energy relaxation kernels of the Gurtin-Pipkin model in thermodynamics with memory
• Asymptotic Stability of Empirical Processes and Related Functionals
• Preference Modeling by Exploiting Latent Components of Ratings
• A proof of the Delta Conjecture when $q=0$
• Emerging from Water: Underwater Image Color Correction Based on Weakly Supervised Color Transfer
• Spanning tree with lower bound on the degrees
• Skewed distributions as limits of a formal evolutionary process
• Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams
• Nonlinear Supervised Dimensionality Reduction via Smooth Regular Embeddings
• On vertex coloring without monochromatic triangles
• Reaching a Target in the Plane with no Information
• A unified polynomial-time algorithm for Feedback Vertex Set on graphs of bounded mim-width
• Generalized Water-filling for Source-Aware Energy-Efficient SRAMs
• Comparison of statistical procedures for Gaussian graphical model selection
• Environmental contours based on kernel density estimation
• Hybrid Thermostatic Approximations of Junctions for some Optimal Control Problems on Networks
• Visual Speech Recognition Using PCA Networks and LSTMs in a Tandem GMM-HMM System
• Games on graphs with a public signal monitoring
• Exact Hausdorff and packing measures for random self-similar code-trees with necks
• Combining Multiple Views for Visual Speech Recognition
• The Geometry of Gaussoids
• Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description
• Increasing Labelings, Generalized Promotion, and Rowmotion
• Paxos Made EPR: Decidable Reasoning about Distributed Protocols
• Block DCT filtering using vector processing
• Global performance metrics for synchronization of heterogeneously rated power systems: The role of machine models and inertia
• Dress like a Star: Retrieving Fashion Products from Videos
• LSMM: A statistical approach to integrating functional annotations with genome-wide association studies
• Efficient Robust Matrix Factorization with Nonconvex Penalties
• Directed Hamilton cycles in digraphs and matching alternating Hamilton cycles in bipartite graphs
• On subgraphs of random Caylay sum graphs
• Decomposition of Uncertainty for Active Learning and Reliable Reinforcement Learning in Stochastic Systems
• SERENADE: A Parallel Randomized Algorithm Suite for Crossbar Scheduling in Input-Queued Switches
• On the Geometry of Chemical Reaction Networks: Lyapunov Function and Large Deviations
• The geometric $R$-matrix for affine crystals of type $A$
• Partitioning the vertices of a torus into isomorphic subgraphs
• A density version of Cobham’s theorem
• Informational Neurobayesian Approach to Neural Networks Training. Opportunities and Prospects
• Simulating Quantum Spin Hall Effect in Topological Lieb Lattice of Linear Circuit
• FigureQA: An Annotated Figure Dataset for Visual Reasoning
• Interpretable Transformations with Encoder-Decoder Networks
• Power Plant Performance Modeling with Concept Drift
• Visual Integration of Data and Model Space in Ensemble Learning
• Scalable Gaussian Processes with Billions of Inducing Inputs via Tensor Train Decomposition
• Online Monotone Games
• Frequency Based Index Estimating the Subclusters’ Connection Strength
• Be Your Own Prada: Fashion Synthesis with Structural Coherence
• Unified sufficient conditions for uniform recovery of sparse signals via nonconvex minimizations
• Learning to Recognize Actions from Limited Training Examples Using a Recurrent Spiking Neural Model
• Synchronization Strings: Efficient and Fast Deterministic Constructions over Small Alphabets
• A Fast and Generic GPU-Based Parallel Reduction Implementation
• Go game formal revealing by Ising model
• Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks
• Mutants and Residents with Different Connection Graphs in the Moran Process
• Convergence Analysis of the Frank-Wolfe Algorithm and Its Generalization in Banach Spaces
• SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
• Correction Factor for Analysis of MIMO Wireless Networks With Highly Directional Beamforming
• Fault-tolerant parallel scheduling of different length jobs on a multiple-access channel
• Partially Coherent Ptychography by Gradient Decomposition of the Probe
• Batch Codes from Hamming and Reed-Müller Codes
• Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models
• Superpixel Based Segmentation and Classification of Polyps in Wireless Capsule Endoscopy
• Linear-Time Algorithm in Bayesian Image Denoising based on Gaussian Markov Random Field
• Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach
• Detecting Online Hate Speech Using Context Aware Models
• Ligand Pose Optimization with Atomic Grid-Based Convolutional Neural Networks
• The Collatz-Wielandt quotient for some pairs of nonnegative operators
• First-order Methods Almost Always Avoid Saddle Points
• Complete Facial Reduction in One Step for Spectrahedra
• The Reliability Function of Lossy Source-Channel Coding of Variable-Length Codes with Feedback
• Transforming cumulative hazard estimates
• Differentially Private Empirical Risk Minimization with Input Perturbation
• More on the sixth coefficient of the matching polynomial in regular graphs
• A numerical study of Haar wavelet priors in an elliptical Bayesian inverse problem
• Biased halfspaces, noise sensitivity, and relative Chernoff inequalities (extended version)
• The saturation number, spectral radius, and family of $k$-edge-connected graphs
• Light-weight place recognition and loop detection using road markings
• A Semantically Motivated Approach to Compute ROUGE Scores
• A direct method for unfolding the resolution function from measurements of neutron induced reactions
• A Non-Stationary Ergodic Theorem with Applications to Averaging
• Minimax state estimates for abstract Neumann problems
• Finite-dimensional Gaussian approximation with linear inequality constraints
• Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data
• Poset ideals of P-partitions and generalized letterplace and determinantal ideals
• Learning Wasserstein Embeddings
• The Importance of System-Level Information in Multiagent Systems Design: Cardinality and Covering Problems
• Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods
• Nonparametric estimation of multivariate distribution function for truncated and censored lifetime data
• Second-order subdifferential. Extremal problems for operational inclusions
• Symmetric Gauss-Seidel Technique Based Alternating Direction Methods of Multipliers for Transform Invariant Low-Rank Textures Problem
• Planar 3-SAT with a Clause/Variable Cycle
• Anticipating Daily Intention using On-Wrist Motion Triggered Sensing
• HDR image reconstruction from a single exposure using deep CNNs
• A regularity structure for rough volatility
• Full Stability for a Class of Control Problems of Semilinear Elliptic Partial Differential Equations
• The satisfiability threshold for random linear equations
• MR to X-Ray Projection Image Synthesis
• Infinite monochromatic sumsets for colourings of the reals
• Resonance graphs of kinky benzenoid systems are daisy cubes
• Nearest-neighbour Markov point processes on graphs with Euclidean edges
• Local Word Vectors Guide Keyphrase Extraction
• Probabilistic Analysis of the Dual-Pivot Quicksort ‘Count’
• On the Steiner hyper-Wiener index of a graph
• Continuous groupoids on the symbolic space, quasi-invariant probabilities for Haar systems and the Haar-Ruelle operator
• On the exactness of Lasserre relaxations and pure states over real closed fields
• The cost number and the determining number of a graph
• Robustness of synchrony in complex networks and generalized Kirchhoff indices
• Hardened Paxos Through Consistency Validatio
• Sum-perfect graphs
• Learning compressed representations of blood samples time series with missing data
• Spoken Language Biomarkers for Detecting Cognitive Impairment
• Real-time Convolutional Neural Networks for Emotion and Gender Classification
• Classification Driven Dynamic Image Enhancement
• SEGCloud: Semantic Segmentation of 3D Point Clouds
• Communication-free Massively Distributed Graph Generation
• An $\mathcal H_2$-Type Error Bound for Time-Limited Balanced Truncation
• Quantile Regression with Interval Data
• A Short Survey on Bounding the Union Probability using Partial Information
• The Maximum Colorful Arborescence problem parameterized by the structure of its color hierarchy graph
• Parallel Combining: Making Use of Free Cycles
• Compensation of Actuator Dynamics Governed by Quasilinear Hyperbolic PDEs
• Belief Propagation Min-Sum Algorithm for Generalized Min-Cost Network Flow
• Kernelization Lower Bounds for Finding Constant Size Subgraphs
• Translation-Invariant Gibbs States of Ising model: General Setting
• Asymptotically Optimal Resource Block Allocation With Limited Feedback
• Transparent Replication Using Metaprogramming in Cyan
• Understanding and Auto-Adjusting Performance-Related Configurations
Models for analyzing multivariate data sets with missing values require strong, often unassessable, assumptions. The most common of these is that the mechanism that created the missing data is ignorable – a twofold assumption dependent on the mode of inference. The first part, which is the focus here, under the Bayesian and direct likelihood paradigms, requires that the missing data are missing at random (MAR); in contrast, the frequentist-likelihood paradigm demands that the missing data mechanism always produces MAR data, a condition known as missing always at random (MAAR). Under certain regularity conditions, assuming MAAR leads to an assumption that can be tested using the observed data alone namely, the missing data indicators only depend on fully observed variables. Here, we propose three different diagnostics procedures that not only indicate when this assumption is invalid but also suggest which variables are the most likely culprits. Although MAAR is not a necessary condition to ensure validity under the Bayesian and direct likelihood paradigms, it is sufficient, and evidence for its violation should encourage the statistician to conduct a targeted sensitivity analysis.
This article proposes a Bayesian nonparametric method for forecasting, imputation, and clustering in sparsely observed, multivariate time series. The method is appropriate for jointly modeling hundreds of time series with widely varying, non-stationary dynamics. Given a collection of time series, the Bayesian model first partitions them into independent clusters using a Chinese restaurant process prior. Within a cluster, all time series are modeled jointly using a novel ‘temporally-coupled’ extension of the Chinese restaurant process mixture. Markov chain Monte Carlo techniques are used to obtain samples from the posterior distribution, which are then used to form predictive inferences. We apply the technique to challenging prediction and imputation tasks using seasonal flu data from the US Center for Disease Control and Prevention, demonstrating competitive imputation performance and improved forecasting accuracy as compared to several state-of-the art baselines. We also show that the model discovers interpretable clusters in datasets with hundreds of time series using macroeconomic data from the Gapminder Foundation.
Data-driven predictive analytics are in use today across a number of industrial applications, but further integration is hindered by the requirement of similarity among model training and test data distributions. This paper addresses the need of learning from possibly nonstationary data streams, or under concept drift, a commonly seen phenomenon in practical applications. A simple dual-learner ensemble strategy, alternating learners framework, is proposed. A long-memory model learns stable concepts from a long relevant time window, while a short-memory model learns transient concepts from a small recent window. The difference in prediction performance of these two models is monitored and induces an alternating policy to select, update and reset the two models. The method features an online updating mechanism to maintain the ensemble accuracy, and a concept-dependent trigger to focus on relevant data. Through empirical studies the method demonstrates effective tracking and prediction when the steaming data carry abrupt and/or gradual changes.
We review recent advances in modal regression studies using kernel density estimation. Modal regression is an alternative approach for investigating relationship between a response variable and its covariates. Specifically, modal regression summarizes the interactions between the response variable and covariates using the conditional mode or local modes. We first describe the underlying model of modal regression and its estimators based on kernel density estimation. We then review the asymptotic properties of the estimators and strategies for choosing the smoothing bandwidth. We also discuss useful algorithms and similar alternative approaches for modal regression, and propose future direction in this field.
We identify a strong equivalence between neural network based machine learning (ML) methods and the formulation of statistical data assimilation (DA), known to be a problem in statistical physics. DA, as used widely in physical and biological sciences, systematically transfers information in observations to a model of the processes producing the observations. The correspondence is that layer label in the ML setting is the analog of time in the data assimilation setting. Utilizing aspects of this equivalence we discuss how to establish the global minimum of the cost functions in the ML context, using a variational annealing method from DA. This provides a design method for optimal networks for ML applications and may serve as the basis for understanding the success of ‘deep learning’. Results from an ML example are presented. When the layer label is taken to be continuous, the Euler-Lagrange equation for the ML optimization problem is an ordinary differential equation, and we see that the problem being solved is a two point boundary value problem. The use of continuous layers is denoted ‘deepest learning’. The Hamiltonian version provides a direct rationale for back propagation as a solution method for the canonical momentum; however, it suggests other solution methods are to be preferred.
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
Causal inference on multiple non-independent outcomes raises serious challenges, because multivariate techniques that properly account for the outcome’s dependence structure need to be considered. We focus on the case of binary outcomes framing our discussion in the potential outcome approach to causal inference. We define causal effects of treatment on joint outcomes introducing the notion of product outcomes. We also discuss a decomposition of the causal effect on product outcomes into intrinsic and extrinsic causal effects, which respectively provide information on treatment effect on the intrinsic (product) structure of the product outcomes and on the outcomes’ dependence structure. We propose a log-mean linear regression approach for modeling the distribution of the potential outcomes, which is particularly appealing because all the causal estimands of interest and the decomposition into intrinsic and extrinsic causal effects can be easily derived by model parameters. The method is illustrated in two randomized experiments concerning (i) the effect of the administration of oral pre-surgery morphine on pain intensity after surgery; and (ii) the effect of honey on nocturnal cough and sleep difficulty associated with childhood upper respiratory tract infections.
We use decision trees to build a helpdesk agent reference network to facilitate the on-the-job advising of junior or less experienced staff on how to better address telecommunication customer fault reports. Such reports generate field measurements and remote measurements which, when coupled with location data and client attributes, and fused with organization-level statistics, can produce models of how support should be provided. Beyond decision support, these models can help identify staff who can act as advisors, based on the quality, consistency and predictability of dealing with complex troubleshooting reports. Advisor staff models are then used to guide less experienced staff in their decision making; thus, we advocate the deployment of a simple mechanism which exploits the availability of staff with a sound track record at the helpdesk to act as dormant tutors.
This paper addresses the land cover classification task for remote sensing images by deep self-taught learning. Our self-taught learning approach learns suitable feature representations of the input data using sparse representation and undercomplete dictionary learning. We propose a deep learning framework which extracts representations in multiple layers and use the output of the deepest layer as input to a classification algorithm. We evaluate our approach using a multispectral Landsat 5 TM image of a study area in the North of Novo Progresso (South America) and the Zurich Summer Data Set provided by the University of Zurich. Experiments indicate that features learned by a deep self-taught learning framework can be used for classification and improve the results compared to classification results using the original feature representation.
Sea level change, one of the most dire impacts of anthropogenic global warming, will affect a large amount of the world’s population. However, sea level change is not uniform in time and space, and the skill of conventional prediction methods is limited due to the ocean’s internal variabi-lity on timescales from weeks to decades. Here we study the potential of neural network methods which have been used successfully in other applications, but rarely been applied for this task. We develop a combination of a convolutional neural network (CNN) and a recurrent neural network (RNN) to ana-lyse both the spatial and the temporal evolution of sea level and to suggest an independent, accurate method to predict interannual sea level anomalies (SLA). We test our method for the northern and equatorial Pacific Ocean, using gridded altimeter-derived SLA data. We show that the used network designs outperform a simple regression and that adding a CNN improves the skill significantly. The predictions are stable over several years.
Deep learning typically requires training a very capable architecture using large datasets. However, many important learning problems demand an ability to draw valid inferences from small size datasets, and such problems pose a particular challenge for deep learning. In this regard, various researches on ‘meta-learning’ are being actively conducted. Recent work has suggested a Memory Augmented Neural Network (MANN) for meta-learning. MANN is an implementation of a Neural Turing Machine (NTM) with the ability to rapidly assimilate new data in its memory, and use this data to make accurate predictions. In models such as MANN, the input data samples and their appropriate labels from previous step are bound together in the same memory locations. This often leads to memory interference when performing a task as these models have to retrieve a feature of an input from a certain memory location and read only the label information bound to that location. In this paper, we tried to address this issue by presenting a more robust MANN. We revisited the idea of meta-learning and proposed a new memory augmented neural network by explicitly splitting the external memory into feature and label memories. The feature memory is used to store the features of input data samples and the label memory stores their labels. Hence, when predicting the label of a given input, our model uses its feature memory unit as a reference to extract the stored feature of the input, and based on that feature, it retrieves the label information of the input from the label memory unit. In order for the network to function in this framework, a new memory-writingmodule to encode label information into the label memory in accordance with the meta-learning task structure is designed. Here, we demonstrate that our model outperforms MANN by a large margin in supervised one-shot classification tasks using Omniglot and MNIST datasets.
In this study, we present Swift Linked Data Miner, an interruptible algorithm that can directly mine an online Linked Data source (e.g., a SPARQL endpoint) for OWL 2 EL class expressions to extend an ontology with new SubClassOf: axioms. The algorithm works by downloading only a small part of the Linked Data source at a time, building a smart index in the memory and swiftly iterating over the index to mine axioms. We propose a transformation function from mined axioms to RDF Data Shapes. We show, by means of a crowdsourcing experiment, that most of the axioms mined by Swift Linked Data Miner are correct and can be added to an ontology. We provide a ready to use Prot\’eg\’e plugin implementing the algorithm, to support ontology engineers in their daily modeling work.
Consider a polynomial optimisation problem, whose instances vary continuously over time. We propose to use a coordinate-descent algorithm for solving such time-varying optimisation problems. In particular, we focus on relaxations of transmission-constrained problems in power systems. On the example of the alternating-current optimal power flows (ACOPF), we bound the difference between the current approximate optimal cost generated by our algorithm and the optimal cost for a relaxation using the most recent data from above by a function of the properties of the instance and the rate of change to the instance over time. We also bound the number of floating-point operations that need to be performed between two updates in order to guarantee the error is bounded from above by a given constant.
Reducing labeling costs in supervised learning is a critical issue in many practical machine learning applications. In this paper, we consider positive-confidence (Pconf) classification, the problem of training a binary classifier only from positive data equipped with confidence. Pconf classification can be regarded as a discriminative extension of one-class classification (which is aimed at ‘describing’ the positive class), with ability to tune hyper-parameters for ‘classifying’ positive and negative samples. Pconf classification is also related to positive-unlabeled (PU) classification (which uses hard-labeled positive data and unlabeled data), allowing us to avoid estimating the class priors, which is a critical bottleneck in typical PU classification methods. For the Pconf classification problem, we provide a simple empirical risk minimization framework and give a formulation for linear-in-parameter models that can be implemented easily and computationally efficiently. We also theoretically establish the consistency and generalization error bounds for Pconf classification, and demonstrate the practical usefulness of the proposed method through experiments.
Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy. However, all of these approaches rely on access to the original training set, which might not always be possible if the network to be compressed was trained on a very large dataset, or on a dataset whose release poses privacy or safety concerns as may be the case for biometrics tasks. We present a method for data-free knowledge distillation, which is able to compress deep neural networks trained on large-scale datasets to a fraction of their size leveraging only some extra metadata to be provided with a pretrained model release. We also explore different kinds of metadata that can be used with our method, and discuss tradeoffs involved in using each of them.
In Change point detection task Likelihood Ratio Test (LRT) is sequentially applied in a sliding window procedure. Its high values indicate changes of parametric distribution in the data sequence. Correspondingly LRT values require predefined bound for their maximum. The maximum value has unknown distribution and may be calibrated with multiplier bootstrap. Bootstrap procedure convolves independent components of the Likelihood function with random weights, that enables to estimate empirically LRT distribution. For this empirical distribution of the LRT we show convergence rates to the real maximum value distribution.
The importance of geo-spatial data in critical applications such as emergency response, transportation, agriculture etc., has prompted the adoption of recent GeoSPARQL standard in many RDF processing engines. In addition to large repositories of geo-spatial data — e.g., LinkedGeoData, OpenStreetMap, etc. — spatial data is also routinely found in automatically constructed knowledgebases such as Yago and WikiData. While there have been research efforts for efficient processing of spatial data in RDF/SPARQL, very little effort has gone into building end-to-end systems that can holistically handle complex SPARQL queries along with spatial filters. In this paper, we present Streak, a RDF data management system that is designed to support a wide-range of queries with spatial filters including complex joins, top-k, higher-order relationships over spatially enriched databases. Streak introduces various novel features such as a careful identifier encoding strategy for spatial and non-spatial entities, the use of a semantics-aware Quad-tree index that allows for early-termination and a clever use of adaptive query processing with zero plan-switch cost. We show that Streak can scale to some of the largest publicly available semantic data resources such as Yago3 and LinkedGeoData which contain spatial entities and quantifiable predicates useful for result ranking. For experimental evaluations, we focus on top-k distance join queries and demonstrate that Streak outperforms popular spatial join algorithms as well as state of the art end-to-end systems like Virtuoso and PostgreSQL.
Change point estimation in its offline version is traditionally performed by optimizing over the data set of interest, by considering each data point as the true location parameter and computing a data fit criterion. Subsequently, the data point that minimizes the criterion is declared as the change point estimate. For estimating multiple change points, the procedures are analogous in spirit, but significantly more involved in execution. Since change-points are local discontinuities, only data points close to the actual change point provide useful information for estimation, while data points far away are superfluous, to the point where using only a few points close to the true parameter is just as precise as using the full data set. Leveraging this ‘locality principle’, we introduce a two-stage procedure for the problem at hand, which in the 1st stage uses a sparse subsample to obtain pilot estimates of the underlying change points, and in the 2nd stage refines these estimates by sampling densely in appropriately defined neighborhoods around them. We establish that this method achieves the same rate of convergence and even virtually the same asymptotic distribution as the analysis of the full data, while reducing computational complexity to O(N^0.5) time (N being the length of data set), as opposed to at least O(N) time for all current procedures, making it promising for the analysis on exceedingly long data sets with adequately spaced out change points. The main results are established under a signal plus noise model with independent and identically distributed error terms, but extensions to dependent data settings, as well as multiple stage (>2) procedures are also provided. The performance of our procedure — which is coined ‘intelligent sampling’ — is illustrated on both synthetic and real Internet data streams.
We propose a novel pooling strategy that learns how to adaptively rank deep convolutional features for selecting more informative representations. To this end, we exploit discriminative analysis to project the features onto a space spanned by the number of classes in the dataset under study. This maps the notion of labels in the feature space into instances in the projected space. We employ these projected distances as a measure to rank the existing features with respect to their specific discriminant power for each individual class. We then apply multipartite ranking to score the separability of the instances and aggregate one-versus-all scores to compute an overall distinction score for each feature. For the pooling, we pick features with the highest scores in a pooling window instead of maximum, average or stochastic random assignments. Our experiments on various benchmarks confirm that the proposed strategy of multipartite pooling is highly beneficial to consistently improve the performance of deep convolutional networks via better generalization of the trained models for the test-time data.
Transfer learning is a popular practice in deep neural networks, but fine-tuning of large number of parameters is a hard task due to the complex wiring of neurons between splitting layers and imbalance distributions of data in pretrained and transferred domains. The reconstruction of the original wiring for the target domain is a heavy burden due to the size of interconnections across neurons. We propose a distributed scheme that tunes the convolutional filters individually while backpropagates them jointly by means of basic probability assignment. Some of the most recent advances in evidence theory show that in a vast variety of the imbalanced regimes, optimizing of some proper objective functions derived from contingency matrices prevents biases towards high-prior class distributions. Therefore, the original filters get gradually transferred based on individual contributions to overall performance of the target domain. This largely reduces the expected complexity of transfer learning whilst highly improves precision. Our experiments on standard benchmarks and scenarios confirm the consistent improvement of our distributed deep transfer learning strategy.
A common practice in most of deep convolutional neural architectures is to employ fully-connected layers followed by Softmax activation to minimize cross-entropy loss for the sake of classification. Recent studies show that substitution or addition of the Softmax objective to the cost functions of support vector machines or linear discriminant analysis is highly beneficial to improve the classification performance in hybrid neural networks. We propose a novel paradigm to link the optimization of several hybrid objectives through unified backpropagation. This highly alleviates the burden of extensive boosting for independent objective functions or complex formulation of multiobjective gradients. Hybrid loss functions are linked by basic probability assignment from evidence theory. We conduct our experiments for a variety of scenarios and standard datasets to evaluate the advantage of our proposed unification approach to deliver consistent improvements into the classification performance of deep convolutional neural networks.
In this paper, we deal with the task of building a dynamic ensemble of chain classifiers for multi-label classification. To do so, we proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model. Such modes allows anticipating the instance-specific chain order without a significant increase in computational burden. The proposed chain models are built using the Naive Bayes classifier and nearest neighbour approach as a base single-label classifiers. To take the benefits of the proposed algorithms, we developed a simple heuristic that allows the system to find relatively good label order. The heuristic sort labels according to the label-specific classification quality gained during the validation phase. The heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the above-mentioned heuristic is an efficient tool for building dynamic chain classifiers.
The optimization-based design of renewable energy systems is a computationally demanding task because of the high temporal fluctuation of supply and demand time series. In order to reduce these time series, the aggregation of typical operation periods has become common. The problem with this method is that these aggregated typical periods are modeled independently and cannot exchange energy. Therefore, seasonal storage cannot be adequately taken into account, although this will be necessary for energy systems with a high share of renewable generation. To address this issue, this paper proposes a novel mathematical description for storage inventories based on the superposition of inter-period and intra-period states. Inter-period states connect the typical periods and are able to account their sequence. The approach has been adopted for different energy system configurations. The results show that a significant reduction in the computational load can be achieved also for long term storage-based energy system models in comparison to optimization models based on the full annual time series.
Since the publication of ‘Complex Contagions and the Weakness of Long Ties’ in 2007, complex contagions have been studied across an enormous variety of social domains. In reviewing this decade of research, we discuss recent advancements in applied studies of complex contagions, particularly in the domains of health, innovation diffusion, social media, and politics. We also discuss how these empirical studies have spurred complementary advancements in the theoretical modeling of contagions, which concern the effects of network topology on diffusion, as well as the effects of individual-level attributes and thresholds. In synthesizing these developments, we suggest three main directions for future research. The first concerns the study of how multiple contagions interact within the same network and across networks, in what may be called an ecology of contagions. The second concerns the study of how the structure of thresholds and their behavioral consequences can vary by individual and social context. The third area concerns the roles of diversity and homophily in the dynamics of complex contagion, including both diversity of demographic profiles among local peers, and the broader notion of structural diversity within a network. Throughout this discussion, we make an effort to highlight the theoretical and empirical opportunities that lie ahead.