CrescendoNet: A Simple Deep Convolutional Neural Network with Ensemble Behavior
We introduce a new deep convolutional neural network, CrescendoNet, by stacking simple building blocks without residual connections. Each Crescendo block contains independent convolution paths with increased depths. The numbers of convolution layers and parameters are only increased linearly in Crescendo blocks. In experiments, CrescendoNet with only 15 layers outperforms almost all networks without residual connections on benchmark datasets, CIFAR10, CIFAR100, and SVHN. Given sufficient amount of data as in SVHN dataset, CrescendoNet with 15 layers and 4.1M parameters can match the performance of DenseNet-BC with 250 layers and 15.3M parameters. CrescendoNet provides a new way to construct high performance deep convolutional neural networks without residual connections. Moreover, through investigating the behavior and performance of subnetworks in CrescendoNet, we note that the high performance of CrescendoNet may come from its implicit ensemble behavior, which differs from the FractalNet that is also a deep convolutional neural network without residual connections. Furthermore, the independence between paths in CrescendoNet allows us to introduce a new path-wise training procedure, which can reduce the memory needed for training.
How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility
Recommendation systems occupy an expanding role in everyday decision making, from choice of movies and household goods to consequential medical and legal decisions. The data used to train and test these systems is algorithmically confounded in that it is the result of a feedback loop between human choices and an existing algorithmic recommendation system. Using simulations, we demonstrate that algorithmic confounding can disadvantage algorithms in training, bias held-out evaluation, and amplify homogenization of user behavior without gains in utility.
Machine Learning and Cognitive Technology for Intelligent Wireless Networks
The ability to dynamically and efficiently allocate resources to meet the need of growing diversity in services and user behavior marks the future of wireless networks, giving rise to intelligent processing, which aims at enabling the system to perceive and assess the available resources, to autonomously learn to adapt to the perceived wireless environment, and to reconfigure its operating mode to maximize the utility of the available resources. The perception capability and reconfigurability are the essential features of cognitive technology while modern machine learning techniques project effectiveness in system adaptation. In this paper, we discuss the development of the cognitive technology and machine learning techniques and emphasize their roles in improving both spectrum and energy efficiency of the future wireless networks. We describe in detail the state-of-the-art of cognitive technology, covering spectrum sensing and access approaches that may enhance spectrum utilization and curtail energy consumption. We discuss powerful machine learning algorithms that enable spectrum- and energy-efficient communications in dynamic wireless environments. We also present practical applications of these techniques to the existing and future wireless communication systems, such as heterogeneous networks and device-to-device communications, and identify some research opportunities and challenges in cognitive technology and machine learning as applied to future wireless networks.
Consistency of Generalized Dynamic Principal Components in Dynamic Factor Models
We study the theoretical properties of the generalized dynamic principal components introduced in Pe\~na and Yohai (2016). In particular, we prove that when the data follows a dynamic factor model, the reconstruction provided by the procedure converges in mean square to the common part of the model as the number of series and periods diverge to infinity. The results of a simulation study support our findings.
Bayesian Learning of Random Graphs & Correlation Structure of Multivariate Data, with Distance between Graphs
We present a method for the simultaneous Bayesian learning of the correlation matrix and graphical model of a multivariate dataset, using Metropolis-within-Gibbs inference. Here, the data comprises measurement of a vector-valued observable, that we model using a high-dimensional Gaussian Process (GP), such that, likelihood of GP parameters given the data, is Matrix-Normal, defined by a mean matrix and between-rows and between-columns covariance matrices. We marginalise over the between-row matrices, to achieve a closed-form likelihood of the between-columns correlation matrix, given the data. This correlation matrix is updated in the first block of an iteration, given the data, and the (generalised Binomial) graph is updated in the second block, at the partial correlation matrix that is computed given the updated correlation. We also learn the 95

Highest Probability Density credible regions of the correlation matrix as well as the graphical model of the data. The difference in the acknowledgement of measurement errors in learning the graphical model, is demonstrated on a small simulated dataset, while the large human disease-symptom network–with

nodes–is learnt using real data. Data on the vino-chemical attributes of Portugese red and white wine samples are employed to learn the correlation structure and graphical model of each dataset, to then compute the distance between the learnt graphical models.
Generating Natural Adversarial Examples
Due to their complex nature, it is hard to characterize the ways in which machine learning models can misbehave or be exploited when deployed. Recent work on adversarial examples, i.e. inputs with minor perturbations that result in substantially different model predictions, is helpful in evaluating the robustness of these models by exposing the adversarial scenarios where they fail. However, these malicious perturbations are often unnatural, not semantically meaningful, and not applicable to complicated domains such as language. In this paper, we propose a framework to generate natural and legible adversarial examples by searching in semantic space of dense and continuous data representation, utilizing the recent advances in generative adversarial networks. We present generated adversaries to demonstrate the potential of the proposed approach for black-box classifiers in a wide range of applications such as image classification, textual entailment, and machine translation. We include experiments to show that the generated adversaries are natural, legible to humans, and useful in evaluating and analyzing black-box classifiers.
Tensor Regression Meets Gaussian Processes
Low-rank tensor regression, a new model class that learns high-order correlation from data, has recently received considerable attention. At the same time, Gaussian processes (GP) are well-studied machine learning models for structure learning. In this paper, we demonstrate interesting connections between the two, especially for multi-way data analysis. We show that low-rank tensor regression is essentially learning a multi-linear kernel in Gaussian processes, and the low-rank assumption translates to the constrained Bayesian inference problem. We prove the oracle inequality and derive the average case learning curve for the equivalent GP model. Our finding implies that low-rank tensor regression, though empirically successful, is highly dependent on the eigenvalues of covariance functions as well as variable correlations.
SemTK: An Ontology-first, Open Source Semantic Toolkit for Managing and Querying Knowledge Graphs
The relatively recent adoption of Knowledge Graphs as an enabling technology in multiple high-profile artificial intelligence and cognitive applications has led to growing interest in the Semantic Web technology stack. Many semantics-related tools, however, are focused on serving experts with a deep understanding of semantic technologies. For example, triplification of relational data is available but there is no open source tool that allows a user unfamiliar with OWL/RDF to import data into a semantic triple store in an intuitive manner. Further, many tools require users to have a working understanding of SPARQL to query data. Casual users interested in benefiting from the power of Knowledge Graphs have few tools available for exploring, querying, and managing semantic data. We present SemTK, the Semantics Toolkit, a user-friendly suite of tools that allow both expert and non-expert semantics users convenient ingestion of relational data, simplified query generation, and more. The exploration of ontologies and instance data is performed through SPARQLgraph, an intuitive web-based user interface in SemTK understandable and navigable by a lay user. The open source version of SemTK is available at
http://semtk.research.ge.com.
TF Boosted Trees: A scalable TensorFlow based framework for gradient boosting
TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to prevent overfitting.
Partial Least Squares Random Forest Ensemble Regression as a Soft Sensor
Six simple, dynamic soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were: moving window partial least squares regression (and a recursive variant), moving window random forest regression, feedforward neural networks, mean moving window, and a novel random forest partial least squares regression ensemble (RF-PLS). We found that, on two of the datasets studied, very small window sizes (4 samples) led to the lowest prediction errors. The RF-PLS method offered the lowest one-step-ahead prediction errors compared to those of the other methods, and demonstrated greater stability at larger time lags than moving window PLS alone. We found that this method most adequately modeled the datasets that did not feature purely monotonic increases in property values. In general, we observed that linear models deteriorated most rapidly at more delayed model update conditions while nonlinear methods tended to provide predictions that approached those from a simple mean moving window. Other data dependent findings are presented and discussed.
Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm
Learning to learn is a powerful paradigm for enabling models to learn from data more effectively and efficiently. A popular approach to meta-learning is to train a recurrent model to read in a training dataset as input and output the parameters of a learned model, or output predictions for new test inputs. Alternatively, a more recent approach to meta-learning aims to acquire deep representations that can be effectively fine-tuned, via standard gradient descent, to new tasks. In this paper, we consider the meta-learning problem from the perspective of universality, formalizing the notion of learning algorithm approximation and comparing the expressive power of the aforementioned recurrent models to the more recent approaches that embed gradient descent into the meta-learner. In particular, we seek to answer the following question: does deep representation combined with standard gradient descent have sufficient capacity to approximate any learning algorithm? We find that this is indeed true, and further find, in our experiments, that gradient-based meta-learning consistently leads to learning strategies that generalize more widely compared to those represented by recurrent models.
• The Capacity of Private Computation
• (Quasi)Periodic revivals in periodically driven interacting quantum systems
• Analysis, Identification, and Validation of Discrete-Time Epidemic Processes
• A stochastic model for evolution with mass extinction on $\mathbb{T}_d^+$
• High efficiency compression for object detection
• Onsets and Frames: Dual-Objective Piano Transcription
• Creation of an Annotated Corpus of Spanish Radiology Reports
• Super-polynomial separations for quantum-enhanced reinforcement learning
• Indirect Supervision for Relation Extraction using Question-Answer Pairs
• Adjusted quantile residual for generalized linear models
• Sedentary quantum walks
• Sample-efficient Policy Optimization with Stein Control Variate
• VLSI Computational Architectures for the Arithmetic Cosine Transform
• Deep word embeddings for visual speech recognition
• Scaling Limits of Processes with Fast Nonlinear Mean Reversion
• Improve SAT-solving with Machine Learning
• Critical Points of Neural Networks: Analytical Forms and Landscape Properties
• Prophet Secretary for Combinatorial Auctions and Matroids
• Deep Learning and Conditional Random Fields-based Depth Estimation and Topographical Reconstruction from Conventional Endoscopy
• Location-adjusted Wald statistic for scalar parameters
• Fast and Scalable Learning of Sparse Changes in High-Dimensional Gaussian Graphical Model Structure
• Bibliometric-Enhanced Information Retrieval: 5th International BIR Workshop
• Prototype Matching Networks for Large-Scale Multi-label Genomic Sequence Classification
• Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics
• Theoretical properties of the global optimizer of two layer neural network
• Integer polygons of given perimeter
• A Dynamic Hash Table for the GPU
• Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
• Rock-Paper-Scissors, Differential Games and Biological Diversity
• Reachability Preservers: New Extremal Bounds and Approximation Algorithms
• Stochastic Variational Video Prediction
• Approximation Algorithms for $\ell_0$-Low Rank Approximation
• Adaptive Sampling Strategies for Stochastic Optimization
• Implicit Manifold Learning on Generative Adversarial Networks
• Some network conditions for positive recurrence of stochastically modeled reaction networks
• Theoretical and Computational Guarantees of Mean Field Variational Inference for Community Detection
• Empirical analysis of non-linear activation functions for Deep Neural Networks in classification tasks
• Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning
• Approximating Continuous Functions by ReLU Nets of Minimal Width
• Notes on Cops and Robber game on graphs
• Macroeconomics and FinTech: Uncovering Latent Macroeconomic Effects on Peer-to-Peer Lending
• Optimal Control of Connected and Automated Vehicles at Roundabouts: An Investigation in a Mixed-Traffic Environment
• Sequential Adaptive Detection for In-Situ Transmission Electron Microscopy (TEM)
• Tensor Sketching: Sparsification and Rank-One Projection
• A generalized parsing framework for Abstract Grammars
• Stochastic Linear Quadratic Optimal Control with General Control Domain
• Algorithmic learning of probability distributions from random data in the limit
• Characterizing the structural diversity of complex networks across domains
• Critical behaviour of a probabilistic cellular automaton model for the dynamics of a population driven by logistic growth and weak Allee effect
• The Exact Solution to Rank-1 L1-norm TUCKER2 Decomposition
• Generalized Forward-Backward Splitting with Penalization for Monotone Inclusion Problems
• Tumor Classification and Segmentation of MR Brain Images
• An Innovations Approach to Viterbi Decoding of Convolutional Codes
• Deep Forward and Inverse Perceptual Models for Tracking and Prediction
• Rate-optimal Meta Learning of Classification Error
• Emergence and Relevance of Criticality in Deep Learning
• Gaussian Approximation of the Distribution of Strongly Repelling Particles on the Unit Circle
• A quenched variational principle for discrete random maps
• Improving Social Media Text Summarization by Learning Sentence Weight Distribution
• Shallow Discourse Parsing with Maximum Entropy Model
• Coarse-Graining Open Markov Processes
• A Sequential Matching Framework for Multi-turn Response Selection in Retrieval-based Chatbots
• Mildly context sensitive grammar induction and variational bayesian inference
• ChainerMN: Scalable Distributed Deep Learning Framework
• Variations of the cop and robber game on graphs
• Spatio-temporal interaction model for crowd video analysis
• Image Patch Matching Using Convolutional Descriptors with Euclidean Distance
• Capacity-Achieving PIR Schemes with Optimal Sub-Packetization
• Intermittent quasistatic dynamical systems: weak convergence of fluctuations
• A Computer Vision System to Localize and Classify Wastes on the Streets
• Latent Space Oddity: on the Curvature of Deep Generative Models
• Semantic Interpolation in Implicit Models
• Flexible Prior Distributions for Deep Generative Models
• Parametrizing filters of a CNN with a GAN
• Updating the VESICLE-CNN Synapse Detector
• Boolean convolutions and regular variation
• Reshaping Cellular Networks for the Sky: The Major Factors and Feasibility
• Continuum percolation for Cox point processes
• A Scaled Smart City for Experimental Validation of Connected and Automated Vehicles
• Improved Bounds for Online Dominating Sets of Trees
• Two extensions of the Erős–Szekeres problem
• TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning
• Joint Cooperative Computation and Interactive Communication for Relay-Assisted Mobile Edge Computing
• Reconnecting statistical physics and combinatorics beyond ensemble equivalence
• Regret Minimization for Partially Observable Deep Reinforcement Learning
• SVSGAN: Singing Voice Separation via Generative Adversarial Network
• Physics-guided Neural Networks (PGNN): An Application in Lake Temperature Modeling
• Stochastic Maximum Principle under Probability Distortion
• Learning Neural Representations of Human Cognition across Many fMRI Studies
• Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization
• Deep Hashing with Triplet Quantization Loss
• Clothing Retrieval with Visual Attention Model
• Breaking the Interference Barrier in Dense Wireless Networks with Interference Alignment
• Marginal false discovery rates for penalized likelihood methods
• Investigating the effect of social groups in uni-directional pedestrian flow
• Manipulation Strategies for the Rank Maximal Matching Problem
• Guarding Against Adversarial Domain Shifts with Counterfactual Regularization
• Immersion of transitive tournaments in digraphs with large minimum outdegree
• Optimal Control of Endo-Atmospheric Launch Vehicle Systems: Geometric and Computational Issues
• Asymptotically Distribution-Free Goodness-of-Fit Testing for Copulas
• A multi-layer network based on Sparse Ternary Codes for universal vector compression
• Designing RNA Secondary Structures is Hard
• On the List-Decodability of Random Linear Rank-Metric Codes
• Energy Efficiency of Multi-user Multi-antenna Random Cellular Networks with Minimum Distance Constraints
• Extracting Syntactic Patterns from Databases
• A 4D-Var Method with Flow-Dependent Background Covariances for the Shallow-Water Equations
• Parameter Estimation in Mean Reversion Processes with Periodic Functional Tendency
• Compact Multi-Class Boosted Trees
• Modelo de Tratamiento para Tumores en Presencia de Radiación
• Discussion of ‘Data-driven confounder selection via Markov and Bayesian networks’ by Jenny Häggström
• Deep Learning as a Mixed Convex-Combinatorial Optimization Problem
• Bypass rewiring and extreme robustness of Eulerian networks
• Learning Graph Convolution Filters from Data Manifold
• Energy-Aware Virtual Network Embedding Approach for Distributed Cloud
• On Learning Mixtures of Well-Separated Gaussians
• Universal Constraints on the Location of Extrema of Eigenfunctions of Non-Local Schrödinger Operators
• Multiple Instance Hybrid Estimator for Hyperspectral Target Characterization and Sub-pixel Target Detection
• Whodunnit? Crime Drama as a Case for Natural Language Understanding
• Lower Bounds for Finding Stationary Points I
• Quasisymmetric Power Sums
• Space-filling design for nonlinear models
• Cellular-Enabled UAV Communication: Trajectory Optimization Under Connectivity Constraint
• Delocalization of Polymers in Lower Tail Large Deviation
Like this:
Like Loading...
Related