The Role of Conditional Independence in the Evolution of Intelligent Systems

Systems are typically made from simple components regardless of their complexity. While the function of each part is easily understood, higher order functions are emergent properties and are notoriously difficult to explain. In networked systems, both digital and biological, each component receives inputs, performs a simple computation, and creates an output. When these components have multiple outputs, we intuitively assume that the outputs are causally dependent on the inputs but are themselves independent of each other given the state of their shared input. However, this intuition can be violated for components with probabilistic logic, as these typically cannot be decomposed into separate logic gates with one output each. This violation of conditional independence on the past system state is equivalent to instantaneous interaction — the idea is that some information between the outputs is not coming from the inputs and thus must have been created instantaneously. Here we compare evolved artificial neural systems with and without instantaneous interaction across several task environments. We show that systems without instantaneous interactions evolve faster, to higher final levels of performance, and require fewer logic components to create a densely connected cognitive machinery.

Deep Neural Networks for Survival Analysis Based on a Multi-Task Framework

Survival analysis/time-to-event models are extremely useful as they can help companies predict when a customer will buy a product, churn or default on a loan, and therefore help them improve their ROI. In this paper, we introduce a new method to calculate survival functions using the Multi-Task Logistic Regression (MTLR) model as its base and a deep learning architecture as its core. Based on the Concordance index (C-index) and Brier score, this method outperforms the MTLR in all the experiments disclosed in this paper as well as the Cox Proportional Hazard (CoxPH) model when nonlinear dependencies are found.

Query2Vec: NLP Meets Databases for Generalized Workload Analytics

We propose methods for learning vector representations of SQL workloads to support a variety of administration tasks and application features, including query recommendation, workload summarization, index selection, identifying expensive queries, and predicting query reuse. We consider vector representations of both raw SQL text and optimized query plans under various assumptions and pre-processing strategies, and evaluate these methods on multiple real SQL workloads by comparing with results of task and application feature metrics in the literature. We find that simple algorithms based on these generic vector representations compete favorably with previous approaches that require a number of assumptions and task-specific heuristics. We then present a new embedding strategy specialized for queries based on tree-structured Long Short Term Memory (LSTM) network architectures that improves on the text-oriented embeddings for some tasks. We find that the general approach, when trained on a large corpus of SQL queries, provides a robust foundation for a variety of workload analysis tasks. We conclude by considering how workload embeddings can be deployed as a core database system feature to support database maintenance and novel applications.

Panel data analysis via mechanistic models

Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of panel systems is that the dynamic interaction between units should be negligible. Panel models therefore consist of a collection of independent stochastic processes, generally linked through shared parameters while also having unit-specific parameters. To give the scientist flexibility in model specification, we are motivated to develop a framework for inference on panel data permitting the consideration of arbitrary nonlinear, partially observed panel models. We build on iterated filtering techniques that provide likelihood-based inference on nonlinear partially observed Markov process models for time series data. Our methodology depends on the latent Markov process only through simulation; this plug-and-play property ensures applicability to a large class of models. We demonstrate our methodology on a toy example and two epidemiological case studies. We address inferential and computational issues arising for large panel datasets.

Toward Scalable Verification for Safety-Critical Deep Networks

The increasing use of deep neural networks for safety-critical applications, such as autonomous driving and flight control, raises concerns about their safety and reliability. Formal verification can address these concerns by guaranteeing that a deep learning system operates as intended, but the state-of-the-art is limited to small systems. In this work-in-progress report we give an overview of our work on mitigating this difficulty, by pursuing two complementary directions: devising scalable verification techniques, and identifying design choices that result in deep learning systems that are more amenable to verification.

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

The driving force behind the recent success of LSTMs has been their ability to learn complex and non-linear relationships. Consequently, our inability to describe these relationships has led to LSTMs being characterized as black boxes. To this end, we introduce contextual decomposition (CD), an interpretation algorithm for analysing individual predictions made by standard LSTMs, without any changes to the underlying model. By decomposing the output of a LSTM, CD captures the contributions of combinations of words or variables to the final prediction of an LSTM. On the task of sentiment analysis with the Yelp and SST data sets, we show that CD is able to reliably identify words and phrases of contrasting sentiment, and how they are combined to yield the LSTM’s final prediction. Using the phrase-level labels in SST, we also demonstrate that CD is able to successfully extract positive and negative negations from an LSTM, something which has not previously been done.

Testing Separability of Functional Time Series

We derive and study a significance test for determining if a panel of functional time series is separable. In the context of this paper, separability means that the covariance structure factors into the product of two functions, one depending only on time and the other depending only on the coordinates of the panel. Separability is a property which can dramatically improve computational efficiency by substantially reducing model complexity. It is especially useful for functional data as it implies that the functional principal components are the same for each member of the panel. However such an assumption must be verified before proceeding with further inference. Our approach is based on functional norm differences and provides a test with well controlled size and high power. We establish our procedure quite generally, allowing one to test separability of autocovariances as well. In addition to an asymptotic justification, our methodology is validated by a simulation study. It is applied to functional panels of particulate pollution and stock market data.

Reinforcement Learning based Recommender System using Biclustering Technique

A recommender system aims to recommend items that a user is interested in among many items. The need for the recommender system has been expanded by the information explosion. Various approaches have been suggested for providing meaningful recommendations to users. One of the proposed approaches is to consider a recommender system as a Markov decision process (MDP) problem and try to solve it using reinforcement learning (RL). However, existing RL-based methods have an obvious drawback. To solve an MDP in a recommender system, they encountered a problem with the large number of discrete actions that bring RL to a larger class of problems. In this paper, we propose a novel RL-based recommender system. We formulate a recommender system as a gridworld game by using a biclustering technique that can reduce the state and action space significantly. Using biclustering not only reduces space but also improves the recommendation quality effectively handling the cold-start problem. In addition, our approach can provide users with some explanation why the system recommends certain items. Lastly, we examine the proposed algorithm on a real-world dataset and achieve a better performance than the widely used recommendation algorithm.

ALE: Additive Latent Effect Models for Grade Prediction

The past decade has seen a growth in the development and deployment of educational technologies for assisting college-going students in choosing majors, selecting courses and acquiring feedback based on past academic performance. Grade prediction methods seek to estimate a grade that a student may achieve in a course that she may take in the future (e.g., next term). Accurate and timely prediction of students’ academic grades is important for developing effective degree planners and early warning systems, and ultimately improving educational outcomes. Existing grade pre- diction methods mostly focus on modeling the knowledge components associated with each course and student, and often overlook other factors such as the difficulty of each knowledge component, course instructors, student interest, capabilities and effort. In this paper, we propose additive latent effect models that incorporate these factors to predict the student next-term grades. Specifically, the proposed models take into account four factors: (i) student’s academic level, (ii) course instructors, (iii) student global latent factor, and (iv) latent knowledge factors. We compared the new models with several state-of-the-art methods on students of various characteristics (e.g., whether a student transferred in or not). The experimental results demonstrate that the proposed methods significantly outperform the baselines on grade prediction problem. Moreover, we perform a thorough analysis on the importance of different factors and how these factors can practically assist students in course selection, and finally improve their academic performance.

Semi-supervised FusedGAN for Conditional Image Generation

We present FusedGAN, a deep network for conditional image synthesis with controllable sampling of diverse images. Fidelity, diversity and controllable sampling are the main quality measures of a good image generation model. Most existing models are insufficient in all three aspects. The FusedGAN can perform controllable sampling of diverse images with very high fidelity. We argue that controllability can be achieved by disentangling the generation process into various stages. In contrast to stacked GANs, where multiple stages of GANs are trained separately with full supervision of labeled intermediate images, the FusedGAN has a single stage pipeline with a built-in stacking of GANs. Unlike existing methods, which requires full supervision with paired conditions and images, the FusedGAN can effectively leverage more abundant images without corresponding conditions in training, to produce more diverse samples with high fidelity. We achieve this by fusing two generators: one for unconditional image generation, and the other for conditional image generation, where the two partly share a common latent space thereby disentangling the generation. We demonstrate the efficacy of the FusedGAN in fine grained image generation tasks such as text-to-image, and attribute-to-face generation.

Meta-Learning with Adaptive Layerwise Metric and Subspace

Recent advances in meta-learning demonstrate that deep representations combined with the gradient descent method have sufficient capacity to approximate any learning algorithm. A promising approach is the model-agnostic meta-learning (MAML) which embeds gradient descent into the meta-learner. It optimizes for the initial parameters of the learner to warm-start the gradient descent updates, such that new tasks can be solved using a small number of examples. In this paper we elaborate the gradient-based meta-learning, developing two new schemes. First, we present a feedforward neural network, referred to as T-net, where the linear transformation between two adjacent layers is decomposed as T W such that W is learned by task-specific learners and the transformation T, which is shared across tasks, is meta-learned to speed up the convergence of gradient updates for task-specific learners. Second, we present MT-net where gradient updates in the T-net are guided by a binary mask M that is meta-learned, restricting the updates to be performed in a subspace. Empirical results demonstrate that our method is less sensitive to the choice of initial learning rates than existing meta-learning methods, and achieves the state-of-the-art or comparable performance on few-shot classification and regression tasks.

Efficient Test Collection Construction via Active Learning

To create a new IR test collection at minimal cost, we must carefully select which documents merit human relevance judgments. Shared task campaigns such as NIST TREC determine this by pooling search results from many participating systems (and often interactive runs as well), thereby identifying the most likely relevant documents in a given collection. While effective, it would be preferable to be able to build a new test collection without needing to run an entire shared task. Toward this end, we investigate multiple active learning (AL) strategies which, without reliance on system rankings: 1) select which documents human assessors should judge; and 2) automatically classify the relevance of remaining unjudged documents. Because scarcity of relevant documents tends to yield highly imbalanced training data for model estimation, we investigate sampling strategies to mitigate class imbalance. We report experiments on four TREC collections with varying scarcity of relevant documents, reporting labeling accuracy achieved, as well as rank correlation when evaluating participant systems using these labels vs. NIST judgments. Results demonstrate the effectiveness of our approach, coupled with further analysis showing how varying relevance scarcity, within and across collections, impacts findings.

Innateness, AlphaZero, and Artificial Intelligence

The concept of innateness is rarely discussed in the context of artificial intelligence. When it is discussed, or hinted at, it is often the context of trying to reduce the amount of innate machinery in a given system. In this paper, I consider as a test case a recent series of papers by Silver et al (Silver et al., 2017a) on AlphaGo and its successors that have been presented as an argument that a ‘even in the most challenging of domains: it is possible to train to superhuman level, without human examples or guidance’, ‘starting tabula rasa.’ I argue that these claims are overstated, for multiple reasons. I close by arguing that artificial intelligence needs greater attention to innateness, and I point to some proposals about what that innateness might look like.

Face Recognition via Centralized Coordinate Learning

Owe to the rapid development of deep neural network (DNN) techniques and the emergence of large scale face databases, face recognition has achieved a great success in recent years. During the training process of DNN, the face features and classification vectors to be learned will interact with each other, while the distribution of face features will largely affect the convergence status of network and the face similarity computing in test stage. In this work, we formulate jointly the learning of face features and classification vectors, and propose a simple yet effective centralized coordinate learning (CCL) method, which enforces the features to be dispersedly spanned in the coordinate space while ensuring the classification vectors to lie on a hypersphere. An adaptive angular margin is further proposed to enhance the discrimination capability of face features. Extensive experiments are conducted on six face benchmarks, including those have large age gap and hard negative samples. Trained only on the small-scale CASIA Webface dataset with 460K face images from about 10K subjects, our CCL model demonstrates high effectiveness and generality, showing consistently competitive performance across all the six benchmark databases.

A modified fuzzy C means algorithm for shading correction in craniofacial CBCT images

CBCT images suffer from acute shading artifacts primarily due to scatter. Numerous image-domain correction algorithms have been proposed in the literature that use patient-specific planning CT images to estimate shading contributions in CBCT images. However, in the context of radiosurgery applications such as gamma knife, planning images are often acquired through MRI which impedes the use of polynomial fitting approaches for shading correction. We present a new shading correction approach that is independent of planning CT images. Our algorithm is based on the assumption that true CBCT images follow a uniform volumetric intensity distribution per material, and scatter perturbs this uniform texture by contributing cupping and shading artifacts in the image domain. The framework is a combination of fuzzy C-means coupled with a neighborhood regularization term and Otsu’s method. Experimental results on artificially simulated craniofacial CBCT images are provided to demonstrate the effectiveness of our algorithm. Spatial non-uniformity is reduced from 16% to 7% in soft tissue and from 44% to 8% in bone regions. With shading-correction, thresholding based segmentation accuracy for bone pixels is improved from 85% to 91% when compared to thresholding without shading-correction. The proposed algorithm is thus practical and qualifies as a plug and play extension into any CBCT reconstruction software for shading correction.

Deep Learning: An Introduction for Applied Mathematicians

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus, approximation theory, optimization and linear algebra. This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective. Our target audience includes postgraduate and final year undergraduate students in mathematics who are keen to learn about the area. The article may also be useful for instructors in mathematics who wish to enliven their classes with references to the application of deep learning techniques. We focus on three fundamental questions: what is a deep neural network? how is a network trained? what is the stochastic gradient method? We illustrate the ideas with a short MATLAB code that sets up and trains a network. We also show the use of state-of-the art software on a large scale image classification problem. We finish with references to the current literature.

Bayesian Estimation of Gaussian Graphical Models with Projection Predictive Selection

Gaussian graphical models are used for determining conditional relationships between variables. This is accomplished by identifying off-diagonal elements in the inverse-covariance matrix that are non-zero. When the ratio of variables (p) to observations (n) approaches one, the maximum likelihood estimator of the covariance matrix becomes unstable and requires shrinkage estimation. Whereas several classical (frequentist) methods have been introduced to address this issue, Bayesian methods remain relatively uncommon in practice and methodological literatures. Here we introduce a Bayesian method for estimating sparse matrices, in which conditional relationships are determined with projection predictive selection. Through simulation and an applied example, we demonstrate that the proposed method often outperforms both classical and alternative Bayesian estimators with respect to frequentist risk and consistently made the fewest false positives.We end by discussing limitations and future directions, as well as contributions to the Bayesian literature on the topic of sparsity.

In-network Neural Networks

We present N2Net, a system that implements binary neural networks using commodity switching chips deployed in network switches and routers. Our system shows that these devices can run simple neural network models, whose input is encoded in the network packets’ header, at packet processing speeds (billions of packets per second). Furthermore, our experience highlights that switching chips could support even more complex models, provided that some minor and cheap modifications to the chip’s design are applied. We believe N2Net provides an interesting building block for future end-to-end networked systems.

On I-Optimal Designs for Generalized Linear Models: An Efficient Algorithm via General Equivalence Theory

The generalized linear model plays an important role in statistical analysis and the related design issues are undoubtedly challenging. The state-of-the-art works mostly apply to design criteria on the estimates of regression coefficients. It is of importance to study optimal designs for generalized linear models, especially on the prediction aspects. In this work, we propose a prediction-oriented design criterion, I-optimality, and develop an efficient sequential algorithm of constructing I-optimal designs for generalized linear models. Through establishing the General Equivalence Theorem of the I-optimality for generalized linear models, we obtain an insightful understanding for the proposed algorithm on how to sequentially choose the support points and update the weights of support points of the design. The proposed algorithm is computationally efficient with guaranteed convergence property. Numerical examples are conducted to evaluate the feasibility and computational efficiency of the proposed algorithm.

Sparsely Connected Convolutional Networks

Residual learning with skip connections permits training ultra-deep neural networks and obtains superb performance. Building in this direction, DenseNets proposed a dense connection structure where each layer is directly connected to all of its predecessors. The densely connected structure leads to better information flow and feature reuse. However, the overly dense skip connections also bring about the problems of potential risk of overfitting, parameter redundancy and large memory consumption. In this work, we analyze the feature aggregation patterns of ResNets and DenseNets under a uniform aggregation view framework. We show that both structures densely gather features from previous layers in the network but combine them in their respective ways: summation (ResNets) or concatenation (DenseNets). We compare the strengths and drawbacks of these two aggregation methods and analyze their potential effects on the networks’ performance. Based on our analysis, we propose a new structure named SparseNets which achieves better performance with fewer parameters than DenseNets and ResNets.

An Overview of Machine Teaching

In this paper we try to organize machine teaching as a coherent set of ideas. Each idea is presented as varying along a dimension. The collection of dimensions then form the problem space of machine teaching, such that existing teaching problems can be characterized in this space. We hope this organization allows us to gain deeper understanding of individual teaching problems, discover connections among them, and identify gaps in the field.

Layered TPOT: Speeding up Tree-based Pipeline Optimization

With the demand for machine learning increasing, so does the demand for tools which make it easier to use. Automated machine learning (AutoML) tools have been developed to address this need, such as the Tree-Based Pipeline Optimization Tool (TPOT) which uses genetic programming to build optimal pipelines. We introduce Layered TPOT, a modification to TPOT which aims to create pipelines equally good as the original, but in significantly less time. This approach evaluates candidate pipelines on increasingly large subsets of the data according to their fitness, using a modified evolutionary algorithm to allow for separate competition between pipelines trained on different sample sizes. Empirical evaluation shows that, on sufficiently large datasets, Layered TPOT indeed finds better models faster.

Latitude: A Model for Mixed Linear-Tropical Matrix Factorization

Nonnegative matrix factorization (NMF) is one of the most frequently-used matrix factorization models in data analysis. A significant reason to the popularity of NMF is its interpretability and the `parts of whole’ interpretation of its components. Recently, max-times, or subtropical, matrix factorization (SMF) has been introduced as an alternative model with equally interpretable `winner takes it all’ interpretation. In this paper we propose a new mixed linear–tropical model, and a new algorithm, called Latitude, that combines NMF and SMF, being able to smoothly alternate between the two. In our model, the data is modeled using the latent factors and latent parameters that control whether the factors are interpreted as NMF or SMF features, or their mixtures. We present an algorithm for our novel matrix factorization. Our experiments show that our algorithm improves over both baselines, and can yield interpretable results that reveal more of the latent structure than either NMF or SMF alone.

Fine-tuned Language Models for Text Classification

Transfer learning has revolutionized computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Fine-tuned Language Models (FitLaM), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a state-of-the-art language model. Our method significantly outperforms the state-of-the-art on five text classification tasks, reducing the error by 18-24% on the majority of datasets. We open-source our pretrained models and code to enable adoption by the community.

When Does Stochastic Gradient Algorithm Work Well?

In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a fixed, large step size and propose a novel assumption on the objective function, under which this method has the improved convergence rates (to a neighborhood of the optimal solutions). We then empirically demonstrate that these assumptions hold for logistic regression and standard deep neural networks on classical data sets. Thus our analysis helps to explain when efficient behavior can be expected from the SGD method in training classification models and deep neural networks.

Contextual and Position-Aware Factorization Machines for Sentiment Classification

While existing machine learning models have achieved great success for sentiment classification, they typically do not explicitly capture sentiment-oriented word interaction, which can lead to poor results for fine-grained analysis at the snippet level (a phrase or sentence). Factorization Machine provides a possible approach to learning element-wise interaction for recommender systems, but they are not directly applicable to our task due to the inability to model contexts and word sequences. In this work, we develop two Position-aware Factorization Machines which consider word interaction, context and position information. Such information is jointly encoded in a set of sentiment-oriented word interaction vectors. Compared to traditional word embeddings, SWI vectors explicitly capture sentiment-oriented word interaction and simplify the parameter learning. Experimental results show that while they have comparable performance with state-of-the-art methods for document-level classification, they benefit the snippet/sentence-level sentiment analysis.

Do the surface Fermi arcs in Weyl semimetals survive disorder?
ConvSRC: SmartPhone based Periocular Recognition using Deep Convolutional Neural Network and Sparsity Augmented Collaborative Representation
Deep Network for Simultaneous Decomposition and Classification in UWB-SAR Imagery
On a bimodal Birnbaum-Saunders distribution with applications to lifetime data
Minimum saturated families of sets
Belief Control Strategies for Interactions over Weak Graphs
Knudsen gas in flat tire
Pilot Contamination Mitigation with Reduced RF Chains
Longest Processing Time rule for identical parallel machines revisited
A group commutator involving the last distance matrix and dual distance matrix of a $Q$-polynomial distance-regular graph
Algorithmic aspects of graph-indexed random walks
Graph-indexed random walks on special classes of graphs
Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management
An operator that relates to semi-meander polynomials via a two-sided q-Wick formula
Automatic Classification of Music Genre using Masked Conditional Neural Networks
Event-triggered Control of Infinite-dimensional Systems
Coded Computing for Distributed Graph Analytics
Counting Borel Orbits in Symmetric Varieties of Types $BI$ and $CII$
Identification of Seed Cells in Multispectral Images for GrowCut Segmentation
Cahn–Hilliard inpainting with the double obstacle potential
Stable Phaseless Sampling and Reconstruction of Real-Valued Signals with Finite Rate of Innovations
Over-the-Air Implementation of Uplink NOMA
RED-Net: A Recurrent Encoder-Decoder Network for Video-based Face Alignment
Wiener-Hopf factorization for time-inhomogeneous Markov chains and its application
Fruit Quantity and Quality Estimation using a Robotic Vision System
Robust Modifications of U-statistics and Applications to Covariance Estimation Problems
An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients
Exploiting Diversity in Molecular Timing Channels via Order Statistics
Image Captioning using Deep Neural Architectures
Brenier approach for optimal transportation between a quasi-discrete measure and a discrete measure
Structure of eigenvectors of random regular digraphs
Circular law for sparse random regular digraphs
The rank of random regular digraphs of constant degree
Light-weight pixel context encoders for image inpainting
On the Proximal Gradient Algorithm with Alternated Inertia
Combinatorics of patience sorting monoids
Relation between combinatorial Ricci curvature and Lin-Lu-Yau’s Ricci Curvature on cell complexes
Additive Margin Softmax for Face Verification
Multi-View Stereo 3D Edge Reconstruction
Catalan numbers, Hankel determinants and Fibonacci polynomials
Unseen Class Discovery in Open-world Classification
A Taxonomy for Management and Optimization of Multiple Resources in Edge Computing
Automatic Detection of Cyberbullying in Social Media Text
Revealing In-Block Nestedness: detection and benchmarking
Stability for the mailing problem
On the Reduction of Biases in Big Data Sets for the Detection of Irregular Power Usage
Game-Theoretical Strategy of Robot in the Area with Dynamical Obstacles
On Global Existence and Blow-up for Damped Stochastic Nonlinear Schrödinger Equation
The Case for Automatic Database Administration using Deep Reinforcement Learning
A formal framework for deliberated judgment
Direct combined approach for expansion of multiple Stratonovich stochastic integrals of multiplicities 2 – 4, based on generalized multiple Fourier series
Rate-Distortion Performance of Sequential Massive Random Access to Gaussian Sources with Memory
Perfect synchronization in networks of phase-frustrated oscillators
A Randomized Exchange Algorithm for Computing Optimal Approximate Designs of Experiments
The scaling limit of the membrane model
On the origin of self-oscillations in large systems
Mixed Delay Constraints in Wyner’s Soft-Handoff Network
Eigenvector localization in the heavy-tailed random conductance model
Sampled-data implementation of derivative-dependent control using artificial delays
Imprecise Markov Models for Scalable and Robust Performance Evaluation of Flexi-Grid Spectrum Allocation Policies
Exact quantum query complexity of weight decision problems
The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and Option Portfolios
A compressed classical description of quantum states
A Markov Process Approach to the asymptotic Theory of abstract Cauchy Problems driven by Poisson Processes
Deep Convolutional Neural Networks for Eigenvalue Problems in Mechanics
Eliminating the effect of rating bias on reputation systems
Characterization of Time Series Via Rényi Complexity-Entropy Curves
TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation
On Gaussian random matrices coupled to the discrete Laplacian
Experience-driven Networking: A Deep Reinforcement Learning based Approach
Ranking Data with Continuous Labels through Oriented Recursive Partitions
Metastable transitions in inertial Langevin systems: what can be different from the overdamped case?
On the number of maximal paths in directed last-passage percolation
Machine learning action parameters in lattice quantum chromodynamics
Faster gaze prediction with dense networks and Fisher pruning
Pathwise Convergence of the Hard Spheres Kac Process
Interactive in-base street model edit: how common GIS software and a database can serve as a custom Graphical User Interface
Efficient Computation of the 8-point DCT via Summation by Parts
Performance Analysis of Joint Pairing and Mode Selection in D2D Communications with FD Radios
Random Construction of Partial MDS Codes
On the Limited Communication Analysis and Design for Decentralized Estimation
A Kotel’nikov Representation for Wavelets
Joint Service Caching and Task Offloading for Mobile Edge Computing in Dense Networks
Quantized Compressive Sensing with RIP Matrices: The Benefit of Dithering
Spontaneous emergence of self-replication in chemical reaction systems
Sparse Activity Detection for Massive Connectivity
Synchronization of electrically coupled resonate-and-fire neurons
A Pipeline for Post-Crisis Twitter Data Acquisition
Nonuniform Reductions and NP-Completeness
Nondeterminisic Sublinear Time Has Measure 0 in P
Uniform Ergodicity for Brownian Motion in a Bounded Convex Set
Batch Auction Design For Cloud Container Services
On the cycle index and the weight enumerator
Unsupervised Hashtag Retrieval and Visualization for Crisis Informatics
How can social planners prevent disappointment in an election?
On the influence of Dice loss function in multi-class organ segmentation of abdominal CT using 3D fully convolutional networks
Variance Components Genetic Association Test for Zero-inflated Count Outcomes
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Rate-Optimal Streaming Codes for Channels with Burst and Isolated Erasures
Operator Norm Moment and Exponential Inequalities for Matrix U-statistics
Graph Based Analysis for Gene Segment Organization In a Scrambled Genome
The Utility Cost of Robust Privacy Guarantees
Faster Algorithms for Large-scale Machine Learning using Simple Sampling Techniques
Condensation of non-reversible zero-range processes
Computation of the Maximum Likelihood estimator in low-rank Factor Analysis
Gradient Estimates and Ergodicity for SDEs Driven by Multiplicative Lévy Noises via Coupling
PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark
Uplink Coverage Performance of an Underlay Drone Cell for Temporary Events
Quadratically Constrained Myopic Adversarial Channels
On a Generic Security Game Model
Formula for Calculating the Wiener Polarity Index
Complexity of Combinations of Qualitative Constraint Satisfaction Problems
3D CNN-based classification using sMRI and MD-DTI images for Alzheimer disease studies
A combinatorial approach to Rauzy-type dynamics II: the labelling method and a second proof of the KZB classification theorem
Doubling Algorithms for Stationary Distributions of Fluid Queues: A Probabilistic Interpretation
Prediction of the Optimal Threshold Value in DF Relay Selection Schemes Based on Artificial Neural Networks
Code Constructions for Distributed Storage With Low Repair Bandwidth and Low Repair Complexity
On-Chip CNN Accelerator for Image Super-Resolution
A single server queue with batch arrivals and semi-Markov services
Short walk adventures
Scattered classes of graphs
Convergence of the solutions of the discounted Hamilton-Jacobi equation: a counterexample
Thermodynamics of network model fitting with spectral entropies
Image Enhancement and Noise Reduction Using Modified Delay-Multiply-and-Sum Beamformer: Application to Medical Photoacoustic Imaging
Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors
Natural Language Multitasking: Analyzing and Improving Syntactic Saliency of Hidden Representations
Synchronization transition in Sakaguchi-Kuramoto model on complex networks with partial degree-frequency correlation
Boolean degree 1 functions on some classical association schemes
Output Feedback Control Based on State and Disturbance Estimation
Graphical models for mediation analysis
Noncrossing partitions, Bruhat order and the cluster complex
Invariants of multidimensional time series based on their iterated-integral signature
Overcoming the vanishing gradient problem in plain recurrent networks
An Iterative Closest Point Method for Unsupervised Word Translation
A methodology for calculating the latency of GPS-probe data
TASEP fluctuations with soft-shock initial data
Upgrading from Gaussian Processes to Student’s-T Processes
Characterization of probability distribution convergence in Wasserstein distance by $L^{p}$-quantization error function
The number of $4$-cycles and the cyclomatic number of a finite simple graph
Private Information Retrieval Through Wiretap Channel II: Privacy Meets Security
Integrating planning for task-completion dialogue policy learning
Translationally invariant non-Fermi liquid metals with critical Fermi-surfaces: Solvable models