AI Reasoning Systems: PAC and Applied Methods

Learning and logic are distinct and remarkable approaches to prediction. Machine learning has experienced a surge in popularity because it is robust to noise and achieves high performance; however, ML experiences many issues with knowledge transfer and extrapolation. In contrast, logic is easily intepreted, and logical rules are easy to chain and transfer between systems; however, inductive logic is brittle to noise. We then explore the premise of combining learning with inductive logic into AI Reasoning Systems. Specifically, we summarize findings from PAC learning (conceptual graphs, robust logics, knowledge infusion) and deep learning (DSRL, \partialILP, DeepLogic) by reproducing proofs of tractability, presenting algorithms in pseudocode, highlighting results, and synthesizing between fields. We conclude with suggestions for integrated models by combining the modules listed above and with a list of unsolved (likely intractable) problems.


Tails and probabilities for extreme outliers

The task of estimation of the tails of probability distributions having small samples seems to be still opened and almost unsolvable. The paper tries to make a step in filling this gap. In 2017 Jordanova et al. introduce six new characteristics of the heaviness of the tails of theoretical distributions. They rely on the probability to observe {\color{blue}mild or} extreme outliers. The main their advantage is that they always exist. This work presents some new properties of these characteristics. Using them six distribution sensitive estimators of the extremal index are defined. A brief simulation study compares their quality with the quality of Hill, t-Hill, Pickands and Deckers-Einmahl-de Haan estimators.


How transferable are the datasets collected by active learners

Active learning is a widely-used training strategy for maximizing predictive performance subject to a fixed annotation budget. Between rounds of training, an active learner iteratively selects examples for annotation, typically based on some measure of the model’s uncertainty, coupling the acquired dataset with the underlying model. However, owing to the high cost of annotation and the rapid pace of model development, labeled datasets may remain valuable long after a particular model is surpassed by new technology. In this paper, we investigate the transferability of datasets collected with an acquisition model A to a distinct successor model S. We seek to characterize whether the benefits of active learning persist when A and S are different models. To this end, we consider two standard NLP tasks and associated datasets: text classification and sequence tagging. We find that training S on a dataset actively acquired with a (different) model A typically yields worse performance than when S is trained with ‘native’ data (i.e., acquired actively using S), and often performs worse than training on i.i.d. sampled data. These findings have implications for the use of active learning in practice,suggesting that it is better suited to cases where models are updated no more frequently than labeled data.


DP-GP-LVM: A Bayesian Non-Parametric Model for Learning Multivariate Dependency Structures

We present a non-parametric Bayesian latent variable model capable of learning dependency structures across dimensions in a multivariate setting. Our approach is based on flexible Gaussian process priors for the generative mappings and interchangeable Dirichlet process priors to learn the structure. The introduction of the Dirichlet process as a specific structural prior allows our model to circumvent issues associated with previous Gaussian process latent variable models. Inference is performed by deriving an efficient variational bound on the marginal log-likelihood on the model.


Modeling, Analysis, and Hard Real-time Scheduling of Adaptive Streaming Applications

In real-time systems, the application’s behavior has to be predictable at compile-time to guarantee timing constraints. However, modern streaming applications which exhibit adaptive behavior due to mode switching at run-time, may degrade system predictability due to unknown behavior of the application during mode transitions. Therefore, proper temporal analysis during mode transitions is imperative to preserve system predictability. To this end, in this paper, we initially introduce Mode Aware Data Flow (MADF) which is our new predictable Model of Computation (MoC) to efficiently capture the behavior of adaptive streaming applications. Then, as an important part of the operational semantics of MADF, we propose the Maximum-Overlap Offset (MOO) which is our novel protocol for mode transitions. The main advantage of this transition protocol is that, in contrast to self-timed transition protocols, it avoids timing interference between modes upon mode transitions. As a result, any mode transition can be analyzed independently from the mode transitions that occurred in the past. Based on this transition protocol, we propose a hard real-time analysis as well to guarantee timing constraints by avoiding processor overloading during mode transitions. Therefore, using this protocol, we can derive a lower bound and an upper bound on the earliest starting time of the tasks in the new mode during mode transitions in such a way that hard real-time constraints are respected.


Parameterized Distributed Algorithms

In this work, we initiate a thorough study of parameterized graph optimization problems in the distributed setting. In a parameterized problem, an algorithm decides whether a solution of size bounded by a \emph{parameter} k exists and if so, it finds one. We study fundamental problems, including Minimum Vertex Cover (MVC), Maximum Independent Set (MaxIS), Maximum Matching (MaxM), and many others, in both the LOCAL and CONGEST distributed computation models. We present lower bounds for the round complexity of solving parameterized problems in both models, together with optimal and near-optimal upper bounds. Our results extend beyond the scope of parameterized problems. We show that any LOCAL (1+\epsilon)-approximation algorithm for the above problems must take \Omega(\epsilon^{-1}) rounds. Joined with the algorithm of [GKM17] and the \Omega(\sqrt{\frac{\log n}{\log\log n}}) lower bound of [KMW16], this settles the complexity of (1+\epsilon)-approximating MVC, MaxM and MaxIS at (\epsilon^{-1}\log n)^{\Theta(1)}. We also show that our parameterized approach reduces the runtime of exact and approximate CONGEST algorithms for MVC and MaxM if the optimal solution is small, without knowing its size beforehand. Finally, we propose the first deterministic o(n^2) rounds CONGEST algorithms that approximate MVC and MaxM within a factor strictly smaller than 2.


Leveraging Catalog Knowledge Graphs for Query Attribute Identification in E-Commerce Sites

Millions of people use online e-commerce platforms to search and buy products. Identifying attributes in a query is a critical component in connecting users to relevant items. However, in many cases, the queries have multiple attributes, and some of them will be in conflict with each other. For example, the query ‘maroon 5 dvds’ has two candidate attributes, the color ‘maroon’ or the band ‘maroon 5’, where only one of the attributes can be present. In this paper, we address the problem of resolving conflicting attributes in e-commerce queries. A challenge in this problem is that knowledge bases like Wikipedia that are used to understand web queries are not focused on the e-commerce domain. E-commerce search engines, however, have access to the catalog which contains detailed information about the items and its attributes. We propose a framework that constructs knowledge graphs from catalog to resolve conflicting attributes in e-commerce queries. Our experiments on real-world queries on e-commerce platforms demonstrate that resolving conflicting attributes by leveraging catalog information significantly improves attribute identification, and also gives out more relevant search results.


Deep Learning in the Wild

Deep learning with neural networks is applied by an increasing number of people outside of classic research environments, due to the vast success of the methodology on a wide range of machine perception tasks. While this interest is fueled by beautiful success stories, practical work in deep learning on novel tasks without existing baselines remains challenging. This paper explores the specific challenges arising in the realm of real world tasks, based on case studies from research \& development in conjunction with industry, and extracts lessons learned from them. It thus fills a gap between the publication of latest algorithmic and methodical developments, and the usually omitted nitty-gritty of how to make them work. Specifically, we give insight into deep learning projects on face matching, print media monitoring, industrial quality control, music scanning, strategy game playing, and automated machine learning, thereby providing best practices for deep learning in practice.


Computing recommendations via a Knowledge Graph-aware Autoencoder

In the last years, deep learning has shown to be a game-changing technology in artificial intelligence thanks to the numerous successes it reached in diverse application fields. Among others, the use of deep learning for the recommendation problem, although new, looks quite promising due to its positive performances in terms of accuracy of recommendation results. In a recommendation setting, in order to predict user ratings on unknown items a possible configuration of a deep neural network is that of autoencoders tipically used to produce a lower dimensionality representation of the original data. In this paper we present KG-AUTOENCODER, an autoencoder that bases the structure of its neural network on the semanticsaware topology of a knowledge graph thus providing a label for neurons in the hidden layer that are eventually used to build a user profile and then compute recommendations. We show the effectiveness of KG-AUTOENCODER in terms of accuracy, diversity and novelty by comparing with state of the art recommendation algorithms.


Exploring Hierarchy-Aware Inverse Reinforcement Learning

We introduce a new generative model for human planning under the Bayesian Inverse Reinforcement Learning (BIRL) framework which takes into account the fact that humans often plan using hierarchical strategies. We describe the Bayesian Inverse Hierarchical RL (BIHRL) algorithm for inferring the values of hierarchical planners, and use an illustrative toy model to show that BIHRL retains accuracy where standard BIRL fails. Furthermore, BIHRL is able to accurately predict the goals of `Wikispeedia’ game players, with inclusion of hierarchical structure in the model resulting in a large boost in accuracy. We show that BIHRL is able to significantly outperform BIRL even when we only have a weak prior on the hierarchical structure of the plans available to the agent, and discuss the significant challenges that remain for scaling up this framework to more realistic settings.


CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks

This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, aiming to perform high-throughput inference. A two-stage architecture tailored for any given CNN-FPGA pair is generated, consisting of a low- and high-precision unit in a cascade. A confidence evaluation unit is employed to identify misclassified cases from the excessively low-precision unit and forward them to the high-precision unit for re-processing. Experiments demonstrate that the proposed toolflow can achieve a performance boost up to 55% for VGG-16 and 48% for AlexNet over the baseline design for the same resource budget and accuracy, without the need of retraining the model or accessing the training data.


Optimal Algorithms for Right-Sizing Data Centers – Extended Version

Electricity cost is a dominant and rapidly growing expense in data centers. Unfortunately, much of the consumed energy is wasted because servers are idle for extended periods of time. We study a capacity management problem that dynamically right-sizes a data center, matching the number of active servers with the varying demand for computing capacity. We resort to a data-center optimization problem introduced by Lin, Wierman, Andrew and Thereska that, over a time horizon, minimizes a combined objective function consisting of operating cost, modeled by a sequence of convex functions, and server switching cost. All prior work addresses a continuous setting in which the number of active servers, at any time, may take a fractional value. In this paper, we investigate for the first time the discrete data-center optimization problem where the number of active servers, at any time, must be integer valued. Thereby we seek truly feasible solutions. First, we show that the offline problem can be solved in polynomial time. Our algorithm relies on a new, yet intuitive graph theoretic model of the optimization problem and performs binary search in a layered graph. Second, we study the online problem and extend the algorithm Lazy Capacity Provisioning (LCP) by Lin et al. to the discrete setting. We prove that LCP is 3-competitive. Moreover, we show that no deterministic online algorithm can achieve a competitive ratio smaller than 3. We develop a randomized online algorithm that is 2-competitive against an oblivious adversary and prove that 2 is a lower bound for the competitive ratio of randomized online algorithms. Finally, we address the continuous setting and give a lower bound of 2 on the best competitiveness of online algorithms. All lower bounds mentioned above also holds in a problem variant with more restricted operating cost functions, introduced by Lin et al.


Tune: A Research Platform for Distributed Model Selection and Training

Modern machine learning algorithms are increasingly computationally demanding, requiring specialized hardware and distributed computation to achieve high performance in a reasonable time frame. Many hyperparameter search algorithms have been proposed for improving the efficiency of model selection, however their adaptation to the distributed compute environment is often ad-hoc. We propose Tune, a unified framework for model selection and training that provides a narrow-waist interface between training scripts and search algorithms. We show that this interface meets the requirements for a broad range of hyperparameter search algorithms, allows straightforward scaling of search to large clusters, and simplifies algorithm implementation. We demonstrate the implementation of several state-of-the-art hyperparameter search algorithms in Tune. Tune is available at http://…/tune.html.


Low-Resource Text Classification using Domain-Adversarial Learning

Deep learning techniques have recently shown to be successful in many natural language processing tasks forming state-of-the-art systems. They require, however, a large amount of annotated data which is often missing. This paper explores the use of domain-adversarial learning as a regularizer to avoid overfitting when training domain invariant features for deep, complex neural network in low-resource and zero-resource settings in new target domains or languages. In the case of new languages, we show that monolingual word-vectors can be directly used for training without pre-alignment. Their projection into a common space can be learnt ad-hoc at training time reaching the final performance of pretrained multilingual word-vectors.


Surrogate-Based Inverse Modeling of the Hydrological System: An Adaptive Approach Considering Surrogate Structural Error
Regional Control of Probabilistic Cellular Automata
Counting the Identities of a Quantum State
Video-based Person Re-identification via 3D Convolutional Networks and Non-local Attention
Quantum Pontryagin Principle under Continuous Measurements and Feedback
Maximizing Invariant Data Perturbation with Stochastic Optimization
Dynamic Density Estimation in Heterogeneous Cell Populations
Metalearning with Hebbian Fast Weights
Bayesian Estimation Under Informative Sampling with Unattenuated Dependence
Differentially Private LQ Control
A critical strange metal from fluctuating gauge fields in a solvable random model
Improving on Q & A Recurrent Neural Networks Using Noun-Tagging
Transfer Operator Theoretic Framework for Monitoring Building Indoor Environment in Uncertain Operating Conditions
Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate
Harborth Constants for Certain Classes of Metacyclic Groups
Efficient population transfer via non-ergodic extended states in quantum spin glass
Mean Field Game with Delay: a Toy Model
Hydranet: Data Augmentation for Regression Neural Networks
Feature Selection for Gender Classification in TUIK Life Satisfaction Survey
A Faster Algorithm for Minimum-Cost Bipartite Matching in Minor-Free Graphs
Approximating the number of maximal near perfect matchings in dense graphs
Algorithms for #BIS-hard problems on expander graphs
Learning-based Regularization for Cardiac Strain Analysis with Ability for Domain Adaptation
Three local actions in $6$-valent arc-transitive graphs
CT-GAN: Conditional Transformation Generative Adversarial Network for Image Attribute Modification
Enhanced C-V2X Mode-4 Subchannel Selection
CTAP: Complementary Temporal Action Proposal Generation
System Level Simulation of Scheduling Schemes for C-V2X Mode-3
Decentralized Multi-UAV Routing in the Presence of Disturbances
TDOA-based Localization via Stochastic Gradient Descent Variants
Fast Modular Subset Sum using Linear Sketching
The geometry and combinatorics of discrete line segment hypergraphs
Network-Assisted Resource Allocation with Quality and Conflict Constraints for V2V Communications
Impact of Quantized Side Information on Subchannel Scheduling for Cellular V2X
Hierarchical Growth is Necessary and (Sometimes) Sufficient to Self-Assemble Discrete Self-Similar Fractals
Optimal Strategies for Matching and Retrieval Problems by Comparing Covariates
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
A game-theoretic approach to timeline-based planning with uncertainty
An Approximate Message Passing Framework for Side Information
Time-dependent Pólya urn
Disordered fermionic quantum critical points
Modeling and Analysis of D2D Millimeter-Wave Networks with Poisson Cluster Processes
A New Millimeter Wave MIMO System for 5G Networks
Cache-enabled HetNets With Millimeter Wave Small Cells
Novel Method for Multi-Dimensional Mapping of Higher Order Modulations for BICM-ID Over Rayleigh Fading Channels
A feature agnostic approach for glaucoma detection in OCT volumes
CADDY Underwater Stereo-Vision Dataset for Human-Robot Interaction (HRI) in the Context of Diver Activities
Super Poincar’e inequality for a dynamic model for the two-parameter Dirichlet process
Large deviations and continuity estimates for the derivative of a random model of $\log |ζ|$ on the critical line
Hybrid Temporal Situation Calculus
Predictability of the imitative learning trajectories
Avoiding Latent Variable Collapse With Generative Skip Models
Adaptive modeling of urban dynamics during ephemeral event via mobile phone traces
Human Mobility Patterns Modelling using CDRs
Near-Epoch Dependence in Riesz Spaces
Extracting Contact and Motion from Manipulation Videos
Bregman Monotone Operator Splitting
The sepr-sets of sign patterns
Analysis of Consensus Networks Driven by Symmetric-Alpha-Stable Motion (Extended Version)
Effective Occlusion Handling for Fast Correlation Filter-based Trackers
Algorithms for metric learning via contrastive embeddings
Uniqueness of Viscosity Solutions of Stochastic Hamilton-Jacobi Equations
Probabilistic Re-aggregation Algorithm [First Draft]
Utilizing Smartphone-Based Machine Learning in Medical Monitor Data Collection: Seven Segment Digit Recognition
Detection and Mitigation of Classes of Attacks in Supervisory Control Systems
Optical Flow Based Real-time Moving Object Detection in Unconstrained Scenes
Computer Analysis of Architecture Using Automatic Image Understanding
Automatic segmentation of skin lesions using deep learning
TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection
Analysis Dictionary Learning based Classification: Structure for Robustness
Estimation of the Distribution of Random Parameters in Discrete Time Abstract Parabolic Systems with Unbounded Input and Output: Approximation and Convergence
Ultra-Fine Entity Typing
Pairwise Independent Random Walks can be Slightly Unbounded
Want to bring a community together Create more sub-communities
Perceptrons from Memristors
Self-dual cyclic codes over $M_2(\mathbb{Z}_4)$
Performance of Image Registration Tools on High-Resolution 3D Brain Images
TequilaGAN: How to easily identify GAN samples
On the Complexity of Iterative Tropical Computation with Applications to Markov Decision Processes
Constraining Strong c-Wilf Equivalence Using Cluster Poset Asymptotics
The complexity of approximating the matching polynomial in the complex plane
Convexity Analysis of Optimization Framework of Attitude Determination from Vector Observations
Sequential sampling of Gaussian latent variable models
Non-Gaussian Component Analysis using Entropy Methods
The latest gossip on BFT consensus
Linear Pseudo-Polynomial Factor Algorithm for Automaton Constrained Tree Knapsack Problem
Strong Interference Alignment
A Sauer-Shelah-Perles Lemma for Lattices
Single Bitmap Block Truncation Coding of Color Images Using Hill Climbing Algorithm
No-regret algorithms for online $k$-submodular maximization
A tight Erdös-Pósa function for planar minors
Spectral Sparsification of Hypergraphs
Recognition in Terra Incognita
Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units
Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition
Generalized Simultaneous Component Analysis of Binary and Quantitative data
Tools for Analyzing Parallel I/O
Sharp phase transition for the continuum Widom-Rowlinson model
A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification
MAX for $k$-independence in multigraphs
Time-dependent matrix product ansatz for interacting reversible dynamics
Observability inequalities for transport equations through Carleman estimates
On the extremal number of subdivisions
Maintaning maximal matching with lookahead
Multi-task dialog act and sentiment recognition on Mastodon
Optimal Short-Circuit Resilient Formulas
Critical Regime in a Curie-Weiss Model with two Groups and Heterogeneous Coupling
Diophantine approximations on random fractals
Are generative deep models for novelty detection truly better
DNN’s Sharpest Directions Along the SGD Trajectory
Conditional Masking to Numerical Data
On-line size Ramsey number for monotone k-uniform ordered paths with uniform looseness
Representation and Verification of Offline Signatures with Dictionary Learning and Parsimonious Coding
Clumsy packings of graphs
Random Walks on Simplicial Complexes and the normalized Hodge Laplacian
Improved Methods for Making Inferences About Multiple Skipped Correlations
Many-body (de)localization in large quantum chains
A stability theorem on cube tessellations
Cyber-Physical System for Energy-Efficient Stadium Operation: Methodology and Experimental Validation
On Universal Sensor Registration
Random Matrix Ensemble for the Level Statistics of Many-Body Localization
Learning Graph Representations by Dendrograms
Deconvolving the Input to Random Abstract Parabolic Systems; A Population Model-Based Approach to Estimating Blood/Breath Alcohol Concentration from Transdermal Alcohol Biosensor Data
Friend-based Ranking
Global optimization test problems based on random field composition
The greedy strategy in optimizing the Perron eigenvalue
Unique Informations and Deficiencies
Extremal problems on ordered and convex geometric hypergraphs
Performance of Angle of Arrival Detection Using MUSIC Algorithm in Inter-Satellite Link
Newton-Krylov PDE-constrained LDDMM in the space of band-limited vector fields
Learning-based Natural Geometric Matching with Homography Prior
Eventually dendric subshifts
An $O(\log^{1.5}n \log\log n)$ Approximation Algorithm for Mean Isoperimetry and Robust $k$-means
At the Mercy of the Common Noise: Blow-ups in a Conditional McKean–Vlasov Problem
Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking
A neuromorphic systems approach to in-memory computing with non-ideal memristive devices: From mitigation to exploitation
The asymptotic spectrum of LOCC transformations
Augmented Generator Sub-transient Model Using Dynamic Phasor Measurements
Optimal Lower Bounds for Distributed and Streaming Spanning Forest Computation
A Tight Lower Bound for Clock Synchronization in Odd-Ary M-Toroids
Heterogeneous inputs to central pattern generators can shape insect gaits
A data-driven robust optimization approach to stochastic model predictive control
Information-Theoretic Limits of Strategic Communication
New/s/leak 2.0 – Multilingual Information Extraction and Visualization for Investigative Journalism
A combinatorial interpretation for Tsallis 2-entropy
Multi-Scale Convolutional-Stack Aggregation for Robust White Matter Hyperintensities Segmentation
Deep Enhanced Representation for Implicit Discourse Relation Recognition
Random matrices with exchangeable entries
Large-Scale Visual Speech Recognition
On the Number of Circuits in Regular Matroids (with Connections to Lattices and Codes)
Exchangeable coalescents, ultrametric spaces, nested interval-partitions: A unifying approach
Postselecting probabilistic finite state recognizers and verifiers
A collisionless singular Cucker-Smale model with decentralized formation control
Cluster categories from Grassmannians and root combinatorics
Model Reconstruction from Model Explanations
An Algorithmic Blend of LPs and Ring Equations for Promise CSPs
Artificial Intelligence for Long-Term Robot Autonomy: A Survey
Soap films with gravity and almost-minimal surfaces
Anticoncentration for subgraph statistics
Image Classification for Arabic: Assessing the Accuracy of Direct English to Arabic Translations
Parametric generation of conditional geological realizations using generative neural networks
Quantum Speedups for Exponential-Time Dynamic Programming Algorithms