DeFind: A Protege Plugin for Computing Concept Definitions in EL Ontologies

We introduce an extension to the Protege ontology editor, which allows for discovering concept definitions, which are not explicitly present in axioms, but are logically implied by an ontology. The plugin supports ontologies formulated in the Description Logic EL, which underpins the OWL 2 EL profile of the Web Ontology Language and despite its limited expressiveness captures most of the biomedical ontologies published on the Web. The developed tool allows to verify whether a concept can be defined using a vocabulary of interest specified by a user. In particular, it allows to decide whether some vocabulary items can be omitted in a formulation of a complex concept. The corresponding definitions are presented to the user and are provided with explanations generated by an ontology reasoner.

Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space

Most existing deep reinforcement learning (DRL) frameworks consider either discrete action space or continuous action space solely. Motivated by applications in computer games, we consider the scenario with discrete-continuous hybrid action space. To handle hybrid action space, previous works either approximate the hybrid space by discretization, or relax it into a continuous set. In this paper, we propose a parametrized deep Q-network (P- DQN) framework for the hybrid action space without approximation or relaxation. Our algorithm combines the spirits of both DQN (dealing with discrete action space) and DDPG (dealing with continuous action space) by seamlessly integrating them. Empirical results on a simulation example, scoring a goal in simulated RoboCup soccer and the solo mode in game King of Glory (KOG) validate the efficiency and effectiveness of our method.

Temporal Convolutional Memory Networks for Remaining Useful Life Estimation of Industrial Machinery

Accurately estimating the remaining useful life (RUL) of industrial machinery is beneficial in many real-world applications. Estimation techniques have mainly utilized linear models or neural network based approaches with a focus on short term time dependencies. This paper introduces a system model that incorporates temporal convolutions with both long term and short term time dependencies. The proposed network learns salient features and complex temporal variations in sensor values, and predicts the RUL. A data augmentation method is used for increased accuracy. The proposed method is compared with several state-of-the-art algorithms on publicly available datasets. It demonstrates promising results, with superior results for datasets obtained from complex environments.

Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension

We propose a neural machine-reading model that constructs dynamic knowledge graphs from procedural text. It builds these graphs recurrently for each step of the described procedure, and uses them to track the evolving states of participant entities. We harness and extend a recently proposed machine reading comprehension (MRC) model to query for entity states, since these states are generally communicated in spans of text and MRC models perform well in extracting entity-centric spans. The explicit, structured, and evolving knowledge graph representations that our model constructs can be used in downstream question answering tasks to improve machine comprehension of text, as we demonstrate empirically. On two comprehension tasks from the recently proposed PROPARA dataset (Dalvi et al., 2018), our model achieves state-of-the-art results. We further show that our model is competitive on the RECIPES dataset (Kiddon et al., 2015), suggesting it may be generally applicable. We present some evidence that the model’s knowledge graphs help it to impose commonsense constraints on its predictions.

Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms

Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm PAM, partitioning around medoids, also known as k-medoids. In Euclidean geometry the mean–as used in k-means–is a good estimator for the cluster center, but this does not hold for arbitrary dissimilarities. PAM uses the medoid instead, the object with the smallest dissimilarity to all others in the cluster. This notion of centrality can be used with any (dis-)similarity, and thus is of high relevance to many domains such as biology that require the use of Jaccard, Gower, or even more complex distances. A key issue with PAM is, however, its high run time cost. In this paper, we propose modifications to the PAM algorithm where at the cost of storing O(k) additional values, we can achieve an O(k)-fold speedup in the second (‘SWAP’) phase of the algorithm, but will still find the same results as the original PAM algorithm. If we slightly relax the choice of swaps performed (while retaining comparable quality), we can further accelerate the algorithm by performing up to k swaps in each iteration. We also show how the CLARA and CLARANS algorithms benefit from this modification. In experiments on real data with k=100, we observed a 200 fold speedup compared to the original PAM SWAP algorithm, making PAM applicable to larger data sets, and in particular to higher k.

Can evolution paths be explained by chance alone?

We propose a purely probabilistic model to explain the evolution path of a population maximum fitness. We show that after n births in the population there are about \ln n upwards jumps. This is true for any mutation probability and any fitness distribution and therefore suggests a general law for the number of upwards jumps. Simulations of our model show that a typical evolution path has first a steep rise followed by long plateaux. Moreover, independent runs show parallel paths. This is consistent with what was observed by Lenski and Travisano (1994) in their bacteria experiments.

One-Shot PIR: Refinement and Lifting

We study a class of private information retrieval (PIR) methods that we call one-shot schemes. The intuition behind one-shot schemes is the following. The user’s query is regarded as a dot product of a query vector and the message vector (database) stored at multiple servers. Privacy, in an information theoretic sense, is then achieved by encrypting the query vector using a secure linear code, such as secret sharing. Several PIR schemes in the literature, in addition to novel ones constructed here, fall into this class. One-shot schemes provide an insightful link between PIR and data security against eavesdropping. However, their download rate is not optimal, i.e., they do not achieve the PIR capacity. Our main contribution is two transformations of one-shot schemes, which we call refining and lifting. We show that refining and lifting one-shot schemes gives capacity-achieving schemes for the cases when the PIR capacity is known. In the other cases, when the PIR capacity is still unknown, refining and lifting one-shot schemes gives the best download rate so far.

Estimating Information Flow in Neural Networks

We study the flow of information and the evolution of internal representations during deep neural network (DNN) training, aiming to demystify the compression aspect of the information bottleneck theory. The theory suggests that DNN training comprises a rapid fitting phase followed by a slower compression phase, in which the mutual information I(X;T) between the input X and internal representations T decreases. Several papers observe compression of estimated mutual information on different DNN models, but the true I(X;T) over these networks is provably either constant (discrete X) or infinite (continuous X). This work explains the discrepancy between theory and experiments, and clarifies what was actually measured by these past works. To this end, we introduce an auxiliary (noisy) DNN framework for which I(X;T) is a meaningful quantity that depends on the network’s parameters. This noisy framework is shown to be a good proxy for the original (deterministic) DNN both in terms of performance and the learned representations. We then develop a rigorous estimator for I(X;T) in noisy DNNs and observe compression in various models. By relating I(X;T) in the noisy DNN to an information-theoretic communication problem, we show that compression is driven by the progressive clustering of hidden representations of inputs from the same class. Several methods to directly monitor clustering of hidden representations, both in noisy and deterministic DNNs, are used to show that meaningful clusters form in the T space. Finally, we return to the estimator of I(X;T) employed in past works, and demonstrate that while it fails to capture the true (vacuous) mutual information, it does serve as a measure for clustering. This clarifies the past observations of compression and isolates the geometric clustering of hidden representations as the true phenomenon of interest.

Unsupervised Neural Multi-document Abstractive Summarization

Abstractive summarization has been studied using neural sequence transduction methods with datasets of large, paired document-summary examples. However, such datasets are rare and the models trained from them do not generalize to other domains. Recently, some progress has been made in learning sequence-to-sequence mappings with only unpaired examples. In our work, we consider the setting where there are only documents and no summaries provided and propose an end-to-end, neural model architecture to perform unsupervised abstractive summarization. Our proposed model consists of an auto-encoder trained so that the mean of the representations of the input documents decodes to a reasonable summary. We consider variants of the proposed architecture and perform an ablation study to show the importance of specific components. We apply our model to the summarization of business and product reviews and show that the generated summaries are fluent, show relevancy in terms of word-overlap, representative of the average sentiment of the input documents, and are highly abstractive compared to baselines.

Explaining Black Boxes on Sequential Data using Weighted Automata

Understanding how a learned black box works is of crucial interest for the future of Machine Learning. In this paper, we pioneer the question of the global interpretability of learned black box models that assign numerical values to symbolic sequential data. To tackle that task, we propose a spectral algorithm for the extraction of weighted automata (WA) from such black boxes. This algorithm does not require the access to a dataset or to the inner representation of the black box: the inferred model can be obtained solely by querying the black box, feeding it with inputs and analyzing its outputs. Experiments using Recurrent Neural Networks (RNN) trained on a wide collection of 48 synthetic datasets and 2 real datasets show that the obtained approximation is of great quality.

Graph HyperNetworks for Neural Architecture Search

Neural architecture search (NAS) automatically finds the best task-specific neural network topology, outperforming many manual architecture designs. However, it can be prohibitively expensive as the search requires training thousands of different networks, while each can last for hours. In this work, we propose the Graph HyperNetwork (GHN) to amortize the search cost: given an architecture, it directly generates the weights by running inference on a graph neural network. GHNs model the topology of an architecture and therefore can predict network performance more accurately than regular hypernetworks and premature early stopping. To perform NAS, we randomly sample architectures and use the validation accuracy of networks with GHN generated weights as the surrogate search signal. GHNs are fast — they can search nearly 10 times faster than other random search methods on CIFAR-10 and ImageNet. GHNs can be further extended to the anytime prediction setting, where they have found networks with better speed-accuracy tradeoff than the state-of-the-art manual designs.

Measuring Swampiness: Quantifying Chaos in Large Heterogeneous Data Repositories

As scientific data repositories and filesystems grow in size and complexity, they become increasingly disorganized. The coupling of massive quantities of data with poor organization makes it challenging for scientists to locate and utilize relevant data, thus slowing the process of analyzing data of interest. To address these issues, we explore an automated clustering approach for quantifying the organization of data repositories. Our parallel pipeline processes heterogeneous filetypes (e.g., text and tabular data), automatically clusters files based on content and metadata similarities, and computes a novel ‘cleanliness’ score from the resulting clustering. We demonstrate the generation and accuracy of our cleanliness measure using both synthetic and real datasets, and conclude that it is more consistent than other potential cleanliness measures.

Mixture of Expert/Imitator Networks: Scalable Semi-supervised Learning Framework

The current success of deep neural networks (DNNs) in an increasingly broad range of tasks for the artificial intelligence strongly depends on the quality and quantity of labeled training data. In general, the scarcity of labeled data, which is often observed in many natural language processing tasks, is one of the most important issues to be addressed. Semi-supervised learning (SSL) is a promising approach to overcome this issue by incorporating a large amount of unlabeled data. In this paper, we propose a novel scalable method of SSL for text classification tasks. The unique property of our method, Mixture of Expert/Imitator Networks, is that imitator networks learn to ‘imitate’ the estimated label distribution of the expert network over the unlabeled data, which potentially contributes as a set of features for the classification. Our experiments demonstrate that the proposed method consistently improves the performance of several types of baseline DNNs. We also demonstrate that our method has the more data, better performance property with promising scalability to the unlabeled data.

Categorical Aspects of Parameter Learning

Parameter learning is the technique for obtaining the probabilistic parameters in conditional probability tables in Bayesian networks from tables with (observed) data — where it is assumed that the underlying graphical structure is known. There are basically two ways of doing so, referred to as maximal likelihood estimation (MLE) and as Bayesian learning. This paper provides a categorical analysis of these two techniques and describes them in terms of basic properties of the multiset monad M, the distribution monad D and the Giry monad G. In essence, learning is about the reltionships between multisets (used for counting) on the one hand and probability distributions on the other. These relationsips will be described as suitable natural transformations.

A Geometric Analysis of Time Series Leading to Information Encoding and a New Entropy Measure

A time series is uniquely represented by its geometric shape, which also carries information. A time series can be modelled as the trajectory of a particle moving in a force field with one degree of freedom. The force acting on the particle shapes the trajectory of its motion, which is made up of elementary shapes of infinitesimal neighborhoods of points in the trajectory. It has been proved that an infinitesimal neighborhood of a point in a continuous time series can have at least 29 different shapes or configurations. So information can be encoded in it in at least 29 different ways. A 3-point neighborhood (the smallest) in a discrete time series can have precisely 13 different shapes or configurations. In other words, a discrete time series can be expressed as a string of 13 symbols. Across diverse real as well as simulated data sets it has been observed that 6 of them occur more frequently and the remaining 7 occur less frequently. Based on frequency distribution of 13 configurations or 13 different ways of information encoding a novel entropy measure, called semantic entropy (E), has been defined. Following notion of power in Newtonian mechanics of the moving particle whose trajectory is the time series, a notion of information power (P) has been introduced for time series. E/P turned out to be an important indicator of synchronous behaviour of time series as observed in epileptic EEG signals.

Ineffectiveness of Dictionary Coding to Infer Predictability Limits of Human Mobility
Neural Network based classification of bone metastasis by primary cacinoma
DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning
Automatic Segmentation of Thoracic Aorta Segments in Low-Dose Chest CT
UOLO – automatic object detection and segmentation in biomedical images
Rate Distortion For Model Compression: From Theory To Practice
Is PGD-Adversarial Training Necessary? Alternative Training via a Soft-Quantization Network with Noisy-Natural Samples Only
Unpaired High-Resolution and Scalable Style Transfer Using Generative Adversarial Networks
CRH: A Simple Benchmark Approach to Continuous Hashing
Image Super-Resolution Using VDSR-ResNeXt and SRCGAN
Computational ghost imaging using a field-programmable gate array
A Novel Domain Adaptation Framework for Medical Image Segmentation
A Resource Allocation based Approach for Corporate Mobility as a Service
A Data-Driven Framework for Assessing Cold Load Pick-up Demand in Service Restoration
Learning Optimal Deep Projection of $^{18}$F-FDG PET Imaging for Early Differential Diagnosis of Parkinsonian Syndromes
InfiNet: Fully Convolutional Networks for Infant Brain MRI Segmentation
Bottom-up Attention, Models of
Inventory Balancing with Online Learning
Dirichlet conditions in Poincaré-Sobolev inequalities: the sub-homogeneous case
The highD Dataset: A Drone Dataset of Naturalistic Vehicle Trajectories on German Highways for Validation of Highly Automated Driving Systems
Mean field voter model on networks and multi-variate beta distribution
On the sensitivity analysis of energy quanto options
Subordinators which are infinitely divisible w.r.t. time: Construction, properties, and simulation of max-stable sequences and infinitely divisible laws
Wind Power Persistence is Governed by Superstatistic
Finite sample performance of linear least squares estimation
Thresholds quantifying proportionality criteria for election methods
Smart Load Node for Non-Smart Load under Smart Grid Paradigm
Regression Based Approach for Measurement of Current in Single-Phase Smart Energy Meter
SmartPM: Automatic Adaptation of Dynamic Processes at Run-Time
On mixture representations for the generalized Linnik distribution and their applications in limit theorems
BSDEs driven by $|z|^2/y$ and applications
Performance Analysis of Large Intelligence Surfaces (LISs): Asymptotic Data Rate and Channel Hardening Effects
Linear response and moderate deviations: hierarchical approach. IV
Spherical Regression under Mismatch Corruption with Application to Automated Knowledge Translation
Long-Duration Autonomy for Small Rotorcraft UAS including Recharging
Non vanishing of theta functions and sets of small multiplicative energy
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience
Linear Program Reconstruction in Practice
Almost Complete Graphs and the Kruskal Katona Theorem
Improving Generalization of Sequence Encoder-Decoder Networks for Inverse Imaging of Cardiac Transmembrane Potential
Does Haze Removal Help CNN-based Image Classification?
Topology of Z_3 equivariant Hilbert schemes
Policy Transfer with Strategy Optimization
Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression
Relative compression of trajectories
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
Shell Tableaux: A set partition analogue of vacillating tableaux
Topological Inference of Manifolds with Boundary
A geometrically converging dual method for distributed optimization over time-varying graphs
Estimating Robot Strengths with Application to Selection of Alliance Members in FIRST Robotics Competitions
A Model for Auto-Programming for General Purposes
Hierarchical Game-Theoretic Planning for Autonomous Vehicles
$C_{2k}$-saturated graphs with no short odd cycles
CPNet: A Context Preserver Convolutional Neural Network for Detecting Shadows in Single RGB Images
Pose Estimation for Objects with Rotational Symmetry
Stabilization and manipulation of multi-spin states in quantum dot time crystals with Heisenberg interactions
Cloud Detection Algorithm for Remote Sensing Images Using Fully Convolutional Neural Networks
Learning to Globally Edit Images with Textual Description
Point Cloud GAN
Core Influence Mechanism on Vertex-Cover Problem through Leaf-Removal-Core Breaking
Deep learning based cloud detection for remote sensing images by the fusion of multi-scale convolutional features
On the null structure of bipartite graphs without cycles of length a multiple of 4
Towards Provably Safe Mixed Transportation Systems with Human-driven and Automated Vehicles
Extremes of branching Ornstein-Uhlenbeck processes
Efficient Multi-level Correlating for Visual Tracking
On the Rate of Convergence for a Characteristic of Multidimensional Birth-Death Process
Ultrafast cryptography with indefinitely switchable optical nanoantennas
Contagions in Social Networks: Effects of Monophilic Contagion, Friendship Paradox and Reactive Networks
Quantum simulation of clustered photosynthetic light harvesting in a superconducting quantum circuit
Diffusive spin-orbit torque at a surface of topological insulator
Approximating Pairwise Correlations in the Ising Model
Characterising epithelial tissues using persistent entropy
Time Synchronization in Wireless Sensor Networks based on Newtons Adaptive Algorithm
Delay Regulated Explosive Synchronization in Multiplex Networks
Error estimation at the information reconciliation stage of quantum key distribution
Optimal Temperature Spacing for Regionally Weight-preserving Tempering
Nesterov Acceleration of Alternating Least Squares for Canonical Tensor Decomposition
Overview of CAIL2018: Legal Judgment Prediction Competition
Exploiting Semantics in Adversarial Training for Image-Level Domain Adaptation
Using generalized estimating equations to estimate nonlinear models with spatial data
On Greedy and Strategic Evaders in Sequential Interdiction Settings with Incomplete Information
Equivalent Constraints for Two-View Geometry: Pose Solution/Pure Rotation Identification and 3D Reconstruction
Attention Driven Person Re-identification
Understanding Crosslingual Transfer Mechanisms in Probabilistic Topic Modeling
Hybrid Building/Floor Classification and Location Coordinates Regression Using A Single-Input and Multi-Output Deep Neural Network for Large-Scale Indoor Localization Based on Wi-Fi Fingerprinting
Generalized tensor equations with leading structured tensors
Linearizable Replicated State Machines with Lattice Agreement
Further study on tensor absolute value equations
Optimal Control of DERs in ADN under Spatial and Temporal Correlated Uncertainties
Embedded deep learning in ophthalmology: Making ophthalmic imaging smarter
A space-time pseudospectral discretization method for solving diffusion optimal control problems with two-sided fractional derivatives
Optimal Time Scheduling Scheme for Wireless Powered Ambient Backscatter Communication in IoT Network
A New [Combinatorial] Proof of the Commutativity of Matching Polynomials for Cycles
Resource Allocation in IoT networks using Wireless Power Transfer
Deep Learning-Based Channel Estimation
Power Flow as Intersection of Circles: A new Fixed Point Method
Towards Formal Definitions of Blameworthiness, Intention, and Moral Responsibility
Computing the partition function of the Sherrington-Kirkpatrick model is hard on average
Optimal Evidence Accumulation on Social Networks
Group Inverse of the Laplacian of Connections of Networks
Evacuation simulation considering action of the guard in an artificial attack
No-reference Image Denoising Quality Assessment
Two Can Play That Game: An Adversarial Evaluation of a Cyber-alert Inspection System
Porosity Amount Estimation in Stones Based on Combination of One Dimensional Local Binary Patterns and Image Normalization Technique
A Transformation-Proximal Bundle Algorithm for Solving Large-Scale Multistage Adaptive Robust Optimization Problems
Massively Parallel Hyperparameter Tuning
Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Dimension
End-to-End Service Level Agreement Specification for IoT Applications
False Data Injection Cyber-Attack Detection
Enhanced Energy Management System with Corrective Transmission Switching Strategy – Part I: Methodology
Enhanced Energy Management System with Corrective Transmission Switching Strategy – Part II: Results and Discussion
Varifocal-Net: A Chromosome Classification Approach using Deep Convolutional Networks
Social Media Brand Engagement as a Proxy for E-commerce Activities: A Case Study of Sina Weibo and JD
Robust Model Predictive Control of Irrigation Systems with Active Uncertainty Learning and Data Analytics
Delay-Constrained Covert Communications with A Full-Duplex Receiver
The relationship between graphs and Nichols braided Lie algebras
Comparison Detector: A novel object detection method for small dataset
Approximating optimal transport with linear programs
Incorporating Diversity into Influential Node Mining
Rainbow triangles in arc-colored digraphs
Empirical determination of the optimum attack for fragmentation of modular networks
Perceptual Image Quality Assessment through Spectral Analysis of Error Representations
Efficient Reconstructions of Common Era Climate via Integrated Nested Laplace Approximations
DDSL: Efficient Subgraph Listing on Distributed and Dynamic Graphs
Sequential Change-point Detection for High-dimensional and non-Euclidean Data
Learning to Sketch with Deep Q Networks and Demonstrated Strokes
Finding Similar Medical Questions from Question Answering Websites
Kasteleyn operators from mirror symmetry
Theoretical Guarantees of Transfer Learning
Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training
Convex Hull Approximation of Nearly Optimal Lasso Solutions
Modeling Multimodal Dynamic Spatiotemporal Graphs
BLEU is Not Suitable for the Evaluation of Text Simplification