Topic representation: finding more representative words in topic models

The top word list, i.e., the top-M words with highest marginal probability in a given topic, is the standard topic representation in topic models. Most of recent automatical topic labeling algorithms and popular topic quality metrics are based on it. However, we find, empirically, words in this type of top word list are not always representative. The objective of this paper is to find more representative top word lists for topics. To achieve this, we rerank the words in a given topic by further considering marginal probability on words over every other topic. The reranking list of top-M words is used to be a novel topic representation for topic models. We investigate three reranking methodologies, using (1) standard deviation weight, (2) standard deviation weight with topic size and (3) Chi Square \c{hi}2statistic selection. Experimental results on real world collections indicate that our representations can extract more representative words for topics, agreeing with human judgements.


The Faults in Our Pi Stars: Security Issues and Open Challenges in Deep Reinforcement Learning

Since the inception of Deep Reinforcement Learning (DRL) algorithms, there has been a growing interest in both research and industrial communities in the promising potentials of this paradigm. The list of current and envisioned applications of deep RL ranges from autonomous navigation and robotics to control applications in the critical infrastructure, air traffic control, defense technologies, and cybersecurity. While the landscape of opportunities and the advantages of deep RL algorithms are justifiably vast, the security risks and issues in such algorithms remain largely unexplored. To facilitate and motivate further research on these critical challenges, this paper presents a foundational treatment of the security problem in DRL. We formulate the security requirements of DRL, and provide a high-level threat model through the classification and identification of vulnerabilities, attack vectors, and adversarial capabilities. Furthermore, we present a review of current literature on security of deep RL from both offensive and defensive perspectives. Lastly, we enumerate critical research venues and open problems in mitigation and prevention of intentional attacks against deep RL as a roadmap for further research in this area.


Time-Aware and Corpus-Specific Entity Relatedness

Entity relatedness has emerged as an important feature in a plethora of applications such as information retrieval, entity recommendation and entity linking. Given an entity, for instance a person or an organization, entity relatedness measures can be exploited for generating a list of highly-related entities. However, the relation of an entity to some other entity depends on several factors, with time and context being two of the most important ones (where, in our case, context is determined by a particular corpus). For example, the entities related to the International Monetary Fund are different now compared to some years ago, while these entities also may highly differ in the context of a USA news portal compared to a Greek news portal. In this paper, we propose a simple but flexible model for entity relatedness which considers time and entity aware word embeddings by exploiting the underlying corpus. The proposed model does not require external knowledge and is language independent, which makes it widely useful in a variety of applications.


Stochastic Substitute Training: A Gray-box Approach to Craft Adversarial Examples Against Gradient Obfuscation Defenses

It has been shown that adversaries can craft example inputs to neural networks which are similar to legitimate inputs but have been created to purposely cause the neural network to misclassify the input. These adversarial examples are crafted, for example, by calculating gradients of a carefully defined loss function with respect to the input. As a countermeasure, some researchers have tried to design robust models by blocking or obfuscating gradients, even in white-box settings. Another line of research proposes introducing a separate detector to attempt to detect adversarial examples. This approach also makes use of gradient obfuscation techniques, for example, to prevent the adversary from trying to fool the detector. In this paper, we introduce stochastic substitute training, a gray-box approach that can craft adversarial examples for defenses which obfuscate gradients. For those defenses that have tried to make models more robust, with our technique, an adversary can craft adversarial examples with no knowledge of the defense. For defenses that attempt to detect the adversarial examples, with our technique, an adversary only needs very limited information about the defense to craft adversarial examples. We demonstrate our technique by applying it against two defenses which make models more robust and two defenses which detect adversarial examples.


Some negative results for Neural Networks

We demonstrate some negative results for approximation of functions with neural networks.


A new approach of contextual recommendation based on the method of Hierarchical Analysis of Processes

Recommender systems are able to estimate the user’s interest for resource given from some relative information to others similar users and to propriety of the resource. In this Memory, we introduced a new contextual recommendation approach based on the AHP Process Hierarchical Analysis method. This work consisted in making a bibliographic study on the works having proposed systems of recommendation based on the context of the users in the field of films. The goal is to design and develop a new approach to recommending movies based on user context. And we relied on methods of multi-criteria decision making (MCDM) and more precisely the method of Hierarchical Process Analysis (AHP) for context integration in the recommendation process.


Graph Laplacian mixture model

Graph learning methods have recently been receiving increasing interest as means to infer structure in datasets. Most of the recent approaches focus on different relationships between a graph and data sample distributions, mostly in settings where all available relate to the same graph. This is, however, not always the case, as data is often available in mixed form, yielding the need for methods that are able to cope with mixture data and learn multiple graphs. We propose a novel generative model that explains a collection of distinct data naturally living on different graphs. We assume the mapping of data to graphs is not known and investigate the problem of jointly clustering a set of data and learning a graph for each of the clusters. Experiments in both synthetic and real-world datasets demonstrate promising performance both in terms of data clustering, as well as multiple graph inference from mixture data.


NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision

Mobile vision systems such as smartphones, drones, and augmented-reality headsets are revolutionizing our lives. These systems usually run multiple applications concurrently and their available resources at runtime are dynamic due to events such as starting new applications, closing existing applications, and application priority changes. In this paper, we present NestDNN, a framework that takes the dynamics of runtime resources into account to enable resource-aware multi-tenant on-device deep learning for mobile vision systems. NestDNN enables each deep learning model to offer flexible resource-accuracy trade-offs. At runtime, it dynamically selects the optimal resource-accuracy trade-off for each deep learning model to fit the model’s resource demand to the system’s available runtime resources. In doing so, NestDNN efficiently utilizes the limited resources in mobile vision systems to jointly maximize the performance of all the concurrently running applications. Our experiments show that compared to the resource-agnostic status quo approach, NestDNN achieves as much as 4.2% increase in inference accuracy, 2.0x increase in video frame processing rate and 1.7x reduction on energy consumption.


Learning Representations in Model-Free Hierarchical Reinforcement Learning

Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences of the agent. When combined with an intrinsic motivation learning mechanism, this method learns subgoals and skills together, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the ATARI 2600 game called Montezuma’s Revenge.


Autowarp: Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders

Measuring similarities between unlabeled time series trajectories is an important problem in domains as diverse as medicine, astronomy, finance, and computer vision. It is often unclear what is the appropriate metric to use because of the complex nature of noise in the trajectories (e.g. different sampling rates or outliers). Domain experts typically hand-craft or manually select a specific metric, such as dynamic time warping (DTW), to apply on their data. In this paper, we propose Autowarp, an end-to-end algorithm that optimizes and learns a good metric given unlabeled trajectories. We define a flexible and differentiable family of warping metrics, which encompasses common metrics such as DTW, Euclidean, and edit distance. Autowarp then leverages the representation power of sequence autoencoders to optimize for a member of this warping distance family. The output is a metric which is easy to interpret and can be robustly learned from relatively few trajectories. In systematic experiments across different domains, we show that Autowarp often outperforms hand-crafted trajectory similarity metrics.


PoPPy: A Point Process Toolbox Based on PyTorch

PoPPy is a Point Process toolbox based on PyTorch, which achieves flexible designing and efficient learning of point process models. It can be used for interpretable sequential data modeling and analysis, e.g., Granger causality analysis of multi-variate point processes, point process-based simulation and prediction of event sequences. In practice, the key points of point process-based sequential data modeling include: 1) How to design intensity functions to describe the mechanism behind observed data? 2) How to learn the proposed intensity functions from observed data? The goal of PoPPy is providing a user-friendly solution to the key points above and achieving large-scale point process-based sequential data analysis, simulation and prediction.


Area Attention

Existing attention mechanisms, are mostly item-based in that a model is designed to attend to a single item in a collection of items (the memory). Intuitively, an area in the memory that may contain multiple items can be worth attending to as a whole. We propose area attention: a way to attend to an area of the memory, where each area contains a group of items that are either spatially adjacent when the memory has a 2-dimensional structure, such as images, or temporally adjacent for 1-dimensional memory, such as natural language sentences. Importantly, the size of an area, i.e., the number of items in an area, can vary depending on the learned coherence of the adjacent items. By giving the model the option to attend to an area of items, instead of only a single item, we hope attention mechanisms can better capture the nature of the task. Area attention can work along multi-head attention for attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation and image captioning, and improve upon strong (state-of-the-art) baselines in both cases. These improvements are obtainable with a basic form of area attention that is parameter free. In addition to proposing the novel concept of area attention, we contribute an efficient way for computing it by leveraging the technique of summed area tables.


The Unit-B Method — Refinement Guided by Progress Concerns

We present Unit-B, a formal method inspired by Event-B and UNITY. Unit-B aims at the stepwise design of software systems satisfying safety and liveness properties. The method features the novel notion of coarse and fine schedules, a generalisation of weak and strong fairness for specifying events’ scheduling assumptions. Based on events schedules, we propose proof rules to reason about progress properties and a refinement order preserving both liveness and safety properties. We illustrate our approach by an example to show that systems development can be driven by not only safety but also liveness requirements.


FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

We present a Few-Shot Relation Classification Dataset (FewRel), consisting of 70, 000 sentences on 100 relations derived from Wikipedia and annotated by crowdworkers. The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers. We adapt the most recent state-of-the-art few-shot learning methods for relation classification and conduct a thorough evaluation of these methods. Empirical results show that even the most competitive few-shot learning models struggle on this task, especially as compared with humans. We also show that a range of different reasoning skills are needed to solve our task. These results indicate that few-shot relation classification remains an open problem and still requires further research. Our detailed analysis points multiple directions for future research. All details and resources about the dataset and baselines are released on http://…/fewrel.


Randomized Gradient Boosting Machine

Gradient Boosting Machine (GBM) introduced by Friedman is an extremely powerful supervised learning algorithm that is widely used in practice — it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In spite of the usefulness of GBM in practice, there is a big gap between its theoretical understanding and its success in practice. In this work, we propose Randomized Gradient Boosting Machine (RGBM) which leads to significant computational gains compared to GBM, by using a randomization scheme to reduce the search in the space of weak learners. Our analysis provides a formal justification of commonly used ad hoc heuristics employed by GBM implementations such as XGBoost, and suggests alternatives. In particular, we also provide a principled guideline towards better step-size selection in RGBM that does not require a line search. The analysis of RGBM is inspired by a special variant of coordinate descent that combines the benefits of randomized coordinate descent and greedy coordinate descent; and may be of independent interest as an optimization algorithm. As a special case, our results for RGBM lead to superior computational guarantees for GBM. Our computational guarantees depend upon a curious geometric quantity that we call Minimal Cosine Angle, which relates to the density of weak learners in the prediction space. We demonstrate the effectiveness of RGBM over GBM in terms of obtaining a model with good training/test data fidelity with a fraction of the computational cost, via numerical experiments on several real datasets.


Deep Learning with Long Short-Term Memory for Time Series Prediction

Time series prediction can be generalized as a process that extracts useful information from historical records and then determines future values. Learning long-range dependencies that are embedded in time series is often an obstacle for most algorithms, whereas Long Short-Term Memory (LSTM) solutions, as a specific kind of scheme in deep learning, promise to effectively overcome the problem. In this article, we first give a brief introduction to the structure and forward propagation mechanism of the LSTM model. Then, aiming at reducing the considerable computing cost of LSTM, we put forward the Random Connectivity LSTM (RCLSTM) model and test it by predicting traffic and user mobility in telecommunication networks. Compared to LSTM, RCLSTM is formed via stochastic connectivity between neurons, which achieves a significant breakthrough in the architecture formation of neural networks. In this way, the RCLSTM model exhibits a certain level of sparsity, which leads to an appealing decrease in the computational complexity and makes the RCLSTM model become more applicable in latency-stringent application scenarios. In the field of telecommunication networks, the prediction of traffic series and mobility traces could directly benefit from this improvement as we further demonstrate that the prediction accuracy of RCLSTM is comparable to that of the conventional LSTM no matter how we change the number of training samples or the length of input sequences.


Outcome-wide longitudinal designs for causal inference: a new template for empirical studies

In this paper we propose a new template for empirical studies intended to assess causal effects: the outcome-wide longitudinal design. The approach is an extension of what is often done to assess the causal effects of a treatment or exposure using confounding control, but now, over numerous outcomes. We discuss the temporal and confounding control principles for such outcome-wide studies, metrics to evaluate robustness or sensitivity to potential unmeasured confounding for each outcome, and approaches to handle multiple testing. We argue that the outcome-wide longitudinal design has numerous advantages over more traditional studies of single exposure-outcome relationships including results that are less subject to investigator bias, greater potential to report null effects, greater capacity to compare effect sizes, a tremendous gain in the efficiency for the research community, a greater policy relevance, and a more rapid advancement of knowledge. We discuss both the practical and theoretical justification for the outcome-wide longitudinal design and also the pragmatic details of its implementation.


Exploiting Partial Correlations in Distributionally Robust Optimization

In this paper, we identify partial correlation information structures that allow for simpler reformulations in evaluating the maximum expected value of mixed integer linear programs with random objective coefficients. To this end, assuming only the knowledge of the mean and the covariance matrix entries restricted to block-diagonal patterns, we develop a reduced semidefinite programming formulation, the complexity of solving which is related to characterizing a suitable projection of the convex hull of the set \{(\bold{x}, \bold{x}\bold{x}'): \bold{x} \in \mathcal{X}\} where \mathcal{X} is the feasible region. In some cases, this lends itself to efficient representations that result in polynomial-time solvable instances, most notably for the distributionally robust appointment scheduling problem with random job durations as well as for computing tight bounds in Project Evaluation and Review Technique (PERT) networks and linear assignment problems. To the best of our knowledge, this is the first example of a distributionally robust optimization formulation for appointment scheduling that permits a tight polynomial-time solvable semidefinite programming reformulation which explicitly captures partially known correlation information between uncertain processing times of the jobs to be scheduled.


Modified Multidimensional Scaling and High Dimensional Clustering

Multidimensional scaling is an important dimension reduction tool in statistics and machine learning. Yet few theoretical results characterizing its statistical performance exist, not to mention any in high dimensions. By considering a unified framework that includes low, moderate and high dimensions, we study multidimensional scaling in the setting of clustering noisy data. Our results suggest that, in order to achieve consistent estimation of the embedding scheme, the classical multidimensional scaling needs to be modified, especially when the noise level increases. To this end, we propose {\it modified multidimensional scaling} which applies a nonlinear transformation to the sample eigenvalues. The nonlinear transformation depends on the dimensionality, sample size and unknown moment. We show that modified multidimensional scaling followed by various clustering algorithms can achieve exact recovery, i.e., all the cluster labels can be recovered correctly with probability tending to one. Numerical simulations and two real data applications lend strong support to our proposed methodology. As a byproduct, we unify and improve existing results on the \ell_{\infty} bound for eigenvectors under only low bounded moment conditions. This can be of independent interest.


The Hellinger Correlation

In this paper, the defining properties of a valid measure of the dependence between two random variables are reviewed and complemented with two original ones, shown to be more fundamental than other usual postulates. While other popular choices are proved to violate some of these requirements, a class of dependence measures satisfying all of them is identified. One particular measure, that we call the Hellinger correlation, appears as a natural choice within that class due to both its theoretical and intuitive appeal. A simple and efficient nonparametric estimator for that quantity is proposed. Synthetic and real-data examples finally illustrate the descriptive ability of the measure, which can also be used as test statistic for exact independence testing.


Label Propagation for Learning with Label Proportions

Learning with Label Proportions (LLP) is the problem of recovering the underlying true labels given a dataset when the data is presented in the form of bags. This paradigm is particularly suitable in contexts where providing individual labels is expensive and label aggregates are more easily obtained. In the healthcare domain, it is a burden for a patient to keep a detailed diary of their daily routines, but often they will be amenable to provide higher level summaries of daily behavior. We present a novel and efficient graph-based algorithm that encourages local smoothness and exploits the global structure of the data, while preserving the `mass’ of each bag.


Why every GBDT speed benchmark is wrong

This article provides a comprehensive study of different ways to make speed benchmarks of gradient boosted decision trees algorithm. We show main problems of several straight forward ways to make benchmarks, explain, why a speed benchmarking is a challenging task and provide a set of reasonable requirements for a benchmark to be fair and useful.


HAKD: Hardware Aware Knowledge Distillation

Despite recent developments, deploying deep neural networks on resource constrained general purpose hardware remains a significant challenge. There has been much work in developing methods for reshaping neural networks, usually with a focus on minimising total parameter count. These methods are typically developed in a hardware-agnostic manner and do not exploit hardware behaviour. In this paper we propose a new approach, Hardware Aware Knowledge Distillation (HAKD) which uses empirical observations of hardware behaviour to design efficient student networks which are then trained with knowledge distillation. This allows the trade-off between accuracy and performance to be managed explicitly. We have applied this approach across three platforms and evaluated it on two networks, MobileNet and DenseNet, on CIFAR-10. We show that HAKD outperforms Deep Compression and Fisher pruning in terms of size, accuracy and performance.


Multi-Multi-View Learning: Multilingual and Multi-Representation Entity Typing

Knowledge bases (KBs) are paramount in NLP. We employ multiview learning for increasing accuracy and coverage of entity type information in KBs. We rely on two metaviews: language and representation. For language, we consider high-resource and low-resource languages from Wikipedia. For representation, we consider representations based on the context distribution of the entity (i.e., on its embedding), on the entity’s name (i.e., on its surface form) and on its description in Wikipedia. The two metaviews language and representation can be freely combined: each pair of language and representation (e.g., German embedding, English description, Spanish name) is a distinct view. Our experiments on entity typing with fine-grained classes demonstrate the effectiveness of multiview learning. We release MVET, a large multiview – and, in particular, multilingual – entity typing dataset we created. Mono- and multilingual fine-grained entity typing systems can be evaluated on this dataset.


Multi-scale Geometric Summaries for Similarity-based Sensor Fusion
Instance Segmentation and Object Detection with Bounding Shape Masks
Vehicle classification using ResNets, localisation and spatially-weighted pooling
Multi-Stage Reinforcement Learning For Object Detection
Hyper-Process Model: A Zero-Shot Learning algorithm for Regression Problems based on Shape Analysis
Bottleneck Supervised U-Net for Pixel-wise Liver and Tumor Segmentation
Finite-time Guarantees for Byzantine-Resilient Distributed State Estimation with Noisy Measurements
Downsampling leads to Image Memorization in Convolutional Autoencoders
Projecting Trouble: Light Based Adversarial Attacks on Deep Learning Classifiers
Coherence Constraints in Facial Expression Recognition
Strategies for Training Stain Invariant CNNs
Characterization of Brain Cortical Morphology Using Localized Topology-Encoding Graphs
A Proximal Zeroth-Order Algorithm for Nonconvex Nonsmooth Problems
A Case for Object Compositionality in Deep Generative Models of Images
Visions of a generalized probability theory
Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning
Stochastic temporal data upscaling using the generalized k-nearest neighbor algorithm
From Machine to Machine: An OCT-trained Deep Learning Algorithm for Objective Quantification of Glaucomatous Damage in Fundus Photographs
Differentiable Fine-grained Quantization for Deep Neural Network Compression
Machine Learning Methods for Track Classification in the AT-TPC
Dermatologist Level Dermoscopy Skin Cancer Classification Using Different Deep Learning Convolutional Neural Networks Algorithms
Block Matching Frame based Material Reconstruction for Spectral CT
Boosted Convolutional Neural Networks for Motor Imagery EEG Decoding with Multiwavelet-based Time-Frequency Conditional Granger Causality Analysis
Implicit Modeling with Uncertainty Estimation for Intravoxel Incoherent Motion Imaging
SOT-MRAM 300mm integration for low power and ultrafast embedded memories
Experimental Investigation of Programmed State Stability in OxRAM Resistive Memories
Fantom: A scalable framework for asynchronous distributed systems
Chord Recognition in Symbolic Music: A Segmental CRF Model, Segment-Level Features, and Comparative Evaluations on Classical and Popular Music
Effective Filtering for Multiscale Stochastic Dynamical Systems driven by Lévy processes
Computing control invariant sets in high dimension is easy
LoGAN: Generating Logos with a Generative Adversarial Neural Network Conditioned on color
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets
Active Ranking with Subset-wise Preferences
Analysis of Strategy and Spread of Russia-sponsored Content in the US in 2017
Uncovering Complex Overlapping Pattern of Communities in Large-scale Social Networks
The cumulative mass profile of the Milky Way as determined by globular cluster kinematics from Gaia DR2
DeepLSR: Deep learning approach for laser speckle reduction
Semiparametric Analysis of Competing Risks Data Under Missing Cause of Failure
Efficiently measuring a quantum device using machine learning
A Ramsey-type Theorem on the Max-Cut Value of $d$-Regular Graphs
Language Modeling at Scale
Approximating the Quadratic Transportation Metric in Near-Linear Time
A Method to Construct $1$-Rotational Factorizations of Complete Graphs and Solutions to the Oberwolfach Problem
On the Root solution to the Skorokhod embedding problem given full marginals
Cyclic structure induced by load fluctuations in adaptive transportation networks
Implosion of a pure death process
Perturbation techniques for convergence analysis of proximal gradient method and other first-order algorithms via variational analysis
Ramsey subsets of the space of infinite block sequences of vectors
Explicit Boij-Soderberg theory of ideals from a graph isomorphism reduction
Recognition of basic hand movements using Electromyography
Change of variable formula for local time of continuous semimartingale
Statistical mechanics of low-rank tensor decomposition
A Fusion Approach for Multi-Frame Optical Flow Estimation
A Statistical Approach to Adult Census Income Level Prediction
A belief propagation algorithm based on domain decomposition
Arithmetic progressions in the trace of Brownian motion in space
Model Selection for Nonnegative Matrix Factorization by Support Union Recovery
A Continuous-Time View of Early Stopping for Least Squares Regression
Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data
Novel Adaptive Algorithms for Estimating Betweenness, Coverage and k-path Centralities
Deblending galaxy superpositions with branched generative adversarial networks
Classical pattern distributions in $\mathcal{S}_{n}(132)$ and $\mathcal{S}_{n}(123)$
Comparative Evaluation of Tree-Based Ensemble Algorithms for Short-Term Travel Time Prediction
Fast Computation of Steady-State Response for Nonlinear Vibrations of High-Degree-of-Freedom Systems
Reproducing AmbientGAN: Generative models from lossy measurements
Resource-Constrained Simultaneous Detection and Labeling of Objects in High-Resolution Satellite Images
On the log-normality of the degree distribution in large homogeneous binary multiplicative attribute graph models
End-to-End Diagnosis and Segmentation Learning from Cardiac Magnetic Resonance Imaging
Interpreting Black Box Predictions using Fisher Kernels
Delocalization of uniform graph homomorphisms from $\mathbb{Z}^2$ to $\mathbb{Z}$
A Remark on the Arcsine Distribution and the Hilbert Transform
Smoothed Online Optimization for Regression and Control
Voltage Collapse Stabilization: A Game Theory Viewpoint
A Binary Optimization Approach for Constrained K-Means Clustering
Approximation of nonnegative systems by moving averages of fixed order
Local Homology of Word Embeddings
Bayesian Modeling of Nonlinear Poisson Regression with Artificial Neural Networks
Sojourn times of Gaussian processes with trend
A Deep-Learning-Based Fashion Attributes Detection Model
Quadratic Backward Stochastic Volterra Integral Equations
AUNet: Breast Mass Segmentation of Whole Mammograms
Background Subtraction using Compressed Low-resolution Images
Automatic Identification of Indicators of Compromise using Neural-Based Sequence Labelling
Size Ramsey numbers of paths
Resolving Referring Expressions in Images With Labeled Elements
Nonconvex and Nonsmooth Sparse Optimization via Adaptively Iterative Reweighted Methods
Data-driven Blockbuster Planning on Online Movie Knowledge Library
Text Embeddings for Retrieval From a Large Knowledge Base
Learned optimizers that outperform SGD on wall-clock and validation loss
Exploiting Deep Representations for Neural Machine Translation
Modeling Localness for Self-Attention Networks
Multi-Head Attention with Disagreement Regularization
Conjugate coupling induced symmetry breaking and quenched oscillations
Fault Area Detection in Leaf Diseases using k-means Clustering
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
Solving Poisson’s Equation using Deep Learning in Particle Simulation of PN Junction
Automated Evaluation of Semantic Segmentation Robustness for Autonomous Driving
Niji: Bitcoin Bridge Utilizing Payment Channels
Exact distance graphs of product graphs
On the well-posedness of a class of McKean Feynman-Kac equations
Solving Weakly-Convex-Weakly-Concave Saddle-Point Problems as Weakly-Monotone Variational Inequality
The Langevin diffusion as a continuous-time model of animal movement and habitat selection
Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models
Numerical methods for piecewise deterministic Markov processes with boundary
Incompatible double posets and double order polytopes
DSFD: Dual Shot Face Detector
Cross-Resolution Person Re-identification with Deep Antithetical Learning
Universal Language Model Fine-Tuning with Subword Tokenization for Polish
Volume Of Sub-level Sets Of Homogeneous Polynomials
Textually Guided Ranking Network for Attentional Image Retweet Modeling
Faster approximation algorithms for computing shortest cycles on weighted graphs
Multistep Speed Prediction on Traffic Networks: A Graph Convolutional Sequence-to-Sequence Learning Approach with Attention Mechanism
Extension of the Gradient Boosting Algorithm for Joint Modeling of Longitudinal and Time-to-Event data
History by Diversity: Helping Historians search News Archives
Discovering Entities with Just a Little Help from You
Designing Search Tasks for Archive Search
Learn to Code-Switch: Data Augmentation using Copy Mechanism on Language Modeling
Algebraic solution of weighted minimax single-facility constrained location problems
Hierarchical landscape of hard disk glasses
A Maximum Edge-Weight Clique Extraction Algorithm Based on Branch-and-Bound
Production facilities location computing the environmental pollution
Publish-and-Flourish: decentralized co-creation and curation of scholarly content
Uniform Exponential Stabilisation of Serially Connected Inhomogeneous Euler-Bernoulli Beams
Accurate and efficient explicit approximations of the Colebrook flow friction equation based on the Wright-Omega function
A Bag-of-Tasks Scheduler Tolerant to Temporal Failures in Clouds
Learning color space adaptation from synthetic to real images of cirrus clouds
Mask Propagation Network for Video Object Segmentation
Estimating abundance from multiple sampling capture-recapture data via a multi-state multi-period stopover model
Coarse-to-fine volumetric segmentation of teeth in Cone-Beam CT
A Proof-Theoretic Approach to Scope Ambiguity in Compositional Vector Space Models
Optimal Algorithm for Bayesian Incentive-Compatible
Dental pathology detection in 3D cone-beam CT
First and Second Order Shape Optimization based on Restricted Mesh Deformations
Learning to Discriminate Noises for Incorporating External Information in Neural Machine Translation
The MeMAD Submission to the IWSLT 2018 Speech Translation Task
Generative adversarial networks and adversarial methods in biomedical image analysis
Complexity, combinatorial positivity, and Newton polytopes
G-SMOTE: A GMM-based synthetic minority oversampling technique for imbalanced learning
Scalable Gaussian Processes on Discrete Domains
Functional Inequalities for Weighted Gamma Distributions on the Space of Finite (Signed) Measures
Multi-condition of stability for nonlinear stochastic non-autonomous delay differential equation
Image-based Natural Language Understanding Using 2D Convolutional Neural Networks
Simultaneous transmission of classical and quantum information under channel uncertainty and jamming attacks
Multi-Agent Reinforcement Learning Based Resource Allocation for UAV Networks
Effective extractive summarization using frequency-filtered entity relationship graphs
Faithful orthogonal representations of graphs from partition logics
A Deep Learning Mechanism for Efficient Information Dissemination in Vehicular Floating Content
Notes on asymptotics of sample eigenstructure for spiked covariance models with non-Gaussian data
Enumeration of $S$-omino towers and row-convex $k$-omino towers
A localization method in Hamiltonian graph theory
A Map Equation with Metadata: Varying the Role of Attributes in Community Detection
Entropy in Quantum Information Theory — Communication and Cryptography
Semi-supervised Target-level Sentiment Analysis via Variational Autoencoder
The UAVid Dataset for Video Semantic Segmentation
A recursively feasible and convergent Sequential Convex Programming procedure to solve non-convex problems with linear equality constraints
Semantic Neutral Drift
Boundary of the Range of a random walk and the Fölner property
Building and Querying Semantic Layers for Web Archives (Extended Version)
The coset and stability rings
Software Rejuvenation for Secure Tracking Control
Learning Negotiating Behavior Between Cars in Intersections using Deep Q-Learning
Multi-type branching processes with time dependent branching rates
Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach
Design of Software Rejuvenation for CPS Security Using Invariant Sets
Precipitation Nowcasting: Leveraging bidirectional LSTM and 1D CNN
Statistical modeling of rates and trends in Holocene relative sea level
Forecasting Individualized Disease Trajectories using Interpretable Deep Learning
Posterior Convergence of Gaussian and General Stochastic Process Regression Under Possible Misspecifications
Communities as Well Separated Subgraphs With Cohesive Cores: Identification of Core-Periphery Structures in Link Communities
Sleep-like slow oscillations induce hierarchical memory association and synaptic homeostasis in thalamo-cortical simulations
A stochastic sewing lemma and applications
Neighbourhood Consensus Networks
Between-Ride Routing for Private Transportation Services
Spatiotemporal CNNs for Pornography Detection in Videos
Toward an AI Physicist for Unsupervised Learning