Stationarity in the Realizations of the Causal Rate-Distortion Function for One-Sided Stationary Sources

This paper derives novel results on the characterization of the the causal information rate-distortion function (IRDF) R_{c}^{it}(D) for arbitrarily-distributed one-sided stationary \kappa-th order Markov source x(1),x(2),…. It is first shown that Gorbunov and Pinsker’s results on the stationarity of the realizations to the causal IRDF (stated for two-sided stationary sources) do not apply to the commonly used family of asymptotic average single-letter (AASL) distortion criteria. Moreover, we show that, in general, a reconstruction sequence cannot be both jointly stationary with a one-sided stationary source sequence and causally related to it. This implies that, in general, the causal IRDF for one-sided stationary sources cannot be realized by a stationary distribution. However, we prove that for an arbitrarily distributed one-sided stationary source and a large class of distortion criteria (including AASL), the search for R_{c}^{it}(D) can be restricted to distributions which yield the output sequence y(1), y(2),… jointly stationary with the source after \kappa samples. Finally, we improve the definition of the stationary causal IRDF \overline{R}_{c}^{it}(D) previously introduced by Derpich and {\O}stergaard for two-sided Markovian stationary sources and show that \overline{R}_{c}^{it}(D) for a two-sided source …,x(-1),x(0),x(1),… equals R_{c}^{it}(D) for the associated one-sided source x(1), x(2),…. This implies that, for the Gaussian quadratic case, the practical zero-delay encoder-decoder pairs proposed by Derpich and {\O}stergaard for approaching R_{c}^{it}(D) achieve an operational data rate which exceeds R_{c}^{it}(D) by less than 1+0.5 \log_2(2 \pi e /12) \simeq 1.254 bits per sample.

Estimating Time-Varying Graphical Models

In this paper, we study time-varying graphical models based on data measured over a temporal grid. Such models are motivated by the needs to describe and understand evolving interacting relationships among a set of random variables in many real applications, for instance the study of how stocks interact with each other and how such interactions change over time. We propose a new model, LOcal Group Graphical Lasso Estimation (loggle), under the assumption that the graph topology changes gradually over time. Specifically, loggle uses a novel local group-lasso type penalty to efficiently incorporate information from neighboring time points and to impose structural smoothness of the graphs. We implement an ADMM based algorithm to fit the loggle model. This algorithm utilizes blockwise fast computation and pseudo-likelihood approximation to improve computational efficiency. An R package loggle has also been developed. We evaluate the performance of loggle by simulation experiments. We also apply loggle to S&P 500 stock price data and demonstrate that loggle is able to reveal the interacting relationships among stocks and among industrial sectors in a time period that covers the recent global financial crisis.

Bayesian Semi-Supervised Tensor Decomposition using Natural Gradients for Anomaly Detection

Anomaly Detection has several important applications. In this paper, our focus is on detecting anomalies in seller-reviewer data using tensor decomposition. While tensor-decomposition is mostly unsupervised, we formulate Bayesian semi-supervised tensor decomposition to take advantage of sparse labeled data. In addition, we use Polya-Gamma data augmentation for the semi-supervised Bayesian tensor decomposition. Finally, we show that the Polya-Gamma formulation simplifies calculation of the Fisher information matrix for partial natural gradient learning. Our experimental results show that our semi-supervised approach outperforms state of the art unsupervised baselines. And that the partial natural gradient learning outperforms stochastic gradient learning and Online-EM with sufficient statistics.

Structural causal models for macro-variables in time-series

We consider a bivariate time series (X_t,Y_t) that is given by a simple linear autoregressive model. Assuming that the equations describing each variable as a linear combination of past values are considered structural equations, there is a clear meaning of how intervening on one particular X_t influences Y_{t'} at later times t'>t. In the present work, we describe conditions under which one can define a causal model between variables that are coarse-grained in time, thus admitting statements like `setting X to x changes Y in a certain way’ without referring to specific time instances. We show that particularly simple statements follow in the frequency domain, thus providing meaning to interventions on frequencies.

Gotta Learn Fast: A New Benchmark for Generalization in RL

In this report, we present a new reinforcement learning (RL) benchmark based on the Sonic the Hedgehog (TM) video game franchise. This benchmark is intended to measure the performance of transfer learning and few-shot learning algorithms in the RL domain. We also present and evaluate some baseline algorithms on the new benchmark.

Natural Language Statistical Features of LSTM-generated Texts

Long Short-Term Memory (LSTM) networks have recently shown remarkable performance in several tasks dealing with natural language generation, such as image captioning or poetry composition. Yet, only few works have analyzed text generated by LSTMs in order to quantitatively evaluate to which extent such artificial texts resemble those generated by humans. We compared the statistical structure of LSTM-generated language to that of written natural language, and to those produced by Markov models of various orders. In particular, we characterized the statistical structure of language by assessing word-frequency statistics, long-range correlations, and entropy measures. Our main finding is that while both LSTM and Markov-generated texts can exhibit features similar to real ones in their word-frequency statistics and entropy measures, LSTM-texts are shown to reproduce long-range correlations at scales comparable to those found in natural language. Moreover, for LSTM networks a temperature-like parameter controlling the generation process shows an optimal value—for which the produced texts are closest to real language—consistent across all the different statistical features investigated.

Sentiment Transfer using Seq2Seq Adversarial Autoencoders

Expressing in language is subjective. Everyone has a different style of reading and writing, apparently it all boil downs to the way their mind understands things (in a specific format). Language style transfer is a way to preserve the meaning of a text and change the way it is expressed. Progress in language style transfer is lagged behind other domains, such as computer vision, mainly because of the lack of parallel data, use cases, and reliable evaluation metrics. In response to the challenge of lacking parallel data, we explore learning style transfer from non-parallel data. We propose a model combining seq2seq, autoencoders, and adversarial loss to achieve this goal. The key idea behind the proposed models is to learn separate content representations and style representations using adversarial networks. Considering the problem of evaluating style transfer tasks, we frame the problem as sentiment transfer and evaluation using a sentiment classifier to calculate how many sentiments was the model able to transfer. We report our results on several kinds of models.

The Generalized Matrix Chain Algorithm

In this paper, we present a generalized version of the matrix chain algorithm to generate efficient code for linear algebra problems, a task for which human experts often invest days or even weeks of works. The standard matrix chain problem consists in finding the parenthesization of a matrix product M := A_1 A_2 \cdots A_n that minimizes the number of scalar operations. In practical applications, however, one frequently encounters more complicated expressions, involving transposition, inversion, and matrix properties. Indeed, the computation of such expressions relies on a set of computational kernels that offer functionality well beyond the simple matrix product. The challenge then shifts from finding an optimal parenthesization to finding an optimal mapping of the input expression to the available kernels. Furthermore, it is often the case that a solution based on the minimization of scalar operations does not result in the optimal solution in terms of execution time. In our experiments, the generated code outperforms other libraries and languages on average by a factor of about 9. The motivation for this work comes from the fact that—despite great advances in the development of compilers—the task of mapping linear algebra problems to optimized kernels is still to be done manually. In order to relieve the user from this complex task, new techniques for the compilation of linear algebra expressions have to be developed.

Deep Learning for Digital Text Analytics: Sentiment Analysis

In today’s scenario, imagining a world without negativity is something very unrealistic, as bad NEWS spreads more virally than good ones. Though it seems impractical in real life, this could be implemented by building a system using Machine Learning and Natural Language Processing techniques in identifying the news datum with negative shade and filter them by taking only the news with positive shade (good news) to the end user. In this work, around two lakhs datum have been trained and tested using a combination of rule-based and data driven approaches. VADER along with a filtration method has been used as an annotating tool followed by statistical Machine Learning approach that have used Document Term Matrix (representation) and Support Vector Machine (classification). Deep Learning algorithms then came into picture to make this system reliable (Doc2Vec) which finally ended up with Convolutional Neural Network(CNN) that yielded better results than the other experimented modules. It showed up a training accuracy of 96%, while a test accuracy of (internal and external news datum) above 85% was obtained.

Multimodal Sparse Bayesian Dictionary Learning

The purpose of this paper is to address the problem of learning dictionaries for multimodal datasets, i.e. datasets collected from multiple data sources. We present an algorithm called multimodal sparse Bayesian dictionary learning (MSBDL). The MSBDL algorithm is able to leverage information from all available data modalities through a joint sparsity constraint on each modality’s sparse codes without restricting the coefficients themselves to be equal. Our framework offers a considerable amount of flexibility to practitioners and addresses many of the shortcomings of existing multimodal dictionary learning approaches. Unlike existing approaches, MSBDL allows the dictionaries for each data modality to have different cardinality. In addition, MSBDL can be used in numerous scenarios, from small datasets to extensive datasets with large dimensionality. MSBDL can also be used in supervised settings and allows for learning multimodal dictionaries concurrently with classifiers for each modality.

CoT: Cooperative Training for Generative Modeling

We propose Cooperative Training (CoT) for training generative models that measure a tractable density function for target data. CoT coordinately trains a generator G and an auxiliary predictive mediator M. The training target of M is to estimate a mixture density of the learned distribution G and the target distribution P, and that of G is to minimize the Jensen-Shannon divergence estimated through M. CoT achieves independent success without the necessity of pre-training via Maximum Likelihood Estimation or involving high-variance algorithms like REINFORCE. This low-variance algorithm is theoretically proved to be unbiased for both generative and predictive tasks. We also theoretically and empirically show the superiority of CoT over most previous algorithms, in terms of generative quality and diversity, predictive generalization ability and computational cost.

Deep Learning For Computer Vision Tasks: A review

Deep learning has recently become one of the most popular sub-fields of machine learning owing to its distributed data representation with multiple levels of abstraction. A diverse range of deep learning algorithms are being employed to solve conventional artificial intelligence problems. This paper gives an overview of some of the most widely used deep learning algorithms applied in the field of computer vision. It first inspects the various approaches of deep learning algorithms, followed by a description of their applications in image classification, object identification, image extraction and semantic segmentation in the presence of noise. The paper concludes with the discussion of the future scope and challenges for construction and training of deep neural networks.

Detail-Preserving Pooling in Deep Networks

Most convolutional neural networks use some method for gradually downscaling the size of the hidden layers. This is commonly referred to as pooling, and is applied to reduce the number of parameters, improve invariance to certain distortions, and increase the receptive field size. Since pooling by nature is a lossy process, it is crucial that each such layer maintains the portion of the activations that is most important for the network’s discriminability. Yet, simple maximization or averaging over blocks, max or average pooling, or plain downsampling in the form of strided convolutions are the standard. In this paper, we aim to leverage recent results on image downscaling for the purposes of deep learning. Inspired by the human visual system, which focuses on local spatial changes, we propose detail-preserving pooling (DPP), an adaptive pooling method that magnifies spatial changes and preserves important structural detail. Importantly, its parameters can be learned jointly with the rest of the network. We analyze some of its theoretical properties and show its empirical benefits on several datasets and networks, where DPP consistently outperforms previous pooling approaches.

The Conversation: Deep Audio-Visual Speech Enhancement

Our goal is to isolate individual speakers from multi-talker simultaneous speech in videos. Existing works in this area have focussed on trying to separate utterances from known speakers in controlled environments. In this paper, we propose a deep audio-visual speech enhancement network that is able to separate a speaker’s voice given lip regions in the corresponding video, by predicting both the magnitude and the phase of the target signal. The method is applicable to speakers unheard and unseen during training, and for unconstrained environments. We demonstrate strong quantitative and qualitative results, isolating extremely challenging real-world examples.

A new transformative framework for data assimilation and calibration of physical ionosphere-thermosphere models
Open or Closed? Information Flow Decided by Transfer Operators and Forecastability Quality Metric
A Tamper-Free Semi-Universal Communication System for Deletion Channels
Non-Malleable Extractors and Non-Malleable Codes: Partially Optimal Constructions
Dynamic Sensor Subset Selection for Centralized Tracking a Time-Varying Stochastic Process
An Estimation of Favorite Value in Emotion Generating Calculation by Fuzzy Petri Net
Local Analysis of Loewner Equation
Characterization of the Pareto social choice correspondence
An information-theoretic, all-scales approach to comparing networks
Moment Inequalities in the Context of Simulated and Predicted Variables
Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model
Counting graded lattices of rank three that have few coatoms
French Word Recognition through a Quick Survey on Recurrent Neural Networks Using Long-Short Term Memory RNN-LSTM
On the centrosymmetric permutations in a class
Jensen-type geometric shapes
Unsupervised and semi-supervised learning with Categorical Generative Adversarial Networks assisted by Wasserstein distance for dermoscopy image Classification
Catalan functions and $k$-Schur positivity
Finite-time scaling in local bifurcations
Report on the 7th International Workshop on Bibliometric-enhanced Information Retrieval (BIR 2018)
Model-based Quantile Regression for Discrete Data
Graph Matching with Anchor Nodes: A Learning Approach
Optimal pebbling number of graphs with given minimum degree
Dependence of exponents on text length versus finite-size scaling for word-frequency distributions
On the upper bound for the mathematical expectation of the norm of a vector uniformly distributed on the sphere and the phenomenon of concentration of uniform measure on the sphere
Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm
Conjectured lower bound for the clique number of a graph
Metastability of the contact process on Erdös-Rényi and configuration model graphs
Universal Successor Representations for Transfer Reinforcement Learning
Manipulation-resistant facility location mechanisms for ZV-line graphs
Derivative free optimization via repeated classification
Constraint Splitting and Projection Methods for Optimal Control of Double Integrator
Maximizing Hamming Distance in Contraction of Permutation Arrays
Smart Soft-RAN for 5G: Dynamic Resource Management in CoMP-NOMA Based Systems
Spatial regularity of semigroups generated by Lévy type operators
Nonlinear 3D Face Morphable Model
Multi-Scale Generalized Plane Match for Optical Flow
Differentially Private Confidence Intervals for Empirical Risk Minimization
Dynamic Multivariate Functional Data Modeling via Sparse Subspace Learning
Achieving Fluency and Coherency in Task-oriented Dialog
Decoupled Novel Object Captioner
A Blackbox Polynomial System Solver on Parallel Shared Memory Computers
Non-existence of perfect binary sequences
Efficient (nonrandom) construction and decoding of non-adaptive group testing
ExFuse: Enhancing Feature Fusion for Semantic Segmentation
Enumerating All Subgraphs without Forbidden Induced Subgraphs via Multivalued Decision Diagrams
Optimal Scalar Linear Index Codes for Three Classes of Two-Sender Unicast Index Coding Problem
Reference-less Measure of Faithfulness for Grammatical Error Correction
AFA-PredNet: The action modulation within predictive coding
Unsupervised Pathology Image Segmentation Using Representation Learning with Spherical K-means
Unsupervised Segmentation of 3D Medical Images Based on Clustering and Deep Representation Learning
Don’t cry to be the first! Symmetric fair division algorithms exist
Generating Clues for Gender based Occupation De-biasing in Text
OLCPM: An Online Framework for Detecting Overlapping Communities in Dynamic Social Networks
About classical solutions of the path-dependent heat equation
Attention Cropping: A Novel Data Augmentation Method for Real-world Plant Species Identification
Limit theorems for multivariate Bessel processes in the freezing regime
Combinatorics of explicit substitutions
MaskReID: A Mask Based Deep Ranking Neural Network for Person Re-identification
Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory
Weighted proper orientations of trees and graphs of bounded treewidth
Mobile Device Synchronisation with Central Database based on Data Relevance
Clustering Strategies for Multicast Precoding in Multi-Beam Satellite Systems
Large Scale Low Power Computing System – Status of Network Design in ExaNeSt and EuroExa Projects
Computing Shapley values in the plane
Multi-dimensional Optimal Trade Execution under Stochastic Resilience
Hybrid-IoT: Hybrid Blockchain Architecture for Internet of Things – PoW Sub-blockchains
Plaque Classification in Coronary Arteries from IVOCT Images Using Convolutional Neural Networks and Transfer Learning
Fusing Saliency Maps with Region Proposals for Unsupervised Object Localization
Discovering the Elite Hypervolume by Leveraging Interspecies Correlation
Von Neumann regularity, split epicness and elementary cellular automata
Generating Multilingual Parallel Corpus Using Subtitles
Weighted Poincaré inequalities, concentration inequalities and tail bounds related to the behavior of the Stein kernel in dimension one
A synopsis of comparative metrics for classifications
Cooperative Energy Efficient Power Allocation Algorithm for downlink massive MIMO
Offline Object Extraction from Dynamic Occupancy Grid Map Sequences
Measurement of exceptional motion in VR video contents for VR sickness assessment using deep convolutional autoencoder
A Test for Multivariate Location Parameter in Elliptical Model based on Forward Search Method
VR IQA NET: Deep Virtual Reality Image Quality Assessment using Adversarial Learning
Some Applications of $S$-restricted Set Partitions
A PTAS for Euclidean TSP with Hyperplane Neighborhoods
A Variable Neighborhood Search for Flying Sidekick Traveling Salesman Problem
Projection image-to-image translation in hybrid X-ray/MR imaging
Interdependent Gibbs Samplers
Making Deep Heatmaps Robust to Partial Occlusions for 3D Object Pose Estimation
Motor Unit Number Estimation via Sequential Monte Carlo
Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments
A Variant of The Corners Theorem
Reasoning about Safety of Learning-Enabled Components in Autonomous Cyber-physical Systems
Every planar graph without adjacent cycles of length at most $8$ is $3$-choosable
Edge-based LBP description of surfaces with colorimetric patterns
Experimental similarity assessment for a collection of fragmented artifacts
Emergent Communication through Negotiation
Compressive Regularized Discriminant Analysis of High-Dimensional Data with Applications to Microarray Studies
Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input
Stability results on vertex Turán problems in Kneser graphs
Spectral density and calculation of free energy
Attention U-Net: Learning Where to Look for the Pancreas
DORA The Explorer: Directed Outreaching Reinforcement Action-Selection
Bipartitioning Problems on Graphs with Bounded Tree-Width
Dynamic Multi-Scale Semantic Segmentation based on Dilated Convolutional Networks
Rapid mixing of Glauber dynamics for colorings below Vigoda’s $11/6$ threshold
Ergodic properties of quasi-Markovian generalized Langevin equations with configuration dependent noise
Flexible and Scalable Deep Learning with MMLSpark
Maximum likelihood estimation in hidden Markov models with inhomogeneous noise
Fully Dynamic Effective Resistances
Cost-Aware Learning and Optimization for Opportunistic Spectrum Access
On Geodesically Convex Formulations for the Brascamp-Lieb Constant
Weak convergence rates of splitting schemes for the stochastic Allen-Cahn equation
Learning to Extract a Video Sequence from a Single Motion-Blurred Image
Seed-Point Detection of Clumped Convex Objects by Short-Range Attractive Long-Range Repulsive Particle Clustering
A simple random matrix model for the vibrational spectrum of jammed packings
Subexponential-time Algorithms for Maximum Independent Set in $P_t$-free and Broom-free Graphs
Ranking Generative Adversarial Networks: Subjective Control over Semantic Image Attributes
Multi-Task Learning for Argumentation Mining
Mean and median bias reduction in generalized linear models
SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation
Predicting Twitter User Socioeconomic Attributes with Network and Language Information
End-to-end Deep Learning of Optical Fiber Communications
Fast Feasible and Unfeasible Matrix Multiplication
Stochastic Comparison of Parallel Systems with Log-Lindley Distributed Components under Random Shocks
LAN property for stochastic differential equations driven by fractional Brownian motion of Hurst parameter $H\in(1/4,1/2)$
Beamformed Fingerprint Learning for Accurate Millimeter Wave Positioning
Moments of random multiplicative functions, II: High moments
Personalized Dynamics Models for Adaptive Assistive Navigation Interfaces
Universality of high-dimensional spanning forests and sandpiles