Neural Networks Regularization Through Representation Learning

Neural network models and deep models are one of the leading and state of the art models in machine learning. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. This work has been done in collaboration with the clinic Rouen Henri Becquerel Center who provided us with data.

Generative Adversarial Privacy

We present a data-driven framework called generative adversarial privacy (GAP). Inspired by recent advancements in generative adversarial networks (GANs), GAP allows the data holder to learn the privatization mechanism directly from the data. Under GAP, finding the optimal privacy mechanism is formulated as a constrained minimax game between a privatizer and an adversary. We show that for appropriately chosen adversarial loss functions, GAP provides privacy guarantees against strong information-theoretic adversaries. We also evaluate the performance of GAP on multi-dimensional Gaussian mixture models and the GENKI face database.

Learning Causal Hazard Ratio with Endogeneity

Cox’s proportional hazards model is one of the most popular statistical models to evaluate associations of a binary exposure with a censored failure time outcome. When confounding factors are not fully observed, the exposure hazard ratio estimated using a Cox model is not causally interpretable. To address this, we propose novel approaches for identification and estimation of the causal hazard ratio in the presence of unmeasured confounding factors. Our approaches are based on a binary instrumental variable and an additional no-interaction assumption. We derive, to the best of our knowledge, the first consistent estimator of the population marginal causal hazard ratio within an instrumental variable framework. Our estimator admits a closed-form representation, and hence avoids the drawbacks of estimating equation based estimators. Our approach is illustrated via simulation studies and a data analysis.

LeFlow: Enabling Flexible FPGA High-Level Synthesis of Tensorflow Deep Neural Networks

Recent work has shown that Field-Programmable Gate Arrays (FPGAs) play an important role in the acceleration of Machine Learning applications. Initial specification of machine learning applications are often done using a high-level Python-oriented framework such as Tensorflow, followed by a manual translation to either C or RTL for synthesis using vendor tools. This manual translation step is time-consuming and requires expertise that limit the applicability of FPGAs in this important domain. In this paper, we present an open-source tool-flow that maps numerical computation models written in Tensorflow to synthesizable hardware. Unlike other tools, which are often constrained by a small number of inflexible templates, our flow uses Google’s XLA compiler which emits LLVM code directly from a Tensorflow specification. This LLVM code can then be used with a high-level synthesis tool to automatically generate hardware. We show that our flow allows users to generate Deep Neural Networks with very few lines of Python code.

How Humans versus Bots React to Deceptive and Trusted News Sources: A Case Study of Active Users

Society’s reliance on social media as a primary source of news has spawned a renewed focus on the spread of misinformation. In this work, we identify the differences in how social media accounts identified as bots react to news sources of varying credibility, regardless of the veracity of the content those sources have shared. We analyze bot and human responses annotated using a fine-grained model that labels responses as being an answer, appreciation, agreement, disagreement, an elaboration, humor, or a negative reaction. We present key findings of our analysis into the prevalence of bots, the variety and speed of bot and human reactions, and the disparity in authorship of reaction tweets between these two sub-populations. We observe that bots are responsible for 9-15% of the reactions to sources of any given type but comprise only 7-10% of accounts responsible for reaction-tweets; trusted news sources have the highest proportion of humans who reacted; bots respond with significantly shorter delays than humans when posting answer-reactions in response to sources identified as propaganda. Finally, we report significantly different inequality levels in reaction rates for accounts identified as bots vs not.

Adversarially Learned Mixture Model

The Adversarially Learned Mixture Model (AMM) is a generative model for unsupervised or semi-supervised data clustering. The AMM is the first adversarially optimized method to model the conditional dependence between inferred continuous and categorical latent variables. Experiments on the MNIST and SVHN datasets show that the AMM allows for semantic separation of complex data when little or no labeled data is available. The AMM achieves a state-of-the-art unsupervised clustering error rate of 2.86% on the MNIST dataset. A semi-supervised extension of the AMM yields competitive results on the SVHN dataset.

ML-Schema: Exposing the Semantics of Machine Learning with Schemas and Ontologies

The ML-Schema, proposed by the W3C Machine Learning Schema Community Group, is a top-level ontology that provides a set of classes, properties, and restrictions for representing and interchanging information on machine learning algorithms, datasets, and experiments. It can be easily extended and specialized and it is also mapped to other more domain-specific ontologies developed in the area of machine learning and data mining. In this paper we overview existing state-of-the-art machine learning interchange formats and present the first release of ML-Schema, a canonical format resulted of more than seven years of experience among different research institutions. We argue that exposing semantics of machine learning algorithms, models, and experiments through a canonical format may pave the way to better interpretability and to realistically achieve the full interoperability of experiments regardless of platform or adopted workflow solution.

Beyond Data and Model Parallelism for Deep Neural Networks

The computational requirements for training deep neural networks (DNNs) have grown to the point that it is now standard practice to parallelize training. Existing deep learning systems commonly use data or model parallelism, but unfortunately, these strategies often result in suboptimal parallelization performance. In this paper, we define a more comprehensive search space of parallelization strategies for DNNs called SOAP, which includes strategies to parallelize a DNN in the Sample, Operation, Attribute, and Parameter dimensions. We also propose FlexFlow, a deep learning framework that uses guided randomized search of the SOAP space to find a fast parallelization strategy for a specific parallel machine. To accelerate this search, FlexFlow introduces a novel execution simulator that can accurately predict a parallelization strategy’s performance and is three orders of magnitude faster than prior approaches that have to execute each strategy. We evaluate FlexFlow with six real-world DNN benchmarks on two GPU clusters and show that FlexFlow can increase training throughput by up to 3.8x over state-of-the-art approaches, even when including its search time, and also improves scalability.

Hierarchical Reinforcement Learning Framework towards Multi-agent Navigation

This paper proposes a navigation algorithm oriented to multi-agent dynamic environment. The algorithm is expressed as a hierarchical framework which contains a Hidden Markov Model (HMM) and Deep Reinforcement Learning (DRL). For simplification, we term our method Hierarchical Navigation Reinforcement Network (HNRN). In high-level architecture, we train an HMM to evaluate agents environment in order to obtain a score. According to this score, adaptive control action will be chosen. While in low-level architecture, two sub-systems are introduced, one is a differential target-driven system, which aims at heading to the target, the other is collision avoidance DRL system, which is used for avoiding obstacles in the dynamic environment. The advantage of this hierarchical system is to decouple the target-driven and collision avoidance tasks, leading to a faster and easier model to be trained. As the experiments manifest, our algorithm has faster learning efficiency and a higher success rate than traditional Velocity Obstacle (VO) algorithms and hybrid DRL method.

Boosting Combinatorial Problem Modeling with Machine Learning

In the past few years, the area of Machine Learning (ML) has witnessed tremendous advancements, becoming a pervasive technology in a wide range of applications. One area that can significantly benefit from the use of ML is Combinatorial Optimization. The three pillars of constraint satisfaction and optimization problem solving, i.e., modeling, search, and optimization, can exploit ML techniques to boost their accuracy, efficiency and effectiveness. In this survey we focus on the modeling component, whose effectiveness is crucial for solving the problem. The modeling activity has been traditionally shaped by optimization and domain experts, interacting to provide realistic results. Machine Learning techniques can tremendously ease the process, and exploit the available data to either create models or refine expert-designed ones. In this survey we cover approaches that have been recently proposed to enhance the modeling process by learning either single constraints, objective functions, or the whole model. We highlight common themes to multiple approaches and draw connections with related fields of research.

Concept-Based Embeddings for Natural Language Processing

In this work, we focus on effectively leveraging and integrating information from concept-level as well as word-level via projecting concepts and words into a lower dimensional space while retaining most critical semantics. In a broad context of opinion understanding system, we investigate the use of the fused embedding for several core NLP tasks: named entity detection and classification, automatic speech recognition reranking, and targeted sentiment analysis.

A Survey on Expert Recommendation in Community Question Answering

Community question answering (CQA) represents the type of Web applications where people can exchange knowledge via asking and answering questions. One significant challenge of most real-world CQA systems is the lack of effective matching between questions and the potential good answerers, which adversely affects the efficient knowledge acquisition and circulation. On the one hand, a requester might experience many low-quality answers without receiving a quality response in a brief time, on the other hand, an answerer might face numerous new questions without being able to identify their questions of interest quickly. Under this situation, expert recommendation emerges as a promising technique to address the above issues. Instead of passively waiting for users to browse and find their questions of interest, an expert recommendation method raises the attention of users to the appropriate questions actively and promptly. The past few years have witnessed considerable efforts that address the expert recommendation problem from different perspectives. These methods all have their issues that need to be resolved before the advantages of expert recommendation can be fully embraced. In this survey, we first present an overview of the research efforts and state-of-the-art techniques for the expert recommendation in CQA. We next summarize and compare the existing methods concerning their advantages and shortcomings, followed by discussing the open issues and future research directions.

Semantic Search by Latent Ontological Features

Both named entities and keywords are important in defining the content of a text in which they occur. In particular, people often use named entities in information search. However, named entities have ontological features, namely, their aliases, classes, and identifiers, which are hidden from their textual appearance. We propose ontology-based extensions of the traditional Vector Space Model that explore different combinations of those latent ontological features with keywords for text retrieval. Our experiments on benchmark datasets show better search quality of the proposed models as compared to the purely keyword-based model, and their advantages for both text retrieval and representation of documents and queries.

Discovering Latent Concepts and Exploiting Ontological Features for Semantic Text Search

Named entities and WordNet words are im-portant in defining the content of a text in which they occur. Named entities have onto-logical features, namely, their aliases, classes, and identifiers. WordNet words also have ontological features, namely, their synonyms, hypernyms, hyponyms, and senses. Those features of concepts may be hidden from their textual appearance. Besides, there are related concepts that do not appear in a query, but can bring out the meaning of the query if they are added. The traditional constrained spreading activation algorithms use all relations of a node in the network that will add unsuitable information into the query. Meanwhile, we only use relations represented in the query. We propose an ontology-based generalized Vector Space Model to semantic text search. It discovers relevant latent concepts in a query by relation constrained spreading activation. Besides, to represent a word having more than one possible direct sense, it combines the most specific common hypernym of the remaining undisambiguated multi-senses with the form of the word. Experiments on a benchmark dataset in terms of the MAP measure for the retrieval performance show that our model is 41.9% and 29.3% better than the purely keyword-based model and the traditional constrained spreading activation model, respectively.

Ontology-Based Query Expansion with Latently Related Named Entities for Semantic Text Search

Traditional information retrieval systems represent documents and queries by keyword sets. However, the content of a document or a query is mainly defined by both keywords and named entities occurring in it. Named entities have ontological features, namely, their aliases, classes, and identifiers, which are hidden from their textual appearance. Besides, the meaning of a query may imply latent named entities that are related to the apparent ones in the query. We propose an ontology-based generalized vector space model to semantic text search. It exploits ontological features of named entities and their latently related ones to reveal the semantics of documents and queries. We also propose a framework to combine different ontologies to take their complementary advantages for semantic annotation and searching. Experiments on a benchmark dataset show better search quality of our model to other ones.

Lesion Analysis and Diagnosis with Mask-RCNN
Neural Chinese Word Segmentation with Dictionary Knowledge
A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network
Why don’t the modules dominate – Investigating the Structure of a Well-Known Modularity-Inducing Problem Domain
A deep learning architecture to detect events in EEG signals during sleep
Wireless Energy Transmission Channel Modeling in Resonant Beam Charging for IoT Devices
Teaching Telecommunication Standards – Bridging the Gap Between Theory and Practice
Faster than Nyquist Transmission by Non-Orthogonal Time Division Multiplexing of Nyquist Sinc Sequences
Sequential Sampling for Optimal Bayesian Classification of Sequencing Count Data
Poster Abstract: Hierarchical Subchannel Allocation for Mode-3 Vehicle-to-Vehicle Sidelink Communications
Regularity properties of the solution to a stochastic heat equation driven by a fractional Gaussian noise on ${\mathbb{S}}^2$
Half-Duplex and Full-Duplex AF and DF Relaying with Energy-Harvesting in Log-Normal Fading
Towards Modeling the Interaction of Spatial-Associative Neural Network Representations for Multisensory Perception
Super edge-connectivity and matching preclusion of data center networks
Forecasting market states
Optimal designs for frequentist model averaging
irbasis: Open-source database and software for intermediate-representation basis functions of imaginary-time Green’s function
Approximation Algorithms for Clustering via Weighted Impurity Measures
Extending the D-Wave with support for Higher Precision Coefficients
Performance of Humans in Iris Recognition: The Impact of Iris Condition and Annotation-driven Verification
Derangements, Ehrhart Theory, and Local h-polynomials
Channel Charting: Locating Users within the Radio Environment using Channel State Information
Domain-Specific Human-Inspired Binarized Statistical Image Features for Iris Recognition
A one-variable bracket polynomial for some Turk’s head knots
QR2: A Third-party Query Reranking Service Over Web Databases
Stabilization and control for the biharmonic Schrödinger equation
Sparse semiparametric canonical correlation analysis for data of mixed types
When Are Two Gossips the Same Types of Communication in Epistemic Gossip Protocols
Survey on Deep Learning Techniques for Person Re-Identification Task
Transfer Learning for High-Precision Trajectory Tracking Through $\mathcal{L}_1$ Adaptive Feedback and Iterative Learning
Adaptive Model Predictive Control for High-Accuracy Trajectory Tracking in Changing Conditions
Markets Beyond Nash Welfare for Leontief Utilities
Weight distributions, zeta functions and Riemann hypothesis for linear and algebraic geometry codes
On Lusztig-Dupont homology of flag complexes
How Do Classifiers Induce Agents To Invest Effort Strategically
TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines
A matching-based heuristic algorithm for school bus routing problems
Gamma Spaces and Information
Token Sliding on Split Graphs
Generating Synthetic Data for Neural Keyword-to-Question Models
Fully Distributed Event-Triggered Protocols for Linear Multi-Agent Networks
On the Acceleration of L-BFGS with Second-Order Information and Stochastic Batches
Improving Photoplethysmographic Measurements under Motion Artifacts using Artificial Neural Network for Personal Healthcare
Real-Time Shape Tracking of Facial Landmarks
Another Approach to Consensus of Multi-agents
Generalization in quasi-periodic environments
Quantitative analysis of finite-difference approximations of free-discontinuity problems
Smart Grid Monitoring Using Power Line Modems: Anomaly Detection and Localization
Counting Integral Points in Polytopes via Numerical Analysis of Contour Integration
Recurrent Stacking of Layers for Compact Neural Machine Translation Models
Investigating Order Effects in Multidimensional Relevance Judgment using Query Logs
A Simple and Space Efficient Segment Tree Implementation
Characterizing Cryptocurrency market with Levy’s stable distributions
Non-local RoIs for Instance Segmentation
Energy-Based Control of Nonlinear Infinite-Dimensional Port-Hamiltonian Systems with Dissipation
A $c/μ$-Rule for Service Resource Allocation in Group-Server Queues
Adaptive Hierarchical Sensing for the Efficient Sampling of Sparse and Compressible Signals
Matching and MIS for Uniformly Sparse Graphs in the Low-Memory MPC Model
Frameworks with coordinated edge motions
SAT encodings for sorting networks, single-exception sorting networks and $ε-$halvers
3D Hand Pose Estimation using Simulation and Partial-Supervision with a Shared Latent Space
Classically Time-Controlled Quantum Automata
A conjugate gradient-based algorithm for large-scale quadratic programming problem with one quadratic constraint
3D human pose estimation from depth maps using a deep combination of poses
A survey on zeros of random holomorphic sections
Geometric ergodicity of the bouncy particle sampler
The conditional permutation test
Wireless Vital Signs Monitoring System using Visible Light Sensing (VLS)
About the lower bounds for the multiple testing problem
Sparse Relaxed Regularized Regression: SR3
ViLDAR – Visible Light Sensing Based Speed Estimation using Vehicle’s Headlamps
The Schröder case of the generalized Delta conjecture
Symmetric exclusion as a random environment: invariance principle
Stochastic Stability in Schelling’s Segregation Model with Markovian Asynchronous Update
Hölder continuity for the Parabolic Anderson Model with space-time homogeneous Gaussian noise
Piecewise Deterministic Markov Processes and their invariant measure
A Geo-Aware Server Assignment Problem for Mobile Edge Computing
The Inverse First Passage Time Problem for killed Brownian motion
Specular-to-Diffuse Translation for Multi-View Reconstruction
Exact Algorithms and Lower Bounds for Stable Instances of Euclidean k-Means
On the Identifiability of Finite Mixtures of Finite Product Measures
Multi-time-horizon Solar Forecasting Using Recurrent Neural Network
Phase Transitions for Optimality Gaps in Optimal Power Flows A Study on the French Transmission Network
Sparse sum-of-squares (SOS) optimization: A bridge between DSOS/SDSOS and SOS optimization for sparse polynomials
Tractable Querying and Learning in Hybrid Domains via Sum-Product Networks
An improved dimensional threshold for the angle problem
Non-separable Nearest-Neighbor Gaussian Process Model for Antarctic Surface Mass Balance and Ice Core Site Selection
Idempotent means on free binary systems do not exist
Codes with hierarchical locality from covering maps of curves
Strong factorization and the braid arrangement fan
Nearly Optimal Pricing Algorithms for Production Constrained and Laminar Bayesian Selection
A salt and pepper noise image denoising method based on the generative classification
Near Real-time Hippocampus Segmentation Using Patch-based Canonical Neural Network
Punishment and inspection for governing the commons in a feedback-evolving game
The Globally Optimal Reparameterization Algorithm: an Alternative to Fast Dynamic Time Warping for Action Recognition in Video Sequences
Semi-Supervised Feature Learning for Off-Line Writer Identifications
Deep neural network ensemble by data augmentation and bagging for skin lesion classification
Fast and Robust High-Dimensional Sparse Representation Recovery Using Generalized SL0
Scalable Incremental Nonconvex Optimization Approach for Phase Retrieval from Minimal Measurements
More powerful logrank permutation tests for two-sample survival data
Diffeomorphic density registration
Multi-objective Non-cooperative Game Model for Cost-based Task Scheduling in Computational Grid
Adaptive Dimension Reduction to Accelerate Infinite-Dimensional Geometric Markov Chain Monte Carlo
Object Detection with Deep Learning: A Review
The Temporary Exchange Problem
Magnitude Bounded Matrix Factorisation for Recommender Systems
Syllabification by Phone Categorization
Deep Clustering for Unsupervised Learning of Visual Features
A Tight Upper Bound on Bit Error Rate of Joint OFDM and Multi-Carrier Index Keying
Learning Probabilistic Logic Programs in Continuous Domains
Online Submodular Maximization: Beating 1/2 Made Simple
Deterministic (1/2 + ε)-Approximation for Submodular Maximization over a Matroid
Modeling and Trade-off for Mobile Communication, Computing and Caching Networks
The method of codifferential descent for convex and global piecewise affine optimization
Maximizing Ergodic Throughput in Wireless Powered Communication Networks
$(2P_2,K_4)$-Free Graphs are 4-Colorable
Convergence of the Quantile Admission Process with Veto Power
On the Fundamental Limits of MIMO Massive Multiple Access Channels
A new lower bound for classic online bin packing
DeepInf: Social Influence Prediction with Deep Learning
Spatio-Temporal Structured Sparse Regression with Hierarchical Gaussian Process Priors
A unifying theory of exactness of linear penalty functions II: parametric penalty functions
Understanding the Twitter Usage of Humanities and Social Sciences Academic Journals
Burkholder-Davis-Gundy inequalities in UMD Banach spaces
WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features
The Trailer of Blockchain Governance Game
On Zeroes of Random Polynomials and Applications to Unwinding