Whats new on arXiv

Correction of AI systems by linear discriminants: Probabilistic foundations

Artificial Intelligence (AI) systems sometimes make errors and will make errors in the future, from time to time. These errors are usually unexpected, and can lead to dramatic consequences. Intensive development of AI and its practical applications makes the problem of errors more important. Total re-engineering of the systems can create new errors and is not always possible due to the resources involved. The important challenge is to develop fast methods to correct errors without damaging existing skills. We formulated the technical requirements to the ‘ideal’ correctors. Such correctors include binary classifiers, which separate the situations with high risk of errors from the situations where the AI systems work properly. Surprisingly, for essentially high-dimensional data such methods are possible: simple linear Fisher discriminant can separate the situations with errors from correctly solved tasks even for exponentially large samples. The paper presents the probabilistic basis for fast non-destructive correction of AI systems. A series of new stochastic separation theorems is proven. These theorems provide new instruments for fast non-iterative correction of errors of legacy AI systems. The new approaches become efficient in high-dimensions, for correction of high-dimensional systems in high-dimensional world (i.e. for processing of essentially high-dimensional data by large systems).

Learning From Positive and Unlabeled Data: A Survey

Learning from positive and unlabeled data or PU learning is the setting where a learner only has access to positive examples and unlabeled data. The assumption is that the unlabeled data can contain both positive and negative examples. This setting has attracted increasing interest within the machine learning literature as this type of data naturally arises in applications such as medical diagnosis and knowledge base completion. This article provides a survey of the current state of the art in PU learning. It proposes seven key research questions that commonly arise in this field and provides a broad overview of how the field has tried to address them.

Assessing biological models using topological data analysis

We use topological data analysis as a tool to analyze the fit of mathematical models to experimental data. This study is built on data obtained from motion tracking groups of aphids in [Nilsen et al., PLOS One, 2013] and two random walk models that were proposed to describe the data. One model incorporates social interactions between the insects, and the second model is a control model that excludes these interactions. We compare data from each model to data from experiment by performing statistical tests based on three different sets of measures. First, we use time series of order parameters commonly used in collective motion studies. These order parameters measure the overall polarization and angular momentum of the group, and do not rely on a priori knowledge of the models that produced the data. Second, we use order parameter time series that do rely on a priori knowledge, namely average distance to nearest neighbor and percentage of aphids moving. Third, we use computational persistent homology to calculate topological signatures of the data. Analysis of the a priori order parameters indicates that the interactive model better describes the experimental data than the control model does. The topological approach performs as well as these a priori order parameters and better than the other order parameters, suggesting the utility of the topological approach in the absence of specific knowledge of mechanisms underlying the data.

A Framework of Transfer Learning in Object Detection for Embedded Systems

Transfer learning is one of the subjects undergoing intense study in the area of machine learning. In object recognition and object detection there are known experiments for the transferability of parameters, but not for neural networks which are suitable for object-detection in real time embedded applications, such as the SqueezeDet neural network. We use transfer learning to accelerate the training of SqueezeDet to a new group of classes. Also, experiments are conducted to study the transferability and co-adaptation phenomena introduced by the transfer learning process. To accelerate training, we propose a new implementation of the SqueezeDet training which provides a faster pipeline for data processing and achieves 1.8 times speedup compared to the initial implementation. Finally, we created a mechanism for automatic hyperparamer optimization using an empirical method.

A Perceptual Prediction Framework for Self Supervised Event Segmentation

Temporal segmentation of long videos is an important problem, that has largely been tackled through supervised learning, often requiring large amounts of annotated training data. In this paper, we tackle the problem of self-supervised temporal segmentation of long videos that alleviate the need for any supervision. We introduce a self-supervised, predictive learning framework that draws inspiration from cognitive psychology to segment long, visually complex videos into individual, stable segments that share the same semantics. We also introduce a new adaptive learning paradigm that helps reduce the effect of catastrophic forgetting in recurrent neural networks. Extensive experiments on three publicly available datasets – Breakfast Actions, 50 Salads, and INRIA Instructional Videos datasets show the efficacy of the proposed approach. We show that the proposed approach is able to outperform weakly-supervised and other unsupervised learning approaches by up to 24% and have competitive performance compared to fully supervised approaches. We also show that the proposed approach is able to learn highly discriminative features that help improve action recognition when used in a representation learning paradigm.

Characterizing machine learning process: A maturity framework

Academic literature on machine learning modeling fails to address how to make machine learning models work for enterprises. For example, existing machine learning processes cannot address how to define business use cases for an AI application, how to convert business requirements from offering managers into data requirements for data scientists, and how to continuously improve AI applications in term of accuracy and fairness, and how to customize general purpose machine learning models with industry, domain, and use case specific data to make them more accurate for specific situations etc. Making AI work for enterprises requires special considerations, tools, methods and processes. In this paper we present a maturity framework for machine learning model lifecycle management for enterprises. Our framework is a re-interpretation of the software Capability Maturity Model (CMM) for machine learning model development process. We present a set of best practices from our personal experience of building large scale real-world machine learning models to help organizations achieve higher levels of maturity independent of their starting point.

The doctrinal paradox: ROC analysis in a probabilistic framework

The doctrinal paradox is analysed from a probabilistic point of view assuming a simple parametric model for the committee’s behaviour. The well known issue-by-issue and case-by-case majority rules are compared in this model, by means of the concepts of false positive rate (FPR), false negative rate (FNR) and Receiver Operating Characteristics (ROC) space. We introduce also a new rule that we call path-by-path, which is somehow halfway between the other two. Under our model assumptions, the issue-by-issue rule is shown to be the best of the three according to an optimality criterion based in ROC maps, for all values of the model parameters (committee size and competence of its members), when equal weight is given to FPR an FNR. For unequal weights, the relative goodness of the rules depends on the values of the competence and the weights, in a way which is precisely described. The results are illustrated with some numerical examples.

TED: Teaching AI to Explain its Decisions

Artificial intelligence systems are being increasingly deployed due to their potential to increase the efficiency, scale, consistency, fairness, and accuracy of decisions. However, as many of these systems are opaque in their operation, there is a growing demand for such systems to provide explanations for their decisions. Conventional approaches to this problem attempt to expose or discover the inner workings of a machine learning model with the hope that the resulting explanations will be meaningful to the consumer. In contrast, this paper suggests a new approach to this problem. It introduces a simple, practical framework, called Teaching Explanations for Decisions (TED), that provides meaningful explanations that match the mental model of the consumer. We illustrate the generality and effectiveness of this approach with two different examples, resulting in highly accurate explanations with no loss of prediction accuracy for these two examples.

Learning Temporal Point Processes via Reinforcement Learning

Social goods, such as healthcare, smart city, and information networks, often produce ordered event data in continuous time. The generative processes of these event data can be very complex, requiring flexible models to capture their dynamics. Temporal point processes offer an elegant framework for modeling event data without discretizing the time. However, the existing maximum-likelihood-estimation (MLE) learning paradigm requires hand-crafting the intensity function beforehand and cannot directly monitor the goodness-of-fit of the estimated model in the process of training. To alleviate the risk of model-misspecification in MLE, we propose to generate samples from the generative model and monitor the quality of the samples in the process of training until the samples and the real data are indistinguishable. We take inspiration from reinforcement learning (RL) and treat the generation of each event as the action taken by a stochastic policy. We parameterize the policy as a flexible recurrent neural network and gradually improve the policy to mimic the observed event distribution. Since the reward function is unknown in this setting, we uncover an analytic and nonparametric form of the reward function using an inverse reinforcement learning formulation. This new RL framework allows us to derive an efficient policy gradient algorithm for learning flexible point process models, and we show that it performs well in both synthetic and real data.

Dynamic Feature Scaling for K-Nearest Neighbor Algorithm

Nearest Neighbors Algorithm is a Lazy Learning Algorithm, in which the algorithm tries to approximate the predictions with the help of similar existing vectors in the training dataset. The predictions made by the K-Nearest Neighbors algorithm is based on averaging the target values of the spatial neighbors. The selection process for neighbors in the Hermitian space is done with the help of distance metrics such as Euclidean distance, Minkowski distance, Mahalanobis distance etc. A majority of the metrics such as Euclidean distance are scale variant, meaning that the results could vary for different range of values used for the features. Standard techniques used for the normalization of scaling factors are feature scaling method such as Z-score normalization technique, Min-Max scaling etc. Scaling methods uniformly assign equal weights to all the features, which might result in a non-ideal situation. This paper proposes a novel method to assign weights to individual feature with the help of out of bag errors obtained from constructing multiple decision tree models.

PanJoin: A Partition-based Adaptive Stream Join

In stream processing, stream join is one of the critical sources of performance bottlenecks. The sliding-window-based stream join provides a precise result but consumes considerable computational resources. The current solutions lack support for the join predicates on large windows. These algorithms and their hardware accelerators are either limited to equi-join or use a nested loop join to process all the requests. In this paper, we present a new algorithm called PanJoin which has high throughput on large windows and supports both equi-join and non-equi-join. PanJoin implements three new data structures to reduce computations during the probing phase of stream join. We also implement the most hardware-friendly data structure, called BI-Sort, on FPGA. Our evaluation shows that PanJoin outperforms several recently proposed stream join methods by more than 1000x, and it also adapts well to highly skewed data.

Theoretical Analysis of Adversarial Learning: A Minimax Approach

We propose a general theoretical method for analyzing the risk bound in the presence of adversaries. In particular, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem could be reduced to a minimax statistical learning problem by introducing a transport map between distributions. Then we prove a risk bound for this minimax problem in terms of covering numbers. In contrast to previous minimax bounds in \cite{lee,far}, our bound is informative when the radius of the ambiguity set is small. Our method could be applied to multi-class classification problems and commonly-used loss functions such as hinge loss and ramp loss. As two illustrative examples, we derive the adversarial risk bounds for kernel-SVM and deep neural networks. Our results indicate that a stronger adversary might have a negative impact on the complexity of the hypothesis class and the existence of margin could serve as a defense mechanism to counter adversarial attacks.

A Multi-layer LSTM-based Approach for Robot Command Interaction Modeling

As the first robotic platforms slowly approach our everyday life, we can imagine a near future where service robots will be easily accessible by non-expert users through vocal interfaces. The capability of managing natural language would indeed speed up the process of integrating such platform in the ordinary life. Semantic parsing is a fundamental task of the Natural Language Understanding process, as it allows extracting the meaning of a user utterance to be used by a machine. In this paper, we present a preliminary study to semantically parse user vocal commands for a House Service robot, using a multi-layer Long-Short Term Memory neural network with attention mechanism. The system is trained on the Human Robot Interaction Corpus, and it is preliminarily compared with previous approaches.

Anomaly Detection using Autoencoders in High Performance Computing Systems

Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components. The current state of the art for automated anomaly detection employs Machine Learning methods or statistical regression models in a supervised fashion, meaning that the detection tool is trained to distinguish among a fixed set of behaviour classes (healthy and unhealthy states). We propose a novel approach for anomaly detection in High Performance Computing systems based on a Machine (Deep) Learning technique, namely a type of neural network called autoencoder. The key idea is to train a set of autoencoders to learn the normal (healthy) behaviour of the supercomputer nodes and, after training, use them to identify abnormal conditions. This is different from previous approaches which where based on learning the abnormal condition, for which there are much smaller datasets (since it is very hard to identify them to begin with). We test our approach on a real supercomputer equipped with a fine-grained, scalable monitoring infrastructure that can provide large amount of data to characterize the system behaviour. The results are extremely promising: after the training phase to learn the normal system behaviour, our method is capable of detecting anomalies that have never been seen before with a very good accuracy (values ranging between 88% and 96%).

Incentivising Participation in Liquid Democracy with Breadth First Delegation
Robustness of the Closest Unstable Equilibrium Point Along a P-V Curve
ADNet: A Deep Network for Detecting Adverts
The largest graphs with given order and diameter: A simple proof
On Stability Condition of Wireless Networked Control Systems under Joint Design of Control Policy and Network Scheduling Policy
Weak convergence of particle swarm optimization
Adaptive model selection method for a conditionally Gaussian semimartingale regression in continuous time
Temporal Graph Convolutional Network for Urban Traffic Flow Prediction Method
On Asymptotic Covariances of A Few Unrotated Factor Solutions
MMALFM: Explainable Recommendation by Leveraging Reviews and Images
Edge directionality properties in complex spherical networks
Distributionally Robust Semi-Supervised Learning for People-Centric Sensing
Clifford-like parallelisms
Learning data augmentation policies using augmented random search
Adaptive Target Recognition: A Case Study Involving Airport Baggage Screening
Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL?
Improving Generalization for Abstract Reasoning Tasks Using Disentangled Feature Representations
Localisation, chiral symmetry and confinement in QCD and related theories
Stationary Harmonic Measure as the Scaling Limit of Truncated Harmonic Measure
On the absolute continuity of random nodal volumes
Strong Equivalence for Epistemic Logic Programs Made Easy (Extended Version)
3s-Unification for Vehicular Headway Modeling
Subsequent Boundary Distance Regression and Pixelwise Classification Networks for Automatic Kidney Segmentation in Ultrasound Images
Large-deviation properties of the largest biconnected component for random graphs
A test case for application of convolutional neural networks to spatio-temporal climate data: Re-identifying clustered weather patterns
Understanding the boosted decision tree methods with the weak-learner approximation
Compliance in Real Time Multiset Rewriting Models
Circuit Depth Reductions
Scattering-free pulse propagation through invisible non-Hermitian disorder
Regularity results of the speed of biased random walks on Galton-Watson trees
Potential Game-Based Non-Myopic Sensor Network Planning for Multi-Target Tracking
Quantum-inspired sublinear classical algorithms for solving low-rank linear systems
On the practice of classification learning for clinical diagnosis and therapy advice in oncology
Generative Dual Adversarial Network for Generalized Zero-shot Learning
Bio-YODIE: A Named Entity Linking System for Biomedical Text
Measures of goodness of fit obtained by canonical transformations on Riemannian manifolds
Pareto-Optimal Allocation of Indivisible Goods with Connectivity Constraints
Comparing Spark vs MPI/OpenMP On Word Count MapReduce
CQASUMM: Building References for Community Question Answering Summarization Corpora
Triangular Ladders $P_{d,2}$ are $e$-positive
Focusing on the Big Picture: Insights into a Systems Approach to Deep Learning for Satellite Imagery
Segue: Overviewing Evolution Patterns of Egocentric Networks by Interactive Construction of Spatial Layouts
Multi-encoder multi-resolution framework for end-to-end speech recognition
Stream attention-based multi-array end-to-end speech recognition
Algorithmic models of human behavior and stochastic optimization
Deep Learning versus Classical Regression for Brain Tumor Patient Survival Prediction
Nonexistence of Bigeodesics in Integrable Models of Last Passage Percolation
Quantum-inspired low-rank stochastic regression with logarithmic dependence on the dimension
Boosting Model Performance through Differentially Private Model Aggregation
Online Timely Status Updates with Erasures for Energy Harvesting Sensors
Analytical Formulation of the Block-Constrained Configuration Model
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Algebraic Many-Body Localization and its implications on information propagation
Modeling and Performance of Uplink Cache-Enabled Massive MIMO Heterogeneous Networks
Molecular computers
A simplifed static frequency converter model for electromechanical transient stability studies of 16$\frac{2}{3}$ Hz railways
The Impact of Timestamp Granularity in Optimistic Concurrency Control
PennyLane: Automatic differentiation of hybrid quantum-classical computations
Eliminating Latent Discrimination: Train Then Mask
p-regularity theory. Applications and developments
Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces
Generalized Ternary Connect: End-to-End Learning and Compression of Multiplication-Free Deep Neural Networks
Prediction of Alzheimer’s disease-associated genes by integration of GWAS summary data and expression data
A Generalized Framework for Approximate Control Variates
OriNet: A Fully Convolutional Network for 3D Human Pose Estimation
A new approach for pedestrian density estimation using moving sensors and computer vision
You Only Live Multiple Times: A Blackbox Solution for Reusing Crash-Stop Algorithms In Realistic Crash-Recovery Settings
Choosing to grow a graph: Modeling network formation as discrete choice
Coordinating Disaster Emergency Response with Heuristic Reinforcement Learning
Blindfold Baselines for Embodied QA
NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification
A Team-Formation Algorithm for Faultline Minimization
Improved Dynamic Memory Network for Dialogue Act Classification with Adversarial Training
Approximation Algorithms for Minimum Norm and Ordered Optimization Problems
Generating faces for affect analysis
LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering
A Review of automatic differentiation and its efficient implementation
Shortcut Graphs and Groups
Finding All Bayesian Network Structures within a Factor of Optimal
Exploiting Local Feature Patterns for Unsupervised Domain Adaptation
Distributed Cooperative Spectrum Sharing in UAV Networks Using Multi-Agent Reinforcement Learning
Electrophysiological indicators of gesture perception
A unified algorithm for the non-convex penalized estimation: The ncpen package
SMERC: Social media event response clustering using textual and temporal information
Multiple-paths $SLE_κ$ in multiply connected domains
Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation
The first passage time density of Brownian motion and the heat equation with Dirichlet boundary condition in time dependent domains
Private Model Compression via Knowledge Distillation
Regularised Zero-Variance Control Variates
Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality
Task Graph Transformations for Latency Tolerance
A Unified Model for Opinion Target Extraction and Target Sentiment Prediction
Domain Agnostic Real-Valued Specificity Prediction
Nonsingular Gaussian Conditionally Markov Sequences
Parallel Stochastic Asynchronous Coordinate Descent: Tight Bounds on the Possible Parallelism
A General Method for Amortizing Variational Filtering
A SAT+CAS Approach to Finding Good Matrices: New Examples and Counterexamples
A Local Regret in Nonconvex Online Learning
Exploring RNN-Transducer for Chinese Speech Recognition
Balancing Relevance and Diversity in Online Bipartite Matching via Submodularity
Neuroimaging Modality Fusion in Alzheimer’s Classification Using Convolutional Neural Networks
Interpreting Models by Allowing to Ask
Towards the topological recursion for double Hurwitz numbers
A Variational Inference based Detection Method for Repetition Coded Generalized Spatial Modulation
Parametric Shortest Paths in Planar Graphs
Exploiting temporal and depth information for multi-frame face anti-spoofing
Modeling Local Dependence in Natural Language with Multi-channel Recurrent Neural Networks
Fundamental Limits of Exact Support Recovery in High Dimensions
Multi-unit Bilateral Trade
Sensitivity Analysis of a Stationary Point Set Map under Total Perturbations. Part 2: Robinson Stability
Community Exploration: From Offline Optimization to Online Learning
Multiscale Information Storage of Linear Long-Range Correlated Stochastic Processes
M Equilibrium: A dual theory of beliefs and choices in games
Amplitude-Aware Lossy Compression for Quantum Circuit Simulation
Co-Representation Learning For Classification and Novel Class Detection via Deep Networks
Spectral Efficiency Analysis in Presence of Correlated Gamma-Lognormal Desired and Interfering Signals
Sensitivity Analysis of a Stationary Point Set Map under Total Perturbations. Part 1: Lipschitzian Stability
Hate Speech Detection from Code-mixed Hindi-English Tweets Using Deep Learning Models
Application of Faster R-CNN model on Human Running Pattern Recognition
Fast HARQ over Finite Blocklength Codes: A Technique for Low-Latency Reliable Communication
User Demand Based Precoding for DSL Systems
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
On the Throughput of Large-but-Finite MIMO Networks using Schedulers
Protection Placement for State Estimation Measurement Data Integrity
Recurrent Multi-Graph Neural Networks for Travel Cost Prediction
Approximating minimum representations of key Horn functions
Vehicle Re-identification Using Quadruple Directional Deep Learning Features
On Lipschitz-like property for polyhedral moving sets
Nonparametric geometric outlier detection
Optimal extension to Sobolev rough paths
Child Gender Determination with Convolutional Neural Networks on Hand Radio-Graphs
Gradient Harmonized Single-stage Detector
Equilibrium measures on trees
Polynomial Schur’s theorem
On the Polarization Levels of Automorphic-Symmetric Channels
Relating local structures, energies, and occurrence probabilities in a two-dimensional silica network
FusionStitching: Deep Fusion and Code Generation for Tensorflow Computations on GPUs
Classical Access Structures of Ramp Secret Sharing Based on Quantum Stabilizer Codes
ImageNet/ResNet-50 Training in 224 Seconds
Applications of Littlewood-Richardson tableaux to computing generic extension of semisimple invariant subspaces of nilpotent linear operators
BAN: Focusing on Boundary Context for Object Detection
Interpretable Credit Application Predictions With Counterfactual Explanations
An Online Attention-based Model for Speech Recognition
Probing interacting two-level systems with rare-earth ions
Modular Networks: Learning to Decompose Neural Computation
Modality Attention for End-to-End Audio-visual Speech Recognition
SVM-Based Sea-Surface Small Target Detection: A False-Alarm-Rate-Controllable Approach
Image Captioning Based on a Hierarchical Attention Mechanism and Policy Gradient Optimization
Deep Neural Network Concepts for Background Subtraction: A Systematic Review and Comparative Evaluation
How Secure are Deep Learning Algorithms from Side-Channel based Reverse Engineering?
A conjugate prior for the Dirichlet distribution
Predicting Distresses using Deep Learning of Text Segments in Annual Reports
Towards the Design of Aerostat Wind Turbine Arrays through AI
Intelligent Drone Swarm for Search and Rescue Operations at Sea
Pose Invariant 3D Face Reconstruction
SAFE: Self-Attentive Function Embeddings for Binary Similarity
Genetic algorithm for optimal distribution in cities
Translating Natural Language to SQL using Pointer-Generator Networks and How Decoding Order Matters
Self-Supervised Learning of Depth and Camera Motion from 360° Videos
Improved Fourier Mellin Invariant for Robust Rotation Estimation with Omni-cameras
Detect or Track: Towards Cost-Effective Video Object Detection/Tracking
Spectral Deconfounding and Perturbed Sparse Linear Models
Iteratively Training Look-Up Tables for Network Quantization
Highly Efficient Stepped Wedge Designs for Clusters of Unequal Size
Personal Names Popularity Estimation and its Application to Record Linkage
Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents
Benchmarking datasets for Anomaly-based Network Intrusion Detection: KDD CUP 99 alternatives
Operator-Valued Matrices with Free or Exchangeable Entries
Comparison of Feature Extraction Methods and Predictors for Income Inference
Quantile regression approach to conditional mode estimation
Sorting out Lipschitz function approximation
On Finding Quantum Multi-collisions
Hallucinating Point Cloud into 3D Sculptural Object
Remarks on a fractional-time stochastic equation
Strong Approximation of Monotone Stochastic Partial Different Equations Driven by Multiplicative Noise
Estimation of urban traffic state with probe vehicles
Advances in sequential measurement and control of open quantum systems
Algorithms for Optimal AC Power Flow in the Presence of Renewable Sources
Embedding Electronic Health Records for Clinical Information Retrieval
Autonomic Intrusion Response in Distributed Computing using Big Data
Multi-task learning for Joint Language Understanding and Dialogue State Tracking
Estimating the Impact of Cyber-Attack Strategies for Stochastic Control Systems
Home Activity Monitoring using Low Resolution Infrared Sensor
Fast Human Pose Estimation
ABox Abduction via Forgetting in ALC (Long Version)
Quickest Detection of Time-Varying False Data Injection Attacks in Dynamic Linear Regression Models
On the Mean Order of Connected Induced Subgraphs of Block Graphs
Deep Object Centric Policies for Autonomous Driving
Robust H-infinity kinematic control of manipulator robots using dual quaternion algebra
Argumentation for Explainable Scheduling (Full Paper with Proofs)
Very Hard Electoral Control Problems
A survey of semidefinite programming approaches to the generalized problem of moments and their error analysis
Cyclic quasi-symmetric functions
Co-regularized Alignment for Unsupervised Domain Adaptation
Higher-Order Cone Programming
New fat-tail normality test based on conditional second moments with applications to finance


If you did not already know

automated CLAUse DETectEr (Claudette) google
Machine Learning Powered Analysis of Consumer Contracts and Privacy Policies. CLAUDETTE – ‘automated CLAUse DETectEr’ – is an interdisciplinary research project hosted at the Law Department of the European University Institute, led by professors Giovanni Sartor and Hans-W. Micklitz, in cooperation with engineers from University of Bologna and University of Modena and Reggio Emilia. The research objective is to test to what extent is it possible to automate reading and legal assessment of online consumer contracts and privacy policies, to evaluate their compliance with EU´s unfair contractual terms law and personal data protection law (GDPR), using machine learning and grammar-based approaches. The idea arose out of bewilderment. Having read dozens of terms of service and of privacy policies of online platforms, we came to conclusion that despite substantive law in place, and despite enforcers´ competence for abstract control, providers of online services still tend to use unfair and unlawful clauses in these documents. Hence, the idea to automate parts of enforcement process by delegating certain tasks to machines. On one hand, we believe that relying on automation can increase quality and effectiveness of legal work of enforcers. On the other, we want to empower consumers themselves, by giving them tools to quickly assess whether what they agree to online is fair and/or lawful. …

Dfuntest google
New ideas in distributed systems (algorithms or protocols) are commonly tested by simulation, because experimenting with a prototype deployed on a realistic platform is cumbersome. However, a prototype not only measures performance but also verifies assumptions about the underlying system. We developed dfuntest – a testing framework for distributed applications that defines abstractions and test structure, and automates experiments on distributed platforms. Dfuntest aims to be jUnit’s analogue for distributed applications; a framework that enables the programmer to write robust and flexible scenarios of experiments. Dfuntest requires minimal bindings that specify how to deploy and interact with the application. Dfuntest’s abstractions allow execution of a scenario on a single machine, a cluster, a cloud, or any other distributed infrastructure, e.g. on PlanetLab. A scenario is a procedure; thus, our framework can be used both for functional tests and for performance measurements. We show how to use dfuntest to deploy our DHT prototype on 60 PlanetLab nodes and verify whether the prototype maintains a correct topology. …

MapReduce for C (MR4C) google
MR4C is an implementation framework that allows you to run native code within the Hadoop execution framework. Pairing the performance and flexibility of natively developed algorithms with the unfettered scalability and throughput inherent in Hadoop, MR4C enables large-scale deployment of advanced data processing applications. …

Whats new on arXiv

Deep Item-based Collaborative Filtering for Top-N Recommendation

Item-based Collaborative Filtering(short for ICF) has been widely adopted in recommender systems in industry, owing to its strength in user interest modeling and ease in online personalization. By constructing a user’s profile with the items that the user has consumed, ICF recommends items that are similar to the user’s profile. With the prevalence of machine learning in recent years, significant processes have been made for ICF by learning item similarity (or representation) from data. Nevertheless, we argue that most existing works have only considered linear and shallow relationship between items, which are insufficient to capture the complicated decision-making process of users. In this work, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationship among items. Going beyond modeling only the second-order interaction (e.g. similarity) between two items, we additionally consider the interaction among all interacted item pairs by using nonlinear neural networks. Through this way, we can effectively model the higher-order relationship among items, capturing more complicated effects in user decision-making. For example, it can differentiate which historical itemsets in a user’s profile are more important in affecting the user to make a purchase decision on an item. We treat this solution as a deep variant of ICF, thus term it as DeepICF. To justify our proposal, we perform empirical studies on two public datasets from MovieLens and Pinterest. Extensive experiments verify the highly positive effect of higher-order item interaction modeling with nonlinear neural networks. Moreover, we demonstrate that by more fine-grained second-order interaction modeling with attention network, the performance of our DeepICF method can be further improved.

Gaussian-Induced Convolution for Graphs

Learning representation on graph plays a crucial role in numerous tasks of pattern recognition. Different from grid-shaped images/videos, on which local convolution kernels can be lattices, however, graphs are fully coordinate-free on vertices and edges. In this work, we propose a Gaussian-induced convolution (GIC) framework to conduct local convolution filtering on irregular graphs. Specifically, an edge-induced Gaussian mixture model is designed to encode variations of subgraph region by integrating edge information into weighted Gaussian models, each of which implicitly characterizes one component of subgraph variations. In order to coarsen a graph, we derive a vertex-induced Gaussian mixture model to cluster vertices dynamically according to the connection of edges, which is approximately equivalent to the weighted graph cut. We conduct our multi-layer graph convolution network on several public datasets of graph classification. The extensive experiments demonstrate that our GIC is effective and can achieve the state-of-the-art results.

Fast Matrix Factorization with Non-Uniform Weights on Missing Data

Matrix factorization (MF) has been widely used to discover the low-rank structure and to predict the missing entries of data matrix. In many real-world learning systems, the data matrix can be very high-dimensional but sparse. This poses an imbalanced learning problem, since the scale of missing entries is usually much larger than that of observed entries, but they cannot be ignored due to the valuable negative signal. For efficiency concern, existing work typically applies a uniform weight on missing entries to allow a fast learning algorithm. However, this simplification will decrease modeling fidelity, resulting in suboptimal performance for downstream applications. In this work, we weight the missing data non-uniformly, and more generically, we allow any weighting strategy on the missing data. To address the efficiency challenge, we propose a fast learning method, for which the time complexity is determined by the number of observed entries in the data matrix, rather than the matrix size. The key idea is two-fold: 1) we apply truncated SVD on the weight matrix to get a more compact representation of the weights, and 2) we learn MF parameters with element-wise alternating least squares (eALS) and memorize the key intermediate variables to avoid repeating computations that are unnecessary. We conduct extensive experiments on two recommendation benchmarks, demonstrating the correctness, efficiency, and effectiveness of our fast eALS method.

An Optimal Control View of Adversarial Machine Learning

I describe an optimal control view of adversarial machine learning, where the dynamical system is the machine learner, the input are adversarial actions, and the control costs are defined by the adversary’s goals to do harm and be hard to detect. This view encompasses many types of adversarial machine learning, including test-item attacks, training-data poisoning, and adversarial reward shaping. The view encourages adversarial machine learning researcher to utilize advances in control theory and reinforcement learning.

End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

Knowledge graph embedding has been an active research topic for knowledge base completion, with progressive improvement from the initial TransE, TransH, DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution over embeddings and multiple layers of nonlinear features to model knowledge graphs. The model can be efficiently trained and scalable to large knowledge graphs. However, there is no structure enforcement in the embedding space of ConvE. The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure. In this work, we propose a novel end-to-end Structure-Aware Convolutional Networks (SACN) that take the benefit of GCN and ConvE together. SACN consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE. WGCN utilizes knowledge graph node structure, node attributes and relation types. It has learnable weights that collect adaptive amount of information from neighboring graph nodes, resulting in more accurate embeddings of graph nodes. In addition, the node attributes are added as the nodes and are easily integrated into the WGCN. The decoder Conv-TransE extends the state-of-the-art ConvE to be translational between entities and relations while keeps the state-of-the-art performance as ConvE. We demonstrate the effectiveness of our proposed SACN model on standard FB15k-237 and WN18RR datasets, and present about 10% relative improvement over the state-of-the-art ConvE in terms of HITS@1, HITS@3 and HITS@10.

ReDecode Framework for Iterative Improvement in Paraphrase Generation

Generating paraphrases, that is, different variations of a sentence conveying the same meaning, is an important yet challenging task in NLP. Automatically generating paraphrases has its utility in many NLP tasks like question answering, information retrieval, conversational systems to name a few. In this paper, we introduce iterative refinement of generated paraphrases within VAE based generation framework. Current sequence generation models lack the capability to (1) make improvements once the sentence is generated; (2) rectify errors made while decoding. We propose a technique to iteratively refine the output using multiple decoders, each one attending on the output sentence generated by the previous decoder. We improve current state of the art results significantly – with over 9% and 28% absolute increase in METEOR scores on Quora question pairs and MSCOCO datasets respectively. We also show qualitatively through examples that our re-decoding approach generates better paraphrases compared to a single decoder by rectifying errors and making improvements in paraphrase structure, inducing variations and introducing new but semantically coherent information.

Computational Complexity Analysis of Genetic Programming

Genetic Programming (GP) is an evolutionary computation technique to solve problems in an automated, domain-independent way. Rather than identifying the optimum of a function as in more traditional evolutionary optimization, the aim of GP is to evolve computer programs with a given functionality. A population of programs is evolved using variation operators inspired by Darwinian evolution (crossover and mutation) and natural selection principles to guide the search process towards better programs. While many GP applications have produced human competitive results, the theoretical understanding of what problem characteristics and algorithm properties allow GP to be effective is comparatively limited. Compared to traditional evolutionary algorithms for function optimization, GP applications are further complicated by two additional factors: the variable length representation of candidate programs, and the difficulty of evaluating their quality efficiently. Such difficulties considerably impact the runtime analysis of GP where space complexity also comes into play. As a result initial complexity analyses of GP focused on restricted settings such as evolving trees with given structures or estimating the quality of solutions using only a small polynomial number of input/output examples. However, the first runtime analyses concerning GP applications for evolving proper functions with defined input/output behavior have recently appeared. In this chapter, we present an overview of the state-of-the-art.

RADS: Real-time Anomaly Detection System for Cloud Data Centres

Cybersecurity attacks in Cloud data centres are increasing alongside the growth of the Cloud services market. Existing research proposes a number of anomaly detection systems for detecting such attacks. However, these systems encounter a number of challenges, specifically due to the unknown behaviour of the attacks and the occurrence of genuine Cloud workload spikes, which must be distinguished from attacks. In this paper, we discuss these challenges and investigate the issues with the existing Cloud anomaly detection approaches. Then, we propose a Real-time Anomaly Detection System (RADS) for Cloud data centres, which uses a one class classification algorithm and a window-based time series analysis to address the challenges. Specifically, RADS can detect VM-level anomalies occurring due to DDoS and cryptomining attacks. We evaluate the performance of RADS by running lab-based experiments and by using real-world Cloud workload traces. Evaluation results demonstrate that RADS can achieve 90-95% accuracy with a low false positive rate of 0-3%. The results further reveal that RADS experiences fewer false positives when using its window-based time series analysis in comparison to using state-of-the-art average or entropy based analysis.

Anomaly Detection and Correction in Large Labeled Bipartite Graphs

Binary classification problems can be naturally modeled as bipartite graphs, where we attempt to classify right nodes based on their left adjacencies. We consider the case of labeled bipartite graphs in which some labels and edges are not trustworthy. Our goal is to reduce noise by identifying and fixing these labels and edges. We first propose a geometric technique for generating random graph instances with untrustworthy labels and analyze the resulting graph properties. We focus on generating graphs which reflect real-world data, where degree and label frequencies follow power law distributions. We review several algorithms for the problem of detection and correction, proposing novel extensions and making observations specific to the bipartite case. These algorithms range from math programming algorithms to discrete combinatorial algorithms to Bayesian approximation algorithms to machine learning algorithms. We compare the performance of all these algorithms using several metrics and, based on our observations, identify the relative strengths and weaknesses of each individual algorithm.

An Interpretable Generative Model for Handwritten Digit Image Synthesis

An interpretable generative model for handwritten digits synthesis is proposed in this work. Modern image generative models, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), are trained by backpropagation (BP). The training process is complex and the underlying mechanism is difficult to explain. We propose an interpretable multi-stage PCA method to achieve the same goal and use handwritten digit images synthesis as an illustrative example. First, we derive principal-component-analysis-based (PCA-based) transform kernels at each stage based on the covariance of its inputs. This results in a sequence of transforms that convert input images of correlated pixels to spectral vectors of uncorrelated components. In other words, it is a whitening process. Then, we can synthesize an image based on random vectors and multi-stage transform kernels through a coloring process. The generative model is a feedforward (FF) design since no BP is used in model parameter determination. Its design complexity is significantly lower, and the whole design process is explainable. Finally, we design an FF generative model using the MNIST dataset, compare synthesis results with those obtained by state-of-the-art GAN and VAE methods, and show that the proposed generative model achieves comparable performance.

A Model-Centric Analysis of Openness, Replication, and Reproducibility

The literature on the reproducibility crisis presents several putative causes for the proliferation of irreproducible results, including HARKing, p-hacking and publication bias. Without a theory of reproducibility, however, it is difficult to determine whether these putative causes can explain most irreproducible results. Drawing from an historically informed conception of science that is open and collaborative, we identify the components of an idealized experiment and analyze these components as a precursor to develop such a theory. Openness, we suggest, has long been intuitively proposed as a solution to irreproducibility. However, this intuition has not been validated in a theoretical framework. Our concern is that the under-theorizing of these concepts can lead to flawed inferences about the (in)validity of experimental results or integrity of individual scientists. We use probabilistic arguments and examine how openness of experimental components relates to reproducibility of results. We show that there are some impediments to obtaining reproducible results that precede many of the causes often cited in literature on the reproducibility crisis. For example, even if erroneous practices such as HARKing, p-hacking, and publication bias were absent at the individual and system level, reproducibility may still not be guaranteed.

Adversarial Learning-Based On-Line Anomaly Monitoring for Assured Autonomy

The paper proposes an on-line monitoring framework for continuous real-time safety/security in learning-based control systems (specifically application to a unmanned ground vehicle). We monitor validity of mappings from sensor inputs to actuator commands, controller-focused anomaly detection (CFAM), and from actuator commands to sensor inputs, system-focused anomaly detection (SFAM). CFAM is an image conditioned energy based generative adversarial network (EBGAN) in which the energy based discriminator distinguishes between proper and anomalous actuator commands. SFAM is based on an action condition video prediction framework to detect anomalies between predicted and observed temporal evolution of sensor data. We demonstrate the effectiveness of the approach on our autonomous ground vehicle for indoor environments and on Udacity dataset for outdoor environments.

Explainable Reasoning over Knowledge Graphs for Recommendation

Incorporating knowledge graph into recommender systems has attracted increasing attention in recent years. By exploring the interlinks within a knowledge graph, the connectivity between users and items can be discovered as paths, which provide rich and complementary information to user-item interactions. Such connectivity not only reveals the semantics of entities and relations, but also helps to comprehend a user’s interest. However, existing efforts have not fully explored this connectivity to infer user preferences, especially in terms of modeling the sequential dependencies within and holistic semantics of a path. In this paper, we contribute a new model named Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for recommendation. KPRN can generate path representations by composing the semantics of both entities and relations. By leveraging the sequential dependencies within a path, we allow effective reasoning on paths to infer the underlying rationale of a user-item interaction. Furthermore, we design a new weighted pooling operation to discriminate the strengths of different paths in connecting a user with an item, endowing our model with a certain level of explainability. We conduct extensive experiments on two datasets about movie and music, demonstrating significant improvements over state-of-the-art solutions Collaborative Knowledge Base Embedding and Neural Factorization Machine.

Recent Research Advances on Interactive Machine Learning

Interactive Machine Learning (IML) is an iterative learning process that tightly couples a human with a machine learner, which is widely used by researchers and practitioners to effectively solve a wide variety of real-world application problems. Although recent years have witnessed the proliferation of IML in the field of visual analytics, most recent surveys either focus on a specific area of IML or aim to summarize a visualization field that is too generic for IML. In this paper, we systematically review the recent literature on IML and classify them into a task-oriented taxonomy built by us. We conclude the survey with a discussion of open challenges and research opportunities that we believe are inspiring for future work in IML.

An Easy Implementation of CV-TMLE

In the world of targeted learning, cross-validated targeted maximum likelihood estimators, CV-TMLE \parencite{Zheng:2010aa}, has a distinct advantage over TMLE \parencite{Laan:2006aa} in that one less condition is required of CV-TMLE in order to achieve asymptotic efficiency in the nonparametric or semiparametric settings. CV-TMLE as originally formulated, consists of averaging usually 10 (for 10-fold cross-validation) parameter estimates, each of which is performed on a validation set separate from where the initial fit was trained. The targeting step is usually performed as a pooled regression over all validation folds but in each fold, we separately evaluate any means as well as the parameter estimate. One nice thing about CV-TMLE, is that we average 10 plug-in estimates so the plug-in quality of preserving the natural parameter bounds is respected. Our adjustment of this procedure also preserves the plug-in characteristic as well as avoids the donsker condtion. The advantage of our procedure is the implementation of the targeting is identical to that of a regular TMLE, once all the validation set initial predictions have been formed. In short, we stack the validation set predictions and pretend as if we have a regular TMLE, which is not necessarily quite a plug-in estimator on each fold but overall will perform asymptotically the same and might have some slight advantage, a subject for future research. In the case of average treatment effect, treatment specific mean and mean outcome under a stochastic intervention, the procedure overlaps exactly with the originally formulated CV-TMLE with a pooled regression for the targeting.

Estimation of Dimensions Contributing to Detected Anomalies with Variational Autoencoders

Anomaly detection using dimensionality reduction has been an essential technique for monitoring multidimensional data. Although deep learning-based methods have been well studied for their remarkable detection performance, their interpretability is still a problem. In this paper, we propose a novel algorithm for estimating the dimensions contributing to the detected anomalies by using variational autoencoders (VAEs). Our algorithm is based on an approximative probabilistic model that considers the existence of anomalies in the data, and by maximizing the log-likelihood, we estimate which dimensions contribute to determining data as an anomaly. The experiments results with benchmark datasets show that our algorithm extracts the contributing dimensions more accurately than baseline methods.

Differentiating Concepts and Instances for Knowledge Graph Embedding

Concepts, which represent a group of different instances sharing common properties, are essential information in knowledge representation. Most conventional knowledge embedding methods encode both entities (concepts and instances) and relations as vectors in a low dimensional semantic space equally, ignoring the difference between concepts and instances. In this paper, we propose a novel knowledge graph embedding model named TransC by differentiating concepts and instances. Specifically, TransC encodes each concept in knowledge graph as a sphere and each instance as a vector in the same semantic space. We use the relative positions to model the relations between concepts and instances (i.e., instanceOf), and the relations between concepts and sub-concepts (i.e., subClassOf). We evaluate our model on both link prediction and triple classification tasks on the dataset based on YAGO. Experimental results show that TransC outperforms state-of-the-art methods, and captures the semantic transitivity for instanceOf and subClassOf relation. Our codes and datasets can be obtained from h…/

A Review for Weighted MinHash Algorithms

Data similarity (or distance) computation is a fundamental research topic which underpins many high-level applications based on similarity measures in machine learning and data mining. However, in large-scale real-world scenarios, the exact similarity computation has become daunting due to ‘3V’ nature (volume, velocity and variety) of big data. In such cases, the hashing techniques have been verified to efficiently conduct similarity estimation in terms of both theory and practice. Currently, MinHash is a popular technique for efficiently estimating the Jaccard similarity of binary sets and furthermore, weighted MinHash is generalized to estimate the generalized Jaccard similarity of weighted sets. This review focuses on categorizing and discussing the existing works of weighted MinHash algorithms. In this review, we mainly categorize the Weighted MinHash algorithms into quantization-based approaches, ‘active index’-based ones and others, and show the evolution and inherent connection of the weighted MinHash algorithms, from the integer weighted MinHash algorithms to real-valued weighted MinHash ones (particularly the Consistent Weighted Sampling scheme). Also, we have developed a python toolbox for the algorithms, and released it in our github. Based on the toolbox, we experimentally conduct a comprehensive comparative study of the standard MinHash algorithm and the weighted MinHash ones.

Adversarial Learning of Label Dependency: A Novel Framework for Multi-class Classification

Recent work has shown that exploiting relations between labels improves the performance of multi-label classification. We propose a novel framework based on generative adversarial networks (GANs) to model label dependency. The discriminator learns to model label dependency by discriminating real and generated label sets. To fool the discriminator, the classifier, or generator, learns to generate label sets with dependencies close to real data. Extensive experiments and comparisons on two large-scale image classification benchmark datasets (MS-COCO and NUS-WIDE) show that the discriminator improves generalization ability for different kinds of models

Gauges, Loops, and Polynomials for Partition Functions of Graphical Models

We suggest a new methodology for analysis and approximate computations of the Partition Functions (PF) of Graphical Models (GM) in the Normal Factor Graph representation that combines the gauge transformation (GT) technique from (Chertkov, Chernyak 2006) with the technique developed in (Straszak, Vishnoi 2017) based on the recent progress in the field of real stable polynomials. We show that GTs (while keeping PF invariant) allow representation of PF as a sum of polynomials of variables associated with edges of the graph. A special belief propagation (BP) gauge makes a single out term of the series least sensitive to variations then resulting in the loop series for PF introduced in (Chertkov, Chernyak 2006). In addition to restating the known results in the polynomial form, we also discover a new relation between the computationally tractable BP term (single out term of the loop series evaluated at the BP gauge) and the PF: sequential application of differential operators, each associated with an edge of the graph, to the BP polynomial results in the PF. Each term in the sequence corresponds to a BP polynomial of a modified GM derived by contraction of an edge. Even though complexity of computing factors in the derived GMs grow exponentially with the number of eliminated edges, polynomials associated with the new factors remain real stable if the original factors have this property. Moreover, we show that BP estimations for the PF do not decrease with eliminations, thus resulting overall in the proof that the BP solution of the original GM gives a lower bound for PF. The proof extends results of (Straszak, Vishnoi 2017) from bipartite to general graphs, however, it is limited to the case when the BP solution is feasible.

NExUS: Bayesian simultaneous network estimation across unequal sample sizes
Relation of Web Service Orchestration, Abstract Process, Web Service and Choreography
Towards time-varying proximal dynamics in Multi-Agent Network Games
HSD-CNN: Hierarchically self decomposing CNN architecture using class specific filter sensitivity analysis
An Initial Attempt of Combining Visual Selective Attention with Deep Reinforcement Learning
Learning Groupwise Scoring Functions Using Deep Neural Networks
Optimal Spectral Initialization for Signal Recovery with Applications to Phase Retrieval
About the ordinances of the vectors of the $n$-dimensional Boolean cube in accordance with their weights
When Locally Linear Embedding Hits Boundary
Faster sublinear approximations of $k$-cliques for low arboricity graphs
Dynamics of the Kuramoto-Sakaguchi Oscillator Network with Asymmetric Order Parameter
Generating subgraphs in chordal graphs
Blockchain for Economically Sustainable Wireless Mesh Networks
Recognizing generating subgraphs revisited
A Progressively-trained Scale-invariant and Boundary-aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions
Statistical modelling of conidial discharge of entomophthoralean fungi using a newly discovered Pandora species
Approximation Algorithms for Graph Burning
Multi-Source Neural Variational Inference
Deep Learning Framework for Pedestrian Collision Avoidance System (PeCAS)
Learning with tree-based tensor formats
A 3-D Projection Model for X-ray Dark-field Imaging
Time-interval balancing in multi-processor scheduling of composite modular jobs (preliminary description)
Three-dimensional double helical DNA structure directly revealed from its X-ray fiber diffraction pattern by iterative phase retrieval
Analysis vs Synthesis – An Investigation of (Co)sparse Signal Models on Graphs
Bridging Network Embedding and Graph Summarization
Machine Learning with Abstention for Automated Liver Disease Diagnosis
On constrained optimization problems solved using CDT
Simultaneous Ruin Probability for Two-Dimensional Brownian and Lévy Risk Models
Thompson Sampling for Pursuit-Evasion Problems
Capital Structure and Speed of Adjustment in U.S. Firms. A Comparative Study in Microeconomic and Macroeconomic Conditions – A Quantille Regression Approach
Managing App Install Ad Campaigns in RTB: A Q-Learning Approach
Unifying Gaussian LWF and AMP Chain Graphs to Model Interference
Semi-supervised Deep Representation Learning for Multi-View Problems
Multiple Subspace Alignment Improves Domain Adaptation
Massive MIMO-based Localization and Mapping Exploiting Phase Information of Multipath Components
Product Title Refinement via Multi-Modal Generative Adversarial Learning
Subsampling to Enhance Efficiency in Input Uncertainty Quantification
On a Pólya functional for rhombi, isosceles triangles, and thinning convex sets
SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient
External optimal control of nonlocal PDEs
Agent Embeddings: A Latent Representation for Pole-Balancing Networks
Constant payoff in zero-sum stochastic games
Deep Learning Based Transmitter Identification using Power Amplifier Nonlinearity
The Poisson random effect model for experience ratemaking: limitations and alternative solutions
Adaptive Hessian Estimation Based Extremum Localization
Robustness of link prediction under network attacks
Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition
M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network
Road Damage Detection And Classification In Smartphone Captured Images Using Mask R-CNN
Identification of Internal Faults in Indirect Symmetrical Phase Shift Transformers Using Ensemble Learning
On the length of the longest consecutive switches
Variational Community Partition with Novel Network Structure Centrality Prior
Visual Saliency Maps Can Apply to Facial Expression Recognition
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
Learning Latent Dynamics for Planning from Pixels
Tractability of Konig Edge Deletion Problems
Statistical Inference for Stable Distribution Using EM algorithm
Time-changed Poisson processes of order $k$
Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition
On the Performance and Convergence of Distributed Stream Processing via Approximate Fault Tolerance
A differential game on Wasserstein space. Application to weak approachability with partial monitoring
Forecasting People’s Needs in Hurricane Events from Social Network
A central limit theorem for descents and major indices in fixed conjugacy classes of $S_n$
Navigating Assistance System for Quadcopter with Deep Reinforcement Learning
Holistic Multi-modal Memory Network for Movie Question Answering
MR-RePair: Grammar Compression based on Maximal Repeats
The Hidden Shape of Stories Reveals Positivity Bias and Gender Bias
New Theoretical Bounds and Constructions of Permutation Codes under Block Permutation Metric
Learning The Invisible: A Hybrid Deep Learning-Shearlet Framework for Limited Angle Computed Tomography
Learning Personalized End-to-End Goal-Oriented Dialog
Streaming Hardness of Unique Games
Matrix Product Operator Restricted Boltzmann Machines
140 Gbaud On-Off Keying Links in C-Band for Short-Reach Optical Interconnects
Subspace Packings
Different Power Adaption Methods on Fluctuating Two-Ray Fading Channels
Forming Probably Stable Communities with Limited Interactions
Depth Image Upsampling based on Guided Filter with Low Gradient Minimization
Fine-tuning of Language Models with Discriminator
Importance Weighted Evolution Strategies
Embedding partial Latin squares in Latin squares with many mutually orthogonal mates
Newton: A Language for Describing Physics
Parameterized Synthetic Image Data Set for Fisheye Lens
Another Note on Intervals in the Hales-Jewett Theorem
Reciprocal and Positive Real Balanced Truncations for Model Order Reduction of Descriptor Systems
Angry or Climbing Stairs? Towards Physiological Emotion Recognition in the Wild
Extending Pretrained Segmentation Networks with Additional Anatomical Structures
Massive MIMO with a Generalized Channel Model: Fundamental Aspects
Blind Over-the-Air Computation and Data Fusion via Provable Wirtinger Flow
Hallucinating very low-resolution and obscured face images
Global sensitivity analysis for optimization with variable selection
Combining Learned Lyrical Structures and Vocabulary for Improved Lyric Generation
Modeling Text Complexity using a Multi-Scale Probit
Not Just Depressed: Bipolar Disorder Prediction on Reddit
Surface area deviation between smooth convex bodies and polytopes
Proprties of biclustering algorithms and a novel biclustering technique based on relative density
Detection of REM Sleep Behaviour Disorder by Automated Polysomnography Analysis
Design of Low Complexity GFDM Transceiver
Path integral Monte Carlo method for the quantum anharmonic oscillator
A Deep Ensemble Framework for Fake News Detection and Classification
Towards Adversarial Denoising of Radar Micro-Doppler Signatures
Learning Segmentation Masks with the Independence Prior
Bias Scheme Reducing Transient Currents and Speeding up Read Operations for 3-D Cross Point PCM
Joint Probability Distribution of Prediction Errors of ARIMA
A Generalization of the Matroid Polytope Theorem to Local Forest Greedoids
Pseudofiniteness in Hrushovski Constructions
Variational and Optimal Control Approaches for the Second-Order Herglotz Problem on Spheres
Classifying Patent Applications with Ensemble Methods
CUNI System for the WMT18 Multimodal Translation Task
The random walk penalised by its range in dimensions $d\geq 3$
Weyl-Mahonian Statistics for Weighted Flags of Type A-D
Analyzing deep CNN-based utterance embeddings for acoustic model adaptation
Inductively pierced codes and neural toric ideals
Input Combination Strategies for Multi-Source Transformer Decoder
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Optimization of triangular networks with spatial constraints
On an Annihilation Number Conjecture
Universal Marginalizer for Amortised Inference and Embedding of Generative Models
Characterizing $(0,\pm 1)$-matrices with only even-rank principal submatrices in terms of skew-symmetry
Mutual Information of Wireless Channels and Block-Jacobi Ergodic Operators
Simple FPGA routing graph compression
Gaussian Auto-Encoder
Learning Representations of Missing Data for Predicting Patient Outcomes
Sliding Window Temporal Graph Coloring
Deep-learning the Latent Space of Light Transport
Markov Property in Generative Classifiers
The Equilibrium States of Large Networks of Erlang Queues

Magister Dixit

“Understanding correlation, multivariate regression and all aspects of massaging data together to look at it from different angles for use in predictive and prescriptive modeling is the backbone knowledge that’s really step one of revealing intelligence…. If you don’t have this, all the data collection and presentation polishing in the world is meaningless.” Mitchell A. Sanders ( August 27, 2013 )

Book Memo: “From Human Attention to Computational Attention”

A Multidisciplinary Approach
This both accessible and exhaustive book will help to improve modeling of attention and to inspire innovations in industry. It introduces the study of attention and focuses on attention modeling, addressing such themes as saliency models, signal detection and different types of signals, as well as real-life applications. The book is truly multi-disciplinary, collating work from psychology, neuroscience, engineering and computer science, amongst other disciplines. What is attention? We all pay attention every single moment of our lives. Attention is how the brain selects and prioritizes information. The study of attention has become incredibly complex and divided: this timely volume assists the reader by drawing together work on the computational aspects of attention from across the disciplines. Those working in the field as engineers will benefit from this book’s introduction to the psychological and biological approaches to attention, and neuroscientists can learn about engineering work on attention. The work features practical reviews and chapters that are quick and easy to read, as well as chapters which present deeper, more complex knowledge. Everyone whose work relates to human perception, to image, audio and video processing will find something of value in this book, from students to researchers and those in industry.

Document worth reading: “Deep Reinforcement Learning: An Overview”

In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This chapter reviews the recent advances in deep reinforcement learning with a focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework. Deep Reinforcement Learning: An Overview

R Packages worth a look

Visualization of Subgroups for Decision Trees (visTree)
Provides a visualization for characterizing subgroups defined by a decision tree structure. The visualization simplifies the ability to interpret indiv …

A ‘ggplot2’-Plot of Composition of Solvency II SCR: SF and IM (ggsolvencyii)
An implementation of ‘ggplot2’-methods to present the composition of Solvency II Solvency Capital Requirement (SCR) as a series of concentric circle-pa …

Inference and Learning in Stochastic Automata (SAutomata)
Machine learning provides algorithms that can learn from data and make inferences or predictions. Stochastic automata is a class of input/output device …

Distilled News

29 Statistical Concepts Explained in Simple English – Part 3

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more

Windows Clipboard Access with R

The windows clipboard is a quick way to get data in and out of R. How can we exploit this feature to accomplish our basic data exploration needs and when might its use be inappropriate? Read on.

Explaining Black-Box Machine Learning Models – Code Part 2: Text classification with LIME

his is code that will accompany an article that will appear in a special edition of a German IT magazine. The article is about explaining black-box machine learning models.

Building a Repository of Alpine-based Docker Images for R, Part II

In the first article of this series, I built an Alpine-based Docker image with R base packages from Alpine’s native repositories, as well as one image with R compiled from source code. The images are hosted on Docker Hub, velaco/alpine-r repository. The next step was either to address the fatal errors I found while testing the installation of R or to proceed building an image with Shiny Server. The logical choice would have been to pass all tests with R’s base packages before proceeding, but I was a bit impatient and wanted to go through the process of building a Shiny Server as soon as possible. After two weeks of trial and error, I finally have a container that can start the server and run Shiny apps.

Easy time-series prediction with R: a tutorial with air traffic data from Lux Airport

In this blog post, I will show you how you can quickly and easily forecast a univariate time series. I am going to use data from the EU Open Data Portal on air passenger transport. You can find the data here. I downloaded the data in the TSV format for Luxembourg Airport, but you could repeat the analysis for any airport.

AI for Good: slides and notebooks from the ODSC workshop

Last week at the ODSC West conference, I was thrilled with the interest in my Using AI for Good workshop: it was wonderful to find a room full of data scientists eager to learn how data science and artificial intelligence can be used to help people and the planet. The workshop was focused around projects from the Microsoft AI for Good program. I’ve included some details about the projects below, and you can also check out the workshop slides and the accompanying Jupyter Notebooks that demonstrate the underlying AI methods used in the projects.

Installing RStudio & Shiny Servers

I did a remote install of Ubuntu Server today. This was somewhat novel because it’s the first time that I have not had physical access to the machine I was installing on. The server install went very smoothly indeed.

Interchanging RMarkdown and ‘spinnable’ R

Interchanging RMarkdown and ‘spinnable’ R

Behaviour Analysis using Graphext

Why do people act the way they do? Why do they buy products, quit their jobs, or change partners? Many of these motives can be inducted from people’s behaviour, and these behaviours are reflected in data. Companies have lots of data about their clients, employees, suppliers… Let’s put that data to work to do some smart data discovery and see what we could learn.

Job Title Analysis in python and NLTK

A job title indicates a lot about someone’s role and responsibilities. It says if they manage a team, if they control a budget, and their level of specialization. Knowing this is useful when automating business development or client outreach. For example, a company that sells voice recognition software may want to send messages to:
• CTOs and technical directors informing them of the price and benefits of using the voice recognition software.
• Potential investors or advisors messages inviting them to see the company’s potential market size.
• Founders and engineers instructing them how to use the software.
Training a software to classify job titles is a multi-text text classification problem. For this task, we can use the Python Natural Language Toolkit (NLTK) and Bayesian classification.

Doing Machine Learning the Uber Way: Five Lessons From the First Three Years of Michelangelo

Uber has been one of the most active contributors to open source machine learning technologies in the last few years. While companies like Google or Facebook have focused their contributions in new deep learning stacks like TensorFlow, Caffe2 or PyTorch, the Uber engineering team has really focused on tools and best practices for building machine learning at scale in the real world. Technologies such as Michelangelo, Horovod, PyML, Pyro are some of examples of Uber’s contributions to the machine learning ecosystem. With only a small group of companies developing large scale machine learning solutions, the lessons and guidance from Uber becomes even more valuable for machine learning practitioners (I certainly learned a lot and have regularly written about Uber’s efforts).

Before you start learning Python, choose the IDE that suits you the best. As Python is one of the leading programming languages, there is a multitude of IDEs available. So the question is, ‘Which is the best Python IDE for Data Science?’

Introduction to Image Recognition: Building a Simple Digit Detector

Digit recognition is not something that difficult or advanced. It is kind of ‘Hello world!’ program – not that cool, but you start exactly here. So I decided to share my work and at the same time refresh the knowledge – it’s being a long ago I played with images.

The 2×2 Data Science Skills Matrix that Harvard Business Review got completely wrong!

Data Science is the current buzzword in the market. Every company at the moment is looking to hire Data Science Professionals to solve some Data problem that they themselves are not aware of currently. Machine Learning has taken over the industry by storm and we have a bunch of self taught Data Scientists in the market. Since this Data Science word is an altogether different universe, it is very difficult to set up priorities on what to learn and what not to. So in this case the Harvard Business Review published an article on what you as a company or individual should give importance to. Let’s have a look.

Decision Tree in Machine Learning

A decision tree is a flowchart-like structure in which each internal node represents a test on a feature (e.g. whether a coin flip comes up heads or tails) , each leaf node represents a class label (decision taken after computing all features) and branches represent conjunctions of features that lead to those class labels. The paths from root to leaf represent classification rules. Below diagram illustrate the basic flow of decision tree for decision making with labels (Rain(Yes), No Rain(No)).

Using Bash for Data Pipelines

Using bash scripts to create data pipelines is incredibly useful as a data scientist. The possibilities with these scripts are almost endless, but here, I will be going through a tutorial on a very basic bash script to download data and count the number of rows and cols in a dataset. Once you get the hang of using bash scripts, you can have the basics for creating IoT devices, and much much more as this all works with a Raspberry Pi. One cool project that you could use this for is to download all of your twitter messages using the twitter api and then predict whether or not a message from a user on Twitter is spam or not. It could run on a Raspberry Pi server from your room! That is a little out of the scope of this tutorial though, so we will begin by looking at a dataset for cars speed in San Francisco!

From Scratch: Bayesian Inference, Markov Chain Monte Carlo and Metropolis Hastings, in python

I’ve been an avid reader on medium/towards data science for a while now, and I’ve enjoyed the diversity and openness of the subjects tackled by many authors. I wish to contribute to this awesome community by creating my own series of articles ‘From Scratch’, where I explain and implement/build anything from scratch (not necessarily in data science, you need only propose!) Why do I want to do that? In the current state of things, we are in possession of such powerful libraries and tools that can do a lot of the work for us. Most experienced authors are well aware of the complexities of implementing such tools. As such, they make use of them to provide short, accessible and to the point reads to users from diverse backgrounds. In many of the articles that I enjoyed, I failed to understand how this or that algorithm is implemented in practise. What are their limitations? Why were they invented? When should they be used?

Reverse Engineering Backpropagation

Sometimes starting with examples might be a faster way to learn something rather than going theory first before getting into detailed examples. That’s what I will attempt to do here using an example from the official PyTorch tutorial that implements backpropogation and reverse engineer the math and subsequently the concept behind it.

ML Intro 5: One hot Encoding, Cyclic Representations, Normalization

This post follows Machine Learning Introduction 4. In the previous post, we described Machine Learning for marketing attribution. In this post, we will illuminate some of the details we ignored in that section. We will inspect a dataset about Marketing Attribution, perform one-hot encoding of our brands, manipulate our one-hot encoding to learn custom business insights, normalize our features, inspect our model inputs once this is all done, and interpret our outputs in detail.

Machine Learning Bit by Bit – Multivariate Gradient Descent

In this post, we’re going to extend our understanding of gradient descent and apply it to a multivariate function.

Machine Learning Bit by Bit – Univariate Gradient Descent

This series aims to share my own endeavour to understand, explore and experiment on topics in machine learning. Mathematical notations in this particular post and the next one on multivariate gradient descent will be mostly in line with those used in the Machine Learning course by Andrew Ng. Understanding and being able to play around with the maths behind is the key in understanding machine learning. It allows us to choose the most suitable algorithms and tailor them according to the problems we want to solve. However, I have encountered many tutorials and lectures where equations used are simply impenetrable. All the symbols look cryptic and there seems to be a huge gap between what is being explained and those equations. I just can’t connect all the dots. Unfortunately, more often than not, maths hinders understanding when some knowledge is assumed and important steps are skipped. Therefore, wherever possible, I will expand the equations and avoid shortcuts, so that everyone can follow along how we reach from the left side of the equation to the right.

Distilled News

Imagining an Engineer: On GAN-Based Data Augmentation Perpetuating Biases

The use of synthetic data generated by Generative Adversarial Networks (GANs) has become quite a popular method to do data augmentation for many applications. While practitioners celebrate this as an economical way to get more synthetic data that can be used to train downstream classifiers, it is not clear that they recognize the inherent pitfalls of this technique. In this paper, we aim to exhort practitioners against deriving any false sense of security against data biases based on data augmentation. To drive this point home, we show that starting with a dataset consisting of head-shots of engineering researchers, GAN-based augmentation ‘imagines’ synthetic engineers, most of whom have masculine features and white skin color (inferred from a human subject study conducted on Amazon Mechanical Turk). This demonstrates how biases inherent in the training data are reinforced, and sometimes even amplified, by GAN-based data augmentation; it should serve as a cautionary tale for the lay practitioners.

Relation extraction with weakly supervised learning based on process-structure-property-performance reciprocity

In this study, we develop a computer-aided material design system to represent and extract knowledge related to material design from natural language texts. A machine learning model is trained on a text corpus weakly labeled by minimal annotated relationship data (~100 labeled relationships) to extract knowledge from scientific articles. The knowledge is represented by relationships between scientific concepts, such as {annealing, grain size, strength}. The extracted relationships are represented as a knowledge graph formatted according to design charts, inspired by the process-structure-property-performance (PSPP) reciprocity. The design chart provides an intuitive effect of processes on properties and prospective processes to achieve the certain desired properties. Our system semantically searches the scientific literature and provides knowledge in the form of a design chart, and we hope it contributes more efficient developments of new materials.

Introducing vizscorer: a bot advisor to score and improve your ggplot plots

One of the most frustrating issues I face in my professional life is the plentitude of ineffective reports generated within my company. Wherever I look around me is plenty of junk charts, like barplot showing useless 3D effects or ambiguous and crowded pie charts.

Extended Isolation Forest

This is a simple package implementation for the Extended Isolation Forest method. It is an improvement on the original algorithm Isolation Forest which is described (among other places) in this paper for detecting anomalies and outliers from a data point distribution. The original algorithm suffers from an inconsistency in producing anomaly scores due to slicing operations. Even though the slicing hyperplanes are selected at random, they are always parallel to the coordinate reference frame. The shortcoming can be seen in score maps as presented in the example notebooks in this repository. In order to improve the situation, we propose an extension which allows the hyperplanes to be taken at random angles. The way in which this is done gives rise to multiple levels of extension depending on the dimensionality of the problem. For an N dimensional dataset, Extended Isolation Forest has N levels of extension, with 0 being identical to the case of standard Isolation Forest, and N-1 being the fully extended version. Here we provide the source code for the algorithm as well as documented example notebooks to help get started. Various visualizations are provided such as score distributions, score maps, aggregate slicing of the domain, and tree and whole forest visualizations. most examples are in 2D. We present one 3D example. However, the algorithm works readily with higher dimensional data.

What is Hidden in the Hidden Markov Model?

Hidden Markov Models or HMMs are the most common models used for dealing with temporal Data. They also frequently come up in different ways in a Data Science Interview usually without the word HMM written over it. In such a scenario it is necessary to discern the problem as an HMM problem by knowing characteristics of HMMs.

How to Ensure Safety and Security During Condition Monitoring and Predictive Maintenance : a Case Study

Effective communication from machines and embedded sensors, actuators in industries are crucial to achieve industrial digitalization. Efficient remote monitoring as well as maintenance methodologies helps to accomplish and transform the existing industries to Smart Factories. Monitoring and maintenance leads to the aggregation of the real-time data from sensors via different existing and new industrial communication protocols. Development of user-friendly interface allows remote Condition Monitoring (CM). Context aware analysis of real-time and historical data provides capability to accomplish active Predictive Maintenance (PdM). Both CM and PdM needs access to the machine process data, industrial net-work and communication layer. Furthermore, data flow between individual components from the Cyber-Physical System (CPS) components starting from the actual machine to the database or analyze engine to the real visualization is important. Security and safety aspects on the application, communication, network and data flow level should be considered. This thesis presents a case study on benefits of PdM and CM, the security and safety aspect of the system and the current challenges and improvements. Components of the CPS ecosystem are taken into consideration to further investigate the individual components which en-ables predictive maintenance and condition monitoring. Additionally, safety and security aspects of each component is analyzed. Moreover, the current challenges and the possible improvements of the PdM and CM systems are analyzed. Also, challenges and improvements regarding the components is taken into consideration. Finally, based on the research, possible improvements have been proposed and validated by the researcher. For the new digital era of secure and robust PdM 4.0, the improvements are vital references.

The ultimate guide to starting AI

A step-by-step overview of how to begin your project, including advice on how to craft a wise performance metric, setting up testing criteria to overcome human bias, and more.

Self-Service Analytics and Operationalization – Why You Need Both

Get the guidebook / whitepaper for a look at how today’s top data-driven companies scale their advanced analytics & machine learning efforts.

Managing risk in machine learning

In this post, I share slides and notes from a keynote I gave at the Strata Data Conference in New York last September. As the data community begins to deploy more machine learning (ML) models, I wanted to review some important considerations.

Preview my new book: Introduction to Reproducible Science in R

I’m pleased to share Part I of my new book ‘Introduction to Reproducible Science in R’. The purpose of this book is to approach model development and software development holistically to help make science and research more reproducible. The need for such a book arose from observing some of the challenges that I’ve seen teaching graduate courses in natural language processing and machine learning, as well as training my own staff to become effective data scientists. While quantitative reasoning and mathematics are important, often I found that the primary obstacle to good data science was reproducibility and repeatability: it’s difficult to quickly reproduce someone else’s results.

A Data Lake’s Worth of Audio Datasets

At Wonder Technologies, we have spent a lot of time building Deep learning systems that understand the world through audio. From deep learning based voice extraction to teaching computers how to read our emotions, we needed to use a wide set of data to deliver APIs that worked even in the craziest sound environments. Here is a list of datasets that I found pretty useful for our research and that I’ve personally used to make my audio related models perform much better in real-world environments.

Inferential Statistics basics

Statistics is one of the most important skills required by a data scientist. There is a lot of mathematics involved in statistics and it can be difficult to grasp. So in this tutorial we are going to go through some of the concepts of statistsics to learn and understand inferential statistics and master it.

Hybrid Fuzzy Name Matching

My workplace works with large-scale databases that, amongst many things, contains data about people. For each person in the DB we have a unique identifier, which is composed of the person’s first name, last name, zip code. We hold ~500MM people in our DB, which can essentially have duplicates if there is a little change in the person’s name. For example, Rob Rosen and Robert Rosen (with the same zip code) will be treated as two different people. I want to note that if we get the same person an additional time, we just update the record’s timestamp, so there is no need for this sort of deduping. In addition, I would like to give credit to my co-worker Jonathan Harel who assisted me in the research for this project.

Predicting Probability Distributions Using Neural Networks

If you’ve been following our tech blog lately, you might have noticed we’re using a special type of neural networks called Mixture Density Network (MDN). MDNs do not only predict the expected value of a target, but also the underlying probability distribution. This blogpost will focus on how to implement such a model using Tensorflow, from the ground up, including explanations, diagrams and a Jupyter notebook with the entire source code.

Why and how to Cross Validate a Model?

Once we are done with training our model, we just can’t assume that it is going to work well on data that it has not seen before. In other words, we cant be sure that the model will have the desired accuracy and variance in production environment. We need some kind of assurance of the accuracy of the predictions that our model is putting out. For this, we need to validate our model. This process of deciding whether the numerical results quantifying hypothesised relationships between variables, are acceptable as descriptions of the data, is known as validation.

From Eigendecomposition to Determinant: Fundamental Mathematics for Machine Learning with Intuitive Examples Part 3/3

For understanding the mathematics for machine learning algorithms, especially deep learning algorithms, it is essential to build up the mathematical concepts from foundational to more advanced. Unfortunately, Mathematical theories are too hard/abstract/dry to digest in many cases. Imagine you are eating a pizza, it is always easier and more fun to go with a coke. The purpose of this article is to provide intuitive examples for fundamental mathematical theories to make the learning experience more enjoyable and memorable, which is to serve chicken wings with beer, fries with ketchup, and rib-eye with wine.

Automated Feature Engineering for Predictive Modeling

One of the main time investments I’ve seen data scientists make when building data products is manually performing feature engineering. While tools such as auto-sklearn and Auto-Keras have been able to automate much of the model fitting process when building a predictive model, determining which features to use as input to the fitting process is usually a manual process. I recently started using the FeatureTools library, which enables data scientists to also automate feature engineering.

Finding and managing research papers: a survey of tools and products

As researchers, especially in (overly) prolific fields like Deep Learning, we often find ourselves overwhelmed by the huge amount of papers to read and keep track of in our work. I think one big reason for this is insufficient use of existing tools and services that aim to make our life easier. Another reason is the lack of a really good product which meets all our needs under one interface, but that is a topic for another post. Lately I’ve been getting into a new subfield of ML and got extremely frustrated with the process of prioritizing, reading and managing the relevant papers… I ended up looking for tools to help me deal with this overload and want to share with you the products and services that I’ve found. The goal is to improve the workflow and quality of life of anyone who works with scientific papers.

A New Hyperbolic Tangent Based Activation Function for Neural Networks

In this article, I introduce a new hyperbolic tangent based activation function, tangent linear unit (TaLU), for neural networks. The function was evaluated for performance using CIFAR-10 and CIFAR-100 database. The performance of the proposed activation function was in par or better than other activation functions such as: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), and exponential linear unit (ELU).

Beginner tutorial: Build your own custom real-time object classifier

In this tutorial, we will learn how to build a custom real-time object classifier to detect any object of your choice! We will be using BeautifulSoup and Selenium to scrape training images from Shutterstock, Amazon’s Mechanical Turk (or BBox Label Tool) to label images with bounding boxes, and YOLOv3 to train our custom detection model.

Document worth reading: “Visions of a generalized probability theory”

In this Book we argue that the fruitful interaction of computer vision and belief calculus is capable of stimulating significant advances in both fields. From a methodological point of view, novel theoretical results concerning the geometric and algebraic properties of belief functions as mathematical objects are illustrated and discussed in Part II, with a focus on both a perspective ‘geometric approach’ to uncertainty and an algebraic solution to the issue of conflicting evidence. In Part III we show how these theoretical developments arise from important computer vision problems (such as articulated object tracking, data association and object pose estimation) to which, in turn, the evidential formalism is able to provide interesting new solutions. Finally, some initial steps towards a generalization of the notion of total probability to belief functions are taken, in the perspective of endowing the theory of evidence with a complete battery of estimation and inference tools to the benefit of all scientists and practitioners. Visions of a generalized probability theory