Advertisements

If you did not already know

automated CLAUse DETectEr (Claudette) google
Machine Learning Powered Analysis of Consumer Contracts and Privacy Policies. CLAUDETTE – ‘automated CLAUse DETectEr’ – is an interdisciplinary research project hosted at the Law Department of the European University Institute, led by professors Giovanni Sartor and Hans-W. Micklitz, in cooperation with engineers from University of Bologna and University of Modena and Reggio Emilia. The research objective is to test to what extent is it possible to automate reading and legal assessment of online consumer contracts and privacy policies, to evaluate their compliance with EU´s unfair contractual terms law and personal data protection law (GDPR), using machine learning and grammar-based approaches. The idea arose out of bewilderment. Having read dozens of terms of service and of privacy policies of online platforms, we came to conclusion that despite substantive law in place, and despite enforcers´ competence for abstract control, providers of online services still tend to use unfair and unlawful clauses in these documents. Hence, the idea to automate parts of enforcement process by delegating certain tasks to machines. On one hand, we believe that relying on automation can increase quality and effectiveness of legal work of enforcers. On the other, we want to empower consumers themselves, by giving them tools to quickly assess whether what they agree to online is fair and/or lawful. …

Dfuntest google
New ideas in distributed systems (algorithms or protocols) are commonly tested by simulation, because experimenting with a prototype deployed on a realistic platform is cumbersome. However, a prototype not only measures performance but also verifies assumptions about the underlying system. We developed dfuntest – a testing framework for distributed applications that defines abstractions and test structure, and automates experiments on distributed platforms. Dfuntest aims to be jUnit’s analogue for distributed applications; a framework that enables the programmer to write robust and flexible scenarios of experiments. Dfuntest requires minimal bindings that specify how to deploy and interact with the application. Dfuntest’s abstractions allow execution of a scenario on a single machine, a cluster, a cloud, or any other distributed infrastructure, e.g. on PlanetLab. A scenario is a procedure; thus, our framework can be used both for functional tests and for performance measurements. We show how to use dfuntest to deploy our DHT prototype on 60 PlanetLab nodes and verify whether the prototype maintains a correct topology. …

MapReduce for C (MR4C) google
MR4C is an implementation framework that allows you to run native code within the Hadoop execution framework. Pairing the performance and flexibility of natively developed algorithms with the unfettered scalability and throughput inherent in Hadoop, MR4C enables large-scale deployment of advanced data processing applications. …

Advertisements

Whats new on arXiv

Deep Item-based Collaborative Filtering for Top-N Recommendation

Item-based Collaborative Filtering(short for ICF) has been widely adopted in recommender systems in industry, owing to its strength in user interest modeling and ease in online personalization. By constructing a user’s profile with the items that the user has consumed, ICF recommends items that are similar to the user’s profile. With the prevalence of machine learning in recent years, significant processes have been made for ICF by learning item similarity (or representation) from data. Nevertheless, we argue that most existing works have only considered linear and shallow relationship between items, which are insufficient to capture the complicated decision-making process of users. In this work, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationship among items. Going beyond modeling only the second-order interaction (e.g. similarity) between two items, we additionally consider the interaction among all interacted item pairs by using nonlinear neural networks. Through this way, we can effectively model the higher-order relationship among items, capturing more complicated effects in user decision-making. For example, it can differentiate which historical itemsets in a user’s profile are more important in affecting the user to make a purchase decision on an item. We treat this solution as a deep variant of ICF, thus term it as DeepICF. To justify our proposal, we perform empirical studies on two public datasets from MovieLens and Pinterest. Extensive experiments verify the highly positive effect of higher-order item interaction modeling with nonlinear neural networks. Moreover, we demonstrate that by more fine-grained second-order interaction modeling with attention network, the performance of our DeepICF method can be further improved.


Gaussian-Induced Convolution for Graphs

Learning representation on graph plays a crucial role in numerous tasks of pattern recognition. Different from grid-shaped images/videos, on which local convolution kernels can be lattices, however, graphs are fully coordinate-free on vertices and edges. In this work, we propose a Gaussian-induced convolution (GIC) framework to conduct local convolution filtering on irregular graphs. Specifically, an edge-induced Gaussian mixture model is designed to encode variations of subgraph region by integrating edge information into weighted Gaussian models, each of which implicitly characterizes one component of subgraph variations. In order to coarsen a graph, we derive a vertex-induced Gaussian mixture model to cluster vertices dynamically according to the connection of edges, which is approximately equivalent to the weighted graph cut. We conduct our multi-layer graph convolution network on several public datasets of graph classification. The extensive experiments demonstrate that our GIC is effective and can achieve the state-of-the-art results.


Fast Matrix Factorization with Non-Uniform Weights on Missing Data

Matrix factorization (MF) has been widely used to discover the low-rank structure and to predict the missing entries of data matrix. In many real-world learning systems, the data matrix can be very high-dimensional but sparse. This poses an imbalanced learning problem, since the scale of missing entries is usually much larger than that of observed entries, but they cannot be ignored due to the valuable negative signal. For efficiency concern, existing work typically applies a uniform weight on missing entries to allow a fast learning algorithm. However, this simplification will decrease modeling fidelity, resulting in suboptimal performance for downstream applications. In this work, we weight the missing data non-uniformly, and more generically, we allow any weighting strategy on the missing data. To address the efficiency challenge, we propose a fast learning method, for which the time complexity is determined by the number of observed entries in the data matrix, rather than the matrix size. The key idea is two-fold: 1) we apply truncated SVD on the weight matrix to get a more compact representation of the weights, and 2) we learn MF parameters with element-wise alternating least squares (eALS) and memorize the key intermediate variables to avoid repeating computations that are unnecessary. We conduct extensive experiments on two recommendation benchmarks, demonstrating the correctness, efficiency, and effectiveness of our fast eALS method.


An Optimal Control View of Adversarial Machine Learning

I describe an optimal control view of adversarial machine learning, where the dynamical system is the machine learner, the input are adversarial actions, and the control costs are defined by the adversary’s goals to do harm and be hard to detect. This view encompasses many types of adversarial machine learning, including test-item attacks, training-data poisoning, and adversarial reward shaping. The view encourages adversarial machine learning researcher to utilize advances in control theory and reinforcement learning.


End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

Knowledge graph embedding has been an active research topic for knowledge base completion, with progressive improvement from the initial TransE, TransH, DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution over embeddings and multiple layers of nonlinear features to model knowledge graphs. The model can be efficiently trained and scalable to large knowledge graphs. However, there is no structure enforcement in the embedding space of ConvE. The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure. In this work, we propose a novel end-to-end Structure-Aware Convolutional Networks (SACN) that take the benefit of GCN and ConvE together. SACN consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE. WGCN utilizes knowledge graph node structure, node attributes and relation types. It has learnable weights that collect adaptive amount of information from neighboring graph nodes, resulting in more accurate embeddings of graph nodes. In addition, the node attributes are added as the nodes and are easily integrated into the WGCN. The decoder Conv-TransE extends the state-of-the-art ConvE to be translational between entities and relations while keeps the state-of-the-art performance as ConvE. We demonstrate the effectiveness of our proposed SACN model on standard FB15k-237 and WN18RR datasets, and present about 10% relative improvement over the state-of-the-art ConvE in terms of HITS@1, HITS@3 and HITS@10.


ReDecode Framework for Iterative Improvement in Paraphrase Generation

Generating paraphrases, that is, different variations of a sentence conveying the same meaning, is an important yet challenging task in NLP. Automatically generating paraphrases has its utility in many NLP tasks like question answering, information retrieval, conversational systems to name a few. In this paper, we introduce iterative refinement of generated paraphrases within VAE based generation framework. Current sequence generation models lack the capability to (1) make improvements once the sentence is generated; (2) rectify errors made while decoding. We propose a technique to iteratively refine the output using multiple decoders, each one attending on the output sentence generated by the previous decoder. We improve current state of the art results significantly – with over 9% and 28% absolute increase in METEOR scores on Quora question pairs and MSCOCO datasets respectively. We also show qualitatively through examples that our re-decoding approach generates better paraphrases compared to a single decoder by rectifying errors and making improvements in paraphrase structure, inducing variations and introducing new but semantically coherent information.


Computational Complexity Analysis of Genetic Programming

Genetic Programming (GP) is an evolutionary computation technique to solve problems in an automated, domain-independent way. Rather than identifying the optimum of a function as in more traditional evolutionary optimization, the aim of GP is to evolve computer programs with a given functionality. A population of programs is evolved using variation operators inspired by Darwinian evolution (crossover and mutation) and natural selection principles to guide the search process towards better programs. While many GP applications have produced human competitive results, the theoretical understanding of what problem characteristics and algorithm properties allow GP to be effective is comparatively limited. Compared to traditional evolutionary algorithms for function optimization, GP applications are further complicated by two additional factors: the variable length representation of candidate programs, and the difficulty of evaluating their quality efficiently. Such difficulties considerably impact the runtime analysis of GP where space complexity also comes into play. As a result initial complexity analyses of GP focused on restricted settings such as evolving trees with given structures or estimating the quality of solutions using only a small polynomial number of input/output examples. However, the first runtime analyses concerning GP applications for evolving proper functions with defined input/output behavior have recently appeared. In this chapter, we present an overview of the state-of-the-art.


RADS: Real-time Anomaly Detection System for Cloud Data Centres

Cybersecurity attacks in Cloud data centres are increasing alongside the growth of the Cloud services market. Existing research proposes a number of anomaly detection systems for detecting such attacks. However, these systems encounter a number of challenges, specifically due to the unknown behaviour of the attacks and the occurrence of genuine Cloud workload spikes, which must be distinguished from attacks. In this paper, we discuss these challenges and investigate the issues with the existing Cloud anomaly detection approaches. Then, we propose a Real-time Anomaly Detection System (RADS) for Cloud data centres, which uses a one class classification algorithm and a window-based time series analysis to address the challenges. Specifically, RADS can detect VM-level anomalies occurring due to DDoS and cryptomining attacks. We evaluate the performance of RADS by running lab-based experiments and by using real-world Cloud workload traces. Evaluation results demonstrate that RADS can achieve 90-95% accuracy with a low false positive rate of 0-3%. The results further reveal that RADS experiences fewer false positives when using its window-based time series analysis in comparison to using state-of-the-art average or entropy based analysis.


Anomaly Detection and Correction in Large Labeled Bipartite Graphs

Binary classification problems can be naturally modeled as bipartite graphs, where we attempt to classify right nodes based on their left adjacencies. We consider the case of labeled bipartite graphs in which some labels and edges are not trustworthy. Our goal is to reduce noise by identifying and fixing these labels and edges. We first propose a geometric technique for generating random graph instances with untrustworthy labels and analyze the resulting graph properties. We focus on generating graphs which reflect real-world data, where degree and label frequencies follow power law distributions. We review several algorithms for the problem of detection and correction, proposing novel extensions and making observations specific to the bipartite case. These algorithms range from math programming algorithms to discrete combinatorial algorithms to Bayesian approximation algorithms to machine learning algorithms. We compare the performance of all these algorithms using several metrics and, based on our observations, identify the relative strengths and weaknesses of each individual algorithm.


An Interpretable Generative Model for Handwritten Digit Image Synthesis

An interpretable generative model for handwritten digits synthesis is proposed in this work. Modern image generative models, such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), are trained by backpropagation (BP). The training process is complex and the underlying mechanism is difficult to explain. We propose an interpretable multi-stage PCA method to achieve the same goal and use handwritten digit images synthesis as an illustrative example. First, we derive principal-component-analysis-based (PCA-based) transform kernels at each stage based on the covariance of its inputs. This results in a sequence of transforms that convert input images of correlated pixels to spectral vectors of uncorrelated components. In other words, it is a whitening process. Then, we can synthesize an image based on random vectors and multi-stage transform kernels through a coloring process. The generative model is a feedforward (FF) design since no BP is used in model parameter determination. Its design complexity is significantly lower, and the whole design process is explainable. Finally, we design an FF generative model using the MNIST dataset, compare synthesis results with those obtained by state-of-the-art GAN and VAE methods, and show that the proposed generative model achieves comparable performance.


A Model-Centric Analysis of Openness, Replication, and Reproducibility

The literature on the reproducibility crisis presents several putative causes for the proliferation of irreproducible results, including HARKing, p-hacking and publication bias. Without a theory of reproducibility, however, it is difficult to determine whether these putative causes can explain most irreproducible results. Drawing from an historically informed conception of science that is open and collaborative, we identify the components of an idealized experiment and analyze these components as a precursor to develop such a theory. Openness, we suggest, has long been intuitively proposed as a solution to irreproducibility. However, this intuition has not been validated in a theoretical framework. Our concern is that the under-theorizing of these concepts can lead to flawed inferences about the (in)validity of experimental results or integrity of individual scientists. We use probabilistic arguments and examine how openness of experimental components relates to reproducibility of results. We show that there are some impediments to obtaining reproducible results that precede many of the causes often cited in literature on the reproducibility crisis. For example, even if erroneous practices such as HARKing, p-hacking, and publication bias were absent at the individual and system level, reproducibility may still not be guaranteed.


Adversarial Learning-Based On-Line Anomaly Monitoring for Assured Autonomy

The paper proposes an on-line monitoring framework for continuous real-time safety/security in learning-based control systems (specifically application to a unmanned ground vehicle). We monitor validity of mappings from sensor inputs to actuator commands, controller-focused anomaly detection (CFAM), and from actuator commands to sensor inputs, system-focused anomaly detection (SFAM). CFAM is an image conditioned energy based generative adversarial network (EBGAN) in which the energy based discriminator distinguishes between proper and anomalous actuator commands. SFAM is based on an action condition video prediction framework to detect anomalies between predicted and observed temporal evolution of sensor data. We demonstrate the effectiveness of the approach on our autonomous ground vehicle for indoor environments and on Udacity dataset for outdoor environments.


Explainable Reasoning over Knowledge Graphs for Recommendation

Incorporating knowledge graph into recommender systems has attracted increasing attention in recent years. By exploring the interlinks within a knowledge graph, the connectivity between users and items can be discovered as paths, which provide rich and complementary information to user-item interactions. Such connectivity not only reveals the semantics of entities and relations, but also helps to comprehend a user’s interest. However, existing efforts have not fully explored this connectivity to infer user preferences, especially in terms of modeling the sequential dependencies within and holistic semantics of a path. In this paper, we contribute a new model named Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for recommendation. KPRN can generate path representations by composing the semantics of both entities and relations. By leveraging the sequential dependencies within a path, we allow effective reasoning on paths to infer the underlying rationale of a user-item interaction. Furthermore, we design a new weighted pooling operation to discriminate the strengths of different paths in connecting a user with an item, endowing our model with a certain level of explainability. We conduct extensive experiments on two datasets about movie and music, demonstrating significant improvements over state-of-the-art solutions Collaborative Knowledge Base Embedding and Neural Factorization Machine.


Recent Research Advances on Interactive Machine Learning

Interactive Machine Learning (IML) is an iterative learning process that tightly couples a human with a machine learner, which is widely used by researchers and practitioners to effectively solve a wide variety of real-world application problems. Although recent years have witnessed the proliferation of IML in the field of visual analytics, most recent surveys either focus on a specific area of IML or aim to summarize a visualization field that is too generic for IML. In this paper, we systematically review the recent literature on IML and classify them into a task-oriented taxonomy built by us. We conclude the survey with a discussion of open challenges and research opportunities that we believe are inspiring for future work in IML.


An Easy Implementation of CV-TMLE

In the world of targeted learning, cross-validated targeted maximum likelihood estimators, CV-TMLE \parencite{Zheng:2010aa}, has a distinct advantage over TMLE \parencite{Laan:2006aa} in that one less condition is required of CV-TMLE in order to achieve asymptotic efficiency in the nonparametric or semiparametric settings. CV-TMLE as originally formulated, consists of averaging usually 10 (for 10-fold cross-validation) parameter estimates, each of which is performed on a validation set separate from where the initial fit was trained. The targeting step is usually performed as a pooled regression over all validation folds but in each fold, we separately evaluate any means as well as the parameter estimate. One nice thing about CV-TMLE, is that we average 10 plug-in estimates so the plug-in quality of preserving the natural parameter bounds is respected. Our adjustment of this procedure also preserves the plug-in characteristic as well as avoids the donsker condtion. The advantage of our procedure is the implementation of the targeting is identical to that of a regular TMLE, once all the validation set initial predictions have been formed. In short, we stack the validation set predictions and pretend as if we have a regular TMLE, which is not necessarily quite a plug-in estimator on each fold but overall will perform asymptotically the same and might have some slight advantage, a subject for future research. In the case of average treatment effect, treatment specific mean and mean outcome under a stochastic intervention, the procedure overlaps exactly with the originally formulated CV-TMLE with a pooled regression for the targeting.


Estimation of Dimensions Contributing to Detected Anomalies with Variational Autoencoders

Anomaly detection using dimensionality reduction has been an essential technique for monitoring multidimensional data. Although deep learning-based methods have been well studied for their remarkable detection performance, their interpretability is still a problem. In this paper, we propose a novel algorithm for estimating the dimensions contributing to the detected anomalies by using variational autoencoders (VAEs). Our algorithm is based on an approximative probabilistic model that considers the existence of anomalies in the data, and by maximizing the log-likelihood, we estimate which dimensions contribute to determining data as an anomaly. The experiments results with benchmark datasets show that our algorithm extracts the contributing dimensions more accurately than baseline methods.


Differentiating Concepts and Instances for Knowledge Graph Embedding

Concepts, which represent a group of different instances sharing common properties, are essential information in knowledge representation. Most conventional knowledge embedding methods encode both entities (concepts and instances) and relations as vectors in a low dimensional semantic space equally, ignoring the difference between concepts and instances. In this paper, we propose a novel knowledge graph embedding model named TransC by differentiating concepts and instances. Specifically, TransC encodes each concept in knowledge graph as a sphere and each instance as a vector in the same semantic space. We use the relative positions to model the relations between concepts and instances (i.e., instanceOf), and the relations between concepts and sub-concepts (i.e., subClassOf). We evaluate our model on both link prediction and triple classification tasks on the dataset based on YAGO. Experimental results show that TransC outperforms state-of-the-art methods, and captures the semantic transitivity for instanceOf and subClassOf relation. Our codes and datasets can be obtained from h…/ github.com/davidlvxin/TransC.


A Review for Weighted MinHash Algorithms

Data similarity (or distance) computation is a fundamental research topic which underpins many high-level applications based on similarity measures in machine learning and data mining. However, in large-scale real-world scenarios, the exact similarity computation has become daunting due to ‘3V’ nature (volume, velocity and variety) of big data. In such cases, the hashing techniques have been verified to efficiently conduct similarity estimation in terms of both theory and practice. Currently, MinHash is a popular technique for efficiently estimating the Jaccard similarity of binary sets and furthermore, weighted MinHash is generalized to estimate the generalized Jaccard similarity of weighted sets. This review focuses on categorizing and discussing the existing works of weighted MinHash algorithms. In this review, we mainly categorize the Weighted MinHash algorithms into quantization-based approaches, ‘active index’-based ones and others, and show the evolution and inherent connection of the weighted MinHash algorithms, from the integer weighted MinHash algorithms to real-valued weighted MinHash ones (particularly the Consistent Weighted Sampling scheme). Also, we have developed a python toolbox for the algorithms, and released it in our github. Based on the toolbox, we experimentally conduct a comprehensive comparative study of the standard MinHash algorithm and the weighted MinHash ones.


Adversarial Learning of Label Dependency: A Novel Framework for Multi-class Classification

Recent work has shown that exploiting relations between labels improves the performance of multi-label classification. We propose a novel framework based on generative adversarial networks (GANs) to model label dependency. The discriminator learns to model label dependency by discriminating real and generated label sets. To fool the discriminator, the classifier, or generator, learns to generate label sets with dependencies close to real data. Extensive experiments and comparisons on two large-scale image classification benchmark datasets (MS-COCO and NUS-WIDE) show that the discriminator improves generalization ability for different kinds of models


Gauges, Loops, and Polynomials for Partition Functions of Graphical Models

We suggest a new methodology for analysis and approximate computations of the Partition Functions (PF) of Graphical Models (GM) in the Normal Factor Graph representation that combines the gauge transformation (GT) technique from (Chertkov, Chernyak 2006) with the technique developed in (Straszak, Vishnoi 2017) based on the recent progress in the field of real stable polynomials. We show that GTs (while keeping PF invariant) allow representation of PF as a sum of polynomials of variables associated with edges of the graph. A special belief propagation (BP) gauge makes a single out term of the series least sensitive to variations then resulting in the loop series for PF introduced in (Chertkov, Chernyak 2006). In addition to restating the known results in the polynomial form, we also discover a new relation between the computationally tractable BP term (single out term of the loop series evaluated at the BP gauge) and the PF: sequential application of differential operators, each associated with an edge of the graph, to the BP polynomial results in the PF. Each term in the sequence corresponds to a BP polynomial of a modified GM derived by contraction of an edge. Even though complexity of computing factors in the derived GMs grow exponentially with the number of eliminated edges, polynomials associated with the new factors remain real stable if the original factors have this property. Moreover, we show that BP estimations for the PF do not decrease with eliminations, thus resulting overall in the proof that the BP solution of the original GM gives a lower bound for PF. The proof extends results of (Straszak, Vishnoi 2017) from bipartite to general graphs, however, it is limited to the case when the BP solution is feasible.


NExUS: Bayesian simultaneous network estimation across unequal sample sizes
Relation of Web Service Orchestration, Abstract Process, Web Service and Choreography
Towards time-varying proximal dynamics in Multi-Agent Network Games
HSD-CNN: Hierarchically self decomposing CNN architecture using class specific filter sensitivity analysis
An Initial Attempt of Combining Visual Selective Attention with Deep Reinforcement Learning
Learning Groupwise Scoring Functions Using Deep Neural Networks
Optimal Spectral Initialization for Signal Recovery with Applications to Phase Retrieval
About the ordinances of the vectors of the $n$-dimensional Boolean cube in accordance with their weights
When Locally Linear Embedding Hits Boundary
Faster sublinear approximations of $k$-cliques for low arboricity graphs
Dynamics of the Kuramoto-Sakaguchi Oscillator Network with Asymmetric Order Parameter
Generating subgraphs in chordal graphs
Blockchain for Economically Sustainable Wireless Mesh Networks
Recognizing generating subgraphs revisited
A Progressively-trained Scale-invariant and Boundary-aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions
Statistical modelling of conidial discharge of entomophthoralean fungi using a newly discovered Pandora species
Approximation Algorithms for Graph Burning
Multi-Source Neural Variational Inference
Deep Learning Framework for Pedestrian Collision Avoidance System (PeCAS)
Learning with tree-based tensor formats
A 3-D Projection Model for X-ray Dark-field Imaging
Time-interval balancing in multi-processor scheduling of composite modular jobs (preliminary description)
Three-dimensional double helical DNA structure directly revealed from its X-ray fiber diffraction pattern by iterative phase retrieval
Analysis vs Synthesis – An Investigation of (Co)sparse Signal Models on Graphs
Bridging Network Embedding and Graph Summarization
Machine Learning with Abstention for Automated Liver Disease Diagnosis
On constrained optimization problems solved using CDT
Simultaneous Ruin Probability for Two-Dimensional Brownian and Lévy Risk Models
Thompson Sampling for Pursuit-Evasion Problems
Capital Structure and Speed of Adjustment in U.S. Firms. A Comparative Study in Microeconomic and Macroeconomic Conditions – A Quantille Regression Approach
Managing App Install Ad Campaigns in RTB: A Q-Learning Approach
Unifying Gaussian LWF and AMP Chain Graphs to Model Interference
Semi-supervised Deep Representation Learning for Multi-View Problems
Multiple Subspace Alignment Improves Domain Adaptation
Massive MIMO-based Localization and Mapping Exploiting Phase Information of Multipath Components
Product Title Refinement via Multi-Modal Generative Adversarial Learning
Subsampling to Enhance Efficiency in Input Uncertainty Quantification
On a Pólya functional for rhombi, isosceles triangles, and thinning convex sets
SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient
External optimal control of nonlocal PDEs
Agent Embeddings: A Latent Representation for Pole-Balancing Networks
Constant payoff in zero-sum stochastic games
Deep Learning Based Transmitter Identification using Power Amplifier Nonlinearity
The Poisson random effect model for experience ratemaking: limitations and alternative solutions
Adaptive Hessian Estimation Based Extremum Localization
Robustness of link prediction under network attacks
Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition
M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network
Road Damage Detection And Classification In Smartphone Captured Images Using Mask R-CNN
Identification of Internal Faults in Indirect Symmetrical Phase Shift Transformers Using Ensemble Learning
On the length of the longest consecutive switches
Variational Community Partition with Novel Network Structure Centrality Prior
Visual Saliency Maps Can Apply to Facial Expression Recognition
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
Learning Latent Dynamics for Planning from Pixels
Tractability of Konig Edge Deletion Problems
Statistical Inference for Stable Distribution Using EM algorithm
Time-changed Poisson processes of order $k$
Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition
On the Performance and Convergence of Distributed Stream Processing via Approximate Fault Tolerance
A differential game on Wasserstein space. Application to weak approachability with partial monitoring
Forecasting People’s Needs in Hurricane Events from Social Network
A central limit theorem for descents and major indices in fixed conjugacy classes of $S_n$
Navigating Assistance System for Quadcopter with Deep Reinforcement Learning
Holistic Multi-modal Memory Network for Movie Question Answering
MR-RePair: Grammar Compression based on Maximal Repeats
The Hidden Shape of Stories Reveals Positivity Bias and Gender Bias
New Theoretical Bounds and Constructions of Permutation Codes under Block Permutation Metric
Learning The Invisible: A Hybrid Deep Learning-Shearlet Framework for Limited Angle Computed Tomography
Learning Personalized End-to-End Goal-Oriented Dialog
Streaming Hardness of Unique Games
Matrix Product Operator Restricted Boltzmann Machines
140 Gbaud On-Off Keying Links in C-Band for Short-Reach Optical Interconnects
Subspace Packings
Different Power Adaption Methods on Fluctuating Two-Ray Fading Channels
Forming Probably Stable Communities with Limited Interactions
Depth Image Upsampling based on Guided Filter with Low Gradient Minimization
Fine-tuning of Language Models with Discriminator
Importance Weighted Evolution Strategies
Embedding partial Latin squares in Latin squares with many mutually orthogonal mates
Newton: A Language for Describing Physics
Parameterized Synthetic Image Data Set for Fisheye Lens
Another Note on Intervals in the Hales-Jewett Theorem
Reciprocal and Positive Real Balanced Truncations for Model Order Reduction of Descriptor Systems
Angry or Climbing Stairs? Towards Physiological Emotion Recognition in the Wild
Extending Pretrained Segmentation Networks with Additional Anatomical Structures
Massive MIMO with a Generalized Channel Model: Fundamental Aspects
Blind Over-the-Air Computation and Data Fusion via Provable Wirtinger Flow
Hallucinating very low-resolution and obscured face images
Global sensitivity analysis for optimization with variable selection
Combining Learned Lyrical Structures and Vocabulary for Improved Lyric Generation
Modeling Text Complexity using a Multi-Scale Probit
Not Just Depressed: Bipolar Disorder Prediction on Reddit
Surface area deviation between smooth convex bodies and polytopes
Proprties of biclustering algorithms and a novel biclustering technique based on relative density
Detection of REM Sleep Behaviour Disorder by Automated Polysomnography Analysis
Design of Low Complexity GFDM Transceiver
Path integral Monte Carlo method for the quantum anharmonic oscillator
A Deep Ensemble Framework for Fake News Detection and Classification
Towards Adversarial Denoising of Radar Micro-Doppler Signatures
Learning Segmentation Masks with the Independence Prior
Bias Scheme Reducing Transient Currents and Speeding up Read Operations for 3-D Cross Point PCM
Joint Probability Distribution of Prediction Errors of ARIMA
A Generalization of the Matroid Polytope Theorem to Local Forest Greedoids
Pseudofiniteness in Hrushovski Constructions
Variational and Optimal Control Approaches for the Second-Order Herglotz Problem on Spheres
Classifying Patent Applications with Ensemble Methods
CUNI System for the WMT18 Multimodal Translation Task
The random walk penalised by its range in dimensions $d\geq 3$
Weyl-Mahonian Statistics for Weighted Flags of Type A-D
Analyzing deep CNN-based utterance embeddings for acoustic model adaptation
Inductively pierced codes and neural toric ideals
Input Combination Strategies for Multi-Source Transformer Decoder
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Optimization of triangular networks with spatial constraints
On an Annihilation Number Conjecture
Universal Marginalizer for Amortised Inference and Embedding of Generative Models
Characterizing $(0,\pm 1)$-matrices with only even-rank principal submatrices in terms of skew-symmetry
Mutual Information of Wireless Channels and Block-Jacobi Ergodic Operators
Simple FPGA routing graph compression
Gaussian Auto-Encoder
Learning Representations of Missing Data for Predicting Patient Outcomes
Sliding Window Temporal Graph Coloring
Deep-learning the Latent Space of Light Transport
Markov Property in Generative Classifiers
The Equilibrium States of Large Networks of Erlang Queues

Magister Dixit

“Understanding correlation, multivariate regression and all aspects of massaging data together to look at it from different angles for use in predictive and prescriptive modeling is the backbone knowledge that’s really step one of revealing intelligence…. If you don’t have this, all the data collection and presentation polishing in the world is meaningless.” Mitchell A. Sanders ( August 27, 2013 )

Book Memo: “From Human Attention to Computational Attention”

A Multidisciplinary Approach
This both accessible and exhaustive book will help to improve modeling of attention and to inspire innovations in industry. It introduces the study of attention and focuses on attention modeling, addressing such themes as saliency models, signal detection and different types of signals, as well as real-life applications. The book is truly multi-disciplinary, collating work from psychology, neuroscience, engineering and computer science, amongst other disciplines. What is attention? We all pay attention every single moment of our lives. Attention is how the brain selects and prioritizes information. The study of attention has become incredibly complex and divided: this timely volume assists the reader by drawing together work on the computational aspects of attention from across the disciplines. Those working in the field as engineers will benefit from this book’s introduction to the psychological and biological approaches to attention, and neuroscientists can learn about engineering work on attention. The work features practical reviews and chapters that are quick and easy to read, as well as chapters which present deeper, more complex knowledge. Everyone whose work relates to human perception, to image, audio and video processing will find something of value in this book, from students to researchers and those in industry.

Document worth reading: “Deep Reinforcement Learning: An Overview”

In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This chapter reviews the recent advances in deep reinforcement learning with a focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework. Deep Reinforcement Learning: An Overview

R Packages worth a look

Visualization of Subgroups for Decision Trees (visTree)
Provides a visualization for characterizing subgroups defined by a decision tree structure. The visualization simplifies the ability to interpret indiv …

A ‘ggplot2’-Plot of Composition of Solvency II SCR: SF and IM (ggsolvencyii)
An implementation of ‘ggplot2’-methods to present the composition of Solvency II Solvency Capital Requirement (SCR) as a series of concentric circle-pa …

Inference and Learning in Stochastic Automata (SAutomata)
Machine learning provides algorithms that can learn from data and make inferences or predictions. Stochastic automata is a class of input/output device …

Distilled News

29 Statistical Concepts Explained in Simple English – Part 3

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more


Windows Clipboard Access with R

The windows clipboard is a quick way to get data in and out of R. How can we exploit this feature to accomplish our basic data exploration needs and when might its use be inappropriate? Read on.


Explaining Black-Box Machine Learning Models – Code Part 2: Text classification with LIME

his is code that will accompany an article that will appear in a special edition of a German IT magazine. The article is about explaining black-box machine learning models.


Building a Repository of Alpine-based Docker Images for R, Part II

In the first article of this series, I built an Alpine-based Docker image with R base packages from Alpine’s native repositories, as well as one image with R compiled from source code. The images are hosted on Docker Hub, velaco/alpine-r repository. The next step was either to address the fatal errors I found while testing the installation of R or to proceed building an image with Shiny Server. The logical choice would have been to pass all tests with R’s base packages before proceeding, but I was a bit impatient and wanted to go through the process of building a Shiny Server as soon as possible. After two weeks of trial and error, I finally have a container that can start the server and run Shiny apps.


Easy time-series prediction with R: a tutorial with air traffic data from Lux Airport

In this blog post, I will show you how you can quickly and easily forecast a univariate time series. I am going to use data from the EU Open Data Portal on air passenger transport. You can find the data here. I downloaded the data in the TSV format for Luxembourg Airport, but you could repeat the analysis for any airport.


AI for Good: slides and notebooks from the ODSC workshop

Last week at the ODSC West conference, I was thrilled with the interest in my Using AI for Good workshop: it was wonderful to find a room full of data scientists eager to learn how data science and artificial intelligence can be used to help people and the planet. The workshop was focused around projects from the Microsoft AI for Good program. I’ve included some details about the projects below, and you can also check out the workshop slides and the accompanying Jupyter Notebooks that demonstrate the underlying AI methods used in the projects.


Installing RStudio & Shiny Servers

I did a remote install of Ubuntu Server today. This was somewhat novel because it’s the first time that I have not had physical access to the machine I was installing on. The server install went very smoothly indeed.


Interchanging RMarkdown and ‘spinnable’ R

Interchanging RMarkdown and ‘spinnable’ R


Behaviour Analysis using Graphext

Why do people act the way they do? Why do they buy products, quit their jobs, or change partners? Many of these motives can be inducted from people’s behaviour, and these behaviours are reflected in data. Companies have lots of data about their clients, employees, suppliers… Let’s put that data to work to do some smart data discovery and see what we could learn.


Job Title Analysis in python and NLTK

A job title indicates a lot about someone’s role and responsibilities. It says if they manage a team, if they control a budget, and their level of specialization. Knowing this is useful when automating business development or client outreach. For example, a company that sells voice recognition software may want to send messages to:
• CTOs and technical directors informing them of the price and benefits of using the voice recognition software.
• Potential investors or advisors messages inviting them to see the company’s potential market size.
• Founders and engineers instructing them how to use the software.
Training a software to classify job titles is a multi-text text classification problem. For this task, we can use the Python Natural Language Toolkit (NLTK) and Bayesian classification.


Doing Machine Learning the Uber Way: Five Lessons From the First Three Years of Michelangelo

Uber has been one of the most active contributors to open source machine learning technologies in the last few years. While companies like Google or Facebook have focused their contributions in new deep learning stacks like TensorFlow, Caffe2 or PyTorch, the Uber engineering team has really focused on tools and best practices for building machine learning at scale in the real world. Technologies such as Michelangelo, Horovod, PyML, Pyro are some of examples of Uber’s contributions to the machine learning ecosystem. With only a small group of companies developing large scale machine learning solutions, the lessons and guidance from Uber becomes even more valuable for machine learning practitioners (I certainly learned a lot and have regularly written about Uber’s efforts).


https://www.kdnuggets.com/2018/11/best-python-ide-data-science.html

Before you start learning Python, choose the IDE that suits you the best. As Python is one of the leading programming languages, there is a multitude of IDEs available. So the question is, ‘Which is the best Python IDE for Data Science?’


Introduction to Image Recognition: Building a Simple Digit Detector

Digit recognition is not something that difficult or advanced. It is kind of ‘Hello world!’ program – not that cool, but you start exactly here. So I decided to share my work and at the same time refresh the knowledge – it’s being a long ago I played with images.


The 2×2 Data Science Skills Matrix that Harvard Business Review got completely wrong!

Data Science is the current buzzword in the market. Every company at the moment is looking to hire Data Science Professionals to solve some Data problem that they themselves are not aware of currently. Machine Learning has taken over the industry by storm and we have a bunch of self taught Data Scientists in the market. Since this Data Science word is an altogether different universe, it is very difficult to set up priorities on what to learn and what not to. So in this case the Harvard Business Review published an article on what you as a company or individual should give importance to. Let’s have a look.


Decision Tree in Machine Learning

A decision tree is a flowchart-like structure in which each internal node represents a test on a feature (e.g. whether a coin flip comes up heads or tails) , each leaf node represents a class label (decision taken after computing all features) and branches represent conjunctions of features that lead to those class labels. The paths from root to leaf represent classification rules. Below diagram illustrate the basic flow of decision tree for decision making with labels (Rain(Yes), No Rain(No)).


Using Bash for Data Pipelines

Using bash scripts to create data pipelines is incredibly useful as a data scientist. The possibilities with these scripts are almost endless, but here, I will be going through a tutorial on a very basic bash script to download data and count the number of rows and cols in a dataset. Once you get the hang of using bash scripts, you can have the basics for creating IoT devices, and much much more as this all works with a Raspberry Pi. One cool project that you could use this for is to download all of your twitter messages using the twitter api and then predict whether or not a message from a user on Twitter is spam or not. It could run on a Raspberry Pi server from your room! That is a little out of the scope of this tutorial though, so we will begin by looking at a dataset for cars speed in San Francisco!


From Scratch: Bayesian Inference, Markov Chain Monte Carlo and Metropolis Hastings, in python

I’ve been an avid reader on medium/towards data science for a while now, and I’ve enjoyed the diversity and openness of the subjects tackled by many authors. I wish to contribute to this awesome community by creating my own series of articles ‘From Scratch’, where I explain and implement/build anything from scratch (not necessarily in data science, you need only propose!) Why do I want to do that? In the current state of things, we are in possession of such powerful libraries and tools that can do a lot of the work for us. Most experienced authors are well aware of the complexities of implementing such tools. As such, they make use of them to provide short, accessible and to the point reads to users from diverse backgrounds. In many of the articles that I enjoyed, I failed to understand how this or that algorithm is implemented in practise. What are their limitations? Why were they invented? When should they be used?


Reverse Engineering Backpropagation

Sometimes starting with examples might be a faster way to learn something rather than going theory first before getting into detailed examples. That’s what I will attempt to do here using an example from the official PyTorch tutorial that implements backpropogation and reverse engineer the math and subsequently the concept behind it.


ML Intro 5: One hot Encoding, Cyclic Representations, Normalization

This post follows Machine Learning Introduction 4. In the previous post, we described Machine Learning for marketing attribution. In this post, we will illuminate some of the details we ignored in that section. We will inspect a dataset about Marketing Attribution, perform one-hot encoding of our brands, manipulate our one-hot encoding to learn custom business insights, normalize our features, inspect our model inputs once this is all done, and interpret our outputs in detail.


Machine Learning Bit by Bit – Multivariate Gradient Descent

In this post, we’re going to extend our understanding of gradient descent and apply it to a multivariate function.


Machine Learning Bit by Bit – Univariate Gradient Descent

This series aims to share my own endeavour to understand, explore and experiment on topics in machine learning. Mathematical notations in this particular post and the next one on multivariate gradient descent will be mostly in line with those used in the Machine Learning course by Andrew Ng. Understanding and being able to play around with the maths behind is the key in understanding machine learning. It allows us to choose the most suitable algorithms and tailor them according to the problems we want to solve. However, I have encountered many tutorials and lectures where equations used are simply impenetrable. All the symbols look cryptic and there seems to be a huge gap between what is being explained and those equations. I just can’t connect all the dots. Unfortunately, more often than not, maths hinders understanding when some knowledge is assumed and important steps are skipped. Therefore, wherever possible, I will expand the equations and avoid shortcuts, so that everyone can follow along how we reach from the left side of the equation to the right.

Distilled News

Imagining an Engineer: On GAN-Based Data Augmentation Perpetuating Biases

The use of synthetic data generated by Generative Adversarial Networks (GANs) has become quite a popular method to do data augmentation for many applications. While practitioners celebrate this as an economical way to get more synthetic data that can be used to train downstream classifiers, it is not clear that they recognize the inherent pitfalls of this technique. In this paper, we aim to exhort practitioners against deriving any false sense of security against data biases based on data augmentation. To drive this point home, we show that starting with a dataset consisting of head-shots of engineering researchers, GAN-based augmentation ‘imagines’ synthetic engineers, most of whom have masculine features and white skin color (inferred from a human subject study conducted on Amazon Mechanical Turk). This demonstrates how biases inherent in the training data are reinforced, and sometimes even amplified, by GAN-based data augmentation; it should serve as a cautionary tale for the lay practitioners.


Relation extraction with weakly supervised learning based on process-structure-property-performance reciprocity

In this study, we develop a computer-aided material design system to represent and extract knowledge related to material design from natural language texts. A machine learning model is trained on a text corpus weakly labeled by minimal annotated relationship data (~100 labeled relationships) to extract knowledge from scientific articles. The knowledge is represented by relationships between scientific concepts, such as {annealing, grain size, strength}. The extracted relationships are represented as a knowledge graph formatted according to design charts, inspired by the process-structure-property-performance (PSPP) reciprocity. The design chart provides an intuitive effect of processes on properties and prospective processes to achieve the certain desired properties. Our system semantically searches the scientific literature and provides knowledge in the form of a design chart, and we hope it contributes more efficient developments of new materials.


Introducing vizscorer: a bot advisor to score and improve your ggplot plots

One of the most frustrating issues I face in my professional life is the plentitude of ineffective reports generated within my company. Wherever I look around me is plenty of junk charts, like barplot showing useless 3D effects or ambiguous and crowded pie charts.


Extended Isolation Forest

This is a simple package implementation for the Extended Isolation Forest method. It is an improvement on the original algorithm Isolation Forest which is described (among other places) in this paper for detecting anomalies and outliers from a data point distribution. The original algorithm suffers from an inconsistency in producing anomaly scores due to slicing operations. Even though the slicing hyperplanes are selected at random, they are always parallel to the coordinate reference frame. The shortcoming can be seen in score maps as presented in the example notebooks in this repository. In order to improve the situation, we propose an extension which allows the hyperplanes to be taken at random angles. The way in which this is done gives rise to multiple levels of extension depending on the dimensionality of the problem. For an N dimensional dataset, Extended Isolation Forest has N levels of extension, with 0 being identical to the case of standard Isolation Forest, and N-1 being the fully extended version. Here we provide the source code for the algorithm as well as documented example notebooks to help get started. Various visualizations are provided such as score distributions, score maps, aggregate slicing of the domain, and tree and whole forest visualizations. most examples are in 2D. We present one 3D example. However, the algorithm works readily with higher dimensional data.


What is Hidden in the Hidden Markov Model?

Hidden Markov Models or HMMs are the most common models used for dealing with temporal Data. They also frequently come up in different ways in a Data Science Interview usually without the word HMM written over it. In such a scenario it is necessary to discern the problem as an HMM problem by knowing characteristics of HMMs.


How to Ensure Safety and Security During Condition Monitoring and Predictive Maintenance : a Case Study

Effective communication from machines and embedded sensors, actuators in industries are crucial to achieve industrial digitalization. Efficient remote monitoring as well as maintenance methodologies helps to accomplish and transform the existing industries to Smart Factories. Monitoring and maintenance leads to the aggregation of the real-time data from sensors via different existing and new industrial communication protocols. Development of user-friendly interface allows remote Condition Monitoring (CM). Context aware analysis of real-time and historical data provides capability to accomplish active Predictive Maintenance (PdM). Both CM and PdM needs access to the machine process data, industrial net-work and communication layer. Furthermore, data flow between individual components from the Cyber-Physical System (CPS) components starting from the actual machine to the database or analyze engine to the real visualization is important. Security and safety aspects on the application, communication, network and data flow level should be considered. This thesis presents a case study on benefits of PdM and CM, the security and safety aspect of the system and the current challenges and improvements. Components of the CPS ecosystem are taken into consideration to further investigate the individual components which en-ables predictive maintenance and condition monitoring. Additionally, safety and security aspects of each component is analyzed. Moreover, the current challenges and the possible improvements of the PdM and CM systems are analyzed. Also, challenges and improvements regarding the components is taken into consideration. Finally, based on the research, possible improvements have been proposed and validated by the researcher. For the new digital era of secure and robust PdM 4.0, the improvements are vital references.


The ultimate guide to starting AI

A step-by-step overview of how to begin your project, including advice on how to craft a wise performance metric, setting up testing criteria to overcome human bias, and more.


Self-Service Analytics and Operationalization – Why You Need Both

Get the guidebook / whitepaper for a look at how today’s top data-driven companies scale their advanced analytics & machine learning efforts.


Managing risk in machine learning

In this post, I share slides and notes from a keynote I gave at the Strata Data Conference in New York last September. As the data community begins to deploy more machine learning (ML) models, I wanted to review some important considerations.


Preview my new book: Introduction to Reproducible Science in R

I’m pleased to share Part I of my new book ‘Introduction to Reproducible Science in R’. The purpose of this book is to approach model development and software development holistically to help make science and research more reproducible. The need for such a book arose from observing some of the challenges that I’ve seen teaching graduate courses in natural language processing and machine learning, as well as training my own staff to become effective data scientists. While quantitative reasoning and mathematics are important, often I found that the primary obstacle to good data science was reproducibility and repeatability: it’s difficult to quickly reproduce someone else’s results.


A Data Lake’s Worth of Audio Datasets

At Wonder Technologies, we have spent a lot of time building Deep learning systems that understand the world through audio. From deep learning based voice extraction to teaching computers how to read our emotions, we needed to use a wide set of data to deliver APIs that worked even in the craziest sound environments. Here is a list of datasets that I found pretty useful for our research and that I’ve personally used to make my audio related models perform much better in real-world environments.


Inferential Statistics basics

Statistics is one of the most important skills required by a data scientist. There is a lot of mathematics involved in statistics and it can be difficult to grasp. So in this tutorial we are going to go through some of the concepts of statistsics to learn and understand inferential statistics and master it.


Hybrid Fuzzy Name Matching

My workplace works with large-scale databases that, amongst many things, contains data about people. For each person in the DB we have a unique identifier, which is composed of the person’s first name, last name, zip code. We hold ~500MM people in our DB, which can essentially have duplicates if there is a little change in the person’s name. For example, Rob Rosen and Robert Rosen (with the same zip code) will be treated as two different people. I want to note that if we get the same person an additional time, we just update the record’s timestamp, so there is no need for this sort of deduping. In addition, I would like to give credit to my co-worker Jonathan Harel who assisted me in the research for this project.


Predicting Probability Distributions Using Neural Networks

If you’ve been following our tech blog lately, you might have noticed we’re using a special type of neural networks called Mixture Density Network (MDN). MDNs do not only predict the expected value of a target, but also the underlying probability distribution. This blogpost will focus on how to implement such a model using Tensorflow, from the ground up, including explanations, diagrams and a Jupyter notebook with the entire source code.


Why and how to Cross Validate a Model?

Once we are done with training our model, we just can’t assume that it is going to work well on data that it has not seen before. In other words, we cant be sure that the model will have the desired accuracy and variance in production environment. We need some kind of assurance of the accuracy of the predictions that our model is putting out. For this, we need to validate our model. This process of deciding whether the numerical results quantifying hypothesised relationships between variables, are acceptable as descriptions of the data, is known as validation.


From Eigendecomposition to Determinant: Fundamental Mathematics for Machine Learning with Intuitive Examples Part 3/3

For understanding the mathematics for machine learning algorithms, especially deep learning algorithms, it is essential to build up the mathematical concepts from foundational to more advanced. Unfortunately, Mathematical theories are too hard/abstract/dry to digest in many cases. Imagine you are eating a pizza, it is always easier and more fun to go with a coke. The purpose of this article is to provide intuitive examples for fundamental mathematical theories to make the learning experience more enjoyable and memorable, which is to serve chicken wings with beer, fries with ketchup, and rib-eye with wine.


Automated Feature Engineering for Predictive Modeling

One of the main time investments I’ve seen data scientists make when building data products is manually performing feature engineering. While tools such as auto-sklearn and Auto-Keras have been able to automate much of the model fitting process when building a predictive model, determining which features to use as input to the fitting process is usually a manual process. I recently started using the FeatureTools library, which enables data scientists to also automate feature engineering.


Finding and managing research papers: a survey of tools and products

As researchers, especially in (overly) prolific fields like Deep Learning, we often find ourselves overwhelmed by the huge amount of papers to read and keep track of in our work. I think one big reason for this is insufficient use of existing tools and services that aim to make our life easier. Another reason is the lack of a really good product which meets all our needs under one interface, but that is a topic for another post. Lately I’ve been getting into a new subfield of ML and got extremely frustrated with the process of prioritizing, reading and managing the relevant papers… I ended up looking for tools to help me deal with this overload and want to share with you the products and services that I’ve found. The goal is to improve the workflow and quality of life of anyone who works with scientific papers.


A New Hyperbolic Tangent Based Activation Function for Neural Networks

In this article, I introduce a new hyperbolic tangent based activation function, tangent linear unit (TaLU), for neural networks. The function was evaluated for performance using CIFAR-10 and CIFAR-100 database. The performance of the proposed activation function was in par or better than other activation functions such as: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), and exponential linear unit (ELU).


Beginner tutorial: Build your own custom real-time object classifier

In this tutorial, we will learn how to build a custom real-time object classifier to detect any object of your choice! We will be using BeautifulSoup and Selenium to scrape training images from Shutterstock, Amazon’s Mechanical Turk (or BBox Label Tool) to label images with bounding boxes, and YOLOv3 to train our custom detection model.

Document worth reading: “Visions of a generalized probability theory”

In this Book we argue that the fruitful interaction of computer vision and belief calculus is capable of stimulating significant advances in both fields. From a methodological point of view, novel theoretical results concerning the geometric and algebraic properties of belief functions as mathematical objects are illustrated and discussed in Part II, with a focus on both a perspective ‘geometric approach’ to uncertainty and an algebraic solution to the issue of conflicting evidence. In Part III we show how these theoretical developments arise from important computer vision problems (such as articulated object tracking, data association and object pose estimation) to which, in turn, the evidential formalism is able to provide interesting new solutions. Finally, some initial steps towards a generalization of the notion of total probability to belief functions are taken, in the perspective of endowing the theory of evidence with a complete battery of estimation and inference tools to the benefit of all scientists and practitioners. Visions of a generalized probability theory