Improving Decision Trees Using Tsallis Entropy

The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. Most of them, however, are greedy algorithms which have the drawback of obtaining only local optimums. Besides, common split criteria, e.g. Shannon entropy, Gain Ratio and Gini index, are also not flexible due to lack of adjustable parameters on data sets. To address the above issues, we propose a series of novel methods using Tsallis entropy in this paper. Firstly, a Tsallis Entropy Criterion (TEC) algorithm is proposed to unify Shannon entropy, Gain Ratio and Gini index, which generalizes the split criteria of decision trees. Secondly, we propose a Tsallis Entropy Information Metric (TEIM) algorithm for efficient construction of decision trees. The TEIM algorithm takes advantages of the adaptability of Tsallis conditional entropy and the reducing greediness ability of two-stage approach. Experimental results on UCI data sets indicate that the TEC algorithm achieves statistically significant improvement over the classical algorithms, and that the TEIM algorithm yields significantly better decision trees in both classification accuracy and tree complexity.

A Roadmap towards Machine Intelligence

The development of intelligent machines is one of the biggest unsolved challenges in computer science. In this paper, we propose some fundamental properties these machines should have, focusing in particular on communication and learning. We discuss a simple environment that could be used to incrementally teach a machine the basics of natural-language-based communication, as a prerequisite to more complex interaction with human users. We also present some conjectures on the sort of algorithms the machine should support in order to profitably learn from the environment.

Learning with Memory Embeddings

Embedding learning, a.k.a. representation learning, has been shown to be able to model large-scale semantic knowledge graphs. A key concept is a mapping of the knowledge graph to a tensor representation whose entries are predicted by models using latent representations of generalized entities. In recent publications the embedding models were extended to also consider temporal evolutions, temporal patterns and subsymbolic representations. These extended models were used successfully to predict clinical events like procedures, lab measurements, and diagnoses. In this paper, we attempt to map these embedding models, which were developed purely as solutions to technical problems, to various cognitive memory functions, in particular to semantic and concept memory, episodic memory and sensory memory. We also make an analogy between a predictive model, which uses entity representations derived in memory models, to working memory. Cognitive memory functions are typically classified as long-term or short-term memory, where long-term memory has the subcategories declarative memory and non-declarative memory and the short term memory has the subcategories sensory memory and working memory. There is evidence that these main cognitive categories are partially dissociated from one another in the brain, as expressed in their differential sensitivity to brain damage. However, there is also evidence indicating that the different memory functions are not mutually independent. A hypothesis that arises out off this work is that mutual information exchange can be achieved by sharing or coupling of distributed latent representations of entities across different memory functions.

Towards Universal Paraphrastic Sentence Embeddings

A latent trawl process model for extreme values

Star-critical Ramsey number of $K_4$ versus $F_n$

Plan Explainability and Predictability for Cobots

Max-Cut under Graph Constraints

Non-Ergodic Complexity Management

Sizes of the extremal girth 5 graphs of orders from 40 to 49

Translative covering of space with slabs

L1 Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs

Strategic Dialogue Management via Deep Reinforcement Learning

Domination polynomial of lexicographic product of specific graphs

Some recent results and open problems on sets of lengths of Krull monoids with finite class group

Desktop to Cloud Migration of Scientific Computing Experiments

Estimation and testing for multiple regulation of multivariate mixed outcomes

Remark on a result of Constantine

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

On the challenge of reconstructing level-1 phylogenetic networks from triplets and clusters

The excluded minors for isometric realizability in the plane

Learning to detect video events from zero or very few video examples

Restricted Markov uniqueness for the stochastic quantization of $P(Φ)_2$ and its applications

Penalized MM Regression Estimation with $L_{γ}$ Penalty: A Robust Version of Bridge Regression

Krylov-Veretennikov formula for functionals from the stopped Wiener process

Bases for cluster algebras from orbifolds

The G-convex Functions Based on the Nonlinear Expectations Defined by G-BSDEs

Bootstrap percolation in directed and inhomogeneous random graphs

$h$-perfect plane triangulations

Reordering GPU Kernel Launches to Enable Efficient Concurrent Execution

On a characterization of idempotent distributions on discrete fields and on the field of p-adic numbers

Information-theoretic neuro-correlates boost evolution of cognitive systems

MOOCs Meet Measurement Theory: A Topic-Modelling Approach

Distinct replication machinery causes an optimization–innovation trade-off in emergent digital replicators

Exploring Correlation between Labels to improve Multi-Label Classification

Learning Halfspaces and Neural Networks with Random Initialization

Maximum Likelihood Estimation for Single Linkage Hierarchical Clustering

Temporal Convolutional Neural Networks for Diagnosis from Lab Tests

Central limit theorem for the wave transport in disordered waveguides: a perturbative approach

The minimum rank problem for circulants

Refraction-reflection strategies in the dual model

Context-aware CNNs for person head detection

Natural Language Understanding with Distributed Representation

Flexible Design for $α$-Duplex Communications in Multi-Tier Cellular Networks

Performance Limits of Online Stochastic Sub-Gradient Learning

A Sample Path Large Deviation Principle for a Class of Population Processes

Private Posterior distributions from Variational approximations

rnn : Recurrent Library for Torch