Improving Decision Trees Using Tsallis Entropy
The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. Most of them, however, are greedy algorithms which have the drawback of obtaining only local optimums. Besides, common split criteria, e.g. Shannon entropy, Gain Ratio and Gini index, are also not flexible due to lack of adjustable parameters on data sets. To address the above issues, we propose a series of novel methods using Tsallis entropy in this paper. Firstly, a Tsallis Entropy Criterion (TEC) algorithm is proposed to unify Shannon entropy, Gain Ratio and Gini index, which generalizes the split criteria of decision trees. Secondly, we propose a Tsallis Entropy Information Metric (TEIM) algorithm for efficient construction of decision trees. The TEIM algorithm takes advantages of the adaptability of Tsallis conditional entropy and the reducing greediness ability of two-stage approach. Experimental results on UCI data sets indicate that the TEC algorithm achieves statistically significant improvement over the classical algorithms, and that the TEIM algorithm yields significantly better decision trees in both classification accuracy and tree complexity.
A Roadmap towards Machine Intelligence
The development of intelligent machines is one of the biggest unsolved challenges in computer science. In this paper, we propose some fundamental properties these machines should have, focusing in particular on communication and learning. We discuss a simple environment that could be used to incrementally teach a machine the basics of natural-language-based communication, as a prerequisite to more complex interaction with human users. We also present some conjectures on the sort of algorithms the machine should support in order to profitably learn from the environment.
Learning with Memory Embeddings
Embedding learning, a.k.a. representation learning, has been shown to be able to model large-scale semantic knowledge graphs. A key concept is a mapping of the knowledge graph to a tensor representation whose entries are predicted by models using latent representations of generalized entities. In recent publications the embedding models were extended to also consider temporal evolutions, temporal patterns and subsymbolic representations. These extended models were used successfully to predict clinical events like procedures, lab measurements, and diagnoses. In this paper, we attempt to map these embedding models, which were developed purely as solutions to technical problems, to various cognitive memory functions, in particular to semantic and concept memory, episodic memory and sensory memory. We also make an analogy between a predictive model, which uses entity representations derived in memory models, to working memory. Cognitive memory functions are typically classified as long-term or short-term memory, where long-term memory has the subcategories declarative memory and non-declarative memory and the short term memory has the subcategories sensory memory and working memory. There is evidence that these main cognitive categories are partially dissociated from one another in the brain, as expressed in their differential sensitivity to brain damage. However, there is also evidence indicating that the different memory functions are not mutually independent. A hypothesis that arises out off this work is that mutual information exchange can be achieved by sharing or coupling of distributed latent representations of entities across different memory functions.
• Towards Universal Paraphrastic Sentence Embeddings
• A latent trawl process model for extreme values
• Star-critical Ramsey number of $K_4$ versus $F_n$
• Plan Explainability and Predictability for Cobots
• Max-Cut under Graph Constraints
• Non-Ergodic Complexity Management
• Sizes of the extremal girth 5 graphs of orders from 40 to 49
• Translative covering of space with slabs
• L1 Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs
• Strategic Dialogue Management via Deep Reinforcement Learning
• Domination polynomial of lexicographic product of specific graphs
• Some recent results and open problems on sets of lengths of Krull monoids with finite class group
• Desktop to Cloud Migration of Scientific Computing Experiments
• Estimation and testing for multiple regulation of multivariate mixed outcomes
• Remark on a result of Constantine
• Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization
• On the challenge of reconstructing level-1 phylogenetic networks from triplets and clusters
• The excluded minors for isometric realizability in the plane
• Learning to detect video events from zero or very few video examples
• Restricted Markov uniqueness for the stochastic quantization of $P(Φ)_2$ and its applications
• Penalized MM Regression Estimation with $L_{γ}$ Penalty: A Robust Version of Bridge Regression
• Krylov-Veretennikov formula for functionals from the stopped Wiener process
• Bases for cluster algebras from orbifolds
• The G-convex Functions Based on the Nonlinear Expectations Defined by G-BSDEs
• Bootstrap percolation in directed and inhomogeneous random graphs
• $h$-perfect plane triangulations
• Reordering GPU Kernel Launches to Enable Efficient Concurrent Execution
• On a characterization of idempotent distributions on discrete fields and on the field of p-adic numbers
• Information-theoretic neuro-correlates boost evolution of cognitive systems
• MOOCs Meet Measurement Theory: A Topic-Modelling Approach
• Distinct replication machinery causes an optimization–innovation trade-off in emergent digital replicators
• Exploring Correlation between Labels to improve Multi-Label Classification
• Learning Halfspaces and Neural Networks with Random Initialization
• Maximum Likelihood Estimation for Single Linkage Hierarchical Clustering
• Temporal Convolutional Neural Networks for Diagnosis from Lab Tests
• Central limit theorem for the wave transport in disordered waveguides: a perturbative approach
• The minimum rank problem for circulants
• Refraction-reflection strategies in the dual model
• Context-aware CNNs for person head detection
• Natural Language Understanding with Distributed Representation
• Flexible Design for $α$-Duplex Communications in Multi-Tier Cellular Networks
• Performance Limits of Online Stochastic Sub-Gradient Learning
• A Sample Path Large Deviation Principle for a Class of Population Processes
• Private Posterior distributions from Variational approximations
• rnn : Recurrent Library for Torch
Like this:
Like Loading...