While the Cox Proportional Hazard model is a fundamental tool in survival analysis, its semi-parametric nature precludes the estimation of upper survival quantiles in the presence of heavy censoring. In contrast, fully parametric models do not suffer from this issue – at the expense of additional modeling assumptions. In this article, we extend a popular family of parametric models which make the Accelerated Failure Time (AFT) assumption to account for heteroscedasticity in the log-survival times. This adds substantial modeling flexibility, and we show how to easily and rapidly compute maximum likelihood estimators for the proposed model in the presence of censoring. In an application to the analysis of a colon cancer study, we found that heteroscedastic modeling greatly diminished the significance of outliers, while even slightly decreasing the average size of prediction intervals.
Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entailment and contradiction is a valuable testing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of large-scale resources. To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude larger than all other resources of its type. This increase in scale allows lexicalized classifiers to outperform some sophisticated existing entailment models, and it allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
We propose a general framework for studying adaptive regret bounds in the online learning framework, including model selection bounds and data-dependent bounds. Given a data- or model-dependent bound we ask, ‘Does there exist some algorithm achieving this bound?’ We show that modifications to recently introduced sequential complexity measures can be used to answer this question by providing sufficient conditions under which adaptive rates can be achieved. In particular each adaptive rate induces a set of so-called offset complexity measures, and obtaining small upper bounds on these quantities is sufficient to demonstrate achievability. A cornerstone of our analysis technique is the use of one-sided tail inequalities to bound suprema of offset random processes. Our framework recovers and improves a wide variety of adaptive bounds including quantile bounds, second-order data-dependent bounds, and small loss bounds. In addition we derive a new type of adaptive bound for online linear optimization based on the spectral norm, as well as a new online PAC-Bayes theorem that holds for countably infinite sets.
We propose a method combining relational-logic representations with deep neural network learning. Domain-specific knowledge is described through relational rules which may be handcrafted or learned. The relational rule-set serves as a template for unfolding possibly deep neural networks whose structures also reflect the structure of given training or testing examples. Different networks corresponding to different examples share their weights, which co-evolve during training by stochastic gradient descend algorithm. Notable relational concepts can be discovered by interpreting shared hidden layer weights corresponding to the rules. Experiments on 78 relational learning benchmarks demonstrate the favorable performance of the method.
Text mining can be applied to many fields. One of the application is using text mining in digital newspaper to do politic sentiment analysis. In this paper sentiment analysis is applied to get information from digital news articles about its positive or negative sentiment regarding particular politician. This paper suggests a simple model to analyze digital newspaper sentiment polarity using naive Bayes classifier method. The model uses a set of initial data to begin with which will be updated when new information appears. The model showed promising result when tested and can be implemented to some other sentiment analysis problems.