Inference of Binary Regime Models with Jump Discontinuities

We have developed a statistical technique to test the model assumption of binary regime switching extension of the geometric L\'{e}vy process (GLP) by proposing a new discriminating statistics. The statistics is sensitive to the transition kernel of the regime switching model. With this statistics, given a time series data, one can test the hypothesis on the nature of regime switching. Furthermore, we have implemented this statistics for testing the regime switching hypothesis with Indian sectoral indices and have reported the result here. The result shows a clear indication of presence of multiple regimes in the data.

Development and evaluation of an open-source, machine learning-based average annual daily traffic estimation software

Traditionally, Departments of Transportation (DOTs) use the factor-based model to estimate Annual Average Daily Traffic (AADT) from short-term traffic counts. The expansion factors, derived from the permanent traffic count stations, are applied to the short-term counts for AADT estimation. The inherent challenges of the factor-based method (i.e., grouping the count stations, applying proper expansion factors) make the estimated AADT values erroneous. Based on a survey conducted by the authors, 97% of the 39 public transportation agencies use the factor-based AADT estimation model, and these agencies face the aforementioned challenges while using factor-based models to estimate AADT. To derive a more accurate AADT, this paper presents the ‘estimAADTion’ software, which is an open-source software developed based on a machine learning method called support vector regression (SVR) for estimating AADT using 24-hour short-term count data. DOTs conduct short-term counts at different locations periodically. This software has been designed to estimate AADT at a particular location from the short-term counts collected at those locations. In order to estimate AADT from short-term counts, the software uses data from permanent count stations to train the SVR model. The performance of the ‘estimAADTion’ software is validated using the short-term count data from South Carolina. The Mean Absolute Percentage Error (MAPE) of the AADT estimated from the software is 3%, while the factor-based method produces a MAPE value of 6%.

Feature Selection and Extraction for Graph Neural Networks

Graph Neural Networks (GNNs) have been a latest hot research topic in data science, due to the fact that they use the ubiquitous data structure graphs as the underlying elements for constructing and training neural networks. In a GNN, each node has numerous features associated with it. The entire task (for example, classification, or clustering) utilizes the features of the nodes to make decisions, at node level or graph level. In this paper, (1) we extend the feature selection algorithm presented in via Gumbel Softmax to GNNs. We conduct a series of experiments on our feature selection algorithms, using various benchmark datasets: Cora, Citeseer and Pubmed. (2) We implement a mechanism to rank the extracted features. We demonstrate the effectiveness of our algorithms, for both feature selection and ranking. For the Cora dataset, (1) we use the algorithm to select 225 features out of 1433 features. Our experimental results demonstrate their effectiveness for the same classification problem. (2) We extract features such that they are linear combinations of the original features, where the coefficients for each extracted features are non-negative and sum up to one. We propose an algorithm to rank the extracted features in the sense that when using them for the same classification problem, the accuracy goes down gradually for the extracted features within the rank 1 – 50, 51 – 100, 100 – 150, and 151 – 200.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ‘Colossal Clean Crawled Corpus’, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.

Functional Tensors for Probabilistic Programming

It is a significant challenge to design probabilistic programming systems that can accommodate a wide variety of inference strategies within a unified framework. Noting that the versatility of modern automatic differentiation frameworks is based in large part on the unifying concept of tensors, we describe a software abstraction –functional tensors– that captures many of the benefits of tensors, while also being able to describe continuous probability distributions. Moreover, functional tensors are a natural candidate for generalized variable elimination and parallel-scan filtering algorithms that enable parallel exact inference for a large family of tractable modeling motifs. We demonstrate the versatility of functional tensors by integrating them into the modeling frontend and inference backend of the Pyro programming language. In experiments we show that the resulting framework enables a large variety of inference strategies, including those that mix exact and approximate inference.

LUTNet: Learning FPGA Configurations for Highly Efficient Neural Network Inference

Research has shown that deep neural networks contain significant redundancy, and thus that high classification accuracy can be achieved even when weights and activations are quantised down to binary values. Network binarisation on FPGAs greatly increases area efficiency by replacing resource-hungry multipliers with lightweight XNOR gates. However, an FPGA’s fundamental building block, the K-LUT, is capable of implementing far more than an XNOR: it can perform any K-input Boolean operation. Inspired by this observation, we propose LUTNet, an end-to-end hardware-software framework for the construction of area-efficient FPGA-based neural network accelerators using the native LUTs as inference operators. We describe the realisation of both unrolled and tiled LUTNet architectures, with the latter facilitating smaller, less power-hungry deployment over the former while sacrificing area and energy efficiency along with throughput. For both varieties, we demonstrate that the exploitation of LUT flexibility allows for far heavier pruning than possible in prior works, resulting in significant area savings while achieving comparable accuracy. Against the state-of-the-art binarised neural network implementation, we achieve up to twice the area efficiency for several standard network models when inferencing popular datasets. We also demonstrate that even greater energy efficiency improvements are obtainable.

GF + MMT = GLF — From Language to Semantics through LF

These days, vast amounts of knowledge are available online, most of it in written form. Search engines help us access this knowledge, but aggregating, relating and reasoning with it is still a predominantly human effort. One of the key challenges for automated reasoning based on natural-language texts is the need to extract meaning (semantics) from texts. Natural language understanding (NLU) systems describe the conversion from a set of natural language utterances to terms in a particular logic. Tools for the co-development of grammar and target logic are currently largely missing. We will describe the Grammatical Logical Framework (GLF), a combination of two existing frameworks, in which large parts of a symbolic, rule-based NLU system can be developed and implemented: the Grammatical Framework (GF) and MMT. GF is a tool for syntactic analysis, generation, and translation with complex natural language grammars and MMT can be used to specify logical systems and to represent knowledge in them. Combining these tools is possible, because they are based on compatible logical frameworks: Martin-L\’of type theory and LF. The flexibility of logical frameworks is needed, as NLU research has not settled on a particular target logic for meaning representation. Instead, new logics are developed all the time to handle various language phenomena. GLF allows users to develop the logic and the language parsing components in parallel, and to connect them for experimentation with the entire pipeline.

Preventing Adversarial Use of Datasets through Fair Core-Set Construction

We propose improving the privacy properties of a dataset by publishing only a strategically chosen ‘core-set’ of the data containing a subset of the instances. The core-set allows strong performance on primary tasks, but forces poor performance on unwanted tasks. We give methods for both linear models and neural networks and demonstrate their efficacy on data.

ProLFA: Representative Prototype Selection for Local Feature Aggregation

Given a set of hand-crafted local features, acquiring a global representation via aggregation is a promising technique to boost computational efficiency and improve task performance. Existing feature aggregation (FA) approaches, including Bag of Words and Fisher Vectors, usually fail to capture the desired information due to their pipeline mode. In this paper, we propose a generic formulation to provide a systematical solution (named ProLFA) to aggregate local descriptors. It is capable of producing compact yet interpretable representations by selecting representative prototypes from numerous descriptors, under relaxed exclusivity constraint. Meanwhile, to strengthen the discriminability of the aggregated representation, we rationally enforce the domain-invariant projection of bundled descriptors along a task-specific direction. Furthermore, ProLFA is also provided with a powerful generalization ability to deal flexibly with the semi-supervised and fully supervised scenarios in local feature aggregation. Experimental results on various descriptors and tasks demonstrate that the proposed ProLFA is considerably superior over currently available alternatives about feature aggregation.

Taxonomy of Real Faults in Deep Learning Systems

The growing application of deep neural networks in safety-critical domains makes the analysis of faults that occur in such systems of enormous importance. In this paper we introduce a large taxonomy of faults in deep learning (DL) systems. We have manually analysed 1059 artefacts gathered from GitHub commits and issues of projects that use the most popular DL frameworks (TensorFlow, Keras and PyTorch) and from related Stack Overflow posts. Structured interviews with 20 researchers and practitioners describing the problems they have encountered in their experience have enriched our taxonomy with a variety of additional faults that did not emerge from the other two sources. Our final taxonomy was validated with a survey involving an additional set of 21 developers, confirming that almost all fault categories (13/15) were experienced by at least 50% of the survey participants.

Simple Strategies in Multi-Objective MDPs (Technical Report)

We consider the verification of multiple expected reward objectives at once on Markov decision processes (MDPs). This enables a trade-off analysis among multiple objectives by obtaining the Pareto front. We focus on strategies that are easy to employ and implement. That is, strategies that are pure (no randomization) and have bounded memory. We show that checking whether a point is achievable by a pure stationary strategy is NP-complete, even for two objectives, and we provide an MILP encoding to solve the corresponding problem. The bounded memory case can be reduced to the stationary one by a product construction. Experimental results using \Storm and Gurobi show the feasibility of our algorithms.

U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging

Neural networks are becoming more and more popular for the analysis of physiological time-series. The most successful deep learning systems in this domain combine convolutional and recurrent layers to extract useful features to model temporal relations. Unfortunately, these recurrent models are difficult to tune and optimize. In our experience, they often require task-specific modifications, which makes them challenging to use for non-experts. We propose U-Time, a fully feed-forward deep learning approach to physiological time series segmentation developed for the analysis of sleep data. U-Time is a temporal fully convolutional network based on the U-Net architecture that was originally proposed for image segmentation. U-Time maps sequential inputs of arbitrary length to sequences of class labels on a freely chosen temporal scale. This is done by implicitly classifying every individual time-point of the input signal and aggregating these classifications over fixed intervals to form the final predictions. We evaluated U-Time for sleep stage classification on a large collection of sleep electroencephalography (EEG) datasets. In all cases, we found that U-Time reaches or outperforms current state-of-the-art deep learning models while being much more robust in the training process and without requiring architecture or hyperparameter adaptation across tasks.

RoboNet: Large-Scale Multi-Robot Learning

Robot learning has emerged as a promising tool for taming the complexity and diversity of the real world. Methods based on high-capacity models, such as deep networks, hold the promise of providing effective generalization to a wide range of open-world environments. However, these same methods typically require large amounts of diverse training data to generalize effectively. In contrast, most robotic learning experiments are small-scale, single-domain, and single-robot. This leads to a frequent tension in robotic learning: how can we learn generalizable robotic controllers without having to collect impractically large amounts of data for each separate experiment? In this paper, we propose RoboNet, an open database for sharing robotic experience, which provides an initial pool of 15 million video frames, from 7 different robot platforms, and study how it can be used to learn generalizable models for vision-based robotic manipulation. We combine the dataset with two different learning algorithms: visual foresight, which uses forward video prediction models, and supervised inverse models. Our experiments test the learned algorithms’ ability to work across new objects, new tasks, new scenes, new camera viewpoints, new grippers, or even entirely new robots. In our final experiment, we find that by pre-training on RoboNet and fine-tuning on data from a held-out Franka or Kuka robot, we can exceed the performance of a robot-specific training approach that uses 4x-20x more data. For videos and data, see the project webpage: