Individualized Time-Series Segmentation for Mining Mobile Phone User Behavior

Mobile phones can record individual’s daily behavioral data as a time-series. In this paper, we present an effective time-series segmentation technique that extracts optimal time segments of individual’s similar behavioral characteristics utilizing their mobile phone data. One of the determinants of an individual’s behavior is the various activities undertaken at various times-of-the-day and days-of-the-week. In many cases, such behavior will follow temporal patterns. Currently, researchers use either equal or unequal interval-based segmentation of time for mining mobile phone users’ behavior. Most of them take into account static temporal coverage of 24-h-a-day and few of them take into account the number of incidences in time-series data. However, such segmentations do not necessarily map to the patterns of individual user activity and subsequent behavior because of not taking into account the diverse behaviors of individuals over time-of-the-week. Therefore, we propose a behavior-oriented time segmentation (BOTS) technique that takes into account not only the temporal coverage of the week but also the number of incidences of diverse behaviors dynamically for producing similar behavioral time segments over the week utilizing time-series data. Experiments on the real mobile phone datasets show that our proposed segmentation technique better captures the user’s dominant behavior at various times-of-the-day and days-of-the-week enabling the generation of high confidence temporal rules in order to mine individual mobile phone users’ behavior.

Analyzing Machine Learning Workloads Using a Detailed GPU Simulator

Most deep neural networks deployed today are trained using GPUs via high-level frameworks such as TensorFlow and PyTorch. This paper describes changes we made to the GPGPU-Sim simulator to enable it to run PyTorch by running PTX kernels included in NVIDIA’s cuDNN library. We use the resulting modified simulator, which we plan to make available publicly with this paper, to study some simple deep learning workloads. With our changes to GPGPU-Sim’s functional simulation model, we find GPGPU-Sim performance model running a cuDNN enabled implementation of LeNet for MNIST reports results within 30% of real hardware. Using GPGPU-Sim’s AerialVision performance analysis tool we observe that cuDNN API calls contain many varying phases and appear to include potentially inefficient microarchitecture behaviour such as DRAM partition bank camping, at least when executed on GPGPU-Sim’s current performance model.

State of the Art in Fair ML: From Moral Philosophy and Legislation to Fair Classifiers

Machine learning is becoming an ever present part in our lives as many decisions, e.g. to lend a credit, are no longer made by humans but by machine learning algorithms. However those decisions are often unfair and discriminating individuals belonging to protected groups based on race or gender. With the recent General Data Protection Regulation (GDPR) coming into effect, new awareness has been raised for such issues and with computer scientists having such a large impact on peoples lives it is necessary that actions are taken to discover and prevent discrimination. This work aims to give an introduction into discrimination, legislative foundations to counter it and strategies to detect and prevent machine learning algorithms from showing such behavior.

An Interpretable Model for Scene Graph Generation

We propose an efficient and interpretable scene graph generator. We consider three types of features: visual, spatial and semantic, and we use a late fusion strategy such that each feature’s contribution can be explicitly investigated. We study the key factors about these features that have the most impact on the performance, and also visualize the learned visual features for relationships and investigate the efficacy of our model. We won the champion of the OpenImages Visual Relationship Detection Challenge on Kaggle, where we outperform the 2nd place by 5\% (20\% relatively). We believe an accurate scene graph generator is a fundamental stepping stone for higher-level vision-language tasks such as image captioning and visual QA, since it provides a semantic, structured comprehension of an image that is beyond pixels and objects.

Learning from Multiview Correlations in Open-Domain Videos

An increasing number of datasets contain multiple views, such as video, sound and automatic captions. A basic challenge in representation learning is how to leverage multiple views to learn better representations. This is further complicated by the existence of a latent alignment between views, such as between speech and its transcription, and by the multitude of choices for the learning objective. We explore an advanced, correlation-based representation learning method on a 4-way parallel, multimodal dataset, and assess the quality of the learned representations on retrieval-based tasks. We show that the proposed approach produces rich representations that capture most of the information shared across views. Our best models for speech and textual modalities achieve retrieval rates from 70.7% to 96.9% on open-domain, user-generated instructional videos. This shows it is possible to learn reliable representations across disparate, unaligned and noisy modalities, and encourages using the proposed approach on larger datasets.

Estimation of Individual Treatment Effect in Latent Confounder Models via Adversarial Learning

Estimating the individual treatment effect (ITE) from observational data is essential in medicine. A central challenge in estimating the ITE is handling confounders, which are factors that affect both an intervention and its outcome. Most previous work relies on the unconfoundedness assumption, which posits that all the confounders are measured in the observational data. However, if there are unmeasurable (latent) confounders, then confounding bias is introduced. Fortunately, noisy proxies for the latent confounders are often available and can be used to make an unbiased estimate of the ITE. In this paper, we develop a novel adversarial learning framework to make unbiased estimates of the ITE using noisy proxies.

Smoothed functional average variance estimation for dimension reduction

We propose an estimation method that we call functional average variance estimation (FAVE), for estimating the EDR space in functional semiparametric regression model, based on kernel estimates of density and regression. Consistency results are then established for the estimator of the interest operator, and for the directions of EDR space. A simulation study that shows that the proposed approach performs as well as traditional ones is presented.

Improving Grey-Box Fuzzing by Modeling Program Behavior

Grey-box fuzzers such as American Fuzzy Lop (AFL) are popular tools for finding bugs and potential vulnerabilities in programs. While these fuzzers have been able to find vulnerabilities in many widely used programs, they are not efficient; of the millions of inputs executed by AFL in a typical fuzzing run, only a handful discover unseen behavior or trigger a crash. The remaining inputs are redundant, exhibiting behavior that has already been observed. Here, we present an approach to increase the efficiency of fuzzers like AFL by applying machine learning to directly model how programs behave. We learn a forward prediction model that maps program inputs to execution traces, training on the thousands of inputs collected during standard fuzzing. This learned model guides exploration by focusing on fuzzing inputs on which our model is the most uncertain (measured via the entropy of the predicted execution trace distribution). By focusing on executing inputs our learned model is unsure about, and ignoring any input whose behavior our model is certain about, we show that we can significantly limit wasteful execution. Through testing our approach on a set of binaries released as part of the DARPA Cyber Grand Challenge, we show that our approach is able to find a set of inputs that result in more code coverage and discovered crashes than baseline fuzzers with significantly fewer executions.

HyperAdam: A Learnable Task-Adaptive Adam for Network Training

Deep neural networks are traditionally trained using human-designed stochastic optimization algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network parameters has emerged as a promising research topic. However, these learned black-box optimizers sometimes do not fully utilize the experience in human-designed optimizers, therefore have limitation in generalization ability. In this paper, a new optimizer, dubbed as \textit{HyperAdam}, is proposed that combines the idea of ‘learning to optimize’ and traditional Adam optimizer. Given a network for training, its parameter update in each iteration generated by HyperAdam is an adaptive combination of multiple updates generated by Adam with varying decay rates. The combination weights and decay rates in HyperAdam are adaptively learned depending on the task. HyperAdam is modeled as a recurrent neural network with AdamCell, WeightCell and StateCell. It is justified to be state-of-the-art for various network training, such as multilayer perceptron, CNN and LSTM.

Towards Robust Neural Networks with Lipschitz Continuity

Deep neural networks have shown remarkable performance across a wide range of vision-based tasks, particularly due to the availability of large-scale datasets for training and better architectures. However, data seen in the real world are often affected by distortions that not accounted for by the training datasets. In this paper, we address the challenge of robustness and stability of neural networks and propose a general training method that can be used to make the existing neural network architectures more robust and stable to input visual perturbations while using only available datasets for training. Proposed training method is convenient to use as it does not require data augmentation or changes in the network architecture. We provide theoretical proof as well as empirical evidence for the efficiency of the proposed training method by performing experiments with existing neural network architectures and demonstrate that same architecture when trained with the proposed training method perform better than when trained with conventional training approach in the presence of noisy datasets.

An Introduction to Krylov Subspace Methods

Nowadays, many fields of study are have to deal with large and sparse data matrixes, but the most important issue is finding the inverse of these matrixes. Thankfully, Krylov subspace methods can be used in solving these types of problem. However, it is difficult to understand mathematical principles behind these methods. In the first part of the article, Krylov methods are discussed in detail. Thus, readers equipped with a basic knowledge of linear algebra should be able to understand these methods. In this part, the knowledge of Krylov methods are put into some examples for simple implementations of a commonly known Krylov method GMRES. In the second part, the article talks about CG iteration, a wildly known method which is very similar to Krylov methods. By comparison between CG iteration and Krylov methods, readers can get a better comprehension of Krylov methods based on CG iteration. In the third part of the article, aiming to improve the efficiency of Krylov methods, preconditioners are discussed. In addition, the restarting GMRES is briefly introduced to reduce the space consumption of Krylov methods in this part.

Fog Computing Architecture: Survey and Challenges

Emerging technologies that generate a huge amount of data such as the Internet of Things (IoT) services need latency aware computing platforms to support time-critical applications. Due to the on-demand services and scalability features of cloud computing, Big Data application processing is done in the cloud infrastructure. Managing Big Data applications exclusively in the cloud is not an efficient solution for latency-sensitive applications related to smart transportation systems, healthcare solutions, emergency response systems and content delivery applications. Thus, the Fog computing paradigm that allows applications to perform computing operations in-between the cloud and the end devices has emerged. In Fog architecture, IoT devices and sensors are connected to the Fog devices which are located in close proximity to the users and it is also responsible for intermediate computation and storage. Most computations will be done on the edge by eliminating full dependencies on the cloud resources. In this chapter, we investigate and survey Fog computing architectures which have been proposed over the past few years. Moreover, we study the requirements of IoT applications and platforms, and the limitations faced by cloud systems when executing IoT applications. Finally, we review current research works that particularly focus on Big Data application execution on Fog and address several open challenges as well as future research directions.

PSICA: decision trees for probabilistic subgroup identification with categorical treatments

Personalized medicine aims at identifying best treatments for a patient with given characteristics. It has been shown in the literature that these methods can lead to great improvements in medicine compared to traditional methods prescribing the same treatment to all patients. Subgroup identification is a branch of personalized medicine which aims at finding subgroups of the patients with similar characteristics for which some of the investigated treatments have a better effect than the other treatments. A number of approaches based on decision trees has been proposed to identify such subgroups, but most of them focus on the two-arm trials (control/treatment) while a few methods consider quantitative treatments (defined by the dose). However, no subgroup identification method exists that can predict the best treatments in a scenario with a categorical set of treatments. We propose a novel method for subgroup identification in categorical treatment scenarios. This method outputs a decision tree showing the probabilities of a given treatment being the best for a given group of patients as well as labels showing the possible best treatments. The method is implemented in an R package \textbf{psica} available at CRAN. In addition to numerical simulations based on artificial data, we present an analysis of a community-based nutrition intervention trial that justifies the validity of our method.

Learning in the Absence of Training Data — a Galactic Application

There are multiple real-world problems in which training data is unavailable, and still, the ambition is to learn values of the system parameters, at which test data on an observable is realised, subsequent to the learning of the functional relationship between these variables. We present a novel Bayesian method to deal with such a problem, in which we learn a system function of a stationary dynamical system, for which only test data on a vector-valued observable is available, and training data is unavailable. This exercise borrows heavily from the state space probability density function (pdf), that we also learn. As there is no training data available for either sought function, we cannot learn its correlation structure, and instead, perform inference (using Metropolis-within-Gibbs), on the discretised form of the sought system function and of the {pdf}, where this pdf is constructed such that the unknown system parameters are embedded within its support. Likelihood of the unknowns given the available data, is defined in terms of such a {pdf}. We make an application to the learning of the density of all gravitational matter in a real galaxy.

The Statistical Dictionary-based String Matching Problem

In the Dictionary-based String Matching (DSM) problem, a retrieval system has access to a source sequence and stores the position of a certain number of strings in a posting table. When a user inquires the position of a string, the retrieval system, instead of searching in the source sequence directly, relies on the the posting table to answer the query more efficiently. In this paper, the Statistical DSM problem is a proposed as a statistical and information-theoretic formulation of the classic DSM problem in which both the source and the query have a statistical description while the strings stored in the posting sequence are described as a code. Through this formulation, we are able to define the efficiency of the retrieval system as the average cost in answering a users’ query in the limit of sufficiently long source sequence. This formulation is used to study the retrieval performance for the case in which (i) all the strings of a given length, referred to as k-grams , and (ii) prefix-free codes.

AutoSense Model for Word Sense Induction

Word sense induction (WSI), or the task of automatically discovering multiple senses or meanings of a word, has three main challenges: domain adaptability, novel sense detection, and sense granularity flexibility. While current latent variable models are known to solve the first two challenges, they are not flexible to different word sense granularities, which differ very much among words, from aardvark with one sense, to play with over 50 senses. Current models either require hyperparameter tuning or nonparametric induction of the number of senses, which we find both to be ineffective. Thus, we aim to eliminate these requirements and solve the sense granularity problem by proposing AutoSense, a latent variable model based on two observations: (1) senses are represented as a distribution over topics, and (2) senses generate pairings between the target word and its neighboring word. These observations alleviate the problem by (a) throwing garbage senses and (b) additionally inducing fine-grained word senses. Results show great improvements over the state-of-the-art models on popular WSI datasets. We also show that AutoSense is able to learn the appropriate sense granularity of a word. Finally, we apply AutoSense to the unsupervised author name disambiguation task where the sense granularity problem is more evident and show that AutoSense is evidently better than competing models. We share our data and code here: https://…/AutoSense.

Data Context Informed Data Wrangling

The process of preparing potentially large and complex data sets for further analysis or manual examination is often called data wrangling. In classical warehousing environments, the steps in such a process have been carried out using Extract-Transform-Load platforms, with significant manual involvement in specifying, configuring or tuning many of them. Cost-effective data wrangling processes need to ensure that data wrangling steps benefit from automation wherever possible. In this paper, we define a methodology to fully automate an end-to-end data wrangling process incorporating data context, which associates portions of a target schema with potentially spurious extensional data of types that are commonly available. Instance-based evidence together with data profiling paves the way to inform automation in several steps within the wrangling process, specifically, matching, mapping validation, value format transformation, and data repair. The approach is evaluated with real estate data showing substantial improvements in the results of automated wrangling.

Feature Selection for Survival Analysis with Competing Risks using Deep Learning

Deep learning models for survival analysis have gained significant attention in the literature, but they suffer from severe performance deficits when the dataset contains many irrelevant features. We give empirical evidence for this problem in real-world medical settings using the state-of-the-art model DeepHit. Furthermore, we develop methods to improve the deep learning model through novel approaches to feature selection in survival analysis. We propose filter methods for \textit{hard} feature selection and a neural network architecture that weights features for \textit{soft} feature selection. Our experiments on two real-world medical datasets demonstrate that substantial performance improvements against the original models are achievable.

Structured Pruning of Neural Networks with Budget-Aware Regularization

Pruning methods have shown to be effective at reducing the size of deep neural networks while keeping accuracy almost intact. Among the most effective methods are those that prune a network while training it with a sparsity prior loss and learnable dropout parameters. A shortcoming of these approaches however is that neither the size nor the inference speed of the pruned network can be controlled directly; yet this is a key feature for targeting deployment of CNNs on low-power hardware. To overcome this, we introduce a budgeted regularized pruning framework for deep convolutional neural networks. Our approach naturally fits into traditional neural network training as it consists of a learnable masking layer, a novel budget-aware objective function, and the use of knowledge distillation. We also provide insights on how to prune a residual network and how this can lead to new architectures. Experimental results reveal that CNNs pruned with our method are more accurate and less compute-hungry than state-of-the-art methods. Also, our approach is more effective at preventing accuracy collapse in case of severe pruning; this allows us to attain pruning factors up to 16x without significantly affecting the accuracy.

Multivariate Ensemble Forecast Framework for Demand Prediction of Anomalous Days

An accurate load forecast is always important for the power industry and energy players as it enables stakeholders to make critical decisions. In addition, its importance is further increased with growing uncertainties in the generation sector due to the high penetration of renewable energy and the introduction of demand side management strategies. An incremental improvement in grid-level demand forecast of anomalous days can potentially save millions of dollars. However, due to an increasing penetration of renewable energy resources and their dependency on several meteorological and exogenous variables, accurate load forecasting of anomalous days has now become very challenging. To improve the prediction accuracy of the load forecasting, an ensemble forecast framework (ENFF) is proposed with a systematic combination of three multiple predictors, namely Elman neural network (ELM), feedforward neural network (FNN) and radial basis function (RBF) neural network. These predictors are trained using global particle swarm optimization (GPSO) to improve their prediction capability in the ENFF. The outputs of individual predictors are combined using a trim aggregation technique by removing forecasting anomalies. Real recorded data of New England ISO grid is used for training and testing of the ENFF for anomalous days. The forecast results of the proposed ENFF indicate a significant improvement in prediction accuracy in comparison to autoregressive integrated moving average (ARIMA) and back-propagation neural networks (BPNN) based benchmark models.

Protecting User Privacy: An Approach for Untraceable Web Browsing History and Unambiguous User Profiles

The overturning of the Internet Privacy Rules by the Federal Communications Commissions (FCC) in late March 2017 allows Internet Service Providers (ISPs) to collect, share and sell their customers’ Web browsing data without their consent. With third-party trackers embedded on Web pages, this new rule has put user privacy under more risk. The need arises for users on their own to protect their Web browsing history from any potential adversaries. Although some available solutions such as Tor, VPN, and HTTPS can help users conceal their online activities, their use can also significantly hamper personalized online services, i.e., degraded utility. In this paper, we design an effective Web browsing history anonymization scheme, PBooster, aiming to protect users’ privacy while retaining the utility of their Web browsing history. The proposed model pollutes users’ Web browsing history by automatically inferring how many and what links should be added to the history while addressing the utility-privacy trade-off challenge. We conduct experiments to validate the quality of the manipulated Web browsing history and examine the robustness of the proposed approach for user privacy protection.

Learning Grouped Convolution for Efficient Domain Adaptation

This paper presents Dokei, an effective supervised domain adaptation method to transform a pre-trained CNN model to one involving efficient grouped convolution. The basis of this approach is formalised as a novel optimisation problem constrained by group sparsity pattern (GSP), and a practical solution based on structured regularisation and maximal bipartite matching is provided. We show that it is vital to keep the connections specified by GSP when mapping pre-trained weights to grouped convolution. We evaluate Dokei on various domains and hardware platforms to demonstrate its effectiveness. The models resulting from Dokei are shown to be more accurate and slimmer than prior work targeting grouped convolution, and more regular and easier to deploy than other pruning techniques.

Revisiting Pre-training: An Efficient Training Method for Image Classification

The training method of repetitively feeding all samples into a pre-defined network for image classification has been widely adopted by current state-of-the-art. In this work, we provide a new method, which can be leveraged to train classification networks in a more efficient way. Starting with a warm-up step, we propose to continually repeat a Drop-and-Pick (DaP) learning strategy. In particular, we drop those easy samples to encourage the network to focus on studying hard ones. Meanwhile, by picking up all samples periodically during training, we aim to recall the memory of the networks to prevent catastrophic forgetting of previously learned knowledge. Our DaP learning method can recover 99.88%, 99.60%, 99.83% top-1 accuracy on ImageNet for ResNet-50, DenseNet-121, and MobileNet-V1 but only requires 75% computation in training compared to those using the classic training schedule. Furthermore, our pre-trained models are equipped with strong knowledge transferability when used for downstream tasks, especially for hard cases. Extensive experiments on object detection, instance segmentation and pose estimation can well demonstrate the effectiveness of our DaP training method.

On the Importance of Strong Baselines in Bayesian Deep Learning

Like all sub-fields of machine learning, Bayesian Deep Learning is driven by empirical validation of its theoretical proposals. Given the many aspects of an experiment, it is always possible that minor or even major experimental flaws can slip by both authors and reviewers. One of the most popular experiments used to evaluate approximate inference techniques is the regression experiment on UCI datasets. However, in this experiment, models which have been trained to convergence have often been compared with baselines trained only for a fixed number of iterations. What we find is that if we take a well-established baseline and evaluate it under the same experimental settings, it shows significant improvements in performance. In fact, it outperforms or performs competitively with numerous to several methods that when they were introduced claimed to be superior to the very same baseline method. Hence, by exposing this flaw in experimental procedure, we highlight the importance of using identical experimental setups to evaluate, compare and benchmark methods in Bayesian Deep Learning.

Explicit Interaction Model towards Text Classification

Text classification is one of the fundamental tasks in natural language processing. Recently, deep neural networks have achieved promising performance in the text classification task compared to shallow models. Despite of the significance of deep models, they ignore the fine-grained (matching signals between words and classes) classification clues since their classifications mainly rely on the text-level representations. To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task. In particular, we design a novel framework, EXplicit interAction Model (dubbed as EXAM), equipped with the interaction mechanism. We justified the proposed approach on several benchmark datasets including both multi-label and multi-class text classification tasks. Extensive experimental results demonstrate the superiority of the proposed method. As a byproduct, we have released the codes and parameter settings to facilitate other researches.

Learning Multiple Defaults for Machine Learning Algorithms

The performance of modern machine learning methods highly depends on their hyperparameter configurations. One simple way of selecting a configuration is to use default settings, often proposed along with the publication and implementation of a new algorithm. Those default values are usually chosen in an ad-hoc manner to work good enough on a wide variety of datasets. To address this problem, different automatic hyperparameter configuration algorithms have been proposed, which select an optimal configuration per dataset. This principled approach usually improves performance, but adds additional algorithmic complexity and computational costs to the training procedure. As an alternative to this, we propose learning a set of complementary default values from a large database of prior empirical results. Selecting an appropriate configuration on a new dataset then requires only a simple, efficient and embarrassingly parallel search over this set. We demonstrate the effectiveness and efficiency of the approach we propose in comparison to random search and Bayesian Optimization.

Joint Neural Architecture Search and Quantization

Designing neural architectures is a fundamental step in deep learning applications. As a partner technique, model compression on neural networks has been widely investigated to gear the needs that the deep learning algorithms could be run with the limited computation resources on mobile devices. Currently, both the tasks of architecture design and model compression require expertise tricks and tedious trials. In this paper, we integrate these two tasks into one unified framework, which enables the joint architecture search with quantization (compression) policies for neural networks. This method is named JASQ. Here our goal is to automatically find a compact neural network model with high performance that is suitable for mobile devices. Technically, a multi-objective evolutionary search algorithm is introduced to search the models under the balance between model size and performance accuracy. In experiments, we find that our approach outperforms the methods that search only for architectures or only for quantization policies. 1) Specifically, given existing networks, our approach can provide them with learning-based quantization policies, and outperforms their 2 bits, 4 bits, 8 bits, and 16 bits counterparts. It can yield higher accuracies than the float models, for example, over 1.02% higher accuracy on MobileNet-v1. 2) What is more, under the balance between model size and performance accuracy, two models are obtained with joint search of architectures and quantization policies: a high-accuracy model and a small model, JASQNet and JASQNet-Small that achieves 2.97% error rate with 0.9 MB on CIFAR-10.

Privacy-preserving Transfer Learning for Knowledge Sharing

In many practical machine-learning applications, it is critical to allow knowledge to be transferred from external domains while preserving user privacy. Unfortunately, existing transfer-learning works do not have a privacy guarantee. In this paper, for the first time, we propose a method that can simultaneously transfer knowledge from external datasets while offering an \epsilon-differential privacy guarantee. First, we show that a simple combination of the hypothesis transfer learning and the privacy preserving logistic regression can address the problem. However, the performance of this approach can be poor as the sample size in the target domain may be small. To address this problem, we propose a new method which splits the feature set in source and target data into several subsets, and trains models on these subsets before finally aggregating the predictions by a stacked generalization. Feature importance can also be incorporated into the proposed method to further improve performance. We prove that the proposed method has an \epsilon-differential privacy guarantee, and further analysis shows that its performance is better than above simple combination given the same privacy budget. Finally, experiments on MINST and real-world RUIJIN datasets show that our proposed method achieves the start-of-the-art performance.

The Error is the Feature: how to Forecast Lightning using a Model Prediction Error

Despite the progress throughout the last decades, weather forecasting is still a challenging and computationally expensive task. Most models which are currently operated by meteorological services around the world rely on numerical weather prediction, a system based on mathematical algorithms describing physical effects. Recent progress in artificial intelligence however demonstrates that machine learning can be successfully applied to many research fields, especially areas dealing with big data that can be used for training. Current approaches to predict thunderstorms often focus on indices describing temperature differences in the atmosphere. If these indices reach a critical threshold, the forecast system emits a thunderstorm warning. Other meteorological systems such as radar and lightning detection systems are added for a more precise prediction. This paper describes a new approach to the prediction of lightnings based on machine learning rather than complex numerical computations. The error of optical flow algorithms applied to images of meteorological satellites is interpreted as a sign for convection potentially leading to thunderstorms. These results are used as the base for the feature generation phase incorporating different convolution steps. Tree classifier models are then trained to predict lightnings within the next few hours (called nowcasting) based on these features. The evaluation section compares the predictive power of the different models and the impact of different features on the classification result.

Entropy and expansion

Shearer’s inequality bounds the sum of joint entropies of random variables in terms of the total joint entropy. We give another lower bound for the same sum in terms of the individual entropies when the variables are functions of independent random seeds. The inequality involves a constant characterizing the expansion properties of the system. Our results generalize to entropy inequalities used in recent work in invariant settings, including the edge-vertex inequality for factor-of-IID processes, Bowen’s entropy inequalities, and Bollob\’as’s entropy bounds in random regular graphs. The proof method yields inequalities for other measures of randomness, including covariance. As an application, we give upper bounds for independent sets in both finite and infinite graphs.

Note on universal algorithms for learning theory

We propose the general way of study the universal estimator for the regression problem in learning theory considered in ‘Universal algorithms for learning theory Part I: piecewise constant functions’ and ‘Universal algorithms for learning theory Part II: piecewise constant functions’ written by Binev, P., Cohen, A., Dahmen, W., DeVore, R., Temlyakov, V. This new approch allows us to improve results.

A Study of Language and Classifier-independent Feature Analysis for Vocal Emotion Recognition
Model-independent femtoscopic Levy imaging for elastic proton-proton scattering
Dispersing obnoxious facilities on a graph
Robust Active Learning for Electrocardiographic Signal Classification
A Comparative Study of Quality and Content-Based Spatial Pooling Strategies in Image Quality Assessment
MAC: Mining Activity Concepts for Language-based Temporal Localization
Instability Effect of PLL on Voltage-Source Converters during Grid Faults: Large-Signal Modeling and Design-Oriented Analysis
Generating Adaptive and Robust Filter Sets Using an Unsupervised Learning Framework
Self-Adversarially Learned Bayesian Sampling
On the Bound of Inverse Images of a Polynomial Map
Acceleration of Primal-Dual Methods by Preconditioning and Fixed Number of Inner Loops
Pneumonia Detection in Chest Radiographs
Performance Analysis of Analog Intermittently Nonlinear Filter in the Presence of Impulsive Noise
Nonlinearity Mitigation in WDM Systems: Models, Strategies, and Achievable Rates
MS-UNIQUE: Multi-model and Sharpness-weighted Unsupervised Image Quality Estimation
Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots
On the almost sure central limit theorem for ARX processes in adaptive tracking
Multivariate Forecasting of Crude Oil Spot Prices using Neural Networks
Low-Resolution Face Recognition
Spread Divergences
A logarithmic bound for the chromatic number of the associahedron
The Price of Uncertain Priors in Source Coding
An Efficient Approach to Informative Feature Extraction from Multimodal Data
Polarity Loss for Zero-shot Object Detection
Risk Identification of Power Transmission System with Renewable Energy
On the Influence of Initial Qubit Placement During NISQ Circuit Compilation
Supervised Fitting of Geometric Primitives to 3D Point Clouds
Markov Chain Block Coordinate Descent
Separation Dimension and Degree
Robust Myopic Control for Systems with Imperfect Observations
Introducing Transformer Degradation in Distribution Locational Marginal Prices
Universal Approximation by a Slim Network with Sparse Shortcut Connections
Locational Marginal Value of Distributed Energy Resources as Non-Wires Alternatives
Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
Multi-View Inpainting for RGB-D Sequence
An Off-policy Policy Gradient Theorem Using Emphatic Weightings
Proceedings 4th Workshop on Formal Integrated Development Environment
A Census of Small Transitive Groups and Vertex-Transitive Graphs
Penalized least squares approximation methods and their applications to stochastic processes
Fundamental Limits on Measuring the Rotational Constraint of Single Molecules using Fluorescence Microscopy
Scalable Label Propagation Algorithms for Heterogeneous Networks
Joint Face Hallucination and Deblurring via Structure Generation and Detail Enhancement
Distorting Neural Representations to Generate Highly Transferable Adversarial Examples
Three-dimensional Optical Coherence Tomography Image Denoising via Multi-input Fully-Convolutional Networks
Bandits with Temporal Stochastic Constraints
Approximate Multi-Matroid Intersection via Iterative Refinement
Data Augmentation using Random Image Cropping and Patching for Deep CNNs
KekuleScope: improved prediction of cancer cell line sensitivity using convolutional neural networks trained on compound images
Large deviations for local mass of branching Brownian motion
Super Diffusion for Salient Object Detection
Covering radius in the Hamming permutation space
Detecting Adversarial Perturbations Through Spatial Behavior in Activation Spaces
Tight Approximation for Unconstrained XOS Maximization
Enhanced Expressive Power and Fast Training of Neural Networks by Random Projections
Mask R-CNN with Pyramid Attention Network for Scene Text Detection
Response monitoring of breast cancer on DCE-MRI using convolutional neural network-generated seed points and constrained volume growing
Random knots in three-dimensional three-colour percolation: numerical results and conjectures
Online Collective Animal Movement Activity Recognition
Reduction-based exact solution of prize-collecting Steiner tree problems
On the use of supervised clustering in stochastic NMPC design
Feature-based groupwise registration of historical aerial images to present-day ortho-photo maps
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning
Kleene stars of the plane, polylogarithms and symmetries
Rates in almost sure invariance principle for quickly mixing dynamical systems
Conditioning Optimization of Extreme Learning Machine by Multitask Beetle Antennae Swarm Algorithm
Failure of complex systems, cascading disasters, and the onset of disease
$k$-Sample problem based on generalized maximum mean discrepancy
Obstacle Avoidance Problem for Second Degree Nonholonomic Systems
Utilizing Dynamic Properties of Sharing Bits and Registers to Estimate User Cardinalities over Time
Driver Behavior Recognition via Interwoven Deep Convolutional Neural Nets with Multi-stream Inputs
BRDF Estimation of Complex Materials with Nested Learning
Uncalibrated Non-Rigid Factorisation by Independent Subspace Analysis
IEGAN: Multi-purpose Perceptual Quality Image Enhancement Using Generative Adversarial Network
REPT: A Streaming Algorithm of Approximating Global and Local Triangle Counts in Parallel
Construction of optimal locally recoverable codes and connection with hypergraph
Verifying C11 Programs Operationally
New estimates on the regularity of the pressure in density-constrained Mean Field Games
MGANet: A Robust Model for Quality Enhancement of Compressed Video
On the chromatic number of disjointness graphs of curves
Generalized Range Moves
Dual Reweighted Lp-Norm Minimization for Salt-and-pepper Noise Removal
Distributed Compression of Correlated Classical-Quantum Sources or: The Price of Ignorance
Object-oriented Targets for Visual Navigation using Rich Semantic Representations
Pilot-Aided Joint-Channel Carrier-Phase Estimation in Space-Division Multiplexed Multicore Fiber Transmission
Ergodicity analysis and antithetic integral control of a class of stochastic reaction networks with delays
Self Paced Adversarial Training for Multimodal Few-shot Learning
An embedded–hybridized discontinuous Galerkin finite element method for the Stokes equations
Using External Archive for Improved Performance in Multi-Objective Optimization
Kane-Fisher weak link physics in the clean scratched-XY model
Partition Recurrences
Sprinkling a few random edges doubles the power
Jacobi Fields in Optimal Control II: One-dimensional Variations
Quenched asymptotics for interacting diffusions on inhomogeneous random graphs
Verification of Planning Domain Models – Revisited
Better Bounds for Online Line Chasing
Copy the Old or Paint Anew? An Adversarial Framework for (non-) Parametric Image Stylization
Impedance-Based Stability Analysis for Interconnected Converter Systems with Open-Loop RHP Poles
Should we adjust for pupil background in school value-added models? A study of Progress 8 and school accountability in England
FAIM — A ConvNet Method for Unsupervised 3D Medical Image Registration
Automatic L3 slice detection in 3D CT images using fully-convolutional networks
TGANv2: Efficient Training of Large Models for Video Generation with Multiple Subsampling Layers
Oversight of Unsafe Systems via Dynamic Safety Envelopes
Distributed Gradient Descent with Coded Partial Gradient Computations
Recommending Users: Whom to Follow on Federated Social Networks
Strength in Numbers: Trading-off Robustness and Computation via Adversarially-Trained Ensembles
Image Quality Assessment and Color Difference
Second-Order Agents on Ring Digraphs
Fault Detection Using Color Blending and Color Transformations
Bayesian Alternatives to the Black-Litterman Model
Parametric Noise Injection: Trainable Randomness to Improve Deep Neural Network Robustness against Adversarial Attack
Solving Chance Constrained Optimization under Non-Parametric Uncertainty Through Hilbert Space Embedding
Estimation of Ornstein-Uhlenbeck Process Using Ultra-High-Frequency Data with Application to Intraday Pairs Trading Strategy
Applying FISTA to optimization problems (with or) without minimizers
How to find MH370?
On Profitability of Trailing Mining
A formula for the partition function that ‘counts’
An Ensemble Framework For Day-Ahead Forecast of PV Output Power in Smart Grids
Deep Neural Network Aided Scenario Identification in Wireless Multi-path Fading Channels
Predicting Diabetes Disease Evolution Using Financial Records and Recurrent Neural Networks
Unsupervised Word Discovery with Segmental Neural Language Models
A Sufficient Condition for Convergences of Adam and RMSProp
PRIN: Pointwise Rotation-Invariant Network
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors
Learning pronunciation from a foreign language in speech synthesis networks
Signal-Anticipating in Local Voltage Control in Distribution Systems
Fine Grained Classification of Personal Data Entities
High-Dimensional Robust Mean Estimation in Nearly-Linear Time
Online Learning for Network Constrained Demand Response Pricing in Distribution Systems
Kinetic Methods for Inverse Problems
Temporally Coherent GANs for Video Super-Resolution (TecoGAN)
Angles of the Gaussian simplex
Optimal Scheduling of Multi-Energy Systems with Flexible Electrical and Thermal Loads
MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image
Your Rugby Mates Don’t Need to Know your Colleagues: Triadic Closure with Edge Colors
Feedback based Mobility Control Algorithm for Maximizing Node Coverage by Drone Base Stations
Natural language understanding for task oriented dialog in the biomedical domain in a low resources context
On the Number of Real Zeros of Random Fewnomials
What is known about Vertex Cover Kernelization?
Phase I dose-escalation trials with more than one dosing regimen
Backdoor Decomposable Monotone Circuits and their Propagation Complete Encodings
A weight-bounded importance sampling method for variance reduction
Una valutazione di copertura, qualita ed efficienza dei servizi sanitari regionali tra 2010 e 2013
Semivariogram Hyper-Parameter Estimation for Whittle-Matérn Priors in Bayesian Inverse Problems
Rank-frequency distribution of natural languages: a difference of probabilities approach
Extraction of Azimuthal Asymmetries using Optimal Observables
Quantifying Filter Bubbles: Analyzing Surprise in Elections
Fast Object Class Labelling via Speech
Parallel sequential Monte Carlo for stochastic optimization
Defect Detection from UAV Images based on Region-Based CNNs
A class of linear codes with few weights
Kac-Lévy processes
LSD$_2$ – Joint Denoising and Deblurring of Short and Long Exposure Images with Convolutional Neural Networks
On finite-dimensional set-inclusive constraint systems: local analysis and related optimality conditions
Darcy law for yield stress fluid
MURAUER: Mapping Unlabeled Real Data for Label AUstERity
Large deviations for geodesic random walks
Monopulse beam synthesis using a sparse single-layer of weights
Generalized Pareto Copulas: A Key to Multivariate Extremes
Global solutions and random dynamical systems for rough evolution equations
Complementary Segmentation of Primary Video Objects with Reversible Flows
Sum-Rate Capacity for Symmetric Gaussian Multiple Access Channels with Feedback
Competency Questions and SPARQL-OWL Queries Dataset and Analysis
On three domination numbers in block graphs
A Game Model of Search and Pursuit
High Dimensional Classification through $\ell_0$-Penalized Empirical Risk Minimization
Quantum Control at the Boundary
Multifidelity Approximate Bayesian Computation
Fixation probabilities for the Moran process with three or more strategies: general and coupling results
Learning Attractor Dynamics for Generative Memory
Hyperdimensional Computing Nanosystem
Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior
An Adaptive Approach for Automated Grapevine Phenotyping using VGG-based Convolutional Neural Networks
Contributions to Biclustering of Microarray Data Using Formal Concept Analysis
Do GAN Loss Functions Really Matter?
Kernel-Based Training of Generative Networks
A lower bound for online rectangle packing
A Hierarchical Neural Network for Sequence-to-Sequences Learning
An alternative approach to heavy-traffic limits for finite-pool queues
About the k-Error Linear Complexity over $\mathbb{F}_p$ of sequences of length 2$p$ with optimal three-level autocorrelation
Synchronization in time-varying random networks with vanishing connectivity
NEP-PACK: A Julia package for nonlinear eigenproblems – v0.2
Spectral Multigraph Networks for Discovering and Fusing Relationships in Molecules
Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses
Model-Based Reinforcement Learning for Sepsis Treatment