Prediction with Unpredictable Feature Evolution (PUFE) google
Feature space can change or evolve when learning with streaming data. Several recent works have studied feature evolvable learning. They usually assume that features would not vanish or appear in an arbitrary way. For example, when knowing the battery lifespan, old features and new features represented by data gathered by sensors will disappear and emerge at the same time along with the sensors exchanging simultaneously. However, different sensors would have different lifespans, and thus the feature evolution can be unpredictable. In this paper, we propose a novel paradigm: Prediction with Unpredictable Feature Evolution (PUFE). We first complete the unpredictable overlapping period into an organized matrix and give a theoretical bound on the least number of observed entries. Then we learn the mapping from the completed matrix to recover the data from old feature space when observing the data from new feature space. With predictions on the recovered data, our model can make use of the advantage of old feature space and is always comparable with any combinations of the predictions on the current instance. Experiments on the synthetic and real datasets validate the effectiveness of our method. …

Rank-1 Convolutional Neural Network google
In this paper, we propose a convolutional neural network(CNN) with 3-D rank-1 filters which are composed by the outer product of 1-D filters. After being trained, the 3-D rank-1 filters can be decomposed into 1-D filters in the test time for fast inference. The reason that we train 3-D rank-1 filters in the training stage instead of consecutive 1-D filters is that a better gradient flow can be obtained with this setting, which makes the training possible even in the case where the network with consecutive 1-D filters cannot be trained. The 3-D rank-1 filters are updated by both the gradient flow and the outer product of the 1-D filters in every epoch, where the gradient flow tries to obtain a solution which minimizes the loss function, while the outer product operation tries to make the parameters of the filter to live on a rank-1 sub-space. Furthermore, we show that the convolution with the rank-1 filters results in low rank outputs, constraining the final output of the CNN also to live on a low dimensional subspace. …

Optuna google
The package was, and still is, developed by a Japanese AI company Preferred Networks. In many ways, Optuna is similar to Hyperopt. So why should you bother? There are a few reasons:
• It’s possible to specify how long the optimization process should last
• Integration with Pandas DataFrame
• The algorithm uses pruning to discard low-quality trials early
• It’s a relatively new project, and developers continue to work on it
• It was easier to use than Hyperopt (at least for me)
How to make your model awesome with Optuna

Text-Driven Graph Embedding With Pairs Sampling (TGE-PS) google
In graphs with rich text information, constructing expressive graph representations requires incorporating textual information with structural information. Graph embedding models are becoming more and more popular in representing graphs, yet they are faced with two issues: sampling efficiency and text utilization. Through analyzing existing models, we find their training objectives are composed of pairwise proximities, and there are large amounts of redundant node pairs in Random Walk-based methods. Besides, inferring graph structures directly from texts (also known as zero-shot scenario) is a problem that requires higher text utilization. To solve these problems, we propose a novel Text-driven Graph Embedding with Pairs Sampling (TGE-PS) framework. TGE-PS uses Pairs Sampling (PS) to generate training samples which reduces ~99% training samples and is competitive compared to Random Walk. TGE-PS uses Text-driven Graph Embedding (TGE) which adopts word- and character-level embeddings to generate node embeddings. We evaluate TGE-PS on several real-world datasets, and experimental results demonstrate that TGE-PS produces state-of-the-art results in traditional and zero-shot link prediction tasks. …