TEST google
Tracking developments in the highly dynamic data-technology landscape are vital to keeping up with novel technologies and tools, in the various areas of Artificial Intelligence (AI). However, It is difficult to keep track of all the relevant technology keywords. In this paper, we propose a novel system that addresses this problem. This tool is used to automatically detect the existence of new technologies and tools in text, and extract terms used to describe these new technologies. The extracted new terms can be logged as new AI technologies as they are found on-the-fly in the web. It can be subsequently classified into the relevant semantic labels and AI domains. Our proposed tool is based on a two-stage cascading model — the first stage classifies if the sentence contains a technology term or not; and the second stage identifies the technology keyword in the sentence. We obtain a competitive accuracy for both tasks of sentence classification and text identification. …

Self-Attention Aligner (SAA) google
Self-attention network, an attention-based feedforward neural network, has recently shown the potential to replace recurrent neural networks (RNNs) in a variety of NLP tasks. However, it is not clear if the self-attention network could be a good alternative of RNNs in automatic speech recognition (ASR), which processes the longer speech sequences and may have online recognition requirements. In this paper, we present a RNN-free end-to-end model: self-attention aligner (SAA), which applies the self-attention networks to a simplified recurrent neural aligner (RNA) framework. We also propose a chunk-hopping mechanism, which enables the SAA model to encode on segmented frame chunks one after another to support online recognition. Experiments on two Mandarin ASR datasets show the replacement of RNNs by the self-attention networks yields a 8.4%-10.2% relative character error rate (CER) reduction. In addition, the chunk-hopping mechanism allows the SAA to have only a 2.5% relative CER degradation with a 320ms latency. After jointly training with a self-attention network language model, our SAA model obtains further error rate reduction on multiple datasets. Especially, it achieves 24.12% CER on the Mandarin ASR benchmark (HKUST), exceeding the best end-to-end model by over 2% absolute CER. …

WarpLDA google
Developing efficient and scalable algorithms for Latent Dirichlet Allocation (LDA) is of wide interest for many applications. Previous work has developed an $O(1)$ Metropolis-Hastings sampling method for each token. However, the performance is far from being optimal due to random accesses to the parameter matrices and frequent cache misses. In this paper, we propose WarpLDA, a novel $O(1)$ sampling algorithm for LDA. WarpLDA is a Metropolis-Hastings based algorithm which is designed to optimize the cache hit rate. Advantages of WarpLDA include 1) Efficiency and scalability: WarpLDA has good locality and carefully designed partition method, and can be scaled to hundreds of machines; 2) Simplicity: WarpLDA does not have any complicated modules such as alias tables, hybrid data structures, or parameter servers, making it easy to understand and implement; 3) Robustness: WarpLDA is consistently faster than other algorithms, under various settings from small-scale to massive-scale dataset and model. WarpLDA is 5-15x faster than state-of-the-art LDA samplers, implying less cost of time and money. With WarpLDA users can learn up to one million topics from hundreds of millions of documents in a few hours, at the speed of 2G tokens per second, or learn topics from small-scale datasets in seconds. …

Physics-Informed Gaussian Process Regression (GPR) google
We present a physics-informed Gaussian Process Regression (GPR) model to predict the phase angle, angular speed, and wind mechanical power from a limited number of measurements. In the traditional data-driven GPR method, the form of the Gaussian Process covariance matrix is assumed and its parameters are found from measurements. In the physics-informed GPR, we treat unknown variables (including wind speed and mechanical power) as a random process and compute the covariance matrix from the resulting stochastic power grid equations. We demonstrate that the physics-informed GPR method is significantly more accurate than the standard data-driven one for immediate forecasting of generators’ angular velocity and phase angle. We also show that the physics-informed GPR provides accurate predictions of the unobserved wind mechanical power, phase angle, or angular velocity when measurements from only one of these variables are available. The immediate forecast of observed variables and predictions of unobserved variables can be used for effectively managing power grids (electricity market clearing, regulation actions) and early detection of abnormal behavior and faults. The physics-based GPR forecast time horizon depends on the combination of input (wind power, load, etc.) correlation time and characteristic (relaxation) time of the power grid and can be extended to short and medium-range times. …