Object Mining (OM) 
The goal of our work is to discover dominant objects without using any annotations. We focus on performing unsupervised object discovery and localization in a strictly general setting where only a single image is given. This is far more challenge than typical colocalization or weaklysupervised localization tasks. To tackle this problem, we propose a simple but effective pattern miningbased method, called Object Mining (OM), which exploits the advantages of data mining and feature representation of pretrained convolutional neural networks (CNNs). Specifically,Object Mining first converts the feature maps from a pretrained CNN model into a set of transactions, and then frequent patterns are discovered from transaction data base through pattern mining techniques. We observe that those discovered patterns, i.e., cooccurrence highlighted regions,typically hold appearance and spatial consistency. Motivated by this observation, we can easily discover and localize possible objects by merging relevant meaningful patterns in an unsupervised manner. Extensive experiments on a variety of benchmarks demonstrate that Object Mining achieves competitive performance compared with the stateoftheart methods. 
Object Oriented Data Analysis (OODA) 
Object oriented data analysis (OODA) is the statistical analysis of data sets of complex objects. The area is understood through consideration of the atom of the statistical analysis. In a first course in statistics, the atoms are numbers. Atoms are vectors in multivariate analysis. An interesting special case of OODA is functional data analysis, where atoms are curves; see Ramsay and Silverman for excellent overviews, as well as many interesting analyses, novel methodologies and detailed discussion. More general atoms have also been considered. Locantore et al. studied the case of images as atoms, and Pizer, Thall and Chen and Yushkevich et al. took the atoms to be shape objects in two and threedimensional space. An important major goal of OODA is understanding population structure of a data set. The usual first step is to find a centerpoint, for example, a mean or median, of the data set. The second step is to analyze the variation about the center. Principal component analysis (PCA) has been a workhorse method for this, especially when combined with new visualizations as done in functional data analysis. An important reason for this success to date is that the data naturally lie in Euclidean spaces, where standard vector space analyses have proven to be both insightful and effective. 
Object Saliency map  Deep reinforcement learning has become popular over recent years, showing superiority on different visualinput tasks such as playing Atari games and robot navigation. Although objects are important image elements, few work considers enhancing deep reinforcement learning with object characteristics. In this paper, we propose a novel method that can incorporate object recognition processing to deep reinforcement learning models. This approach can be adapted to any existing deep reinforcement learning frameworks. Stateoftheart results are shown in experiments on Atari games. We also propose a new approach called ‘object saliency maps’ to visually explain the actions made by deep reinforcement learning agents. 
ObjectDriven Attentive Generative Adversarial Newtork (ObjGAN) 
In this paper, we propose Objectdriven Attentive Generative Adversarial Newtorks (ObjGANs) that allow objectcentered texttoimage synthesis for complex scenes. Following the twostep (layoutimage) generation process, a novel objectdriven attentive image generator is proposed to synthesize salient objects by paying attention to the most relevant words in the text description and the pregenerated semantic layout. In addition, a new Fast RCNN based objectwise discriminator is proposed to provide rich objectwise discrimination signals on whether the synthesized object matches the text description and the pregenerated layout. The proposed ObjGAN significantly outperforms the previous state of the art in various metrics on the largescale COCO benchmark, increasing the Inception score by 27% and decreasing the FID score by 11%. A thorough comparison between the traditional grid attention and the new objectdriven attention is provided through analyzing their mechanisms and visualizing their attention layers, showing insights of how the proposed model generates complex scenes in high quality. 
Objective Function  A function that is to be optimized (minimizing or maximizing a numerical value depending on a particular task or problem), for example, an objective function in pattern classification tasks could be to minimize the error rate of a classifier. 
ObjectiveReinforced Generative Adversarial Network (ORGAN) 
In unsupervised data generation tasks, besides the generation of a sample based on previous observations, one would often like to give hints to the model in order to bias the generation towards desirable metrics. We propose a method that combines Generative Adversarial Networks (GANs) and reinforcement learning (RL) in order to accomplish exactly that. While RL biases the data generation process towards arbitrary metrics, the GAN component of the reward function ensures that the model still remembers information learned from data. We build upon previous results that incorporated GANs and RL in order to generate sequence data and test this model in several settings for the generation of molecules encoded as text sequences (SMILES) and in the context of music generation, showing for each case that we can effectively bias the generation process towards desired metrics. 
Objectoriented Neural Programming (OONP) 
We propose Objectoriented Neural Programming (OONP), a framework for semantically parsing documents in specific domains. Basically, OONP reads a document and parses it into a predesigned objectoriented data structure (referred to as ontology in this paper) that reflects the domainspecific semantics of the document. An OONP parser models semantic parsing as a decision process: a neural netbased Reader sequentially goes through the document, and during the process it builds and updates an intermediate ontology to summarize its partial understanding of the text it covers. OONP supports a rich family of operations (both symbolic and differentiable) for composing the ontology, and a big variety of forms (both symbolic and differentiable) for representing the state and the document. An OONP parser can be trained with supervision of different forms and strength, including supervised learning (SL) , reinforcement learning (RL) and hybrid of the two. Our experiments on both synthetic and realworld document parsing tasks have shown that OONP can learn to handle fairly complicated ontology with training data of modest sizes. 
OBOE  Algorithm selection and hyperparameter tuning remain two of the most challenging tasks in machine learning. The number of machine learning applications is growing much faster than the number of machine learning experts, hence we see an increasing demand for efficient automation of learning processes. Here, we introduce OBOE, an algorithm for timeconstrained model selection and hyperparameter tuning. Taking advantage of similarity between datasets, OBOE finds promising algorithm and hyperparameter configurations through collaborative filtering. Our system explores these models under time constraints, so that rapid initializations can be provided to warmstart more finegrained optimization methods. One novel aspect of our approach is a new heuristic for active learning in timeconstrained matrix completion based on optimal experiment design. Our experiments demonstrate that OBOE delivers stateoftheart performance faster than competing approaches on a test bed of supervised learning problems. 
Observable Operator Model (OOM) 
Observable Operator Models (OOMs) were introduced by Jaeger as a generalization of hidden Markov models (HMMs). The theory of OOMs makes use of both probabilistic and linear algebraic tools, which has an important advantage: using the tools of linear algebra a very simple and efficient learning algorithm can be developed for OOMs. This seems to be better than the known algorithms for HMMs.
A widely used class of models for stochastic systems is hidden Markov models. Systems that can be modeled by hidden Markov models are a proper subclass of linearly dependent processes, a class of stochastic systems known from mathematical investigations carried out over the past four decades. This article provides a novel, simple characterization of linearly dependent processes, called observable operator models . The mathematical properties of observable operator models lead to a constructive learning algorithm for the identification of linearly dependent processes. The core of the algorithm has a time complexity of O(N + nm3), where N is the size of training data, n is the number of distinguishable outcomes of observations, and m is model statespace dimension. A short introduction to observable operator models of discrete stochastic processes 
ObservedHidden Merged Seeded Network (OHMS) 
➚ “Graphical Inference in ObservedHidden Variable Merged Seeded Network” 
Ocean Tensor Library  Matrix and tensor operations form the basis of a wide range of fields and applications, and in many cases constitute a substantial part of the overall computational complexity. The ability of generalpurpose GPUs to speed up many of these operations and enable others has resulted in a widespread adaptation of these devices. In order for tensor operations to take full advantage of the computational power, specialized software is required, and currently there exist several packages (predominantly in the area of deep learning) that incorporate tensor operations on both CPU and GPU. Nevertheless, a standalone framework that supports general tensor operations is still missing. In this paper we fill this gap and propose the Ocean Tensor Library: a modular tensorsupport package that is designed to serve as a foundational layer for applications that require dense tensor operations on a variety of device types. The API is carefully designed to be powerful, extensible, and at the same time easy to use. The package is available as open source. 
OCKELM+  Kernel methodbased oneclass classifier is mainly used for outlier or novelty detection. In this letter, kernel ridge regression (KRR) based oneclass classifier (KOC) has been extended for learning using privileged information (LUPI). LUPIbased KOC method is referred to as KOC+. This privileged information is available as a feature with the dataset but only for training (not for testing). KOC+ utilizes the privileged information differently compared to normal feature information by using a socalled correction function. Privileged information helps KOC+ in achieving better generalization performance which is exhibited in this letter by testing the classifiers with and without privileged information. Existing and proposed classifiers are evaluated on the datasets from UCI machine learning repository and also on MNIST dataset. Moreover, experimental results evince the advantage of KOC+ over KOC and support vector machine (SVM) based oneclass classifiers. 
Octave  GNU Octave is a highlevel interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write noninteractive programs. The Octave language is quite similar to Matlab so that most programs are easily portable. http://wiki.octave.org 
OctNet  We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling. 
Octree Generating Networks  We present a deep convolutional decoder architecture that can generate volumetric 3D outputs in a compute and memoryefficient manner by using an octree representation. The network learns to predict both the structure of the octree, and the occupancy values of individual cells. This makes it a particularly valuable technique for generating 3D shapes. In contrast to standard decoders acting on regular voxel grids, the architecture does not have cubic complexity. This allows representing much higher resolution outputs with a limited memory budget. We demonstrate this in several application domains, including 3D convolutional autoencoders, generation of objects and whole scenes from highlevel representations, and shape from a single image. 
Oddball SGD  Stochastic Gradient Descent (SGD) is arguably the most popular of the machine learning methods applied to training deep neural networks (DNN) today. It has recently been demonstrated that SGD can be statistically biased so that certain elements of the training set are learned more rapidly than others. In this article, we place SGD into a feedback loop whereby the probability of selection is proportional to error magnitude. This provides a noveltydriven oddball SGD process that learns more rapidly than traditional SGD by prioritising those elements of the training set with the largest novelty (error). In our DNN example, oddball SGD trains some 50x faster than regular SGD. 
Odds  Odds are a numerical expression used in gambling and statistics to reflect the likelihood that a particular event will take place. Conventionally, they are expressed in the form “X to Y”, where X and Y are numbers. In gambling, odds represent the ratio between the amounts staked by parties to a wager or bet. Thus, odds of 6 to 1 mean the first party (normally a bookmaker) is staking six times the amount that the second party is. In statistics, odds represent the probability that an event will take place. Thus, odds of 6 to 1 mean that there are six possible outcomes in which the event will not take place to every one where it will. In other words, the probability that X will not happen is six times the probability that it will. The gambling and statistical uses of odds are closely interlinked. If a bet is a fair one, then the odds offered to the gamblers will perfectly reflect relative probabilities. If the odds being offered to the gamblers do not correspond to probability in this way then one of the parties to the bet has an advantage over the other. 
ODTest  In the real world, a learning system could receive an input that looks nothing like anything it has seen during training, and this can lead to unpredictable behaviour. We thus need to know whether any given input belongs to the population distribution of the training data to prevent unpredictable behaviour in deployed systems. A recent surge of interest on this problem has led to the development of sophisticated techniques in the deep learning literature. However, due to the absence of a standardized problem formulation or an exhaustive evaluation, it is not evident if we can rely on these methods in practice. What makes this problem different from a typical supervised learning setting is that we cannot model the diversity of outofdistribution samples in practice. The distribution of outliers used in training may not be the same as the distribution of outliers encountered in the application. Therefore, classical approaches that learn inliers vs. outliers with only two datasets can yield optimistic results. We introduce ODtest, a threedataset evaluation scheme as a practical and more reliable strategy to assess progress on this problem. The ODtest benchmark provides a straightforward means of comparison for methods that address the outofdistribution sample detection problem. We present an exhaustive evaluation of a broad set of methods from related areas on image classification tasks. Furthermore, we show that for realistic applications of highdimensional images, the existing methods have low accuracy. Our analysis reveals areas of strength and weakness of each method. 
Offline Algorithm  In computer science, an online algorithm is one that can process its input piecebypiece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start. In contrast, an offline algorithm is given the whole problem data from the beginning and is required to output an answer which solves the problem at hand. 
Offline MultiAction Policy Learning  In many settings, a decisionmaker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action. Examples include selecting offers, prices, advertisements, or emails to send to consumers, as well as the problem of determining which medication to prescribe to a patient. While there is a growing body of literature devoted to this problem, most existing results are focused on the case where data comes from a randomized experiment, and further, there are only two possible actions, such as giving a drug to a patient or not. In this paper, we study the offline multiaction policy learning problem with observational data and where the policy may need to respect budget constraints or belong to a restricted policy class such as decision trees. We build on the theory of efficient semiparametric inference in order to propose and implement a policy learning algorithm that achieves asymptotically minimaxoptimal regret. To the best of our knowledge, this is the first result of this type in the multiaction setup, and it provides a substantial performance improvement over the existing learning algorithms. We then consider additional computational challenges that arise in implementing our method for the case where the policy is restricted to take the form of a decision tree. We propose two different approaches, one using a mixed integer program formulation and the other using a treesearch based algorithm. 
Offline Resource Scheduling Algorithm (DeepRM_Off) 
With the rapid development of deep learning, deep reinforcement learning (DRL) began to appear in the field of resource scheduling in recent years. Based on the previous research on DRL in the literature, we introduce online resource scheduling algorithm DeepRM2 and the offline resource scheduling algorithm DeepRM_Off. Compared with the stateoftheart DRL algorithm DeepRM and heuristic algorithms, our proposed algorithms have faster convergence speed and better scheduling efficiency with regarding to average slowdown time, job completion time and rewards. 
OffsetNet  Navigating surgical tools in the dynamic and tortuous anatomy of the lung’s airways requires accurate, realtime localization of the tools with respect to the preoperative scan of the anatomy. Such localization can inform human operators or enable closedloop control by autonomous agents, which would require accuracy not yet reported in the literature. In this paper, we introduce a deep learning architecture, called OffsetNet, to accurately localize a bronchoscope in the lung in realtime. After training on only 30 minutes of recorded camera images in conserved regions of a lung phantom, OffsetNet tracks the bronchoscope’s motion on a heldout recording through these same regions at an update rate of 47 Hz and an average position error of 1.4 mm. Because this model performs poorly in less conserved regions, we augment the training dataset with simulated images from these regions. To bridge the gap between camera and simulated domains, we implement domain randomization and a generative adversarial network (GAN). After training on simulated images, OffsetNet tracks the bronchoscope’s motion in less conserved regions at an average position error of 2.4 mm, which meets conservative thresholds required for successful tracking. 
OfftheGrid Model Based Deep Learning (OMODL) 
We introduce a model based offthegrid image reconstruction algorithm using deep learned priors. The main difference of the proposed scheme with current deep learning strategies is the learning of nonlinear annihilation relations in Fourier space. We rely on a model based framework, which allows us to use a significantly smaller deep network, compared to direct approaches that also learn how to invert the forward model. Preliminary comparisons against image domain MoDL approach demonstrates the potential of the offthegrid formulation. The main benefit of the proposed scheme compared to structured lowrank methods is the quite significant reduction in computational complexity. 
OHIE  Blockchain protocols, originating from Bitcoin, have established a new model of trust through decentralization. However, the low transaction throughput of the first generation of blockchain consensus protocols has been a serious concern. Many new protocols have been proposed recently that scale the throughput of the blockchain with available bandwidth. However, these scalable consensus protocols are becoming increasingly complex, making it more and more difficult to verify their end safety and liveness guarantees. This encumbers adoption since blockchain protocols are difficult to upgrade, once deployed. We propose a new consensus protocol for permissionless blockchains, called OHIE, with an explicit goal of aiming for simplicity. OHIE composes as many parallel instances of Bitcoin’s original (and simple) backbone protocol as needed to achieve nearoptimal throughput (i.e., utilizing within a constant factor of the available bandwidth). OHIE tolerates a Byzantine adversary with fraction f < 1/2 of the computation power. We formally prove safety and liveness properties of OHIE. Our proof invokes previously established properties of Bitcoin’s backbone protocol as a blackbox, given the modular design of OHIE. In our experimental evaluation with up to 50,000 nodes, OHIE achieves nearoptimal throughput, and provides better decentralization of at least about 20x over prior works. 
Oja Median  Consider p+1 points in R^p. These points form a simplex, which has a pdimensional volume. For example, in R^3 four points form a tetrahedron, and in R^2 three points form a triangle whose area is ‘2dimensional volume’. Now consider a data set in R^p for which we seek the median. Oja proposed the following measure for a point X in R^p: · for every subset of p points from the data set, form a simplex with X. · sum together the volumes of each such simplex. · the Oja simplex median is any point X* in R^p for which this sum is minimum. 
Omegaml  omegaml – the data science platform that scales from laptop to enterprise. Batteries included. 
OmniScale Network (OSNet) 
As an instancelevel recognition problem, person reidentification (ReID) relies on discriminative features, which not only capture different spatial scales but also encapsulate an arbitrary combination of multiple scales. We call these features of both homogeneous and heterogeneous scales omniscale features. In this paper, a novel deep CNN is designed, termed OmniScale Network (OSNet), for omniscale feature learning in ReID. This is achieved by designing a residual block composed of multiple convolutional feature streams, each detecting features at a certain scale. Importantly, a novel unified aggregation gate is introduced to dynamically fuse multiscale features with inputdependent channelwise weights. To efficiently learn spatialchannel correlations and avoid overfitting, the building block uses both pointwise and depthwise convolutions. By stacking such blocks layerbylayer, our OSNet is extremely lightweight and can be trained from scratch on existing ReID benchmarks. Despite its small model size, our OSNet achieves stateoftheart performance on six personReID datasets. 
On2Vec  Populating ontology graphs represents a longstanding problem for the Semantic Web community. Recent advances in translationbased graph embedding methods for populating instancelevel knowledge graphs lead to promising new approaching for the ontology population problem. However, unlike instancelevel graphs, the majority of relation facts in ontology graphs come with comprehensive semantic relations, which often include the properties of transitivity and symmetry, as well as hierarchical relations. These comprehensive relations are often too complex for existing graph embedding methods, and direct application of such methods is not feasible. Hence, we propose On2Vec, a novel translationbased graph embedding method for ontology population. On2Vec integrates two model components that effectively characterize comprehensive relation facts in ontology graphs. The first is the Componentspecific Model that encodes concepts and relations into lowdimensional embedding spaces without a loss of relational properties; the second is the Hierarchy Model that performs focused learning of hierarchical relation facts. Experiments on several wellknown ontology graphs demonstrate the promising capabilities of On2Vec in predicting and verifying new relation facts. These promising results also make possible significant improvements in related methods. 
OnBoard/OffBoard Distributed Data Analytics (OODIDA) 
OODIDA (Onboard/Offboard Distributed Data Analytics) is a platform for distributing and executing concurrent data analysis tasks. It targets a fleet of reference vehicles in the automotive industry and has a particular focus on rapid prototyping. Its underlying messagepassing infrastructure has been implemented in Erlang/OTP, but the external applications for user interaction and carrying out data analysis tasks use a languageindependent JSON interface. These applications are primarily implemented in Python. A data analyst interacting with OODIDA uses a Python library. The bulk of the data analytics tasks are performed by clients (onboard), while a central server performs supplementary tasks (offboard). OODIDA can be automatically packaged and deployed, which necessitates restarting parts of the system, or all of it. This is potentially disruptive. To address this issue, we added the ability to execute userdefined Python modules on both the client and the server, which can be replaced without restarting any part of the system. Modules can even be swapped between iterations of an ongoing assignment. This facilitates use cases such as iterative A/B testing of machine learning algorithms or deploying experimental algorithms onthefly. Activecode replacement is a key feature of our system as well as an example of interoperability between a functional and a nonfunctional programming language. 
OnDisk Data Processing (ODDP) 
In this paper, we present a survey of ‘ondisk’ data processing (ODDP). ODDP, which is a form of neardata processing, refers to the computing arrangement where the secondary storage drives have the data processing capability. Proposed ODDP schemes vary widely in terms of the data processing capability, target applications, architecture and the kind of storage drive employed. Some ODDP schemes provide only a specific but heavily used operation like sort whereas some provide a full range of operations. Recently, with the advent of Solid State Drives, powerful and extensive ODDP solutions have been proposed. In this paper, we present a thorough review of architectures developed for different ondisk processing approaches along with current and future challenges and also identify the future directions which ODDP can take. 
OneClass Adversarial net (OCAN) 
Many online applications, such as online social networks or knowledge bases, are often attacked by malicious users who commit different types of actions such as vandalism on Wikipedia or fraudulent reviews on eBay. Currently, most of the fraud detection approaches require a training dataset that contains records of both benign and malicious users. However, in practice, there are often no or very few records of malicious users. In this paper, we develop oneclass adversarial nets (OCAN) for fraud detection using training data with only benign users. OCAN first uses LSTMAutoencoder to learn the representations of benign users from their sequences of online activities. It then detects malicious users by training a discriminator with a complementary GAN model that is different from the regular GAN model. Experimental results show that our OCAN outperforms the stateoftheart oneclass classification models and achieves comparable performance with the latest multisource LSTM model that requires both benign and malicious users in the training phase. 
OneClass Classification  This paper presents a method called Oneclass Classification using Length statistics of Emerging Patterns Plus (OCLEP+). 
OneClass Generative Adversarial Network (OCGAN) 
We present a novel model called OCGAN for the classical problem of oneclass novelty detection, where, given a set of examples from a particular class, the goal is to determine if a query example is from the same class. Our solution is based on learning latent representations of inclass examples using a denoising autoencoder network. The key contribution of our work is our proposal to explicitly constrain the latent space to exclusively represent the given class. In order to accomplish this goal, firstly, we force the latent space to have bounded support by introducing a tanh activation in the encoder’s output layer. Secondly, using a discriminator in the latent space that is trained adversarially, we ensure that encoded representations of inclass examples resemble uniform random samples drawn from the same bounded space. Thirdly, using a second adversarial discriminator in the input space, we ensure all randomly drawn latent samples generate examples that look real. Finally, we introduce a gradientdescent based sampling technique that explores points in the latent space that generate potential outofclass examples, which are fed back to the network to further train it to generate inclass examples from those points. The effectiveness of the proposed method is measured across four publicly available datasets using two oneclass novelty detection protocols where we achieve stateoftheart results. 
OneClass Support Vector Machine (OCSVM) 
Traditionally, many classification problems try to solve the two or multiclass situation. The goal of the machine learning application is to distinguish test data between a number of classes, using training data. But what if you only have data of one class and the goal is to test new data and found out whether it is alike or not like the training data? A method for this task, which gained much popularity the last two decades, is the OneClass Support Vector Machine. Estimating the Support of a HighDimensional Distribution 
OneFactorAtaTime (OFAT) 
The onefactoratatime method (or OFAT) is a method of designing experiments involving the testing of factors, or causes, one at a time instead of all simultaneously. Prominent text books and academic papers currently favor factorial experimental designs, a method pioneered by Sir Ronald A. Fisher, where multiple factors are changed at once. The reasons stated for favoring the use of factorial design over OFAT are: 1. OFAT requires more runs for the same precision in effect estimation 2. OFAT cannot estimate interactions 3. OFAT can miss optimal settings of factors Despite these criticisms, some researchers have articulated a role for OFAT and showed they can be more effective than fractional factorials under certain conditions (number of runs is limited, primary goal is to attain improvements in the system, and experimental error is not large compared to factor effects, which must be additive and independent of each other). Designed experiments remain nearly always preferred to OFAT with many types and methods available, in addition to fractional factorials which, though usually requiring more runs than OFAT, do address the three concerns above. One modern design over which OFAT has no advantage in number of runs is the PlackettBurman which, by having all factors vary simultaneously (an important quality in experimental designs), gives generally greater precision in effect estimation. reval 
OnePass Algorithm  In computing, a onepass algorithm is one which reads its input exactly once, in order, without unbounded buffering. A onepass algorithm generally requires O(n) time and less than O(n) storage (typically O(1)), where n is the size of the input. Basically onepass algorithm operates as follows: (1) the object descriptions are processed serially; (2) the first object becomes the cluster representative of the first cluster; (3) each subsequent object is matched against all cluster representatives existing at its processing time; (4) a given object is assigned to one cluster (or more if overlap is allowed) according to some condition on the matching function; (5) when an object is assigned to a cluster the representative for that cluster is recomputed; (6) if an object fails a certain test it becomes the cluster representative of a new cluster ➚ “Big O Notation” 
OneShot Conditional Object Detection Framework (OSCD) 
The current advances in object detection depend on largescale datasets to get good performance. However, there may not always be sufficient samples in many scenarios, which leads to the research on fewshot detection as well as its extreme variation oneshot detection. In this paper, the oneshot detection has been formulated as a conditional probability problem. With this insight, a novel oneshot conditional object detection (OSCD) framework, referred as Comparison Network (ComparisonNet), has been proposed. Specifically, query and target image features are extracted through a Siamese network as mapped metrics of marginal probabilities. A twostage detector for OSCD is introduced to compare the extracted query and target features with the learnable metric to approach the optimized nonlinear conditional probability. Once trained, ComparisonNet can detect objects of both seen and unseen classes without further training, which also has the advantages including classagnostic, trainingfree for unseen classes, and without catastrophic forgetting. Experiments show that the proposed approach achieves stateoftheart performance on the proposed datasets of FashionMNIST and PASCAL VOC. 
OneShot Federated Learning  We present oneshot federated learning, where a central server learns a global model over a network of federated devices in a single round of communication. Our approach – drawing on ensemble learning and knowledge aggregation – achieves an average relative gain of 51.5% in AUC over local baselines and comes within 90.1% of the (unattainable) global ideal. We discuss these methods and identify several promising directions of future work. 
OneShot Imitation Learning  Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from very few demonstrations of any given task, and instantly generalize to new situations of the same task, without requiring taskspecific engineering. In this paper, we propose a metalearning framework for achieving such capability, which we call oneshot imitation learning. Specifically, we consider the setting where there is a very large set of tasks, and each task has many instantiations. For example, a task could be to stack all blocks on a table into a single tower, another task could be to place all blocks on a table into twoblock towers, etc. In each case, different instances of the task would consist of different sets of blocks with different initial states. At training time, our algorithm is presented with pairs of demonstrations for a subset of all tasks. A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration. At test time, a demonstration of a single instance of a new task is presented, and the neural net is expected to perform well on new instances of this new task. The use of soft attention allows the model to generalize to conditions and tasks unseen in the training data. We anticipate that by training this model on a much greater variety of tasks and settings, we will obtain a general system that can turn any demonstrations into robust policies that can accomplish an overwhelming variety of tasks. Videos available at https://bit.ly/oneshotimitation. 
OneShot Learning  Oneshot learning is an object categorization problem in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, oneshot learning aims to learn information about object categories from one, or only a few, training images. 
OneShot Observation Learning  Observation learning is the process of learning a task by observing an expert demonstrator. We present a robust observation learning method for robotic systems. Our principle contributions are in introducing a one shot learning method where only a single demonstration is needed for learning and in proposing a novel feature extraction method for extracting unique activity features from the demonstration. Reward values are then generated from these demonstrations. We use a learning algorithm with these rewards to learn the controls for a robotic manipulator to perform the demonstrated task. With simulation and real robot experiments, we show that the proposed method can be used to learn tasks from a single demonstration under varying conditions of viewpoints, object properties, morphology of manipulators and scene backgrounds. 
OneSided Dynamic Principal Components  odpc 
OneSided Preference Game With Reference Information (OSPGR) 
We often try to predict others’ actions by obtaining supporting information that shows a preference index of surrounding people. In order to reproduce these situations, we propose a game named ‘Onesided Preference Game with Reference Information (OSPGR).’ We conducted experiments in which players who have similar preferences compete for objects in OSPGR. In the experiment, we used three different types of objects: boxes, faces, and cars. Our results show that the most frequently selected object was not the most popular one. In order to gain deeper insights into the experiment’s results, we constructed a decisionmaking model based on two assumptions: (1) players are rational and (2) are convinced that the other players’ preference orders are equivalent to the preference index for the group. Compared to the choice behavior of the model, the experiment’s results show that there was a tendency to take risks when the objects were faces, or the priority of that particular player was low. 
OneStep Spectral Attack (OSSA) 
Many deep learning models are vulnerable to the adversarial attack, i.e., imperceptible but intentionallydesigned perturbations to the input can cause incorrect output of the networks. In this paper, using information geometry, we provide a reasonable explanation for the vulnerability of deep learning models. By considering the data space as a nonlinear space with the Fisher information metric induced from a neural network, we first propose an adversarial attack algorithm termed onestep spectral attack (OSSA). The method is described by a constrained quadratic form of the Fisher information matrix, where the optimal adversarial perturbation is given by the first eigenvector, and the model vulnerability is reflected by the eigenvalues. The larger an eigenvalue is, the more vulnerable the model is to be attacked by the corresponding eigenvector. Taking advantage of the property, we also propose an adversarial detection method with the eigenvalues serving as characteristics. Both our attack and detection algorithms are numerically optimized to work efficiently on large datasets. Our evaluations show superior performance compared with other methods, implying that the Fisher information is a promising approach to investigate the adversarial attacks and defenses. 
Online Algorithm  In computer science, an online algorithm is one that can process its input piecebypiece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start. In contrast, an offline algorithm is given the whole problem data from the beginning and is required to output an answer which solves the problem at hand. 
Online Analytical Mining (OLAM) 
Online Analytical Processing (OLAP) technology is an essential element of the decision support system and permits decision makers to visualize huge operational data for quick, consistent, interactive and meaningful analysis. More recently, data mining techniques are also used together with OLAP to analyze large data sets which makes OLAP more useful and easier to apply in decision support systems. Several works in the past proved the likelihood and interest of integrating OLAP with data mining and as a result a new promising direction of Online Analytical Mining (OLAM) has emerged. OLAM provides a multidimensional view of its data and creates an interactive data mining environment whereby users can dynamically select data mining and OLAP functions, perform OLAP operations (such as drilling, slicing, dicing and pivoting on the data mining results), as well as perform mining operations on OLAP results, that is, mining different portions of data at multiple levels of abstraction. 
Online Analytical Processing (OLAP) 
In computing, online analytical processing, or OLAP, is an approach to answering multidimensional analytical (MDA) queries swiftly. OLAP is part of the broader category of business intelligence, which also encompasses relational database, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications coming up, such as agriculture. The term OLAP was created as a slight modification of the traditional database term Online Transaction Processing (“OLTP”). 
Online Connected Dominating Set Leasing (OCDSL) 
We introduce the \emph{Online Connected Dominating Set Leasing} problem (OCDSL) in which we are given an undirected connected graph $G = (V, E)$, a set $\mathcal{L}$ of lease types each characterized by a duration and cost, and a sequence of subsets of $V$ arriving over time. A node can be leased using lease type $l$ for cost $c_l$ and remains active for time $d_l$. The adversary gives in each step $t$ a subset of nodes that need to be dominated by a connected subgraph consisting of nodes active at time $t$. The goal is to minimize the total leasing costs. OCDSL contains the \emph{Parking Permit Problem}~\cite{PPP} as a special subcase and generalizes the classical offline \emph{Connected Dominating Set} problem~\cite{Guha1998}. It has an $\Omega(\log ^2 n + \log \mathcal{L})$ randomized lower bound resulting from lower bounds for the \emph{Parking Permit Problem} and the \emph{Online Set Cover} problem~\cite{Alon:2003:OSC:780542.780558,Korman}, where $\mathcal{L}$ is the number of available lease types and $n$ is the number of nodes in the input graph. We give a randomized $\mathcal{O}(\log ^2 n + \log \mathcal{L} \log n)$competitive algorithm for OCDSL. We also give a deterministic algorithm for a variant of OCDSL in which the dominating subgraph need not be connected, the \emph{Online Dominating Set Leasing} problem. The latter is based on a simple primaldual approach and has an $\mathcal{O}(\mathcal{L} \cdot \Delta)$competitive ratio, where $\Delta$ is the maximum degree of the input graph. 
Online Convex Dictionary Learning  Dictionary learning is a dimensionality reduction technique widely used in data mining, machine learning and signal processing alike. Nevertheless, many dictionary learning algorithms such as variants of Matrix Factorization (MF) do not adequately scale with the size of available datasets. Furthermore, scalable dictionary learning methods lack interpretability of the derived dictionary matrix. To mitigate these two issues, we propose a novel lowcomplexity, batch online convex dictionary learning algorithm. The algorithm sequentially processes small batches of data maintained in a fixed amount of storage space, and produces meaningful dictionaries that satisfy convexity constraints. Our analytical results are twofold. First, we establish convergence guarantees for the proposed online learning scheme. Second, we show that a subsequence of the generated dictionaries converges to a stationary point of the approximationerror function. Experimental results on synthetic and real world datasets demonstrate both the computational savings of the proposed online method with respect to convex nonnegative MF, and performance guarantees comparable to those of online nonconvex learning. 
Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD) 
Recent work in distance metric learning has focused on learning transformations of data that best align with specified pairwise similarity and dissimilarity constraints, often supplied by a human observer. The learned transformations lead to improved retrieval, classification, and clustering algorithms due to the better adapted distance or similarity measures. Here, we address the problem of learning these transformations when the underlying constraint generation process is nonstationary. This nonstationarity can be due to changes in either the groundtruth clustering used to generate constraints or changes in the feature subspaces in which the class structure is apparent. We propose Online Convex Ensemble StrongLy Adaptive Dynamic Learning (OCELAD), a general adaptive, online approach for learning and tracking optimal metrics as they change over time that is highly robust to a variety of nonstationary behaviors in the changing metric. We apply the OCELAD framework to an ensemble of online learners. Specifically, we create a retroinitialized composite objective mirror descent (COMID) ensemble (RICE) consisting of a set of parallel COMID learners with different learning rates, demonstrate RICEOCELAD on both real and synthetic data sets and show significant performance improvements relative to previously proposed batch and online distance metric learning algorithms. 
Online Convex Optimization (OCO) 

Online Deep Metric Learning (ODML) 
Metric learning learns a metric function from training data to calculate the similarity or distance between samples. From the perspective of feature learning, metric learning essentially learns a new feature space by feature transformation (e.g., Mahalanobis distance metric). However, traditional metric learning algorithms are shallow, which just learn one metric space (feature transformation). Can we further learn a better metric space from the learnt metric space In other words, can we learn metric progressively and nonlinearly like deep learning by just using the existing metric learning algorithms To this end, we present a hierarchical metric learning scheme and implement an online deep metric learning framework, namely ODML. Specifically, we take one online metric learning algorithm as a metric layer, followed by a nonlinear layer (i.e., ReLU), and then stack these layers modelled after the deep learning. The proposed ODML enjoys some nice properties, indeed can learn metric progressively and performs superiorly on some datasets. Various experiments with different settings have been conducted to verify these properties of the proposed ODML. 
Online EventDetection Problem (OEDP) 
Given a stream $S = (s_1, s_2, …, s_N)$, a $\phi$heavy hitter is an item $s_i$ that occurs at least $\phi N$ times in $S$. The problem of finding heavyhitters has been extensively studied in the database literature. In this paper, we study a related problem. We say that there is a $\phi$event at time $t$ if $s_t$ occurs exactly $\phi N$ times in $(s_1, s_2, …, s_t)$. Thus, for each $\phi$heavy hitter there is a single $\phi$event which occurs when its count reaches the reporting threshold $\phi N$. We define the online eventdetection problem (OEDP) as: given $\phi$ and a stream $S$, report all $\phi$events as soon as they occur. Many realworld monitoring systems demand event detection where all events must be reported (no false negatives), in a timely manner, with no nonevents reported (no false positives), and a low reporting threshold. As a result, the OEDP requires a large amount of space (Omega(N) words) and is not solvable in the streaming model or via standard samplingbased approaches. Since OEDP requires large space, we focus on cacheefficient algorithms in the externalmemory model. We provide algorithms for the OEDP that are within a log factor of optimal. Our algorithms are tunable: its parameters can be set to allow for a bounded falsepositives and a bounded delay in reporting. None of our relaxations allow false negatives since reporting all events is a strict requirement of our applications. Finally, we show improved results when the count of items in the input stream follows a powerlaw distribution. 
Online Failure Prediction  To identify during runtime whether a failure will occur in the near future based on an assessment of the monitored current system state. Such type of failure prediction is called online failure prediction. 
Online FAult Detection (FADO) 
This paper proposes and studies a detection technique for adversarial scenarios (dubbed deterministic detection). This technique provides an alternative detection methodology in case the usual stochastic methods are not applicable: this can be because the studied phenomenon does not follow a stochastic sampling scheme, samples are highdimensional and subsequent multipletesting corrections render results overly conservative, sample sizes are too low for asymptotic results (as e.g. the central limit theorem) to kick in, or one cannot allow for the small probability of failure inherent to stochastic approaches. This paper instead designs a method based on insights from machine learning and online learning theory: this detection algorithm – named Online FAult Detection (FADO) – comes with theoretical guarantees of its detection capabilities. A version of the margin is found to regulate the detection performance of FADO. A precise expression is derived for bounding the performance, and experimental results are presented assessing the influence of involved quantities. A case study of scene detection is used to illustrate the approach. The technology is closely related to the linear perceptron rule, inherits its computational attractiveness and flexibility towards various extensions. 
Online Generative Discriminative Restricted Boltzmann Machine (OGDRBM) 
We propose a novel online learning algorithm for Restricted Boltzmann Machines (RBM), namely, the Online Generative Discriminative Restricted Boltzmann Machine (OGDRBM), that provides the ability to build and adapt the network architecture of RBM according to the statistics of streaming data. The OGDRBM is trained in two phases: (1) an online generative phase for unsupervised feature representation at the hidden layer and (2) a discriminative phase for classification. The online generative training begins with zero neurons in the hidden layer, adds and updates the neurons to adapt to statistics of streaming data in a single pass unsupervised manner, resulting in a feature representation best suited to the data. The discriminative phase is based on stochastic gradient descent and associates the represented features to the class labels. We demonstrate the OGDRBM on a set of multicategory and binary classification problems for data sets having varying degrees of classimbalance. We first apply the OGDRBM algorithm on the multiclass MNIST dataset to characterize the network evolution. We demonstrate that the online generative phase converges to a stable, concise network architecture, wherein individual neurons are inherently discriminative to the class labels despite unsupervised training. We then benchmark OGDRBM performance to other machine learning, neural network and ClassRBM techniques for credit scoring applications using 3 public nonstationary twoclass credit datasets with varying degrees of classimbalance. We report that OGDRBM improves accuracy by 2.53% over batch learning techniques while requiring at least 24%70% fewer neurons and fewer training samples. This online generative training approach can be extended greedily to multiple layers for training Deep Belief Networks in nonstationary data mining applications without the need for a priori fixed architectures. 
Online Gradient Descent (OGD) 
In stochastic (or “online”) gradient descent, the true gradient of Q(w) is approximated by a gradient at a single example. … As the algorithm sweeps through the training set, it performs the above update for each training example. Several passes can be made over the training set until the algorithm converges. If this is done, the data can be shuffled for each pass to prevent cycles. Typical implementations may use an adaptive learning rate so that the algorithm converges. 
Online Gradient Descent With Expected Gradient (OGDEG) 
Online learning with limited information feedback (bandit) tries to solve the problem where an online learner receives partial feedback information from the environment in the course of learning. Under this setting, Flaxman extends Zinkevich’s classical Online Gradient Descent (OGD) algorithm Zinkevich [2003] by proposing the Online Gradient Descent with Expected Gradient (OGDEG) algorithm. Specifically, it uses a simple trick to approximate the gradient of the loss function $f_t$ by evaluating it at a single point and bounds the expected regret as $\mathcal{O}(T^{5/6})$ Flaxman et al. [2005]. It has been shown that compared with the firstorder algorithms, secondorder online learning algorithms such as Online Newton Step (ONS) Hazan et al. [2007] can significantly accelerate the convergence rate in traditional online learning. Motivated by this, this paper aims to exploit secondorder information to speed up the convergence of OGDEG. In particular, we extend the ONS algorithm with the trick of expected gradient and develop a novel secondorder online learning algorithm, i.e., Online Newton Step with Expected Gradient (ONSEG). Theoretically, we show that the proposed ONSEG algorithm significantly reduces the expected regret of OGDEG from $\mathcal{O}(T^{5/6})$ to $\mathcal{O}(T^{2/3})$ in the bandit feedback scenario. Empirically, we demonstrate the advantages of the proposed algorithm on several realworld datasets. 
Online Hyperparameter Learning for AutoAugmentation (OHLAutoAug) 
Data augmentation is critical to the success of modern deep learning techniques. In this paper, we propose Online Hyperparameter Learning for AutoAugmentation (OHLAutoAug), an economical solution that learns the augmentation policy distribution along with network training. Unlike previous methods on autoaugmentation that search augmentation strategies in an offline manner, our method formulates the augmentation policy as a parameterized probability distribution, thus allowing its parameters to be optimized jointly with network parameters. Our proposed OHLAutoAug eliminates the need of retraining and dramatically reduces the cost of the overall search process, while establishes significantly accuracy improvements over baseline models. On both CIFAR10 and ImageNet, our method achieves remarkable on search accuracy, 60x faster on CIFAR10 and 24x faster on ImageNet, while maintaining competitive accuracies. 
Online ICA  Solving statistical learning problems often involves nonconvex optimization. Despite the empirical success of nonconvex statistical optimization methods, their global dynamics, especially convergence to the desirable local minima, remain less well understood in theory. In this paper, we propose a new analytic paradigm based on diffusion processes to characterize the global dynamics of nonconvex statistical optimization. As a concrete example, we study stochastic gradient descent (SGD) for the tensor decomposition formulation of independent component analysis. In particular, we cast different phases of SGD into diffusion processes, i.e., solutions to stochastic differential equations. Initialized from an unstable equilibrium, the global dynamics of SGD transit over three consecutive phases: (i) an unstable OrnsteinUhlenbeck process slowly departing from the initialization, (ii) the solution to an ordinary differential equation, which quickly evolves towards the desirable local minimum, and (iii) a stable OrnsteinUhlenbeck process oscillating around the desirable local minimum. Our proof techniques are based upon Stroock and Varadhan’s weak convergence of Markov chains to diffusion processes, which are of independent interest. 
Online Machine Learning  Online machine learning is a model of induction that learns one instance at a time. The goal in online learning is to predict labels for instances. For example, the instances could describe the current conditions of the stock market, and an online algorithm predicts tomorrow’s value of a particular stock. The key defining characteristic of online learning is that soon after the prediction is made, the true label of the instance is discovered. This information can then be used to refine the prediction hypothesis used by the algorithm. The goal of the algorithm is to make predictions that are close to the true labels. 
Online Matching Problem  The online matching problem was introduced by Karp, Vazirani and Vazirani nearly three decades ago. In that seminal work, they studied this problem in bipartite graphs with vertices arriving only on one side, and presented optimal deterministic and randomized algorithms for this setting. In comparison, more general arrival models, such as edge arrivals and general vertex arrivals, have proven more challenging and positive results are known only for various relaxations of the problem. Online Matching with General Arrivals 
Online Maximum a Posterior Estimation (OPE) 
One of the core problems in statistical models is the estimation of a posterior distribution. For topic models, the problem of posterior inference for individual texts is particularly important, especially when dealing with data streams, but is often intractable in the worst case. As a consequence, existing methods for posterior inference are approximate and do not have any guarantee on neither quality nor convergence rate. In this paper, we introduce a provably fast algorithm, namely Online Maximum a Posterior Estimation (OPE), for posterior inference in topic models. OPE has more attractive properties than existing inference approaches, including theoretical guarantees on quality and fast convergence rate. The discussions about OPE are very general and hence can be easily employed in a wide class of probabilistic models. Finally, we employ OPE to design three novel methods for learning Latent Dirichlet allocation from text streams or large corpora. Extensive experiments demonstrate some superior behaviors of OPE and of our new learning methods. 
Online MetaLearning  A central capability of intelligent systems is the ability to continuously build upon previous experiences to speed up and enhance learning of new tasks. Two distinct research paradigms have studied this question. Metalearning views this problem as learning a prior over model parameters that is amenable for fast adaptation on a new task, but typically assumes the set of tasks are available together as a batch. In contrast, online (regret based) learning considers a sequential setting in which problems are revealed one after the other, but conventionally train only a single model without any taskspecific adaptation. This work introduces an online metalearning setting, which merges ideas from both the aforementioned paradigms to better capture the spirit and practice of continual lifelong learning. We propose the follow the meta leader algorithm which extends the MAML algorithm to this setting. Theoretically, this work provides an $\mathcal{O}(\log T)$ regret guarantee with only one additional higher order smoothness assumption in comparison to the standard online setting. Our experimental evaluation on three different largescale tasks suggest that the proposed algorithm significantly outperforms alternatives based on traditional online learning approaches. 
Online Mirror Descent  
Online MultiArmed Bandit  We introduce a novel variant of the multiarmed bandit problem, in which bandits are streamed one at a time to the player, and at each point, the player can either choose to pull the current bandit or move on to the next bandit. Once a player has moved on from a bandit, they may never visit it again, which is a crucial difference between our problem and classic multiarmed bandit problems. In this online context, we study Bernoulli bandits (bandits with payout Ber($p_i$) for some underlying mean $p_i$) with underlying means drawn i.i.d. from various distributions, including the uniform distribution, and in general, all distributions that have a CDF satisfying certain differentiability conditions near zero. In all cases, we suggest several strategies and investigate their expected performance. Furthermore, we bound the performance of any optimal strategy and show that the strategies we have suggested are indeed optimal up to a constant factor. We also investigate the case where the distribution from which the underlying means are drawn is not known ahead of time. We again, are able to suggest algorithms that are optimal up to a constant factor for this case, given certain mild conditions on the universe of distributions. 
Online Multiple Kernel Classification (OMKC) 
Online learning and kernel learning are two active research topics in machine learning. Although each of them has been studied extensively, there is a limited effort in addressing the intersecting research. In this paper, we introduce a new research problem, termed OnlineMultiple Kernel Learning (OMKL), that aims to learn a kernel based prediction function from a pool of predefined kernels in an online learning fashion. OMKL is generally more challenging than typical online learning because both the kernel classifiers and their linear combination weights must be learned simultaneously. In this work, we consider two setups for OMKL, i.e. combining binary predictions or realvalued outputs from multiple kernel classifiers, and we propose both deterministic and stochastic approaches in the two setups for OMKL. The deterministic approach updates all kernel classifiers for every misclassified example, while the stochastic approach randomly chooses a classifier(s) for updating according to some sampling strategies. Mistake bounds are derived for all the proposed OMKL algorithms. 
Online Network Optimization (ONO) 
Future 5G wireless networks will rely on agile and automated network management, where the usage of diverse resources must be jointly optimized with surgical accuracy. A number of key wireless network functionalities (e.g., traffic steering, energy savings) give rise to hard optimization problems. What is more, high spatiotemporal traffic variability coupled with the need to satisfy strict per slice/service SLAs in modern networks, suggest that these problems must be constantly (re)solved, to maintain closetooptimal performance. To this end, in this paper we propose the framework of Online Network Optimization (ONO), which seeks to maintain both agile and efficient control over time, using an arsenal of datadriven, adaptive, and AIbased techniques. Since the mathematical tools and the studied regimes vary widely among these methodologies, a theoretical comparison is often out of reach. Therefore, the important question ‘what is the right ONO technique ‘ remains open to date. In this paper, we discuss the pros and cons of each technique and further attempt a direct quantitative comparison for a specific use case, using real data. Our results suggest that carefully combining the insights of problem modeling with stateoftheart AI techniques provides significant advantages at reasonable complexity. 
Online Portfolio Selection (OLPS) 
Online portfolio selection, which sequentially selects a portfolio over a set of assets in order to achieve certain targets, is a natural and important task for asset portfolio management. Aiming to maximize the cumulative wealth, several categories of algorithms have been proposed to solve this task. One category of algorithmsFollow theWinner tries to asymptotically achieve the same growth rate (expected log return) as that of an optimal strategy, which is often based on the CGT. The second categoryFollow the Losertransfers the wealth from winning assets to losers, which seems contradictory to the common sense but empirically often achieves significantly better performance. Finally, the third categoryPattern Matchingbased approachestries to predict the next market distribution based on a sample of historical data and explicitly optimizes the portfolio based on the sampled distribution. Although these three categories are focused on a single strategy (class), there are also some other strategies that focus on combining multiple strategies (classes)MetaLearning Algorithms (MLAs). Book: Online Portfolio Selection 
Online Principal Component Analysis (oPCA) 
In the online setting of the well known Principal Component Analysis (PCA) problem, the vectors xt are presented to the algorithm one by one. onlinePCA 
Online Reputation Monitoring (ORM) 
Online Reputation Monitoring (ORM) is concerned with the use of computational tools to measure the reputation of entities online, such as politicians or companies. 
Online Resource Scheduling Algorithm (DeepRM2) 
With the rapid development of deep learning, deep reinforcement learning (DRL) began to appear in the field of resource scheduling in recent years. Based on the previous research on DRL in the literature, we introduce online resource scheduling algorithm DeepRM2 and the offline resource scheduling algorithm DeepRM_Off. Compared with the stateoftheart DRL algorithm DeepRM and heuristic algorithms, our proposed algorithms have faster convergence speed and better scheduling efficiency with regarding to average slowdown time, job completion time and rewards. 
Online Soft Mining (OSM) 
Deep metric learning aims to learn a deep embedding that can capture the semantic similarity of data points. Given the availability of massive training samples, deep metric learning is known to suffer from slow convergence due to a large fraction of trivial samples. Therefore, most existing methods generally resort to sample mining strategies for selecting nontrivial samples to accelerate convergence and improve performance. In this work, we identify two critical limitations of the sample mining methods, and provide solutions for both of them. First, previous mining methods assign one binary score to each sample, i.e., dropping or keeping it, so they only selects a subset of relevant samples in a minibatch. Therefore, we propose a novel sample mining method, called Online Soft Mining (OSM), which assigns one continuous score to each sample to make use of all samples in the minibatch. OSM learns extended manifolds that preserve useful intraclass variances by focusing on more similar positives. Second, the existing methods are easily influenced by outliers as they are generally included in the mined subset. To address this, we introduce ClassAware Attention (CAA) that assigns little attention to abnormal data samples. Furthermore, by combining OSM and CAA, we propose a novel weighted contrastive loss to learn discriminative embeddings. Extensive experiments on two finegrained visual categorisation datasets and two videobased person reidentification benchmarks show that our method significantly outperforms the stateoftheart. 
Online Transactional Processing (OLTP) 
Online transaction processing, or OLTP, is a class of information systems that facilitate and manage transactionoriented applications, typically for data entry and retrieval transaction processing. The term is somewhat ambiguous; some understand a “transaction” in the context of computer or database transactions, while others (such as the Transaction Processing Performance Council) define it in terms of business or commercial transactions. OLTP has also been used to refer to processing in which the system responds immediately to user requests. An automated teller machine (ATM) for a bank is an example of a commercial transaction processing application. Online transaction processing applications are high throughput and insert or updateintensive in database management. These application are used concurrently by hundred of users. The key goals of OLTP applications are availability, speed, concurrency, and recoverability. Reduced paper trails and the faster, more accurate forecast for revenues and expenses are both example of how OLTP makes things simpler for businesses. However, like many modern online information technology solutions, some systems require offline maintenance, which further affects the costbenefit analysis of online transaction processing system. 
Online Variance Reduction with Mixtures (VRM) 
Adaptive importance sampling for stochastic optimization is a promising approach that offers improved convergence through variance reduction. In this work, we propose a new framework for variance reduction that enables the use of mixtures over predefined sampling distributions, which can naturally encode prior knowledge about the data. While these sampling distributions are fixed, the mixture weights are adapted during the optimization process. We propose VRM, a novel and efficient adaptive scheme that asymptotically recovers the best mixture weights in hindsight and can also accommodate sampling distributions over sets of points. We empirically demonstrate the versatility of VRM in a range of applications. 
onlineSPARC  Recent progress in logic programming (e.g., the development of the Answer Set Programming paradigm) has made it possible to teach it to general undergraduate and even middle/high school students. Given the limited exposure of these students to computer science, the complexity of downloading, installing and using tools for writing logic programs could be a major barrier for logic programming to reach a much wider audience. We developed onlineSPARC, an online answer set programming environment with a self contained file system and a simple interface. It allows users to type/edit logic programs and perform several tasks over programs, including asking a query to a program, getting the answer sets of a program, and producing a drawing/animation based on the answer sets of a program. 
Ontology  In computer science and information science, an ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse. An ontology compartmentalizes the variables needed for some set of computations and establishes the relationships between them. The fields of artificial intelligence, the Semantic Web, systems engineering, software engineering, biomedical informatics, library science, enterprise bookmarking, and information architecture all create ontologies to limit complexity and to organize information. The ontology can then be applied to problem solving. 
Ontology Based Data Access (OBDA) 
Ontologybased data access is concerned with querying incomplete data sources in the presence of domainspecific knowledge provided by an ontology. A central notion in this setting is that of an ontologymediated query, which is a database query coupled with an ontology. OntologyBased Data Access: A Study through Disjunctive Datalog, CSP, and MMSNP 
Ontology Classification  Ontology classification – the computation of the subsumption hierarchies for classes and propertiesis a core reasoning service provided by all OWL (Web Ontology Language) reasoners known to us. A popular algorithm for computing the class hierarchy is the socalled Enhanced Traversal (ET) algorithm. 
Ontology Engineering  Ontology engineering in computer science and information science is a field which studies the methods and methodologies for building ontologies: formal representations of a set of concepts within a domain and the relationships between those concepts. A largescale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. 
Ontology Generative Adversarial Network  The recent success of Generative Adversarial Networks (GAN) is a result of their ability to generate high quality images from a latent vector space. An important application is the generation of images from a text description, where the text description is encoded and further used in the conditioning of the generated image. Thus the generative network has to additionally learn a mapping from the text latent vector space to a highly complex and multimodal image data distribution, which makes the training of such models challenging. To handle the complexities of fashion image and meta data, we propose Ontology Generative Adversarial Networks (OGANs) for fashion image synthesis that is conditioned on an hierarchical fashion ontology in order to improve the image generation fidelity. We show that the incorporation of the ontology leads to better image quality as measured by Fr\'{e}chet Inception Distance and Inception Score. Additionally, we show that the OGAN achieves better conditioning results evaluated by implicit similarity between the text and the generated image. 
Ontology Learning  Manual construction of ontologies for the SemanticWeb is a time consuming task. In order to help humans, the ontology learning field tries to automate the construction of new ontologies. The amount of data caused by the success of Internet is demanding methodologies and tools to automatically extract unknown and potentially useful knowledge out of it, generating structured representations with that knowledge. Although ontological engineering tools have matured over the last decade , manual ontology acquisition remains a tedious, timeconsuming, error prone, and complex task that can easily result in a knowledge acquisition bottleneck. Besides, while the new necessities of information are growing, the available ontologies need to be updated, enriched with new contents. The research on the ontology learning field has made possible the development of several approaches that allow the partial automation of the ontology construction process. It aims at reducing the time and effort in the ontology development process. Some methods and tools have been proposed in the last years, to speed up the ontology building process, using different sources and several techniques. Computational linguistics techniques, information extraction, statistics, and machine learning are the most prominent paradigms applied until now. There are also a great variety of information sources used for ontology learning. Though Web pages, dictionaries, knowledge bases, semistructured and structured sources can be used to learn an ontology, most of the methods only use textual sources for the learning process. All methods and tools have a strong relationships to the type of processing performed. In summary, the ontology learning field puts a number of research activities together, which focus on different types of knowledge and information sources, but share their target of a common domain conceptualisation The ontology learning is a complex multidisciplinary field that uses the natural language processing, text and web data extraction, machine learning and ontology engineering. 
OntologyBased Global and Collective Motion Pattern (On_GCMP) 
In multiperson videos, especially team sport videos, a semantic event is usually represented as a confrontation between two teams of players, which can be represented as collective motion. In broadcast basketball videos, specific camera motions are used to present specific events. Therefore, a semantic event in broadcast basketball videos is closely related to both the global motion (camera motion) and the collective motion. A semantic event in basketball videos can be generally divided into three stages: preevent, event occurrence (eventocc), and postevent. In this paper, we propose an ontologybased global and collective motion pattern (On_GCMP) algorithm for basketball event classification. First, a twostage GCMP based event classification scheme is proposed. The GCMP is extracted using optical flow. The twostage scheme progressively combines a fiveclass event classification algorithm on eventoccs and a twoclass event classification algorithm on preevents. Both algorithms utilize sequential convolutional neural networks (CNNs) and long shortterm memory (LSTM) networks to extract the spatial and temporal features of GCMP for event classification. Second, we utilize postevent segments to predict success/failure using deep features of images in the video frames (RGB_DF_VF) based algorithms. Finally the event classification results and success/failure classification results are integrated to obtain the final results. To evaluate the proposed scheme, we collected a new dataset called NCAA+, which is automatically obtained from the NCAA dataset by extending the fixed length of video clips forward and backward of the corresponding semantic events. The experimental results demonstrate that the proposed scheme achieves the mean average precision of 59.22% on NCAA+. It is higher by 7.62% than stateoftheart on NCAA. 
OntoSeg  Text segmentation (TS) aims at dividing long text into coherent segments which reflect the subtopic structure of the text. It is beneficial to many natural language processing tasks, such as Information Retrieval (IR) and document summarisation. Current approaches to text segmentation are similar in that they all use wordfrequency metrics to measure the similarity between two regions of text, so that a document is segmented based on the lexical cohesion between its words. Various NLP tasks are now moving towards the semantic web and ontologies, such as ontologybased IR systems, to capture the conceptualizations associated with user needs and contents. Text segmentation based on lexical cohesion between words is hence not sufficient anymore for such tasks. This paper proposes OntoSeg, a novel approach to text segmentation based on the ontological similarity between text blocks. The proposed method uses ontological similarity to explore conceptual relations between text segments and a Hierarchical Agglomerative Clustering (HAC) algorithm to represent the text as a treelike hierarchy that is conceptually structured. The rich structure of the created tree further allows the segmentation of text in a linear fashion at various levels of granularity. The proposed method was evaluated on a wellknown dataset, and the results show that using ontological similarity in text segmentation is very promising. Also we enhance the proposed method by combining ontological similarity with lexical similarity and the results show an enhancement of the segmentation quality. 
Open Category Detection  Open category detection is the problem of detecting ‘alien’ test instances that belong to categories or classes that were not present in the training data. In many applications, reliably detecting such aliens is central to ensuring the safety and accuracy of test set predictions. Unfortunately, there are no algorithms that provide theoretical guarantees on their ability to detect aliens under general assumptions. Further, while there are algorithms for open category detection, there are few empirical results that directly report alien detection rates. 
Open Data  Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. The goals of the open data movement are similar to those of other “Open” movements such as open source, open hardware, open content, and open access. The philosophy behind open data has been long established (for example in the Mertonian tradition of science), but the term “open data” itself is recent, gaining popularity with the rise of the Internet and World Wide Web and, especially, with the launch of opendata government initiatives such as Data.gov and Data.gov.uk. 
Open Data Center Alliance (ODCA) 
The Open Data Center Alliance is focused on the widespread adoption of enterprise cloud computing through best practice sharing and collaboration with the industry on availability of solution choice based on ODCA requirements. From its inception to today, the Alliance has seen a maturation of the cloud market place. To meet this new stage of enterprise cloud readiness, the ODCA has announced a new organizational charter. This new charter has driven the creation of the ODCA Cloud Expert Network and workgroups to deliver work focused on this charter. 
Open Data Platform (ODP) 
The Open Data Platform Initiative (ODP) is a shared industry effort focused on promoting and advancing the state of Apache Hadoop and Big Data technologies for the enterprise. Enabling Big Data solutions to flourish atop a common core platform. The current ecosystem is challenged and slowed by fragmented and duplicated efforts. The ODP Core will take the guesswork out of the process and accelerate many use cases by running on a common platform. Freeing up enterprises and ecosystem vendors to focus on building business driven applications. 
Open Domain INformer (ODIN) 
Rulebase information extraction (IE) has long enjoyed wide adoption throughout industry, though it has remained largely ignored in academia, in favor of machine learning (ML) methods (Chiticariu et al., 2013). However, rulebased systems have several advantages over pure ML systems, including: (a) the rules are interpretable and thus suitable for rapid development and/or domain transfer; and (b) humans and machines can contribute to the same model. Why then have such systems failed to hold the attention of the academic community? One argument raised by Chiticariu et al. is that, despite notable previous efforts (Appelt and Onyshkevych, 1998; Levy and Andrew, 2006; Hunter et al., 2008; Cunningham et al., 2011; Chang and Manning, 2014), there is not a standard language for this task, or a ‘standard way to express rules’, which raises the entry cost for new rulebased systems. ODIN aims to address these issues with a new language and framework. We follow the simplicity principles promoted by other natural language processing toolkits, such as Stanford’s CoreNLP, which aim to ‘avoid overdesign’, ‘do one thing well’, and have a user ‘up and running in ten minutes or less’ (Manning et al., 2014). 
Open Information Extraction Corpus (OPIEC) 
Open information extraction (OIE) systems extract relations and their arguments from natural language text in an unsupervised manner. The resulting extractions are a valuable resource for downstream tasks such as knowledge base construction, open question answering, or event schema induction. In this paper, we release, describe, and analyze an OIE corpus called OPIEC, which was extracted from the text of English Wikipedia. OPIEC complements the available OIE resources: It is the largest OIE corpus publicly available to date (over 340M triples) and contains valuable metadata such as provenance information, confidence scores, linguistic annotations, and semantic annotations including spatial and temporal information. We analyze the OPIEC corpus by comparing its content with knowledge bases such as DBpedia or YAGO, which are also based on Wikipedia. We found that most of the facts between entities present in OPIEC cannot be found in DBpedia and/or YAGO, that OIE facts often differ in the level of specificity compared to knowledge base facts, and that OIE open relations are generally highly polysemous. We believe that the OPIEC corpus is a valuable resource for future research on automated knowledge base construction. 
Open Intent Discovery  Detecting and identifying user intent from text, both written and spoken, plays an important role in modelling and understand dialogs. Existing research for intent discovery model it as a classification task with a predefined set of known categories. To generailze beyond these preexisting classes, we define a new task of \textit{open intent discovery}. We investigate how intent can be generalized to those not seen during training. To this end, we propose a twostage approach to this task – predicting whether an utterance contains an intent, and then tagging the intent in the input utterance. Our model consists of a bidirectional LSTM with a CRF on top to capture contextual semantics, subject to some constraints. Selfattention is used to learn long distance dependencies. Further, we adapt an adversarial training approach to improve robustness and perforamce across domains. We also present a dataset of 25k reallife utterances that have been labelled via crowd sourcing. Our experiments across different domains and realworld datasets show the effectiveness of our approach, with less than 100 annotated examples needed per unique domain to recognize diverse intents. The approach outperforms stateoftheart baselines by 515% F1 score points. 
Open Neural Network Exchange (ONNX) 
ONNX is a open format to represent deep learning models. With ONNX, AI developers can more easily move models between stateoftheart tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners. 
Open set Recognition (OSR) 
In realworld recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requiring the classifiers not only to accurately classify the seen classes, but also to effectively deal with the unseen ones. This paper provides a comprehensive survey of existing open set recognition techniques covering various aspects ranging from related definitions, representations of models, datasets, experiment setup and evaluation metrics. Furthermore, we briefly analyze the relationships between OSR and its related tasks including zeroshot, oneshot (fewshot) recognition/learning techniques, classification with reject option, and so forth. Additionally, we also overview the open world recognition which can be seen as a natural extension of OSR. Importantly, we highlight the limitations of existing approaches and point out some promising subsequent research directions in this field. 
Open Speech and Music Interpretation by Large Space Extraction (openSMILE) 
The openSMILE feature extraction tool enables you to extract large audio feature spaces, and apply machine learning methods to classify and analyze your data in realtime. It combines features from Music Information Retrieval and Speech Processing. SMILE is an acronym for Speech & Music Interpretation by Largespace Extraction. It is written in C++ and is available as both a standalone commandline executable as well as a dynamic library. The main features of openSMILE are its capability of online incremental processing and its modularity. Feature extractor components can be freely interconnected to create new and custom features, all via a simple configuration file. New components can be added to openSMILE via an easy binary plugin interface and a comprehensive API. http://…tureextractoratutorialforversion21 http://…/citation.cfm?id=1874246 
Open Web Analytics (OWA) 
Open Web Analytics (OWA) is open source web analytics software that you can use to track and analyze how people use your websites and applications. OWA is licensed under GPL and provides website owners and developers with easy ways to add web analytics to their sites using simple Javascript, PHP, or REST based APIs. OWA also comes with builtin support for tracking websites made with popular content management frameworks such as WordPress and MediaWiki. 
OpenAI  OpenAI is a nonprofit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return. In the short term, we’re building on recent advances in AI research and working towards the next set of breakthroughs. 
OpenAI Gym  OpenAI Gym is a toolkit for reinforcement learning research. It includes a growing collection of benchmark problems that expose a common interface, and a website where people can share their results and compare the performance of algorithms. This whitepaper discusses the components of OpenAI Gym and the design decisions that went into the software. 
OpenBLAS  OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version. 
OpenCL  OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), fieldprogrammable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages (based on C99 and C++11) for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task and databased parallelism. OpenCL Performance Prediction using ArchitectureIndependent Features 
OpenCLIPER  Medical image processing is often limited by the computational cost of the involved algorithms. Whereas dedicated computing devices (GPUs in particular) exist and do provide significant efficiency boosts, they have an extra cost of use in terms of housekeeping tasks (device selection and initialization, data streaming, synchronization with the CPU and others), which may hinder developers from using them. This paper describes an OpenCLbased framework that is capable of handling dedicated computing devices seamlessly and that allows the developer to concentrate on image processing tasks. The framework handles automatically device discovery and initialization, data transfers to and from the device and the file system and kernel loading and compiling. Data structures need to be defined only once independently of the computing device; code is unique, consequently, for every device, including the host CPU. Pinned memory/buffer mapping is used to achieve maximum performance in data transfers. Code fragments included in the paper show how the computing device is almost immediately and effortlessly available to the users algorithms, so they can focus on productive work. Code required for device selection and initialization, data loading and streaming and kernel compilation is minimal and systematic. Algorithms can be thought of as mathematical operators (called processes), with input, output and parameters, and they may be chained one after another easily and efficiently. Also for efficiency, processes can have their initialization work split from their core workload, so process chains and loops do not incur in performance penalties. Algorithm code is independent of the device type targeted. 
openCV  OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at realtime computer vision. It is free for use under the open source BSD license. The library is crossplatform. It focuses mainly on realtime image processing. OpenCV 
OpenDomain Spoken Question Answering Dataset (ODSQA) 
Reading comprehension by machine has been widely studied, but machine comprehension of spoken content is still a less investigated problem. In this paper, we release OpenDomain Spoken Question Answering Dataset (ODSQA) with more than three thousand questions. To the best of our knowledge, this is the largest real SQA dataset. On this dataset, we found that ASR errors have catastrophic impact on SQA. To mitigate the effect of ASR errors, subword units are involved, which brings consistent improvements over all the models. We further found that data augmentation on textbased QA training examples can improve SQA. 
OpenFace  OpenFace is a Python and Torch implementation of face recognition with deep neural networks and is based on the CVPR 2015 paper FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google. Torch allows the network to be executed on a CPU or with CUDA. 
OpenGIS Simple Features Reference Implementation (OGR) 
OGR used to stand for OpenGIS Simple Features Reference Implementation. However, since OGR is not fully compliant with the OpenGIS Simple Feature specification and is not approved as a reference implementation of the spec the name was changed to OGR Simple Features Library. The only meaning of OGR in this name is historical. OGR is also the prefix used everywhere in the source of the library for class names, filenames, etc. 
OpenHowNet  In this paper, we present an open sememebased lexical knowledge base OpenHowNet. Based on wellknown HowNet, OpenHowNet comprises three components: core data which is composed of more than 100 thousand senses annotated with sememes, OpenHowNet Web which gives a brief introduction to OpenHowNet as well as provides online exhibition of OpenHowNet information, and OpenHowNet API which includes several useful APIs such as accessing OpenHowNet core data and drawing sememe tree structures of senses. In the main text, we first give some backgrounds including definition of sememe and details of HowNet. And then we introduce some previous HowNet and sememebased research works. Last but not least, we detail the constituents of OpenHowNet and their basic features and functionalities. Additionally, we briefly make a summary and list some future works. 
OpenKiwi  We introduce OpenKiwi, a Pytorchbased open source framework for translation quality estimation. OpenKiwi supports training and testing of wordlevel and sentencelevel quality estimation systems, implementing the winning systems of the WMT 201518 quality estimation campaigns. We benchmark OpenKiwi on two datasets from WMT 2018 (EnglishGerman SMT and NMT), yielding stateoftheart performance on the wordlevel tasks and near stateoftheart in the sentencelevel tasks. 
OpenLava  OpenLava is an open source workload job scheduling software for a cluster of computers. OpenLava was derived from an early version of Platform LSF. Its configuration file syntax, API, and CLI have been kept unchanged. Therefore OpenLava is mostly compatible with Platform LSF. OpenLava was based on the Utopia research project at the University of Toronto. OpenLava is licensed under GNU General Public License v2. http://www.openlava.org 
OpenLoop Optimistic Planning (OLOP) 
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of possible sequences of actions, so that, once all available numerical resources (e.g. CPU time, number of calls to a generative model) have been used, one returns a recommendation on the best possible immediate action to follow based on this exploration. The performance of a strategy is assessed in terms of its simple regret, that is the loss in performance resulting from choosing the recommended action instead of an optimal one. We first provide a minimax lower bound for this problem, and show that a uniform planning strategy matches this minimax rate (up to a logarithmic factor). Then we propose a UCB (Upper Confidence Bounds)based planning algorithm, called OLOP (OpenLoop Optimistic Planning), which is also minimax optimal, and prove that it enjoys much faster rates when there is a small proportion of nearoptimal sequences of actions. Finally, we compare our results with the regret bounds one can derive for our setting with bandits algorithms designed for an infinite number of arms. Practical OpenLoop Optimistic Planning 
OpenMarkov  OpenMarkov is a software tool for probabilistic graphical models (PGMs) developed by the Research Centre for Intelligent DecisionSupport Systems of the UNED in Madrid, Spain. It has been designed for: · editing and evaluating several types of several types of PGMs, such as Bayesian networks, influence diagrams, factored Markov models, etc.; · learning Bayesian networks from data interactively; · costeffectiveness analysis. ➘ “Probabilistic Graphical Model” 
OPENMENDEL  Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet this need is called the OPENMENDELproject (https://openmendel.github.io ). It aims to (1) enable interactive and reproducible analyses with informative intermediate results, (2) scale to big data analytics, (3) embrace parallel and distributed computing, (4) adapt to rapid hardware evolution, (5) allow cloud computing, (6) allow integration of varied genetic data types, and (7) foster easy communication between clinicians, geneticists, statisticians, and computer scientists. This article reviews and makes recommendations to the genetic epidemiology community in the context of the OPENMENDEL project. 
OpenMx  OpenMx is an open source program for extended structural equation modeling. It runs as a package under R. Cross platform, it runs under Linux, Mac OS and Windows. OpenMx consists of an R library of functions and optimizers supporting the rapid and flexible implementation and estimation of SEM models. Models can be estimated based on either raw data (with FIML modelling) or on correlation or covariance matrices. Models can handle mixtures of continuous and ordinal data. The current version is OpenMx 2, and is available on CRAN. Path analysis, Confirmatory factor analysis, Latent growth modeling, Mediation analysis are all implemented. Multiple group models are implemented readily. When a model is run, it returns a model, and models can be updated (adding ad removing paths, adding constraints and equalities. Giving parameters the same label equates them). An innovation is that labels can consist of address of other parameters, allowing easy implementation of constrains on parameters by address. RAM models return standardized and raw estimates, as well as a range of fit indices (AIC, RMSEA, TLI, CFI etc.). Confidence intervals are estimated robustly. The program has parallel processing builtin via links to parallel environments in R, and in general takes advantage of the R programming environment. Users can expand the package with functions. These have been used, for instance, to implement Modification indices. Models can be written in either a ‘pathic’ or ‘matrix’ form. For those who think in terms of path models, paths are specified using mxPath() to describe paths. For models that are better suited to description in terms of matrix algebra, this is done using similar functional extensions in the R environment, for instance mxMatrix and mxAlgebra. OpenMx,ifaTools 
OpenNMT  OpenNMT is an opensource toolkit for neural machine translation (NMT). The system prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques. OpenNMT has been used in several production MT systems, modified for numerous research papers, and is implemented across several deep learning frameworks. 
OpenRefine  OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase. Please note that since October 2nd, 2012, Google is not actively supporting this project, which has now been rebranded to OpenRefine. Project development, documentation and promotion is now fully supported by volunteers. Find out more about the history of OpenRefine and how you can help the community. rrefine 
OpenSource Toolkit for Neural Machine Translation (openNMT) 
We describe an opensource toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques. 
OpenStreetMap (OSM) 
OpenStreetMap (OSM) is a collaborative project to create a free editable map of the world. Two major driving forces behind the establishment and growth of OSM have been restrictions on use or availability of map information across much of the world and the advent of inexpensive portable satellite navigation devices. Created by Steve Coast in the UK in 2004, it was inspired by the success of Wikipedia and the preponderance of proprietary map data in the UK and elsewhere. Since then, it has grown to over 1.6 million registered users, who can collect data using manual survey, GPS devices, aerial photography, and other free sources. This crowdsourced data is then made available under the Open Database License. The site is supported by the OpenStreetMap Foundation, a nonprofit organization registered in England. Rather than the map itself, the data generated by the OpenStreetMap project is considered its primary output. This data is then available for use in both traditional applications, like its usage by Craigslist, OsmAnd, Geocaching, MapQuest Open, JMP statistical software, and Foursquare to replace Google Maps, and more unusual roles, like replacing default data included with GPS receivers. This data has been favourably compared with proprietary datasources, though data quality varies worldwide. http://…tmapvisualizationcasestudysamplecode 
OpenTSDB  OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable. Thanks to HBase’s scalability, OpenTSDB allows you to collect thousands of metrics from tens of thousands of hosts and applications, at a high rate (every few seconds). OpenTSDB will never delete or downsample data and can easily store hundreds of billions of data points. OpenTSDB is free software and is available under both LGPLv2.1+ and GPLv3+. Find more about OpenTSDB at http://opentsdb.net 
Operational Analytics  Operational analytics is a more specific term for a type of business analytics which focuses on improving existing operations. This type of business analytics, like others, involves the use of various data mining and data aggregation tools to get more transparent information for business planning. 
Operational Intelligence (OI) 
Operational intelligence (OI) is a category of realtime dynamic, business analytics that delivers visibility and insight into data, streaming events and business operations. Operational Intelligence solutions run queries against streaming data feeds and event data to deliver realtime analytic results as operational instructions. Operational Intelligence provides organizations the ability to make decisions and immediately act on these analytic insights, through manual or automated actions. 
Operational Modal Analysis (OMA) 
Ambient modal identification, also known as Operational Modal Analysis (OMA), aims at identifying the modal properties of a structure based on vibration data collected when the structure is under its operating conditions, i.e., no initial excitation or known artificial excitation. The modal properties of a structure include primarily the natural frequencies, damping ratios and mode shapes. In an ambient vibration test the subject structure can be under a variety of excitation sources which are not measured but are assumed to be ‘broadband random’. The latter is a notion that one needs to apply when developing an ambient identification method. The specific assumptions vary from one method to another. Regardless of the method used, however, proper modal identification requires that the spectral characteristics of the measured response reflect the properties of the modes rather than those of the excitation. 
OperationGuided AttentionBased SequencetoSequence Network (OpAtt) 
Recent neural models for datatotext generation are mostly based on datadriven endtoend training over encoderdecoder networks. Even though the generated texts are mostly fluent and informative, they often generate descriptions that are not consistent with the input structured data. This is a critical issue especially in domains that require inference or calculations over raw data. In this paper, we attempt to improve the fidelity of neural datatotext generation by utilizing preexecuted symbolic operations. We propose a framework called Operationguided Attentionbased sequencetosequence network (OpAtt), with a specifically designed gating mechanism as well as a quantization module for operation results to utilize information from preexecuted operations. Experiments on two sports datasets show our proposed method clearly improves the fidelity of the generated texts to the input structured data. 
Operations Research (OR) 
Operations research, or operational research in British usage, is a discipline that deals with the application of advanced analytical methods to help make better decisions. It is often considered to be a subfield of mathematics. The terms management science and decision science are sometimes used as synonyms. Employing techniques from other mathematical sciences, such as mathematical modeling, statistical analysis, and mathematical optimization, operations research arrives at optimal or nearoptimal solutions to complex decisionmaking problems. Because of its emphasis on humantechnology interaction and because of its focus on practical applications, operations research has overlap with other disciplines, notably industrial engineering and operations management, and draws on psychology and organization science. Operations research is often concerned with determining the maximum (of profit, performance, or yield) or minimum (of loss, risk, or cost) of some realworld objective. Originating in military efforts before World War II, its techniques have grown to concern problems in a variety of industries. 
Opine  Online users are constantly seeking experiences, such as a hotel with clean rooms and a lively bar, or a restaurant for a romantic rendezvous. However, ecommerce search engines only support queries involving objective attributes such as location, price and cuisine, and any experiential data is relegated to text reviews. In order to support experiential queries, a database system needs to model subjective data and also be able to process queries where the user can express varied subjective experiences in words chosen by the user, in addition to specifying predicates involving objective attributes. This paper introduces Opine, a subjective database system that addresses these challenges. We introduce a data model for subjective databases. We describe how Opine translates subjective queries against the subjective database schema, which is done through matching the user query phrases to the underlying schema. We also show how the experiential conditions specified by the user can be combined and the results aggregated and ranked. We demonstrate that subjective databases satisfy user needs more effectively and accurately than alternative techniques through experiments with real data of hotel and restaurant reviews. 
Opinion Mining  ➘ “Sentiment Analysis” 
Opinion Pool  
Opportunistic Sensing  ➘ “Participatory Sensing” 
OppositeDirection Feature Attack (ODFA) 
Adversarial examples in recent works target at closed set recognition systems, in which the training and testing classes are identical. In realworld scenarios, however, the testing classes may have limited, if any, overlap with the training classes, a problem named open set recognition. To our knowledge, the community does not have a specific design of adversarial examples targeting at this practical setting. Arguably, the new setting compromises traditional closed set attack methods in two aspects. First, closed set attack methods are based on classification and target at classification as well, but the open set problem suggests a different task, \emph{i.e.,} retrieval. It is undesirable that the generation mechanism of closed set recognition is different from the aim of open set recognition. Second, given that the query image is usually of an unseen class, predicting its category from the training classes is not reasonable, which leads to an inferior adversarial gradient. In this work, we view open set recognition as a retrieval task and propose a new approach, OppositeDirection Feature Attack (ODFA), to generate adversarial examples / queries. When using an attacked example as query, we aim that the true matches be ranked as low as possible. In addressing the two limitations of closed set attack methods, ODFA directly works on the features for retrieval. The idea is to push away the feature of the adversarial query in the opposite direction of the original feature. Albeit simple, ODFA leads to a larger drop in Recall@K and mAP than the closeset attack methods on two open set recognition datasets, \emph{i.e.,} Market1501 and CUB2002011. We also demonstrate that the attack performance of ODFA is not evidently superior to the stateoftheart methods under closed set recognition (Cifar10), suggesting its specificity for open set problems. 
OPTaaS  OPTaaS aims to make optimization efficient for complex and expensive problems. OPTaaS is a generalpurpose Bayesian optimizer which provides optimal hyperparameter configurations via webservices. It can handle any parameter type and does not need to know the underlying process, models, or data. 
OptComplete  We consider the problem of matrix completion with side information on an $n\times m$ matrix. We formulate the problem exactly as a sparse regression problem of selecting features and show that it can be reformulated as a binary convex optimization problem. We design OptComplete, based on a novel concept of stochastic cutting planes to enable efficient scaling of the algorithm up to matrices of sizes $n = 10^6$ and $m = 10^5$. We report experiments on both synthetic and realworld datasets that show that OptComplete outperforms current stateoftheart methods both in terms of accuracy and scalability, while providing insight on the factors that affect the ratings. 
Optical Fringe Patterns Denoising Convolutional Neural Network (FPDCNN) 
Optical fringe patterns are often contaminated by speckle noise, making it difficult to accurately and robustly extract their phase fields. Thereupon we propose a filtering method based on deep learning, called optical fringe patterns denoising convolutional neural network (FPDCNN), for directly removing speckle from the input noisy fringe patterns. The FPDCNN method is divided into multiple stages, each stage consists of a set of convolutional layers along with batch normalization and leaky rectified linear unit (Leaky ReLU) activation function. The endtoend joint training is carried out using the Euclidean loss. Extensive experiments on simulated and experimental optical fringe patterns, specially finer ones with high density, show that the proposed method is superior to some stateoftheart denoising techniques in spatial or transform domains, efficiently preserving main features of fringe at a fairly fast speed. 
Optical Neural Network (ONN) 
We develop a novel optical neural network (ONN) framework which introduces a degree of scalar invariance to image classification estimation. Taking a hint from the human eye, which has higher resolution near the center of the retina, images are broken out into multiple levels of varying zoom based on a focal point. Each level is passed through an identical convolutional neural network (CNN) in a Siamese fashion, and the results are recombined to produce a high accuracy estimate of the object class. ONNs act as a wrapper around existing CNNs, and can thus be applied to many existing algorithms to produce notable accuracy improvements without having to change the underlying architecture. 
Optimal Completion Distillation (OCD) 
We present Optimal Completion Distillation (OCD), a training procedure for optimizing sequence to sequence models based on edit distance. OCD is efficient, has no hyperparameters of its own, and does not require pretraining or joint optimization with conditional loglikelihood. Given a partial sequence generated by the model, we first identify the set of optimal suffixes that minimize the total edit distance, using an efficient dynamic programming algorithm. Then, for each position of the generated sequence, we use a target distribution that puts equal probability on the first token of all the optimal suffixes. OCD achieves the stateoftheart performance on endtoend speech recognition, on both Wall Street Journal and Librispeech datasets, achieving $9.3\%$ WER and $4.5\%$ WER respectively. 
Optimal Control Theory  Optimal control theory is a mature mathematical discipline with numerous applications in both science and engineering. It is emerging as the computational framework of choice for studying the neural control of movement, in much the same way that probabilistic inference is emerging as the computational framework of choice for studying sensory information processing. Despite the growing popularity of optimal control models, however, the elaborate mathematical machinery behind them is rarely exposed and the big picture is hard to grasp without reading a few technical books on the subject. 
Optimal Coordinate Ascent (OCA) 
In machine learning, Feature Selection (FS) is a major part of efficient algorithm. It fuels the algorithm and is the starting block for our prediction. In this paper, we present a new method, called Optimal Coordinate Ascent (OCA) that allows us selecting features among block and individual features. OCA relies on coordinate ascent to find an optimal solution for gradient boosting methods score (number of correctly classified samples). OCA takes into account the notion of dependencies between variables forming blocks in our optimization. The coordinate ascent optimization solves the issue of the NP hard original problem where the number of combinations rapidly explode making a grid search unfeasible. It reduces considerably the number of iterations changing this NP hard problem into a polynomial search one. OCA brings substantial differences and improvements compared to previous coordinate ascent feature selection method: we group variables into block and individual variables instead of a binary selection. Our initial guess is based on the kbest group variables making our initial point more robust. We also introduced new stopping criteria making our optimization faster. We compare these two methods on our data set. We found that our method outperforms the initial one. We also compare our method to the Recursive Feature Elimination (RFE) method and find that OCA leads to the minimum feature set with the highest score. This is a nice byproduct of our method as it provides empirically the most compact data set with optimal performance. 
Optimal Margin Distribution Network (mdNet) 
Recent research about margin theory has proved that maximizing the minimum margin like support vector machines does not necessarily lead to better performance, and instead, it is crucial to optimize the margin distribution. In the meantime, margin theory has been used to explain the empirical success of deep network in recent studies. In this paper, we present mdNet (the Optimal Margin Distribution Network), a network which embeds a loss function in regard to the optimal margin distribution. We give a theoretical analysis of our method using the PACBayesian framework, which confirms the significance of the margin distribution for classification within the framework of deep networks. In addition, empirical results show that the mdNet model always outperforms the baseline crossentropy loss model consistently across different regularization situations. And our mdNet model also outperforms the crossentropy loss (Xent), hinge loss and soft hinge loss model in generalization task through limited training data. 
Optimal Matching Analysis (OMA) 
Optimal matching is a sequence analysis method used in social science, to assess the dissimilarity of ordered arrays of tokens that usually represent a timeordered sequence of socioeconomic states two individuals have experienced. Once such distances have been calculated for a set of observations (e.g. individuals in a cohort) classical tools (such as cluster analysis) can be used. The method was tailored to social sciences from a technique originally introduced to study molecular biology (protein or genetic) sequences. Optimal matching uses the NeedlemanWunsch algorithm. 
Optimal Sparse Decision Tree  Decision tree algorithms have been among the most popular algorithms for interpretable (transparent) machine learning since the early 1980’s. The problem that has plagued decision tree algorithms since their inception is their lack of optimality, or lack of guarantees of closeness to optimality: decision tree algorithms are often greedy or myopic, and sometimes produce unquestionably suboptimal models. Hardness of decision tree optimization is both a theoretical and practical obstacle, and even careful mathematical programming approaches have not been able to solve these problems efficiently. This work introduces the first practical algorithm for optimal decision trees for binary variables. The algorithm is a codesign of analytical bounds that reduce the search space and modern systems techniques, including data structures and a custom bitvector library. We highlight possible steps to improving the scalability and speed of future generations of this algorithm based on insights from our theory and experiments. 
Optimal Transport Classifier (OTClassifier) 
Recent studies have demonstrated the vulnerability of deep convolutional neural networks against adversarial examples. Inspired by the observation that the intrinsic dimension of image data is much smaller than its pixel space dimension and the vulnerability of neural networks grows with the input dimension, we propose to embed highdimensional input images into a lowdimensional space to perform classification. However, arbitrarily projecting the input images to a lowdimensional space without regularization will not improve the robustness of deep neural networks. Leveraging optimal transport theory, we propose a new framework, Optimal Transport Classifier (OTClassifier), and derive an objective that minimizes the discrepancy between the distribution of the true label and the distribution of the OTClassifier output. Experimental results on several benchmark datasets show that, our proposed framework achieves stateoftheart performance against strong adversarial attack methods. 
Optimism in the Face of Sensible Value Function (OFVF) 
Optimism about the poorly understood states and actions is the main driving force of exploration for many provablyefficient reinforcement learning algorithms. We propose optimism in the face of sensible value functions (OFVF) a novel datadriven Bayesian algorithm to constructing Plausibility sets for MDPs to explore robustly minimizing the worst case exploration cost. The method computes policies with tighter optimistic estimates for exploration by introducing two new ideas. First, it is based on Bayesian posterior distributions rather than distributionfree bounds. Second, OFVF does not construct plausibility sets as simple confidence intervals. Confidence intervals as plausibility sets are a sufficient but not a necessary condition. OFVF uses the structure of the value function to optimize the location and shape of the plausibility set to guarantee upper bounds directly without necessarily enforcing the requirement for the set to be a confidence interval. OFVF proceeds in an episodic manner, where the duration of the episode is fixed and known. Our algorithm is inherently Bayesian and can leverage prior information. Our theoretical analysis shows the robustness of OFVF, and the empirical results demonstrate its practical promise. 
Optimistic Lower Bounds Optimization (OLBO) 
While modelbased reinforcement learning has empirically been shown to significantly reduce the sample complexity that hinders modelfree RL, the theoretical understanding of such methods has been rather limited. In this paper, we introduce a novel algorithmic framework for designing and analyzing modelbased RL algorithms with theoretical guarantees, and a practical algorithm Optimistic Lower Bounds Optimization (OLBO). In particular, we derive a theoretical guarantee of monotone improvement for modelbased RL with our framework. We iteratively build a lower bound of the expected reward based on the estimated dynamical model and sample trajectories, and maximize it jointly over the policy and the model. Assuming the optimization in each iteration succeeds, the expected reward is guaranteed to improve. The framework also incorporates an optimismdriven perspective, and reveals the intrinsic measure for the model prediction error. Preliminary simulations demonstrate that our approach outperforms the standard baselines on continuous control benchmark tasks. 
Optimistic Optimization  OOR 
Optimization  In mathematics, computer science, economics, or management science, mathematical optimization (alternatively, optimization or mathematical programming) is the selection of a best element (with regard to some criteria) from some set of available alternatives. In the simplest case, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations comprises a large area of applied mathematics. More generally, optimization includes finding ‘best available’ values of some objective function given a defined domain (or a set of constraints), including a variety of different types of objective functions and different types of domains. 
Optimization via Network Evolution (ONE) 
There has been a gap between artificial intelligence and human intelligence. In this paper, we identify three key elements forming human intelligence, and suggest that abstraction learning combines these elements and is thus a way to bridge the gap. Prior researches in artificial intelligence either specify abstraction by human experts, or take abstraction as a qualitative explanation for the model. This paper aims to learn abstraction directly. We tackle three main challenges: representation, objective function, and learning algorithm. Specifically, we propose a partition structure that contains preallocated abstraction neurons; we formulate abstraction learning as a constrained optimization problem, which integrates abstraction properties; we develop a network evolution algorithm to solve this problem. This complete framework is named ONE (Optimization via Network Evolution). In our experiments on MNIST, ONE shows elementary humanlike intelligence, including low energy consumption, knowledge sharing, and lifelong learning. 
Optimized PAtchMatch Label fusion (OPAL) 
Automatic segmentation methods are important tools for quantitative analysis of Magnetic Resonance Images (MRI). Recently, patchbased label fusion approaches have demonstrated stateoftheart segmentation accuracy. In this paper, we introduce a new patchbased label fusion framework to perform segmentation of anatomical structures. The proposed approach uses an Optimized PAtchMatch Label fusion (OPAL) strategy that drastically reduces the computation time required for the search of similar patches. The reduced computation time of OPAL opens the way for new strategies and facilitates processing on large databases. In this paper, we investigate new perspectives offered by OPAL, by introducing a new multiscale and multifeature framework. During our validation on hippocampus segmentation we use two datasets: young adults in the ICBM cohort and elderly adults in the EADCADNI dataset. For both, OPAL is compared to stateoftheart methods. Results show that OPAL obtained the highest median Dice coefficient (89.9% for ICBM and 90.1% for EADCADNI). Moreover, in both cases, OPAL produced a segmentation accuracy similar to interexpert variability. On the EADCADNI dataset, we compare the hippocampal volumes obtained by manual and automatic segmentation. The volumes appear to be highly correlated that enables to perform more accurate separation of pathological populations. 
Optimus  As data scientists, we care about extracting the best information out of our data. Data is the new soil, you have to get in and get your hands dirty, without cleaning and preparing it, it just useless. Data preparation accounts for about 80% of the work of data scientists, so having a solution that connects to your database or file system, uses the most important framework for machine learning and data science at the moment (Apache Spark) and that can handle lots of information, working both in a cluster in a parallelized fashion or locally on your laptop is really important to have. Prepare, process and explore your Big Data with fastest open source library on the planet using Apache Spark and Python (PySpark). Data Science with Optimus. Part 1: Intro. 
OptStream  A number of applications benefit from continuously releasing streams of personal data statistics. The process, however, poses significant privacy risks. Motivated by an application in energy systems, this paper presents OptStream, a novel algorithm for releasing differential private data streams. OptStream is a 4step procedure consisting of sampling, perturbation, reconstruction, and postprocessing modules. The sampling module selects a small set of points to access privately in each period of interest, the perturbation module adds noise to the sampled data points to guarantee privacy, the reconstruction module reassembles the nonsampling data points from the perturbed sampled points, and the postprocessing module uses convex optimization over the private output of the previous modules, as well as the private answers of additional queries on the data stream, to ensure consistency of the data’s salient features. OptStream is used to release a real data stream from the largest transmission operator in Europe. Experimental results show that OptStream not only improves the accuracy of the stateoftheart by at least one order of magnitude on this application domain, but it is also able to ensure accurate load forecasting based on the private data. 
Optuna  The package was, and still is, developed by a Japanese AI company Preferred Networks. In many ways, Optuna is similar to Hyperopt. So why should you bother? There are a few reasons: • It’s possible to specify how long the optimization process should last • Integration with Pandas DataFrame • The algorithm uses pruning to discard lowquality trials early • It’s a relatively new project, and developers continue to work on it • It was easier to use than Hyperopt (at least for me) How to make your model awesome with Optuna 
Optunity  Optunity is a free software package for hyperparameter search in the context of machine learning developed at STADIUS. GitXiv 
OPUS Miner  OPUS Miner is an open source implementation of the OPUS Miner algorithm which applies OPUS search for Filtered Topk Association Discovery of SelfSufficient Itemsets. OPUS Miner finds selfsufficient itemsets. These are an effective way of summarizing the key associations in highdimensional data. opusminer 
Orbit  Orbit is a composable framework for orchestrating change processing, tracking, and synchronization across multiple data sources. Orbit is written in Typescript and distributed on npm through the @orbit organization. Prebuilt distributions are provided in several module formats and ES language levels. Orbit is isomorphic – it can be run both in modern browsers as well as in the Node.js runtime. 
Orbital Petri Net (OPN) 
Petri Nets is very interesting tool for studying and simulating different behaviors of information systems. It can be used in different applications based on the appropriate class of Petri Nets whereas it is classical, colored or timed Petri Nets. In this paper we introduce a new approach of Petri Nets called orbital Petri Nets (OPN) for studying the orbital rotating systems within a specific domain. The study investigated and analyzed OPN with highlighting the problem of space debris collision problem as a case study. The mathematical investigation results of two OPN models proved that space debris collision problem can be prevented based on the new method of firing sequence in OPN. By this study, new smart algorithms can be implemented and simulated by orbital Petri Nets for mitigating the space debris collision problem as a next work. 
Order Robust Adaptive Continual LEarning (ORACLE) 
The order of the tasks a continual learning model encounters may have large impact on the performance of each task, as well as the taskaverage performance. This ordersensitivity may cause serious problems in realworld scenarios where fairness plays a critical role (e.g. medical diagnosis). To tackle this problem, we propose a novel orderrobust continual learning method, which instead of learning a completely shared set of weights, represent the parameters for each task as a sum of taskshared parameters that captures generic representations and taskadaptive parameters capturing taskspecific ones, where the latter is factorized into sparse lowrank matrices in order to minimize capacity increase. With such parameter decomposition, when training for a new task, the taskadaptive parameters for earlier tasks remain mostly unaffected, where we update them only to reflect the changes made to the taskshared parameters. This prevents catastrophic forgetting for old tasks and at the same time make the model less sensitive to the task arrival order. We validate our OrderRobust Adaptive Continual LEarning (ORACLE) method on multiple benchmark datasets against stateoftheart continual learning methods, and the results show that it largely outperforms those strong baselines with significantly less increase in capacity and training time, as well as obtains smaller performance disparity for each task with different order sequences. 
Order Statistic  In statistics, the kth order statistic of a statistical sample is equal to its kthsmallest value. Together with rank statistics, order statistics are among the most fundamental tools in nonparametric statistics and inference. Important special cases of the order statistics are the minimum and maximum value of a sample, and (with some qualifications discussed below) the sample median and other sample quantiles. When using probability theory to analyze order statistics of random samples from a continuous distribution, the cumulative distribution function is used to reduce the analysis to the case of order statistics of the uniform distribution. 
Ordered Decision Diagram (ODD) 
A Symbolic Approach to Explaining Bayesian Network Classifiers 
Ordered Neurons LSTM (ONLSTM) 
Recurrent neural network (RNN) models are widely used for processing sequential data governed by a latent tree structure. Previous work shows that RNN models (especially Long ShortTerm Memory (LSTM) based models) could learn to exploit the underlying tree structure. However, its performance consistently lags behind that of treebased models. This work proposes a new inductive bias Ordered Neurons, which enforces an order of updating frequencies between hidden state neurons. We show that the ordered neurons could explicitly integrate the latent tree structure into recurrent models. To this end, we propose a new RNN unit: ONLSTM, which achieve good performances on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference. 
Ordered Weighted Averaging Aggregation Operator (OWA) 
In applied mathematics – specifically in fuzzy logic – the ordered weighted averaging (OWA) operators provide a parameterized class of mean type aggregation operators. They were introduced by Ronald R. Yager. Many notable mean operators such as the max, arithmetic average, median and min, are members of this class. They have been widely used in computational intelligence because of their ability to model linguistically expressed aggregation instructions. 
Ordering Points to Identify Cluster Structure (OPTICS) 
➘ “Ordering Points to Identify the Clustering Structure” 
Ordering Points to Identify the Clustering Structure (OPTICS) 
Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding densitybased clusters in spatial data. It was presented by Mihael Ankerst, Markus M. Breunig, HansPeter Kriegel and Jörg Sander. Its basic idea is similar to DBSCAN, but it addresses one of DBSCAN’s major weaknesses: the problem of detecting meaningful clusters in data of varying density. In order to do so, the points of the database are (linearly) ordered such that points which are spatially closest become neighbors in the ordering. Additionally, a special distance is stored for each point that represents the density that needs to be accepted for a cluster in order to have both points belong to the same cluster. This is represented as a dendrogram. Clustering Using OPTICS 
Ordinal Data Clustering  ordinalClust 
Ordinal Forests (OF) 
Ordinal forests (OF) are a method for ordinal regression with highdimensional and lowdimensional data that is able to predict the values of the ordinal target variable for new observations and at the same time estimate the relative widths of the classes of the ordinal target variable. Using a (permutationbased) variable importance measure it is moreover possible to rank the importances of the covariates. ordinalForest 
Ordinal Monte Carlo Tree Search (Ordinal MCTS) 
In many problem settings, most notably in game playing, an agent receives a possibly delayed reward for its actions. Often, those rewards are handcrafted and not naturally given. Even simple terminalonly rewards, like winning equals 1 and losing equals 1, can not be seen as an unbiased statement, since these values are chosen arbitrarily, and the behavior of the learner may change with different encodings, such as setting the value of a loss to 0:5, which is often done in practice to encourage learning. It is hard to argue about good rewards and the performance of an agent often depends on the design of the reward signal. In particular, in domains where states by nature only have an ordinal ranking and where meaningful distance information between game state values are not available, a numerical reward signal is necessarily biased. In this paper, we take a look at Monte Carlo Tree Search (MCTS), a popular algorithm to solve MDPs, highlight a reoccurring problem concerning its use of rewards, and show that an ordinal treatment of the rewards overcomes this problem. Using the General Video Game Playing framework we show a dominance of our newly proposed ordinal MCTS algorithm over preferencebased MCTS, vanilla MCTS and various other MCTS variants. 
Ordinal Pooling Network (OPN) 
In the framework of convolutional neural networks that lie at the heart of deep learning, downsampling is often performed with a maxpooling operation that however completely discards the information from other activations in a pooling region. To address this issue, a novel pooling scheme, Ordinal Pooling Network (OPN), is introduced in this work. OPN rearranges all the elements of a pooling region in a sequence and assigns different weights to all the elements based upon their orders in the sequence, where the weights are learned via the gradientbased optimisation. The results of our smallscale experiments on image classification task on MNIST database demonstrate that this scheme leads to a consistent improvement in the accuracy over maxpooling operation. This improvement is expected to increase in the deep networks, where several layers of pooling become necessary. 
Oriented Response Network (ORN) 
Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During backpropagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce withinclass rotationinvariant deep features while maintaining interclass discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple stateoftheart DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks. 
ORIGAMI  Memory bandwidth bottleneck is a major challenges in processing machine learning (ML) algorithms. Inmemory acceleration has potential to address this problem; however, it needs to address two challenges. First, inmemory accelerator should be general enough to support a large set of different ML algorithms. Second, it should be efficient enough to utilize bandwidth while meeting limited power and area budgets of logic layer of a 3Dstacked memory. We observe that previous work fails to simultaneously address both challenges. We propose ORIGAMI, a heterogeneous set of inmemory accelerators, to support compute demands of different ML algorithms, and also uses an offtheshelf compute platform (e.g.,FPGA,GPU,TPU,etc.) to utilize bandwidth without violating strict area and power budgets. ORIGAMI offers a patternmatching technique to identify similar computation patterns of ML algorithms and extracts a compute engine for each pattern. These compute engines constitute heterogeneous accelerators integrated on logic layer of a 3Dstacked memory. Combination of these compute engines can execute any type of ML algorithms. To utilize available bandwidth without violating area and power budgets of logic layer, ORIGAMI comes with a computationsplitting compiler that divides an ML algorithm between inmemory accelerators and an outofthememory platform in a balanced way and with minimum intercommunications. Combination of pattern matching and split execution offers a new design point for acceleration of ML algorithms. Evaluation results across 12 popular ML algorithms show that ORIGAMI outperforms stateoftheart accelerator with 3Dstacked memory in terms of performance and energydelay product (EDP) by 1.5x and 29x (up to 1.6x and 31x), respectively. Furthermore, results are within a 1% margin of an ideal system that has unlimited compute resources on logic layer of a 3Dstacked memory. 
Origraph  Data wrangling is widely acknowledged to be a critical part of the data analysis pipeline. Nevertheless, there are currently no techniques to efficiently wrangle network datasets. Here we introduce a set of interaction techniques that enable analysts to carry out complex network wrangling operations. These operations include deriving attributes across connected classes, converting nodes to edges and viceversa, and faceting nodes and edges based on attributes. We implement these operations in a webbased, opensource system, Origraph, which provides interfaces to execute the operations and investigate the results. Designed for wrangling, rather than analysis, Origraph can be used to load data in many forms, wrangle and transform the network, and export it in formats compatible with common network visualization tools. We demonstrate Origraph’s usefulness in a series of examples with different datasets from a variety of sources. 
Orthant Probabilities  
Orthogonal Array (OA) 
In mathematics, in the area of combinatorial designs, an orthogonal array is a ‘table’ (array) whose entries come from a fixed finite set of symbols (typically, {1,2,…,n}), arranged in such a way that there is an integer t so that for every selection of t columns of the table, all ordered ttuples of the symbols, formed by taking the entries in each row restricted to these columns, appear the same number of times. The number t is called the strength of the orthogonal array. The Orthogonal Array Package oapackage 
Orthogonal Deep Neural Network (OrthDNN) 
In this paper, we introduce the algorithms of Orthogonal Deep Neural Networks (OrthDNNs) to connect with recent interest of spectrally regularized deep learning methods. OrthDNNs are theoretically motivated by generalization analysis of modern DNNs, with the aim to find solution properties of network weights that guarantee better generalization. To this end, we first prove that DNNs are of local isometry on data distributions of practical interest; by using a new covering of the sample space and introducing the local isometry property of DNNs into generalization analysis, we establish a new generalization error bound that is both scale and rangesensitive to singular value spectrum of each of networks’ weight matrices. We prove that the optimal bound w.r.t. the degree of isometry is attained when each weight matrix has a spectrum of equal singular values, among which orthogonal weight matrix or a nonsquare one with orthonormal rows or columns is the most straightforward choice, suggesting the algorithms of OrthDNNs. We present both algorithms of strict and approximate OrthDNNs, and for the later ones we propose a simple yet effective algorithm called Singular Value Bounding (SVB), which performs as well as strict OrthDNNs, but at a much lower computational cost. We also propose Bounded Batch Normalization (BBN) to make compatible use of batch normalization with OrthDNNs. We conduct extensive comparative studies by using modern architectures on benchmark image classification. Experiments show the efficacy of OrthDNNs. 
Orthogonal Floating Search Framework  The present study proposes a new Orthogonal Floating Search framework for structure selection of nonlinear systems by adapting the existing floating search algorithms for feature selection. The proposed framework integrates the concept of orthogonal space and consequent ErrorReductionRatio (ERR) metric with the existing floating search algorithms. On the basis of this framework, three wellknown feature selection algorithms have been adapted which include the classical Sequential Forward Floating Search (SFFS), Improved sequential Forward Floating Search (IFFS) and Oscillating Search (OS). This framework retains the simplicity of classical Orthogonal Forward Regression with ERR (OFRERR) and eliminates the nesting effect associated with OFRERR. The performance of the proposed framework has been demonstrated considering several benchmark nonlinear systems. The results show that most of the existing feature selection methods can easily be tailored to identify the correct system structure of nonlinear systems. 
Orthogonal Generative Adversarial Network (OGAN) 
In this paper, we propose Orthogonal Generative Adversarial Networks (OGANs). We decompose the network of discriminator orthogonally and add an extra loss into the objective of common GANs, which can enforce discriminator become an effective encoder. The same extra loss can be embedded into any kind of GANs and there is almost no increase in computation. Furthermore, we discuss the principle of our method, which is relative to the fullyexploiting of the remaining degrees of freedom of discriminator. As we know, our solution is the simplest approach to train a generative adversarial network with autoencoding ability. 
Orthogonal Matching Pursuit  In text classification, the problem of overfitting arises due to the high dimensionality, making regularization essential. Although classic regularizers provide sparsity, they fail to return highly accurate models. On the contrary, stateoftheart grouplasso regularizers provide better results at the expense of low sparsity. In this paper, we apply a greedy variable selection algorithm, called Orthogonal Matching Pursuit, for the text classification task. We also extend standard group OMP by introducing overlapping group OMP to handle overlapping groups of features. Empirical analysis verifies that both OMP and overlapping GOMP constitute powerful regularizers, able to produce effective and supersparse models. Code and data are available at: https://…tring_G0\string_0DlcGkq6tQb2zqAaca\string dl\string=0 . 
Orthogonal Nonlinear LeastSquares Regression (ONLS) 
Orthogonal nonlinear least squares (ONLS) is a not so frequently applied and maybe overlooked regression technique that comes into question when one encounters an “error in variables” problem. While classical nonlinear least squares (NLS) aims to minimize the sum of squared vertical residuals, ONLS minimizes the sum of squared orthogonal residuals. The method is based on finding points on the fitted line that are orthogonal to the data by minimizing for each the Euclidean distance to some point on the fitted curve. onls 
Orthogonal Regression  Total least squares, also known as rigorous least squares and (in a special case) orthogonal regression, is a type of errorsinvariables regression, a least squares data modeling technique in which observational errors on both dependent and independent variables are taken into account. It is a generalization of Deming regression, and can be applied to both linear and nonlinear models. The total least squares approximation of the data is generically equivalent to the best, in the Frobenius norm, lowrank approximation of the data matrix. 
OrthoNormal basis construction In cOnfounding factor Normalization (ONION) 
Statistical learning on biological data can be challenging due to confounding variables in sample collection and processing. Confounders can cause models to generalize poorly and result in inaccurate prediction performance metrics if models are not validated thoroughly. In this paper, we propose methods to control for confounding factors and further improve prediction performance. We introduce OrthoNormal basis construction In cOnfounding factor Normalization (ONION) to remove confounding covariates and use the DomainAdversarial Neural Network (DANN) to penalize models for encoding confounder information. We apply the proposed methods to simulated and empirical patient data and show significant improvements in generalization. 
Oscillatory Neural Network  ➘ “Spiking Neural Network” 
Oscillatory Recurrent GAted Neural Integrator Circuits (ORGaNICs) 
Working memory is a cognitive process that is responsible for temporarily holding and manipulating information. Most of the empirical neuroscience research on working memory has focused on measuring sustained activity in prefrontal cortex (PFC) and/or parietal cortex during simple delayedresponse tasks, and most of the models of working memory have been based on neural integrators. But working memory means much more than just holding a piece of information online. We describe a new theory of working memory, based on a recurrent neural circuit that we call ORGaNICs (Oscillatory Recurrent GAted Neural Integrator Circuits). ORGaNICs are a variety of Long Short Term Memory units (LSTMs), imported from machine learning and artificial intelligence. ORGaNICs can be used to explain the complex dynamics of delayperiod activity in prefrontal cortex (PFC) during a working memory task. The theory is analytically tractable so that we can characterize the dynamics, and the theory provides a means for reading out information from the dynamically varying responses at any point in time, in spite of the complex dynamics. ORGaNICs can be implemented with a biophysical (electrical circuit) model of pyramidal cells, combined with shunting inhibition via a thalamocortical loop. Although introduced as a computational theory of working memory, ORGaNICs are also applicable to models of sensory processing, motor preparation and motor control. ORGaNICs offer computational advantages compared to other varieties of LSTMs that are commonly used in AI applications. Consequently, ORGaNICs are a framework for canonical computation in brains and machines. 
OSEMN Process (OSEMN) 
We’ve variously heard it said that data science requires some commandline fu for data procurement and preprocessing, or that one needs to know some machine learning or stats, or that one should know how to `look at data’. All of these are partially true, so we thought it would be useful to propose one possible taxonomy – we call it the Snice* taxonomy – of what a data scientist does, in roughly chronological order: · Obtain · Scrub · Explore · Model · iNterpret (or, if you like, OSEMN, which rhymes with possum). Using the OSEMN Process to Work Through a Data Problem 
OTNSGAII II  Two important characteristics of multiobjective evolutionary algorithms are distribution and convergency. As a classic multiobjective genetic algorithm, NSGAII is widely used in multiobjective optimization fields. However, in NSGAII, the random population initialization and the strategy of population maintenance based on distance cannot maintain the distribution or convergency of the population well. To dispose these two deficiencies, this paper proposes an improved algorithm, OTNSGAII II, which has a better performance on distribution and convergency. The new algorithm adopts orthogonal experiment, which selects individuals in manner of a new discontinuing nondominated sorting and crowding distance, to produce the initial population. And a new pruning strategy based on clustering is proposed to selfadaptively prunes individuals with similar features and poor performance in nondominated sorting and crowding distance, or to individuals are far away from the Pareto Front according to the degree of intraclass aggregation of clustering results. The new pruning strategy makes population to converge to the Pareto Front more easily and maintain the distribution of population. OTNSGAII and NSGAII are compared on various types of test functions to verify the improvement of OTNSGAII in terms of distribution and convergency. 
Out of Memory (OOM) 
Out of memory (OOM) is an often undesired state of computer operation where no additional memory can be allocated for use by programs or the operating system. Such a system will be unable to load any additional programs, and since many programs may load additional data into memory during execution, these will cease to function correctly. This usually occurs because all available memory, including disk swap space, has been allocated. 
Outer productbased Neural Collaborative Filtering (ONCF) 
In this work, we contribute a new multilayer neural network architecture named ONCF to perform collaborative filtering. The idea is to use an outer product to explicitly model the pairwise correlations between the dimensions of the embedding space. In contrast to existing neural recommender models that combine user embedding and item embedding via a simple concatenation or elementwise product, our proposal of using outer product above the embedding layer results in a twodimensional interaction map that is more expressive and semantically plausible. Above the interaction map obtained by outer product, we propose to employ a convolutional neural network to learn highorder correlations among embedding dimensions. Extensive experiments on two public implicit feedback data demonstrate the effectiveness of our proposed ONCF framework, in particular, the positive effect of using outer product to model the correlations between embedding dimensions in the low level of multilayer neural recommender model. The experiment codes are available at: https://…/ConvNCF 
Outlier  In statistics, an outlier is an observation point that is distant from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. 
Outlier Aware Network Embedding Algorithm (ONE) 
Attributed network embedding has received much interest from the research community as most of the networks come with some content in each node, which is also known as node attributes. Existing attributed network approaches work well when the network is consistent in structure and attributes, and nodes behave as expected. But real world networks often have anomalous nodes. Typically these outliers, being relatively unexplainable, affect the embeddings of other nodes in the network. Thus all the downstream network mining tasks fail miserably in the presence of such outliers. Hence an integrated approach to detect anomalies and reduce their overall effect on the network embedding is required. Towards this end, we propose an unsupervised outlier aware network embedding algorithm (ONE) for attributed networks, which minimizes the effect of the outlier nodes, and hence generates robust network embeddings. We align and jointly optimize the loss functions coming from structure and attributes of the network. To the best of our knowledge, this is the first generic network embedding approach which incorporates the effect of outliers for an attributed network without any supervision. We experimented on publicly available real networks and manually planted different types of outliers to check the performance of the proposed algorithm. Results demonstrate the superiority of our approach to detect the network outliers compared to the stateoftheart approaches. We also consider different downstream machine learning applications on networks to show the efficiency of ONE as a generic network embedding technique. The source code is made available at https://…/ONE. 
Outlier Exposure  It is important to detect and handle anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and indistribution examples. At the same time, diverse image and text data commonly used by deep learning systems are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This approach enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments in vision and natural language processing settings, we find that Outlier Exposure significantly improves the detection performance. Our approach is even applicable to density estimation models and anomaly detectors for largescale images. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance. 
Outline Generation  In this paper, we introduce and tackle the Outline Generation (OG) task, which aims to unveil the inherent content structure of a multiparagraph document by identifying its potential sections and generating the corresponding section headings. Without loss of generality, the OG task can be viewed as a novel structured summarization task. To generate a sound outline, an ideal OG model should be able to capture three levels of coherence, namely the coherence between context paragraphs, that between a section and its heading, and that between context headings. The first one is the foundation for section identification, while the latter two are critical for consistent heading generation. In this work, we formulate the OG task as a hierarchical structured prediction problem, i.e., to first predict a sequence of section boundaries and then a sequence of section headings accordingly. We propose a novel hierarchical structured neural generation model, named HiStGen, for the task. Our model attempts to capture the threelevel coherence via the following ways. First, we introduce a Markov paragraph dependency mechanism between context paragraphs for section identification. Second, we employ a sectionaware attention mechanism to ensure the semantic coherence between a section and its heading. Finally, we leverage a Markov heading dependency mechanism and a review mechanism between context headings to improve the consistency and eliminate duplication between section headings. Besides, we build a novel WIKIOG dataset, a public collection which consists of over 1.75 million documentoutline pairs for research on the OG task. Experimental results on our benchmark dataset demonstrate that our model can significantly outperform several stateoftheart sequential generation models for the OG task. 
Outline of Knowledge  Knowledge – familiarity with someone or something, which can include facts, information, descriptions, and/or skills acquired through experience or education. It can refer to the theoretical or practical understanding of a subject. It can be implicit (as with practical skill or expertise) or explicit (as with the theoretical understanding of a subject); and it can be more or less formal or systematic. 
OutofCore Algorithm  Outofcore or external memory algorithms are algorithms that are designed to process data that is too large to fit into a computer’s main memory at one time. Such algorithms must be optimized to efficiently fetch and access data stored in slow bulk memory such as hard drives or tape drives. A typical example is geographic information systems, especially digital elevation models, where the full data set easily exceeds several gigabytes or even terabytes of data. This notion naturally extends to a network connecting a data server to a treatment or visualization workstation. Popular massofdata based web applications such as googleMap or googleEarth enter this topic. It also extends to GPU computing – utilizing powerful graphics cards with little memory (compared to CPU memory) and slow CPUGPU memory transfer (compared to computation bandwidth). 
OutofDistribution (OOD) 
Deep learning has significantly improved the performance of machine learning systems in fields such as computer vision, natural language processing, and speech. In turn, these algorithms are integral in commercial applications such as autonomous driving, medical diagnosis, and web search. In these applications, it is critical to detect sensor failures, unusual environments, novel biological phenomena, and cyber attacks. To accomplish this, systems must be capable of detecting when inputs are anomalous or outofdistribution (OOD) 
OutofDistribution Detector for Neural Networks (ODIN) 
We consider the problem of detecting outofdistribution examples in neural networks. We propose ODIN, a simple and effective outofdistribution detector for neural networks, that does not require any change to a pretrained model. Our method is based on the observation that using temperature scaling and adding small perturbations to the input can separate the softmax score distributions of in and outofdistribution samples, allowing for more effective detection. We show in a series of experiments that our approach is compatible with diverse network architectures and datasets. It consistently outperforms the baseline approach[1] by a large margin, establishing a new stateoftheart performance on this task. For example, ODIN reduces the false positive rate from the baseline 34.7% to 4.3% on the DenseNet (applied to CIFAR10) when the true positive rate is 95%. We theoretically analyze the method and prove that performance improvement is guaranteed under mild conditions on the image distributions. 
Outofsample Testing  
Output Masks  In this paper we propose a novel method for achieving average consensus in a multiagent network while avoiding to disclose the initial states of the individual agents. In order to achieve privacy protection of the state variables, we introduce maps, called output masks, which alter the value of the states before publicly broadcasting them. These output masks are local (i.e., implemented independently by each agent), deterministic, timevarying and converging asymptotically to the true state. The resulting masked system is also timevarying and has the original (unmasked) system as its limit system. It is shown in the paper that the masked system has the original average consensus value as a global attractor. However, in order to preserve privacy, it cannot share an equilibrium point with the unmasked system, meaning that in the masked system the global attractor cannot be also stable. 
Output Range Analysis  Deep neural networks (NN) are extensively used for machine learning tasks such as image classification, perception and control of autonomous systems. Increasingly, these deep NNs are also been deployed in highassurance applications. Thus, there is a pressing need for developing techniques to verify neural networks to check whether certain userexpected properties are satisfied. In this paper, we study a specific verification problem of computing a guaranteed range for the output of a deep neural network given a set of inputs represented as a convex polyhedron. Range estimation is a key primitive for verifying deep NNs. We present an efficient range estimation algorithm that uses a combination of local search and linear programming problems to efficiently find the maximum and minimum values taken by the outputs of the NN over the given input set. In contrast to recently proposed ‘monolithic’ optimization approaches, we use local gradient descent to repeatedly find and eliminate local minima of the function. The final global optimum is certified using a mixed integer programming instance. We implement our approach and compare it with Reluplex, a recently proposed solver for deep neural networks. We demonstrate the effectiveness of the proposed approach for verification of NNs used in automated control as well as those used in classification. 
OutputConstrained BNN  Bayesian neural network (BNN) priors are defined in parameter space, making it hard to encode prior knowledge expressed in function space. We formulate a prior that incorporates functional constraints about what the output can or cannot be in regions of the input space. OutputConstrained BNNs (OCBNN) represent an interpretable approach of enforcing a range of constraints, fully consistent with the Bayesian framework and amenable to blackbox inference. We demonstrate how OCBNNs improve model robustness and prevent the prediction of infeasible outputs in two realworld applications of healthcare and robotics. 
Outranking Methods (OM) 
A classical problem in the field of Multiple Criteria Decision Making (mcdm) is to build a preference relation on a set of multiattributed alternatives on the basis of preferences expresses on each attribute and interattribute information such as weights. Based on this preference relation (or, more generally, on various relations obtained following a robustness analysis) a recommendation is elaborated (e.g. exhibiting of a subset likely to contain the best alternatives). OutrankingTools 
Overconfidence Effect  The overconfidence effect is a wellestablished bias in which someone’s subjective confidence in their judgments is reliably greater than their objective accuracy, especially when confidence is relatively high. For example, in some quizzes, people rate their answers as “99% certain” but are wrong 40% of the time. It has been proposed that a metacognitive trait mediates the accuracy of confidence judgments, but this trait’s relationship to variations in cognitive ability and personality remains uncertain. Overconfidence is one example of a miscalibration of subjective probabilities. 
Overdispersion  In statistics, overdispersion is the presence of greater variability (statistical dispersion) in a data set than would be expected based on a given statistical model. A common task in applied statistics is choosing a parametric model to fit a given set of empirical observations. This necessitates an assessment of the fit of the chosen model. It is usually possible to choose the model parameters in such a way that the theoretical population mean of the model is approximately equal to the sample mean. However, especially for simple models with few parameters, theoretical predictions may not match empirical observations for higher moments. When the observed variance is higher than the variance of a theoretical model, overdispersion has occurred. Conversely, underdispersion means that there was less variation in the data than predicted. Overdispersion is a very common feature in applied data analysis because in practice, populations are frequently heterogeneous (nonuniform) contrary to the assumptions implicit within widely used simple parametric models. 
Overfitting  In statistics and machine learning, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. A model which has been overfit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data. 
Overfly Algorithm  In this paper we investigate the supervised backpropagation training of multilayer neural networks from a dynamical systems point of view. We discuss some links with the qualitative theory of differential equations and introduce the overfly algorithm to tackle the local minima problem. Our approach is based on the existence of first integrals of the generalised gradient system with buildin dissipation. 
Overlapping KMeans (OKM) 
Cleuziou, G. (2007) <doi:10.1109/icpr.2008.4761079> COveR 
OverSketch  We propose OverSketch, an approximate algorithm for distributed matrix multiplication in serverless computing. OverSketch leverages ideas from matrix sketching and highperformance computing to enable costefficient multiplication that is resilient to faults and straggling nodes pervasive in lowcost serverless architectures. We establish statistical guarantees on the accuracy of OverSketch and empirically validate our results by solving a largescale linear program using interiorpoint methods and demonstrate a 34% reduction in compute time on AWS Lambda. 
OverSketched Newton  Motivated by recent developments in serverless systems for largescale machine learning as well as improvements in scalable randomized matrix algorithms, we develop OverSketched Newton, a randomized Hessianbased optimization algorithm to solve largescale smooth and stronglyconvex problems in serverless systems. OverSketched Newton leverages matrix sketching ideas from Randomized Numerical Linear Algebra to compute the Hessian approximately. These sketching methods lead to inbuilt resiliency against stragglers that are a characteristic of serverless architectures. We establish that OverSketched Newton has a linearquadratic convergence rate, and we empirically validate our results by solving largescale supervised learning problems on realworld datasets. Experiments demonstrate a reduction of ~50% in total running time on AWS Lambda, compared to stateoftheart distributed optimization schemes. 
Owl  Owl is a new numerical library developed in the OCaml language. It focuses on providing a comprehensive set of highlevel numerical functions so that developers can quickly build up data analytical applications. In this abstract, we will present Owl’s design, core components, and its key functionality. 
OWLAx  Once the conceptual overview, in terms of a somewhat informal class diagram, has been designed in the course of engineering an ontology, the process of adding many of the appropriate logical axioms is mostly a routine task. We provide a Protege plugin which supports this task, together with a visual user interface, based on established methods for ontology design pattern modeling. 