Query Autofiltering google
Query Autofiltering is autotagging of the incoming query where the knowledge source is the search index itself. What does this mean and why should we care? Content tagging processes are traditionally done at index time either manually or automatically by machine learning or knowledge based (taxonomy/ontology) approaches. To ‘tag’ a piece of content means to attach a piece of metadata that defines some attribute of that content (such as product type, color, price, date and so on). We use this now for faceted search – if I search for ‘shirts’, the search engine will bring back all records that have the token ‘shirts’ or the singular form ‘shirt’ (using a technique called stemming). At the same time, it will display all of the values of the various tags that we added to the content at index time under the field name or ‘category’ of these tags. We call these things facets. When the user clicks on a facet link, say color = red, we then generate a Solr filter query with the name / value pair of <field name> = <facet value> and add that to the original query. What this does is narrow the search result set to all records that have ‘shirt’ or ‘shirts’ and the ‘color’ facet value of ‘red’. Another benefit of faceting is that the user can see all of the colors that shirts come in, so they can also find blue shirts in the same way.
Query Autofiltering Revisited


Adaptive Sequential Machine Learning google
A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification. The stochastic optimization problems arising in these machine learning problems is solved using algorithms such as stochastic gradient descent (SGD). A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples at each time step to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. A bound is developed to show that the estimate of the change in the minimizers is non-trivial provided that the excess risk is small enough. Extensions relevant to the machine learning setting are considered, including a cost-based approach to select the number of samples with a cost budget over a fixed horizon, and an approach to applying cross-validation for model selection. Finally, experiments with synthetic and real data are used to validate the algorithms. …

AsymDPOP google
Asymmetric distributed constraint optimization problems (ADCOPs) are an emerging model for coordinating agents with personal preferences. However, the existing inference-based complete algorithms which use local eliminations cannot be applied to ADCOPs, as the parent agents are required to transfer their private functions to their children. Rather than disclosing private functions explicitly to facilitate local eliminations, we solve the problem by enforcing delayed eliminations and propose AsymDPOP, the first inference-based complete algorithm for ADCOPs. To solve the severe scalability problems incurred by delayed eliminations, we propose to reduce the memory consumption by propagating a set of smaller utility tables instead of a joint utility table, and to reduce the computation efforts by sequential optimizations instead of joint optimizations. The empirical evaluation indicates that AsymDPOP significantly outperforms the state-of-the-arts, as well as the vanilla DPOP with PEAV formulation. …

Correlated Variational Auto-Encoder (CVAE) google
Variational Auto-Encoders (VAEs) are capable of learning latent representations for high dimensional data. However, due to the i.i.d. assumption, VAEs only optimize the singleton variational distributions and fail to account for the correlations between data points, which might be crucial for learning latent representations from dataset where a priori we know correlations exist. We propose Correlated Variational Auto-Encoders (CVAEs) that can take the correlation structure into consideration when learning latent representations with VAEs. CVAEs apply a prior based on the correlation structure. To address the intractability introduced by the correlated prior, we develop an approximation by average of a set of tractable lower bounds over all maximal acyclic subgraphs of the undirected correlation graph. Experimental results on matching and link prediction on public benchmark rating datasets and spectral clustering on a synthetic dataset show the effectiveness of the proposed method over baseline algorithms. …