Crossbar architecture based devices have been widely adopted in neural network accelerators by taking advantage of the high efficiency on vector-matrix multiplication (VMM) operations. However, in the case of convolutional neural networks (CNNs), the efficiency is compromised dramatically due to the large amounts of data reuse. Although some mapping methods have been designed to achieve a balance between the execution throughput and resource overhead, the resource consumption cost is still huge while maintaining the throughput. Network pruning is a promising and widely studied leverage to shrink the model size. Whereas, previous work didn`t consider the crossbar architecture and the corresponding mapping method, which cannot be directly utilized by crossbar-based neural network accelerators. Tightly combining the crossbar structure and its mapping, this paper proposes a crossbar-aware pruning framework based on a formulated L0-norm constrained optimization problem. Specifically, we design an L0-norm constrained gradient descent (LGD) with relaxant probabilistic projection (RPP) to solve this problem. Two grains of sparsity are successfully achieved: i) intuitive crossbar-grain sparsity and ii) column-grain sparsity with output recombination, based on which we further propose an input feature maps (FMs) reorder method to improve the model accuracy. We evaluate our crossbar-aware pruning framework on median-scale CIFAR10 dataset and large-scale ImageNet dataset with VGG and ResNet models. Our method is able to reduce the crossbar overhead by 44%-72% with little accuracy degradation. This work greatly saves the resource and the related energy cost, which provides a new co-design solution for mapping CNNs onto various crossbar devices with significantly higher efficiency.
We introduce a theory-driven mechanism for learning a neural network model that performs generative topology design in one shot given a problem setting, circumventing the conventional iterative procedure that computational design tasks usually entail. The proposed mechanism can lead to machines that quickly response to new design requirements based on its knowledge accumulated through past experiences of design generation. Achieving such a mechanism through supervised learning would require an impractically large amount of problem-solution pairs for training, due to the known limitation of deep neural networks in knowledge generalization. To this end, we introduce an interaction between a student (the neural network) and a teacher (the optimality conditions underlying topology optimization): The student learns from existing data and is tested on unseen problems. Deviation of the student’s solutions from the optimality conditions is quantified, and used to choose new data points for the student to learn from. We show through a compliance minimization problem that the proposed learning mechanism is significantly more data efficient than using a static dataset under the same computational budget.
This paper considers the problem of estimating a change point in the covariance matrix in a sequence of high-dimensional vectors, where the dimension is substantially larger than the sample size. A two-stage approach is proposed to efficiently estimate the location of the change point. The first step consists of a reduction of the dimension to identify elements of the covariance matrices corresponding to significant changes. In a second step we use the components after dimension reduction to determine the position of the change point. Theoretical properties are developed for both steps and numerical studies are conducted to support the new methodology.
A public decision-making problem consists of a set of issues, each with multiple possible alternatives, and a set of competing agents, each with a preferred alternative for each issue. We study adaptations of market economies to this setting, focusing on binary issues. Issues have prices, and each agent is endowed with artificial currency that she can use to purchase probability for her preferred alternatives (we allow randomized outcomes). We first show that when each issue has a single price that is common to all agents, market equilibria can be arbitrarily bad. This negative result motivates a different approach. We present a novel technique called ‘pairwise issue expansion’, which transforms any public decision-making instance into an equivalent Fisher market, the simplest type of private goods market. This is done by expanding each issue into many goods: one for each pair of agents who disagree on that issue. We show that the equilibrium prices in the constructed Fisher market yield a ‘pairwise pricing equilibrium’ in the original public decision-making problem which maximizes Nash welfare. More broadly, pairwise issue expansion uncovers a powerful connection between the public decision-making and private goods settings; this immediately yields several interesting results about public decisions markets, and furthers the hope that we will be able to find a simple iterative voting protocol that leads to near-optimum decisions.
In this paper, we propose a new bivariate point process model to study the activity patterns of social media users. The proposed model not only is flexible to accommodate but also can provide meaningful insight into the complex behaviors of modern social media users. A composite likelihood approach and a composite EM estimation procedure are developed to overcome the challenges that arise in parameter estimation. Furthermore, we show consistency and asymptotic normality of the resulting estimator. We apply our proposed method to Donald Trump’s Twitter data and study if and how his tweeting behavior evolved before, during and after the presidential campaign. Moreover, we apply our method to a large-scale social media data and find interesting subgroups of users with distinct behaviors. Additionally, we discuss the effect of social ties on a user’s online content generating behavior.
Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.
Predicting transportation modes from GPS (Global Positioning System) records is a hot topic in the trajectory mining domain. Each GPS record is called a trajectory point and a trajectory is a sequence of these points. Trajectory mining has applications including but not limited to transportation mode detection, tourism, traffic congestion, smart cities management, animal behaviour analysis, environmental preservation, and traffic dynamics are some of the trajectory mining applications. Transportation modes prediction as one of the tasks in human mobility and vehicle mobility applications plays an important role in resource allocation, traffic management systems, tourism planning and accident detection. In this work, the proposed framework in Etemad et al. is extended to consider other aspects in the task of transportation modes prediction. Wrapper search and information retrieval methods were investigated to find the best subset of trajectory features. Finding the best classifier and the best feature subset, the framework is compared against two related papers that applied deep learning methods. The results show that our framework achieved better performance. Moreover, the ground truth noise removal improved accuracy of transportation modes prediction task; however, the assumption of having access to test set labels in pre-processing task is invalid. Furthermore, the cross validation approaches were investigated and the performance results show that the random cross validation method provides optimistic results.
Recently, Convolutional Neural Networks (CNNs) have dominated the field of computer vision. Their widespread success has been attributed to their representation learning capabilities. For classification tasks, CNNs have widely employed probabilistic output and have shown the significance of providing additional confidence for predictions. However, such probabilistic methodologies are not widely applicable for addressing regression problems using CNNs, as regression involves learning unconstrained continuous and, in many cases, multi-variate target variables. We propose a PRObabilistic Parametric rEgression Loss (PROPEL) that enables probabilistic regression using CNNs. PROPEL is fully differentiable and, hence, can be easily incorporated for end-to-end training of existing regressive CNN architectures. The proposed method is flexible as it learns complex unconstrained probabilities while being generalizable to higher dimensional multi-variate regression problems. We utilize a PROPEL-based CNN to address the problem of learning hand and head orientation from uncalibrated color images. Comprehensive experimental validation and comparisons with existing CNN regression loss functions are provided. Our experimental results indicate that PROPEL significantly improves the performance of a CNN, while reducing model parameters by 10x as compared to the existing state-of-the-art.
Bayesian models are naturally equipped to provide recursive inference because they can formally reconcile new data and existing scientific information. However, popular use of Bayesian methods often avoids priors that are based on exact posterior distributions resulting from former studies. Recursive Bayesian methods include two main approaches that we refer to as Prior- and Proposal-Recursive Bayes. Prior-Recursive Bayes uses Bayesian updating, fitting models to partitions of data sequentially, and provides a convenient way to accommodate new data as they become available. Prior-Recursive Bayes uses the posterior from the previous stage as the prior in the new stage based on the latest data. By contrast, Proposal-Recursive Bayes is intended for use with hierarchical Bayesian models and uses a set of transient priors in first stage independent analyses of the data partitions. The second stage of Proposal-Recursive Bayes uses the posterior distributions from the first stage as proposals in an MCMC algorithm to fit the full model. The second-stage recursive proposals simplify the Metropolis-Hastings ratio substantially and can lead to computational advantages for the Proposal-Recursive Bayes method. We combine Prior- and Proposal-Recursive concepts in a framework that can be used to fit any Bayesian model exactly, and often with computational improvements. We demonstrate our new method by fitting a geostatistical model to spatially-explicit data in a sequence of stages, leading to computational improvements by a factor of three in our example. While the method we propose provides exact inference, it can also be coupled with modern approximation methods leading to additional computational efficiency. Overall, our new approach has implications for big data, streaming data, and optimal adaptive design situations and can be modified to fit a broad class of Bayesian models to data.
The Internet of Things (IoT) integrates billions of smart devices that can communicate with one another with minimal human intervention. It is one of the fastest developing fields in the history of computing, with an estimated 50 billion devices by the end of 2020. On the one hand, IoT play a crucial role in enhancing several real-life smart applications that can improve life quality. On the other hand, the crosscutting nature of IoT systems and the multidisciplinary components involved in the deployment of such systems introduced new security challenges. Implementing security measures, such as encryption, authentication, access control, network security and application security, for IoT devices and their inherent vulnerabilities is ineffective. Therefore, existing security methods should be enhanced to secure the IoT system effectively. Machine learning and deep learning (ML/DL) have advanced considerably over the last few years, and machine intelligence has transitioned from laboratory curiosity to practical machinery in several important applications. Consequently, ML/DL methods are important in transforming the security of IoT systems from merely facilitating secure communication between devices to security-based intelligence systems. The goal of this work is to provide a comprehensive survey of ML /DL methods that can be used to develop enhanced security methods for IoT systems. IoT security threats that are related to inherent or newly introduced threats are presented, and various potential IoT system attack surfaces and the possible threats related to each surface are discussed. We then thoroughly review ML/DL methods for IoT security and present the opportunities, advantages and shortcomings of each method. We discuss the opportunities and challenges involved in applying ML/DL to IoT security. These opportunities and challenges can serve as potential future research directions.
In most real-world systems units are interconnected and can be represented as networks consisting of nodes and edges. For instance, in social systems individuals can have social ties, family or financial relationships. In settings where some units are exposed to a treatment and its effect spills over connected units, estimating both the direct effect of the treatment and spillover effects presents several challenges. First, assumptions on the way and the extent to which spillover effects occur along the observed network are required. Second, in observational studies, where the treatment assignment is not under the control of the investigator, confounding and homophily are potential threats to the identification and estimation of causal effects on networks. Here, we make two structural assumptions: i) neighborhood interference, which assumes interference operates only through a function of the immediate neighbors’ treatments ii) unconfoundedness of the individual and neighborhood treatment, which rules out the presence of unmeasured confounding variables, including those driving homophily. Under these assumptions we develop a new covariate-adjustment estimator for treatment and spillover effects in observational studies on networks. Estimation is based on a generalized propensity score that balances individual and neighborhood covariates across units under different levels of individual treatment and of exposure to neighbors’ treatment. Adjustment for propensity score is performed using a penalized spline regression. Inference capitalizes on a three-step Bayesian procedure which allows to take into account the uncertainty in the propensity score estimation and avoiding model feedback. Finally, correlation of interacting units is taken into account using a community detection algorithm and incorporating random effects in the outcome model.
Nowadays, sampling-based Approximate Query Processing (AQP) is widely regarded as a promising way to achieve interactivity in big data analytics. To build such an AQP system, finding the minimal sample size for a query regarding given error constraints in general, called Sample Size Optimization (SSO), is an essential yet unsolved problem. Ideally, the goal of solving the SSO problem is to achieve statistical accuracy, computational efficiency and broad applicability all at the same time. Existing approaches either make idealistic assumptions on the statistical properties of the query, or completely disregard them. This may result in overemphasizing only one of the three goals while neglect the others. To overcome these limitations, we first examine carefully the statistical properties shared by common analytical queries. Then, based on the properties, we propose a linear model describing the relationship between sample sizes and the approximation errors of a query, which is called the error model. Then, we propose a Model-guided Iterative Sample Selection (MISS) framework to solve the SSO problem generally. Afterwards, based on the MISS framework, we propose a concrete algorithm, called $L^2$Miss, to find optimal sample sizes under the $L^2$ norm error metric. Moreover, we extend the $L^2$Miss algorithm to handle other error metrics. Finally, we show theoretically and empirically that the $L^2$Miss algorithm and its extensions achieve satisfactory accuracy and efficiency for a considerably wide range of analytical queries.
The relational data model offers unrivaled rigor and precision in defining data structure and querying complex data. Yet the use of relational databases in scientific data pipelines is limited due to their perceived unwieldiness. We propose a simplified and conceptually refined relational data model named DataJoint. The model includes a language for schema definition, a language for data queries, and diagramming notation for visualizing entities and relationships among them. The model adheres to the principle of entity normalization, which requires that all data — both stored and derived — must be represented by well-formed entity sets. DataJoint’s data query language is an algebra on entity sets with five operators that provide matching capabilities to those of other relational query languages with greater clarity due to entity normalization. Practical implementations of DataJoint have been adopted in neuroscience labs for fluent interaction with scientific data pipelines.