TechKG google
Knowledge graph is a kind of valuable knowledge base which would benefit lots of AI-related applications. Up to now, lots of large-scale knowledge graphs have been built. However, most of them are non-Chinese and designed for general purpose. In this work, we introduce TechKG, a large scale Chinese knowledge graph that is technology-oriented. It is built automatically from massive technical papers that are published in Chinese academic journals of different research domains. Some carefully designed heuristic rules are used to extract high quality entities and relations. Totally, it comprises of over 260 million triplets that are built upon more than 52 million entities which come from 38 research domains. Our preliminary ex-periments indicate that TechKG has high adaptability and can be used as a dataset for many diverse AI-related applications. We released TechKG at:

Vertex-Diminished Random Walk (VDRW) google
Imbalanced data widely exists in many high-impact applications. An example is in air traffic control, where we aim to identify the leading indicators for each type of accident cause from historical records. Among all three types of accident causes, historical records with ‘personnel issues’ are much more than the other two types (‘aircraft issues’ and ‘environmental issues’) combined. Thus, the resulting dataset is highly imbalanced, and can be naturally modeled as a network. Up until now, most existing work on imbalanced data analysis focused on the classification setting, and very little is devoted to learning the node representation from imbalanced networks. To address this problem, in this paper, we propose Vertex-Diminished Random Walk (VDRW) for imbalanced network analysis. The key idea is to encourage the random particle to walk within the same class by adjusting the transition probabilities each step. It resembles the existing Vertex Reinforced Random Walk in terms of the dynamic nature of the transition probabilities, as well as some convergence properties. However, it is more suitable for analyzing imbalanced networks as it leads to more separable node representations in the embedding space. Then, based on VDRW, we propose a semi-supervised network representation learning framework named ImVerde for imbalanced networks, in which context sampling uses VDRW and the label information to create node-context pairs, and balanced-batch sampling adopts a simple under-sampling method to balance these pairs in different classes. Experimental results demonstrate that ImVerde based on VDRW outperforms state-of-the-art algorithms for learning network representation from imbalanced data. …

Caffe google
Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.