DocBERT
Pre-trained language representation models achieve remarkable state of the art across a wide range of tasks in natural language processing. One of the latest advancements is BERT, a deep pre-trained transformer that yields much better results than its predecessors do. Despite its burgeoning popularity, however, BERT has not yet been applied to document classification. This task deserves attention, since it contains a few nuances: first, modeling syntactic structure matters less for document classification than for other problems, such as natural language inference and sentiment classification. Second, documents often have multiple labels across dozens of classes, which is uncharacteristic of the tasks that BERT explores. In this paper, we describe fine-tuning BERT for document classification. We are the first to demonstrate the success of BERT on this task, achieving state of the art across four popular datasets. …

Anonymous Information Delivery (AID)
We introduce the problem of anonymous information delivery (AID), comprised of $K$ messages, a user, and $N$ servers (each holds $M$ messages) that wish to deliver one out of $K$ messages to the user anonymously, i.e., without revealing the delivered message index to the user. This AID problem may be viewed as the dual of the private information retrieval problem. The information theoretic capacity of AID, $C$, is defined as the maximum number of bits of the desired message that can be anonymously delivered per bit of total communication to the user. For the AID problem with $K$ messages, $N$ servers, $M$ messages stored per server, and $N \geq \lceil \frac{K}{M} \rceil$, we provide an achievable scheme of rate $1/\lceil \frac{K}{M} \rceil$ and an information theoretic converse of rate $M/K$, i.e., the AID capacity satisfies $1/\lceil \frac{K}{M} \rceil \leq C \leq M/K$. This settles the capacity of AID when $\frac{K}{M}$ is an integer. When $\frac{K}{M}$ is not an integer, we show that the converse rate of $M/K$ is achievable if $N \geq \frac{K}{\gcd(K,M)} – (\frac{M}{\gcd(K,M)}-1)(\lfloor \frac{K}{M} \rfloor -1)$, and the achievable rate of $1/\lceil \frac{K}{M} \rceil$ is optimal if $N = \lceil \frac{K}{M} \rceil$. Otherwise if $\lceil \frac{K}{M} \rceil < N < \frac{K}{\gcd(K,M)} – (\frac{M}{\gcd(K,M)}-1)(\lfloor \frac{K}{M} \rfloor -1)$, we give an improved achievable scheme and prove its optimality for several small settings. …