Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (65)
- Artificial Intelligence and Robotics (7)
- Biostatistics (3)
- Statistics and Probability (3)
- Epidemiology (2)
-
- Medicine and Health Sciences (2)
- Public Health (2)
- Applied Mathematics (1)
- Bioinformatics (1)
- Computational Biology (1)
- Disease Modeling (1)
- Diseases (1)
- Genetics and Genomics (1)
- International Public Health (1)
- Life Sciences (1)
- Longitudinal Data Analysis and Time Series (1)
- Non-linear Dynamics (1)
- Keyword
-
- Image Processing and Computer Vision (3)
- Image annotation (2)
- Image retrieval (2)
- Information Search and Retrieval (2)
- Relevance models (2)
-
- Adaptation (1)
- Akamai (1)
- Algorithms (1)
- Application acceleration (1)
- Bandwidth cost minimization (1)
- Bayesian inference (1)
- Biostatistics (1)
- CDN (1)
- Camera sensor networks (1)
- Character detection (1)
- Color histogram (1)
- Content delivery (1)
- Content delivery networks (1)
- Content-based image retrieval (1)
- Control Composition (1)
- Cosine Trees (1)
- Cursive (1)
- DEDS (1)
- DNS (1)
- Dengue (1)
- Development (1)
- Differential privacy (1)
- Distributed Systems (1)
- Distributed search (1)
- Epidemiology (1)
- Publication Year
Articles 1 - 30 of 68
Full-Text Articles in Physical Sciences and Mathematics
Challenges And Best Practices In Real-Time Prediction Of Infectious Disease: A Case Study Of Dengue In Thailand, Nicholas Reich, Stephen Lauer, Krzysztof Sakrejda, Sopon Iamsirithaworn, Soawapak Hinjoy, Paphanij Suangtho, Suthanun Suthachana, Hannah Clapham, Henrik Salje, Derek Cummings, Justin Lessler
Challenges And Best Practices In Real-Time Prediction Of Infectious Disease: A Case Study Of Dengue In Thailand, Nicholas Reich, Stephen Lauer, Krzysztof Sakrejda, Sopon Iamsirithaworn, Soawapak Hinjoy, Paphanij Suangtho, Suthanun Suthachana, Hannah Clapham, Henrik Salje, Derek Cummings, Justin Lessler
Nicholas G Reich
Epidemics of communicable diseases place a huge burden on public health infrastructures across the world. Producing accurate and actionable forecasts of infectious disease incidence at short and long time scales will improve public health response to outbreaks. However, scientists and public health officials face many obstacles in trying to create accurate and actionable real-time forecasts of infectious disease incidence. Dengue is a mosquito-borne virus that annually infects over 400 million people worldwide. We developed a real-time forecasting model for dengue hemorrhagic fever in the 77 provinces of Thailand. We created an operational and computational infrastructure that generated multi-step predictions of …
Interactions Between Serotypes Of Dengue Highlight Epidemiological Impact Of Cross-Immunity, Nicholas Reich, Sourya Shrestha, Aaron King, Pejman Rohani, Justin Lessler, Siripen Kalayanarooj, In-Kyu Yoon, Robert Gibbons, Donald Burke, Derek Cummings
Interactions Between Serotypes Of Dengue Highlight Epidemiological Impact Of Cross-Immunity, Nicholas Reich, Sourya Shrestha, Aaron King, Pejman Rohani, Justin Lessler, Siripen Kalayanarooj, In-Kyu Yoon, Robert Gibbons, Donald Burke, Derek Cummings
Nicholas G Reich
Dengue, a mosquito-borne virus of humans, infects over 50 million people annually. Infection with any of the four dengue serotypes induces protective immunity to that serotype, but does not confer long-term protection against infection by other serotypes. The immunological interactions between sero- types are of central importance in understanding epidemiological dynamics and anticipating the impact of dengue vaccines. We analysed a 38-year time series with 12 197 serotyped dengue infections from a hospital in Bangkok, Thailand. Using novel mechanistic models to represent different hypothesized immune interactions between serotypes, we found strong evidence that infec- tion with dengue provides substantial short-term …
Learning To Align From Scratch, Gary Huang, Marwan Mattar, Honglak Lee, Erik Learned-Miller
Learning To Align From Scratch, Gary Huang, Marwan Mattar, Honglak Lee, Erik Learned-Miller
Erik G Learned-Miller
Unsupervised joint alignment of images has been demonstrated to improve performance on recognition tasks such as face verification. Such alignment reduces undesired variability due to factors such as pose, while only requiring weak supervision in the form of poorly aligned examples. However, prior work on unsupervised alignment of complex, real-world images has required the careful selection of feature representation based on hand-crafted image descriptors, in order to achieve an appropriate, smooth optimization landscape. In this paper, we instead propose a novel combination of unsupervised joint alignment with unsupervised feature learning. Specifically, we incorporate deep learning into the congealing alignment framework. …
Distribution Fields For Tracking, Erik Learned-Miller, Laura Lara
Distribution Fields For Tracking, Erik Learned-Miller, Laura Lara
Erik G Learned-Miller
Visual tracking of general objects often relies on the assumption that gradient descent of the alignment function will reach the global optimum. A common technique to smooth the objective function is to blur the image. However, blurring the image destroys image information, which can cause the target to be lost. To address this problem we introduce a method for building an image descriptor using distribution fields (DFs), a representation that allows smoothing the objective function without destroying information about pixel values. We present experimental evidence on the superiority of the width of the basin of attraction around the global optimum …
Topic Models For Taxonomies, Anton Bakalov, Andrew Mccallum, Hanna Wallach, David Minmo
Topic Models For Taxonomies, Anton Bakalov, Andrew Mccallum, Hanna Wallach, David Minmo
Hanna M. Wallach
Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully capture the intended meaning of a taxonomy node, and particularly non-experts may have difficulty navigating and placing data into the taxonomy. This paper introduces two semi-supervised topic models that automatically augment a given taxonomy with many additional keywords by leveraging a corpus of multi-labeled documents. Our experiments show that users find the topics …
A Gpu-Based Approximate Svd Algorithm, Blake Foster, Sridhar Mahadevan, Rui Wang
A Gpu-Based Approximate Svd Algorithm, Blake Foster, Sridhar Mahadevan, Rui Wang
Rui Wang
Approximation of matrices using the Singular Value Decomposition (SVD) plays a central role in many science and engineering applications. However, the computation cost of an exact SVD is prohibitively high for very large matrices. In this paper, we describe a GPU-based approximate SVD algorithm for large matrices. Our method is based on the QUIC-SVD introduced by [6], which exploits a tree-based structure to efficiently discover a subset of rows that spans the matrix space. We describe how to map QUIC-SVD onto the GPU, and improve its speed and stability using a blocked Gram-Schmidt orthogonalization method. Using a simple matrix partitioning …
Topic-Partitioned Multinetwork Embeddings, Peter Krafft, Juston Moore, Bruce Desmarais, Hanna Wallach
Topic-Partitioned Multinetwork Embeddings, Peter Krafft, Juston Moore, Bruce Desmarais, Hanna Wallach
Hanna M. Wallach
We introduce a joint model of network content and context designed for exploratory analysis of email networks via visualization of topic-specific communication patterns. Our model is based on a novel extension of the latent space network model to the mixed-membership framework, and it uses latent Dirichlet allocation to model the text attributes of our data. To perform inference in this model, we use an approximate stochastic expectation-maximization algorithm. We validate the appropriateness of our model using a simulation study and a prediction task, and demonstrate its capabilities by investigating the communication patterns within a new government email dataset, the New …
Energy-Aware Load Balancing In Content Delivery Networks, Vimal Mathew, Ramesh Sitaraman, Prashant Shenoy
Energy-Aware Load Balancing In Content Delivery Networks, Vimal Mathew, Ramesh Sitaraman, Prashant Shenoy
Ramesh Sitaraman
Internet-scale distributed systems such as content delivery networks (CDNs) operate hundreds of thousands of servers deployed in thousands of data center locations around the globe. Since the energy costs of operating such a large IT infrastructure are a significant fraction of the total operating costs, we argue for redesigning CDNs to incorporate energy optimizations as a first-order principle. We propose techniques to turn off CDN servers during periods of low load while seeking to balance three key design goals: maximize energy reduction, minimize the impact on client-perceived service availability (SLAs), and limit the frequency of on-off server transitions to reduce …
Message-Passing Algorithms For Quadratic Programming Formulations Of Map Estimation, Akshat Kumar, Shlomo Zilberstein
Message-Passing Algorithms For Quadratic Programming Formulations Of Map Estimation, Akshat Kumar, Shlomo Zilberstein
Shlomo Zilberstein
Computing maximum a posteriori (MAP) estimation in graphical models is an important inference problem with many applications. We present message-passing algorithms for quadratic programming (QP) formulations of MAP estimation for pairwise Markov random fields. In particular, we use the concave-convex procedure (CCCP) to obtain a locally optimal algorithm for the non-convex QP formulation. A similar technique is used to derive a globally convergent algorithm for the convex QP relaxation of MAP. We also show that a recently developed expectation-maximization (EM) algorithm for the QP formulation of MAP can be derived from the CCCP perspective. Experiments on synthetic and real-world problems …
Algorithms For Optimizing The Bandwidth Cost Of Content Delivery, Micah Adler, Ramesh Sitaraman, Harish Venkataramani
Algorithms For Optimizing The Bandwidth Cost Of Content Delivery, Micah Adler, Ramesh Sitaraman, Harish Venkataramani
Ramesh Sitaraman
Content Delivery Networks (CDNs) deliver web content to end-users from a large distributed platform of web servers hosted in data centers belonging to thousands of Internet Service Providers (ISPs) around the world. The bandwidth cost incurred by a CDN is the sum of the amounts it pays each ISP for routing traffic from its servers located in that ISP out to end-users. A large enterprise may also contract with multiple ISPs to provide redundant Internet access for its origin infrastructure using technologies such as multihoming and mirroring, thereby incurring a significant bandwidth cost across multiple ISPs. This paper initiates the …
Optimizing Linear Counting Queries Under Differential Privacy, Chao Li, Michael Hay, Vibhor Rastogi, Gerome Miklau, Andrew Mcgregor
Optimizing Linear Counting Queries Under Differential Privacy, Chao Li, Michael Hay, Vibhor Rastogi, Gerome Miklau, Andrew Mcgregor
Andrew McGregor
Differential privacy is a robust privacy standard that has been successfully applied to a range of data analysis tasks. But despite much recent work, optimal strategies for answering a collection of related queries are not known.
We propose the matrix mechanism, a new algorithm for answering a workload of predicate counting queries. Given a workload, the mechanism requests answers to a different set of queries, called a query strategy, which are answered using the standard Laplace mechanism. Noisy answers to the workload queries are then derived from the noisy answers to the strategy queries. This two stage process can result …
Sample Size And Statistical Power Considerations In High-Dimensionality Data Settings: A Comparative Study Of Classification Algorithms, Yu Guo, Armin Garber, Raji Balasubramanian
Sample Size And Statistical Power Considerations In High-Dimensionality Data Settings: A Comparative Study Of Classification Algorithms, Yu Guo, Armin Garber, Raji Balasubramanian
Raji Balasubramanian
Background: Data generated using ‘omics’ technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of …
The Akamai Network: A Platform For High-Performance Internet Applications, Erik Nygren, Ramesh Sitaraman, Jennifer Sun
The Akamai Network: A Platform For High-Performance Internet Applications, Erik Nygren, Ramesh Sitaraman, Jennifer Sun
Ramesh Sitaraman
Comprising more than 61,000 servers located across nearly 1,000 networks in 70 countries worldwide, the Akamai platform delivers hundreds of billions of Internet interactions daily, helping thousands of enterprises boost the performance and reliability of their Internet applications. In this paper, we give an overview of the components and capabilities of this large-scale distributed computing platform, and offer some insight into its architecture, design principles, operation, and management.
Dynamic Computational Model Suggests That Cellular Citizenship Is Fundamental For Selective Tumor Apoptosis, Megan Olsen, Nava Siegelmann-Danieli, Hava Siegelmann
Dynamic Computational Model Suggests That Cellular Citizenship Is Fundamental For Selective Tumor Apoptosis, Megan Olsen, Nava Siegelmann-Danieli, Hava Siegelmann
Hava Siegelmann
Computational models in the field of cancer research have focused primarily on estimates of biological events based on laboratory generated data. We introduce a novel in-silico technology that takes us to the next level of prediction models and facilitates innovative solutions through the mathematical system. The model's building blocks are cells defined phenotypically as normal or tumor, with biological processes translated into equations describing the life protocols of the cells in a quantitative and stochastic manner. The essentials of communication in a society composed of normal and tumor cells are explored to reveal “protocols” for selective tumor eradication. Results consistently …
Scalable Probabilistic Databases With Factor Graphs And Mcmc, Michael Wick, Andrew Mccallum, Gerome Miklau
Scalable Probabilistic Databases With Factor Graphs And Mcmc, Michael Wick, Andrew Mccallum, Gerome Miklau
Gerome Miklau
Probabilistic databases play a crucial role in the management and understanding of uncertain data. However, incorporating probabilities into the semantics of incomplete databases has posed many challenges, forcing systems to sacrifice modeling power, scalability, or restrict the class of relational algebra formula under which they are closed. We propose an alternative approach where the underlying relational database always represents a single world, and an external factor graph encodes a distribution over possible worlds; Markov chain Monte Carlo (MCMC) inference is then used to recover this uncertainty to a desired level of fidelity. Our approach allows the efficient evaluation of arbitrary …
Fddb: A Benchmark For Face Detection In Unconstrained Settings, Vidit Jain, Erik Learned-Miller
Fddb: A Benchmark For Face Detection In Unconstrained Settings, Vidit Jain, Erik Learned-Miller
Erik G Learned-Miller
Despite the maturity of face detection research, it re- mains difficult to compare different algorithms for face de- tection. This is partly due to the lack of common evaluation schemes. Also, existing data sets for evaluating face detec- tion algorithms do not capture some aspects of face appear- ances that are manifested in real-world scenarios. In this work, we address both of these issues. We present a new data set of face images with more faces and more accurate annotations for face regions than in previous data sets. We also propose two rigorous and precise methods for evaluat- ing the …
Adapting Blstm Neural Network Based Keyword Spotting Trained On Modern Data To Historical Documents, Volkmar Frinken, Andreas Fischer, Horst Bunke, R. Manmatha
Adapting Blstm Neural Network Based Keyword Spotting Trained On Modern Data To Historical Documents, Volkmar Frinken, Andreas Fischer, Horst Bunke, R. Manmatha
R. Manmatha
Being able to search for words or phrases in historic handwritten documents is of paramount importance when preserving cultural heritage. Storing scanned pages of written text can save the information from degradation, but it does not make the textual information readily available. Automatic keyword spotting systems for handwritten historic documents can fill this gap. However, most such systems have trouble with the great variety of writing styles. It is not uncommon for handwriting processing systems to be built for just a single book. In this paper we show that neural network based keyword spotting systems are flexible enough to be …
Image Retrieval Using Markov Random Fields And Global Image Features, Ainhoa Llorente, R. Manmatha, Stefan Rüger
Image Retrieval Using Markov Random Fields And Global Image Features, Ainhoa Llorente, R. Manmatha, Stefan Rüger
R. Manmatha
In this paper, we propose a direct image retrieval framework based on Markov Random Fields (MRFs) that exploits the semantic context dependencies of the image. The novelty of our approach lies in the use of different kernels in our non-parametric density estimation together with the utilization of configurations that explore semantic relationships among concepts at the same time as low-level features, instead of just focusing on correlation between image features like in previous formulations. Hence, we introduce several configurations and study which one achieve the best performance. Results are presented for two datasets, the usual benchmark Corel 5k and the …
Optimizing Semantic Coherence In Topic Models, D. Mimno, Hanna Wallach, E. Talley, M. Leenders, A. Mccallum
Optimizing Semantic Coherence In Topic Models, D. Mimno, Hanna Wallach, E. Talley, M. Leenders, A. Mccallum
Hanna M. Wallach
Large organizations often face the critical challenge of sharing information and maintaining connections between disparate subunits. Tools for automated analysis of document collections, such as topic models, can provide an important means for communication. The value of topic modeling is in its ability to discover interpretable, coherent themes from unstructured document sets, yet it is not unusual to find semantic mismatches that substantially reduce user confidence. In this paper, we first present an expert-driven topic annotation study, undertaken in order to obtain an annotated set of baseline topics and their distinguishing characteristics. We then present a metric for detecting poor-quality …
Learning The Structure Of Deep Sparse Graphical Models, Ryan Adams, Hanna Wallach, Zoubin Ghahramani
Learning The Structure Of Deep Sparse Graphical Models, Ryan Adams, Hanna Wallach, Zoubin Ghahramani
Hanna M. Wallach
Deep belief networks are a powerful way to model complex probability distributions. However, it is difficult to learn the structure of a belief network, particularly one with hidden units. The Indian buffet process has been used as a nonparametric Bayesian prior on the structure of a directed belief network with a single infinitely wide hidden layer. Here, we introduce the cascading Indian buffet process (CIBP), which provides a prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief …
An Alternative Prior Process For Nonparametric Bayesian Clustering, Hanna Wallach, Shane Jensen, Lee Dicker, Katherine Heller
An Alternative Prior Process For Nonparametric Bayesian Clustering, Hanna Wallach, Shane Jensen, Lee Dicker, Katherine Heller
Hanna M. Wallach
Prior distributions play a crucial role in Bayesian approaches to clustering. Two commonly-used prior distributions are the Dirichlet and Pitman-Yor processes. In this paper, we investigate the predictive probabilities that underlie these processes, and the implicit "rich-get-richer" characteristic of the resulting partitions. We explore an alternative prior for nonparametric Bayesian clustering---the uniform process---for applications where the "rich-get-richer" property is undesirable. We also explore the cost of this process: partitions are no longer exchangeable with respect to the ordering of variables. We present new asymptotic and simulation-based results for the clustering characteristics of the uniform process and compare these with known …
Anytime Planning For Decentralized Pomdps Using Expectation Maximization, Akshat Kumar, Shlomo Zilberstein
Anytime Planning For Decentralized Pomdps Using Expectation Maximization, Akshat Kumar, Shlomo Zilberstein
Shlomo Zilberstein
Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. While finite-horizon DEC-POMDPs have enjoyed significant success, progress remains slow for the infinite-horizon case mainly due to the inherent complexity of optimizing stochastic controllers representing agent policies. We present a promising new class of algorithms for the infinite-horizon case, which recasts the optimization problem as inference in a mixture of DBNs. An attractive feature of this approach is the straightforward adoption of existing inference techniques in DBNs for solving DEC-POMDPs and supporting richer representations such as factored or continuous states and actions. We also derive the Expectation Maximization (EM) …
Memory Reconsolidation For Natural Language Processing, Kum Tu, David Cooper, Hava Siegelmann
Memory Reconsolidation For Natural Language Processing, Kum Tu, David Cooper, Hava Siegelmann
Hava Siegelmann
We propose a model of memory reconsolidation that can output new sentences with additional meaning after refining information from input sentences and integrating them with related prior experience. Our model uses available technology to first disambiguate the meanings of words and extracts information from the sentences into a structure that is an extension to semantic networks. Within our long-term memory we introduce an action relationships database reminiscent of the way symbols are associated in brain, and propose an adaptive mechanism for linking these actions with the different scenarios. The model then fills in the implicit context of the input and …
Learning On The Fly: Font Free Approaches To Difficult Ocr Problems, Andrew Kae, Erik Learned-Miller
Learning On The Fly: Font Free Approaches To Difficult Ocr Problems, Andrew Kae, Erik Learned-Miller
Erik G Learned-Miller
Despite ubiquitous claims that optical character recog- nition (OCR) is a “solved problem,” many categories of documents continue to break modern OCR software such as documents with moderate degradation or unusual fonts. Many approaches rely on pre-computed or stored charac- ter models, but these are vulnerable to cases when the font of a particular document was not part of the training set, or when there is so much noise in a document that the font model becomes weak. To address these difficult cases, we present a form of iterative contextual modeling that learns character models directly from the document it …
Robust Recognition Of Documents By Fusing Results Of Word Clusters, Venkat Rasagna, Anand Kumar, C. Jawahar, R. Manmatha
Robust Recognition Of Documents By Fusing Results Of Word Clusters, Venkat Rasagna, Anand Kumar, C. Jawahar, R. Manmatha
R. Manmatha
The word error rate of any optical character recognition system (OCR) is usually substantially below its component or character error rate. This is especially true of Indic languages in which a word consists of many components. Current OCRs recognize each character or word separately and do not take advantage of document level constraints. We propose a document level OCR which incorporates information from the entire document to reduce word error rates. Word images are first clustered using a locality sensitive hashing technique. Individual words are then recognized using a (regular) OCR. The OCR outputs of word images in a cluster …
A Hidden Markov Model For Alphabet-Soup Word Recognition, Shaolei Feng, Nicholas Howe, R. Manmatha
A Hidden Markov Model For Alphabet-Soup Word Recognition, Shaolei Feng, Nicholas Howe, R. Manmatha
R. Manmatha
Recent work on the ``alphabet soup'' paradigm has demonstrated effective segmentation-free character-based recognition of cursive handwritten historical text documents. The approach first uses a joint boosting technique to detect potential characters - the alphabet soup. A second stage uses a dynamic programming algorithm to recover the correct sequence of characters. Despite experimental success, the ad hoc dynamic programming method previously lacked theoretical justification. This paper puts the method on a sounder footing by recasting the dynamic programming as inference on an ensemble of hidden Markov models (HMMs). Although some work has questioned the use of score outputs from classifiers like …
Distributed Image Search In Camera Sensor Networks, Tingxin Yan, Deepak Ganesan, R. Manmatha
Distributed Image Search In Camera Sensor Networks, Tingxin Yan, Deepak Ganesan, R. Manmatha
R. Manmatha
Recent advances in sensor networks permit the use of a large number of relatively inexpensive distributed computational nodes with camera sensors linked in a network and possibly linked to one or more central servers. We argue that the full potential of such a distributed system can be realized if it is designed as a distributed search engine where images from different sensors can be captured, stored, searched and queried. However, unlike traditional image search engines that are focused on resource-rich situations, the resource limitations of camera sensor networks in terms of energy, band- width, computational power, and memory capacity present …
Bayesian Modeling Of Dependency Trees Using Hierarchical Pitman-Yor Priors, Hanna Wallach, Charles Sutton, Andrew Mccallum
Bayesian Modeling Of Dependency Trees Using Hierarchical Pitman-Yor Priors, Hanna Wallach, Charles Sutton, Andrew Mccallum
Hanna M. Wallach
In this paper, we introduce two hierarchical Bayesian dependency parsing models. First, we show that a classic dependency parser can be substantially improved by (a) using a hierarchical Pitman-Yor process prior over the distribution over dependents of a word, and (b) sampling the model hyperparameters. Second, we present a parsing model in which latent state variables mediate the relationships between words and their dependents. The model clusters dependencies into states using a similar approach to that used by Bayesian topic models when clustering words into topics. The inferred states have a syntactic character, and lead to modestly improved parse accuracy …
Bayesian Modeling Of Dependency Trees Using Hierarchical Pitman-Yor Priors, Hanna Wallach, Charles Sutton, Andrew Mccallum
Bayesian Modeling Of Dependency Trees Using Hierarchical Pitman-Yor Priors, Hanna Wallach, Charles Sutton, Andrew Mccallum
Hanna M. Wallach
In this paper, we introduce two hierarchical Bayesian dependency parsing models. First, we show that a classic dependency parser can be substantially improved by (a) using a hierarchical Pitman-Yor process prior over the distribution over dependents of a word, and (b) sampling the model hyperparameters. Second, we present a parsing model in which latent state variables mediate the relationships between words and their dependents. The model clusters dependencies into states using a similar approach to that used by Bayesian topic models when clustering words into topics. The inferred states have a syntactic character, and lead to modestly improved parse accuracy …
Intelligent Email: Aiding Users With Ai, Mark Dredze, Hanna Wallach, Danny Puller, Tova Brooks, Josh Carroll, Joshua Magarick, John Blitzer, Fernando Pereira
Intelligent Email: Aiding Users With Ai, Mark Dredze, Hanna Wallach, Danny Puller, Tova Brooks, Josh Carroll, Joshua Magarick, John Blitzer, Fernando Pereira
Hanna M. Wallach
Email occupies a central role in the modern workplace. This has led to a vast increase in the number of email messages that users are expected to handle daily. Furthermore, email is no longer simply a tool for asynchronous online communication---email is now used for task management, personal archiving, as well both synchronous and asynchronous online communication. This explosion can lead to ``email overload''---many users are overwhelmed by the large quantity of information in their mailboxes. In the human--computer interaction community, there has been much research on tackling email overload. Recently, similar efforts have emerged in the artificial intelligence (AI) …