Open Access. Powered by Scholars. Published by Universities.®
Social and Behavioral Sciences Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 5 of 5
Full-Text Articles in Social and Behavioral Sciences
Domain Specific Document Retrieval Framework For Real-Time Social Health Data, Swapnil Soni
Domain Specific Document Retrieval Framework For Real-Time Social Health Data, Swapnil Soni
Kno.e.sis Publications
With the advent of the web search and microblogging, the percentage of Online Health Information Seekers (OHIS) using these online services to share and seek health real-time information has in- creased exponentially. OHIS use web search engines or microblogging search services to seek out latest, relevant as well as reliable health in- formation. When OHIS turn to microblogging search services to search real-time content, trends and breaking news, etc. the search results are not promising. Two major challenges exist in the current microblogging search engines are keyword based techniques and results do not contain real-time information. To address these challenges, …
Mining Effective Multi-Segment Sliding Window For Pathogen Incidence Rate Prediction, Lei Duan, Changjie Tang, Xiasong Li, Guozhu Dong, Xianming Wang, Jie Zuo, Min Jiang, Zhongqi Li, Yongqing Zhang
Mining Effective Multi-Segment Sliding Window For Pathogen Incidence Rate Prediction, Lei Duan, Changjie Tang, Xiasong Li, Guozhu Dong, Xianming Wang, Jie Zuo, Min Jiang, Zhongqi Li, Yongqing Zhang
Kno.e.sis Publications
Pathogen incidence rate prediction, which can be considered as time series modeling, is an important task for infectious disease incidence rate prediction and for public health. This paper investigates the application of a genetic computation technique, namely GEP, for pathogen incidence rate prediction. To overcome the shortcomings of traditional sliding windows in GEP-based time series modeling, the paper introduces the problem of mining effective sliding window, for discovering optimal sliding windows for building accurate prediction models. To utilize the periodical characteristic of pathogen incidence rates, a multi-segment sliding window consisting of several segments from different periodical intervals is proposed and …
Pattern Space Maintenance For Data Updates And Interactive Mining, Mengling Feng, Guozhu Dong, Jinyan Li, Yap-Peng Tan, Limsoon Wong
Pattern Space Maintenance For Data Updates And Interactive Mining, Mengling Feng, Guozhu Dong, Jinyan Li, Yap-Peng Tan, Limsoon Wong
Kno.e.sis Publications
This article addresses the incremental and decremental maintenance of the frequent pattern space. We conduct an in-depth investigation on how the frequent pattern space evolves under both incremental and decremental updates. Based on the evolution analysis, a new data structure, Generator-Enumeration Tree (GE-tree), is developed to facilitate the maintenance of the frequent pattern space. With the concept of GE-tree, we propose two novel algorithms, Pattern Space Maintainer+ (PSM+) and Pattern Space Maintainer− (PSM−), for the incremental and decremental maintenance of frequent patterns. Experimental results demonstrate that the proposed algorithms, on average, outperform the representative state-of-the-art …
Efficient Computation Of Iceberg Cubes By Bounding Aggregate Functions, Xiuzhen Zhang, Pauline Lienhua Chou, Guozhu Dong
Efficient Computation Of Iceberg Cubes By Bounding Aggregate Functions, Xiuzhen Zhang, Pauline Lienhua Chou, Guozhu Dong
Kno.e.sis Publications
The iceberg cubing problem is to compute the multidimensional group-by partitions that satisfy given aggregation constraints. Pruning unproductive computation for iceberg cubing when nonantimonotone constraints are present is a great challenge because the aggregate functions do not increase or decrease monotonically along the subset relationship between partitions. In this paper, we propose a novel bound prune cubing (BP-Cubing) approach for iceberg cubing with nonantimonotone aggregation constraints. Given a cube over n dimensions, an aggregate for any group-by partition can be computed from aggregates for the most specific n--dimensional partitions (MSPs). The largest and smallest aggregate values computed this way become …
Summarizing Data Sets For Classification, Christopher W. Kinzig, Krishnaprasad Thirunarayan, Gary B. Lamont, Robert E. Marmelstein
Summarizing Data Sets For Classification, Christopher W. Kinzig, Krishnaprasad Thirunarayan, Gary B. Lamont, Robert E. Marmelstein
Kno.e.sis Publications
This paper describes our approach and experiences with implementing a data mining system using genetic algorithms in C++. In contrast with earlier classification algorithms that tended to “tile” the data sets using some pre-specified “shapes”, the proposed system is based on Marmelstein’s work on determining natural boundaries for class homogeneous regions. These boundaries are further refined to construct a compact set of simple data mining rules for classification.