Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Categorical Data Analysis

2017

Institution
Keyword
Publication
Publication Type

Articles 1 - 22 of 22

Full-Text Articles in Statistics and Probability

Using Data Analytics For Discovering Library Resource Insights: Case From Singapore Management University, Ning Lu, Rui Song, Dina Li Gwek Heng, Swapna Gottipati, Aaron Tay Dec 2017

Using Data Analytics For Discovering Library Resource Insights: Case From Singapore Management University, Ning Lu, Rui Song, Dina Li Gwek Heng, Swapna Gottipati, Aaron Tay

Research Collection School Of Computing and Information Systems

Library resources are critical in supporting teaching, research and learning processes. Several universities have employed online platforms and infrastructure for enabling the online services to students, faculty and staff. To provide efficient services by understanding and predicting user needs libraries are looking into the area of data analytics. Library analytics in Singapore Management University is the project committed to provide an interface for data-intensive project collaboration, while supporting one of the library’s key pillars on its commitment to collaborate on initiatives with SMU Communities and external groups. In this paper, we study the transaction logs for user behavior analysis that …


Data-Adaptive Kernel Support Vector Machine, Xin Liu Nov 2017

Data-Adaptive Kernel Support Vector Machine, Xin Liu

Electronic Thesis and Dissertation Repository

In this thesis, we propose the data-adaptive kernel Support Vector Machine (SVM), a new method with a data-driven scaling kernel function based on real data sets. This two-stage approach of kernel function scaling can enhance the accuracy of a support vector machine, especially when the data are imbalanced. Followed by the standard SVM procedure in the first stage, the proposed method locally adapts the kernel function to data locations based on the skewness of the class outcomes. In the second stage, the decision rule is constructed with the data-adaptive kernel function and is used as the classifier. This process enlarges …


Data Envelopment Analysis Using Glpkapi In R, Konrad Miziolek, Jordan Beary, Shreyas Vasanth, Surekha Chanamolu, Rudraxi Mitra Oct 2017

Data Envelopment Analysis Using Glpkapi In R, Konrad Miziolek, Jordan Beary, Shreyas Vasanth, Surekha Chanamolu, Rudraxi Mitra

Engineering and Technology Management Student Projects

The work done here is primarily a wrapper function written to separate some of the more difficult-to-use glpkAPI functionality from the end-user. The user, when prompted, selects the appropriate configuration of the .mod file to the task (for example, output-oriented CRS), and the data file, as a .dat. The function then loads the required glpkAPI library, and carries forward the model. It allocates the problem and workspace, reads the model file and data file the user selects, builds the problem, and solves it. The function returns primal values, and, if dual = TRUE is selected, also returns dual weights.


How Singapore Investors Can Profit From Unstructured Data, Clarence Goh Sep 2017

How Singapore Investors Can Profit From Unstructured Data, Clarence Goh

Research Collection School Of Accountancy

Data that is collected in the business environment can be structured or unstructured. In general, structured data refers to information which is highly organised and which can easily be stored in rows and columns within database systems. On the other hand, unstructured data does not have a strict data structure, and is also not organised in a pre-defined manner.


Improving The Accuracy For The Long-Term Hydrologic Impact Assessment (L-Thia) Model, Anqi Zhang, Lawrence Theller, Bernard A. Engel Aug 2017

Improving The Accuracy For The Long-Term Hydrologic Impact Assessment (L-Thia) Model, Anqi Zhang, Lawrence Theller, Bernard A. Engel

The Summer Undergraduate Research Fellowship (SURF) Symposium

Urbanization increases runoff by changing land use types from less impervious to impervious covers. Improving the accuracy of a runoff assessment model, the Long-Term Hydrologic Impact Assessment (L-THIA) Model, can help us to better evaluate the potential uses of Low Impact Development (LID) practices aimed at reducing runoff, as well as to identify appropriate runoff and water quality mitigation methods. Several versions of the model have been built over time, and inconsistencies have been introduced between the models. To improve the accuracy and consistency of the model, the equations and parameters (primarily curve numbers in the case of this model) …


Integrating Apache Spark And R For Big Data Analytics On Solving Geographic Problems, Mengqi Zhang, Tin Seong Kam Aug 2017

Integrating Apache Spark And R For Big Data Analytics On Solving Geographic Problems, Mengqi Zhang, Tin Seong Kam

Research Collection School Of Computing and Information Systems

With the advent ofdigital technology and smart devices, a flood of digital data is beinggenerated every day. This huge amount of data not only records the historyactivities but also provides future valuable information for organizations andbusinesses. However, the true values of these data will not be fullyappreciated until they have been processed, analyzed and the analysis resultsbeen communicated to decision makers in a business friendly manner.In view of thisneed, big data has been one of the major research focus in the academicresearch community especially in the field of computer science and the softwarevendor as well as the big data service …


Visualizing Lab And Phenotype Associations Using Phewas And Electronic Health Records, Brenda Emerson, Miriam Goldman, Sahiti Kolli Jul 2017

Visualizing Lab And Phenotype Associations Using Phewas And Electronic Health Records, Brenda Emerson, Miriam Goldman, Sahiti Kolli

Honors Projects

As the digitization of patient health records is becoming more common, we are given a great opportunity to analyze these records and hopefully make discoveries about diseases or medicines. Being given large datasets of Electronic Health Records, I and two other students decided to look for novel phenotype associations with mean lab values, look to see whether the presence of a lab had associations with a phenotype, and create an interactive application to visual the associations between labs and phenotypes.


Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei Jul 2017

Burden Of Atopic Dermatitis In The United States: Analysis Of Healthcare Claims Data In The Commercial, Medicare, And Medi-Cal Databases, Sulena Shrestha, Raymond Miao, Li Wang, Jingdong Chao, Huseyin Yuce, Wenhui Wei

Publications and Research

Comparative data on the burden of atopic dermatitis (AD) in adults relative to the general population are limited. We performed a large-scale evaluation of the burden of disease among US adults with AD relative to matched non-AD controls, encompassing comorbidities, healthcare resource utilization (HCRU), and costs, using healthcare claims data. The impact of AD disease severity on these outcomes was also evaluated.


Mining Diverse Consumer Preferences For Bundling And Recommendation, Ha Loc Do Jul 2017

Mining Diverse Consumer Preferences For Bundling And Recommendation, Ha Loc Do

Dissertations and Theses Collection

That consumers share similar tastes on some products does not guarantee their agreement on other products. Therefore, both similarity and dierence should be taken into account for a more rounded view on consumer preferences. This manuscript focuses on mining this diversity of consumer preferences from two perspectives, namely 1) between consumers and 2) between products. Diversity of preferences between consumers is studied in the context of recommendation systems. In some preference models, measuring similarities in preferences between two consumers plays the key role. These approaches assume two consumers would share certain degree of similarity on any products, ignoring the fact …


Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad May 2017

Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad

Electronic Thesis and Dissertation Repository

Despite a large volume of research on the prognosis, diagnosis and overall burden of multimorbidity, very little is known about socio-demographic characteristics of multimorbid patients. This thesis aims to analyze the socio-demographic characteristics of patients with multiple chronic conditions (multimorbidity), focusing on patient groups sharing the same combination of diseases. Several methods were explored to analyze the co-occurrence of multiple chronic diseases as well as the associations between socio-demographics and chronic conditions. These methods include disease pair distributions over gender, age groups and income level quintiles, Multimorbidity Coefficients for measuring the concurrence of disease pairs and triples, and k-modes clustering …


Spatiotemporal Analyses Of Recycled Water Production, Jana E. Archer May 2017

Spatiotemporal Analyses Of Recycled Water Production, Jana E. Archer

Electronic Theses and Dissertations

Increased demands on water supplies caused by population expansion, saltwater intrusion, and drought have led to water shortages which may be addressed by use of recycled water as recycled water products. Study I investigated recycled water production in Florida and California during 2009 to detect gaps in distribution and identify areas for expansion. Gaps were detected along the panhandle and Miami, Florida, as well as the northern and southwestern regions in California. Study II examined gaps in distribution, identified temporal change, and located areas for expansion for Florida in 2009 and 2015. Production increased in the northern and southern regions …


Now You See It, Now You Don't! A Study Of Content Modification Behavior In Facebook, Fuxiang Chen, Ee-Peng Lim Apr 2017

Now You See It, Now You Don't! A Study Of Content Modification Behavior In Facebook, Fuxiang Chen, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Social media, as a major platform to disseminate information, has changed the way users and communities contribute content. In this paper, we aim to study content modifications on public Facebook pages operated by news media, community groups, and bloggers. We also study the possible reasons behind them, and their effects on user interaction. We conducted a detailed study of Content Censorship (CC) and Content Edit (CE) in Facebook using a detailed longitudinal dataset consisting of 57 public Facebook pages over 3 weeks covering 145,955 posts and 9,379,200 comments. We detected many CC and CE activities between 28% and 56% of …


Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr Mar 2017

Statistically Analyzing Assembly Line Processing Times Through Incorporation Of Product Variation, Kyle Rehr, Matthew Farr

Scholars Week

Timing methods and performance metrics are important in the heavily industrialized world we live in. Industrial plants use metrics to measure quality of production, help make decisions, and drive the strategy of the organization. However, there are many factors to be considered when measuring performance based on a metric; of which we will be analyzing the importance of product variation. We will be analyzing assembly line timings, whilst controlling for product variance, to show the importance differences between products makes in one’s ability to predict performance. In addition, we will be analyzing the current “statistical” methods used by an industrial …


Are You Ready? Data Analytics Is Reshaping The Work Of Accountants, Clarence Goh Feb 2017

Are You Ready? Data Analytics Is Reshaping The Work Of Accountants, Clarence Goh

Research Collection School Of Accountancy

According to the 2016 State of Analytics and Data Science reportpublished by data analytics firm Mu Sigma, 65% of senior business leaderssurveyed in the United States believe that data analytics has influenced theirbusiness in a positive way.


Are You Ready? Data Analytics Is Reshaping The Work Of Accountants, Clarence Goh Feb 2017

Are You Ready? Data Analytics Is Reshaping The Work Of Accountants, Clarence Goh

Research Collection School Of Accountancy

According to the 2016 State of Analytics and Data Science reportpublished by data analytics firm Mu Sigma, 65% of senior business leaderssurveyed in the United States believe that data analytics has influenced theirbusiness in a positive way.


On The Three Dimensional Interaction Between Flexible Fibers And Fluid Flow, Bogdan Nita, Ryan Allaire Jan 2017

On The Three Dimensional Interaction Between Flexible Fibers And Fluid Flow, Bogdan Nita, Ryan Allaire

Department of Mathematics Facuty Scholarship and Creative Works

In this paper we discuss the deformation of a flexible fiber clamped to a spherical body and immersed in a flow of fluid moving with a speed ranging between 0 and 50 cm/s by means of three dimensional numerical simulation developed in COMSOL . The effects of flow speed and initial configuration angle of the fiber relative to the flow are analyzed. A rigorous analysis of the numerical procedure is performed and our code is benchmarked against well established cases. The flow velocity and pressure are used to compute drag forces upon the fiber. Of particular interest is the behavior …


What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg Jan 2017

What’S Brewing? A Statistics Education Discovery Project, Marla A. Sole, Sharon L. Weinberg

Publications and Research

We believe that students learn best, are actively engaged, and are genuinely interested when working on real-world problems. This can be done by giving students the opportunity to work collaboratively on projects that investigate authentic, familiar problems. This article shares one such project that was used in an introductory statistics course. We describe the steps taken to investigate why customers are charged more for iced coffee than hot coffee, which included collecting data and using descriptive and inferential statistical analysis. Interspersed throughout the article, we describe strategies that can help teachers implement the project and scaffold material to assist students …


Alcohol Perceptions And Behavior In A Residential Peer Social Network, Shannon R. Kenney, Miles Q. Ott, Matthew Meisel, Nancy P. Barnett Jan 2017

Alcohol Perceptions And Behavior In A Residential Peer Social Network, Shannon R. Kenney, Miles Q. Ott, Matthew Meisel, Nancy P. Barnett

Statistical and Data Sciences: Faculty Publications

Personalized normative feedback is a recommended component of alcohol interventions targeting college students. However, normative data are commonly collected through campus-based surveys, not through actual participant-referent relationships. In the present investigation, we examined how misperceptions of residence hall peers, both overall using a global question and those designated as important peers using person-specific questions, were related to students’ personal drinking behaviors. Participants were 108 students (88% freshman, 54% White, 51% female) residing in a single campus residence hall. Participants completed an online baseline survey in which they reported their own alcohol use and perceptions of peer alcohol use using both …


Curriculum Guidelines For Undergraduate Programs In Data Science, Richard D. De Veaux, Mahesh Agarwal, Maia Averett, Benjamin Baumer, Andrew Bray, Thomas C. Bressoud, Lance Bryant, Lei Z. Cheng, Amanda Francis, Robert Gould, Albert Y. Kim, Matt Kretchmar, Qin Lu, Ann Moskol, Deborah Nolan, Roberto Pelayo, Sean Raleigh, Ricky J. Sethi, Mutiara Sondjaja, Neelesh Tiruviluamala, Paul X. Uhlig, Talitha M. Washington, Curtis L. Wesley, David White, Ping Ye Jan 2017

Curriculum Guidelines For Undergraduate Programs In Data Science, Richard D. De Veaux, Mahesh Agarwal, Maia Averett, Benjamin Baumer, Andrew Bray, Thomas C. Bressoud, Lance Bryant, Lei Z. Cheng, Amanda Francis, Robert Gould, Albert Y. Kim, Matt Kretchmar, Qin Lu, Ann Moskol, Deborah Nolan, Roberto Pelayo, Sean Raleigh, Ricky J. Sethi, Mutiara Sondjaja, Neelesh Tiruviluamala, Paul X. Uhlig, Talitha M. Washington, Curtis L. Wesley, David White, Ping Ye

Statistical and Data Sciences: Faculty Publications

The Park City Math Institute 2016 Summer Undergraduate Faculty Program met for the purpose of composing guidelines for undergraduate programs in data science. The group consisted of 25 undergraduate faculty from a variety of institutions in the United States, primarily from the disciplines of mathematics, statistics, and computer science. These guidelines are meant to provide some structure for institutions planning for or revising a major in data science.


Quantifying The Effect Of The Shift In Major League Baseball, Christopher John Hawke Jr. Jan 2017

Quantifying The Effect Of The Shift In Major League Baseball, Christopher John Hawke Jr.

Senior Projects Spring 2017

Baseball is a very strategic and abstract game, but the baseball world is strangely obsessed with statistics. Modern mainstream statisticians often study offensive data, such as batting average or on-base percentage, in order to evaluate player performance. However, this project observes the game from the opposite perspective: the defensive side of the game. In hopes of analyzing the game from a more concrete perspective, countless mathemeticians - most famously, Bill James - have developed numerous statistical models based on real life data of Major League Baseball (MLB) players. Large numbers of metrics go into these models, but what this project …


Advance Care Planning As A Shared Endeavor: Completion Of Acp Documents In A Multidisciplinary Cancer Program, Melissa A. Clark, Miles Q. Ott, Michelle L. Rogers, Mary C. Politi, Susan C. Miller, Laura Moynihan, Katina Robison, Ashley Stuckey, Don Dizon Jan 2017

Advance Care Planning As A Shared Endeavor: Completion Of Acp Documents In A Multidisciplinary Cancer Program, Melissa A. Clark, Miles Q. Ott, Michelle L. Rogers, Mary C. Politi, Susan C. Miller, Laura Moynihan, Katina Robison, Ashley Stuckey, Don Dizon

Statistical and Data Sciences: Faculty Publications

Objective—We examined the roles of oncology providers in advance care planning (ACP) delivery in the context of a multidisciplinary cancer program.

Methods—Semi-structured interviews were conducted with 200 women with recurrent and/or metastatic breast or gynecologic cancer. Participants were asked to name providers they deemed important in their cancer care and whether they had discussed and/or completed ACP documentation. Evidence of ACP documentation was obtained from chart reviews.

Results—Fifty percent of participants self-reported completing an advance directive (AD) and 48.5% had named a healthcare power of attorney (HPA), 38.5% had completed both, and 39.0% had completed neither document. Among women who …


The Value Of A Collegiate Far Part 141 Jeopardy-Crew Resource Management (Crm)-Simulation Event, Samuel M. Vance Jan 2017

The Value Of A Collegiate Far Part 141 Jeopardy-Crew Resource Management (Crm)-Simulation Event, Samuel M. Vance

Journal of Aviation/Aerospace Education & Research

This article explores the viability of using a FAR Part 141 collegiate crew resource management (CRM) flight simulator scenario event as a jeopardy event (a graded, syllabus item) in an upper-level professional pilot curriculum course. Ultimately, the objective is to suggest this approach as a value-added curriculum consideration for other collegiate professional pilot programs. The selection of four CRM criteria to be examined was made by the course professor. Using the four principles, the students assembled the grading rubric for their event. The simulator scenario placed students in airspace, geography and weather dissimilar to that in which they were training …