Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Gene Set Based Ensemble Methods For Cancer Classification, William Evans Duncan Jan 2013

Gene Set Based Ensemble Methods For Cancer Classification, William Evans Duncan

LSU Doctoral Dissertations

Diagnosis of cancer very often depends on conclusions drawn after both clinical and microscopic examinations of tissues to study the manifestation of the disease in order to place tumors in known categories. One factor which determines the categorization of cancer is the tissue from which the tumor originates. Information gathered from clinical exams may be partial or not completely predictive of a specific category of cancer. Further complicating the problem of categorizing various tumors is that the histological classification of the cancer tissue and description of its course of development may be atypical. Gene expression data gleaned from micro-array analysis …


A Hybrid Framework Of Iterative Mapreduce And Mpi For Molecular Dynamics Applications, Shuju Bai Jan 2013

A Hybrid Framework Of Iterative Mapreduce And Mpi For Molecular Dynamics Applications, Shuju Bai

LSU Doctoral Dissertations

Developing platforms for large scale data processing has been a great interest to scientists. Hadoop is a widely used computational platform which is a fault-tolerant distributed system for data storage due to HDFS (Hadoop Distributed File System) and performs fault-tolerant distributed data processing in parallel due to MapReduce framework. It is quite often that actual computations require multiple MapReduce cycles, which needs chained MapReduce jobs. However, Design by Hadoop is poor in addressing problems with iterative structures. In many iterative problems, some invariant data is required by every MapReduce cycle. The same data is uploaded to Hadoop file system in …


Study On The Performance Of Tcp Over 10gbps High Speed Networks, Cheng Cui Jan 2013

Study On The Performance Of Tcp Over 10gbps High Speed Networks, Cheng Cui

LSU Doctoral Dissertations

Internet traffic is expected to grow phenomenally over the next five to ten years. To cope with such large traffic volumes, high-speed networks are expected to scale to capacities of terabits-per-second and beyond. Increasing the role of optics for packet forwarding and transmission inside the high-speed networks seems to be the most promising way to accomplish this capacity scaling. Unfortunately, unlike electronic memory, it remains a formidable challenge to build even a few dozen packets of integrated all-optical buffers. On the other hand, many high-speed networks depend on the TCP/IP protocol for reliability which is typically implemented in software and …


On-The-Fly Tracing For Data-Centric Computing : Parallelization, Workflow And Applications, Lei Jiang Jan 2013

On-The-Fly Tracing For Data-Centric Computing : Parallelization, Workflow And Applications, Lei Jiang

LSU Doctoral Dissertations

As data-centric computing becomes the trend in science and engineering, more and more hardware systems, as well as middleware frameworks, are emerging to handle the intensive computations associated with big data. At the programming level, it is crucial to have corresponding programming paradigms for dealing with big data. Although MapReduce is now a known programming model for data-centric computing where parallelization is completely replaced by partitioning the computing task through data, not all programs particularly those using statistical computing and data mining algorithms with interdependence can be re-factorized in such a fashion. On the other hand, many traditional automatic parallelization …


Program Analysis : Termination Proofs For Linear Simple Loops, Hongyi Chen Jan 2013

Program Analysis : Termination Proofs For Linear Simple Loops, Hongyi Chen

LSU Doctoral Dissertations

Termination proof synthesis for simple loops, i.e., loops with only conjoined constraints in the loop guard and variable updates in the loop body, is the building block of termination analysis, as well as liveness analysis, for large complex imperative systems. In particular, we consider a subclass of simple loops which contain only linear constraints in the loop guard and linear updates in the loop body. We call them Linear Simple Loops (LSLs). LSLs are particularly interesting because most loops in practice are indeed linear; more importantly, since we allow the update statements to handle nondeterminism, LSLs are expressive enough to …


On Identifying Critical Nuggets Of Information During Classification Task, David Sathiaraj Jan 2013

On Identifying Critical Nuggets Of Information During Classification Task, David Sathiaraj

LSU Doctoral Dissertations

In large databases, there may exist critical nuggets - small collections of records or instances that contain domain-specific important information. This information can be used for future decision making such as labeling of critical, unlabeled data records and improving classification results by reducing false positive and false negative errors. In recent years, data mining efforts have focussed on pattern and outlier detection methods. However, not much effort has been dedicated to finding critical nuggets within a data set. This work introduces the idea of critical nuggets, proposes an innovative domain-independent method to measure criticality, suggests a heuristic to reduce the …


Exploring The Learnability Of Numeric Datasets, Di Lin Jan 2013

Exploring The Learnability Of Numeric Datasets, Di Lin

LSU Doctoral Dissertations

When doing classification, it has often been observed that datasets may exhibit different levels of difficulty with respect to how accurately they can be classified. That is, there are some datasets which can be classified very accurately by many classification algorithms, and there also exist some other datasets that no classifier can classify them with high accuracy. Based on this observation, we try to address the following problems: a)what are the factors that make a dataset easy or difficult to be accurately classified? b) how to use such factors to predict the difficulties of unclassified datasets? and c) how to …


Toward Digitizing The Human Experience : A New Resource For Natural Language Processing, Jerry Scott Weltman Jan 2013

Toward Digitizing The Human Experience : A New Resource For Natural Language Processing, Jerry Scott Weltman

LSU Doctoral Dissertations

A long-standing goal of Artificial Intelligence is to program computers that understand natural language. A basic obstacle is that computers lack the common sense that even small children acquire simply by experiencing life, and no one has devised a way to program this experience into a computer. This dissertation presents a methodology and proof-of-concept software system that enables non-experts, with some training, to create simple experiences. For the purposes of this dissertation, an experience is a series of time-ordered comic frames, annotated with the changing intentional and physical states of the characters and objects in each frame. Each frame represents …