Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Physical Sciences and Mathematics

The Impact Of Overfitting And Overgeneralization On The Classification Accuracy In Data Mining, Huy Nguyen Anh Pham Jan 2010

The Impact Of Overfitting And Overgeneralization On The Classification Accuracy In Data Mining, Huy Nguyen Anh Pham

LSU Doctoral Dissertations

Current classification approaches usually do not try to achieve a balance between fitting and generalization when they infer models from training data. Such approaches ignore the possibility of different penalty costs for the false-positive, false-negative, and unclassifiable types. Thus, their performances may not be optimal or may even be coincidental. This dissertation analyzes the above issues in depth. It also proposes two new approaches called the Homogeneity-Based Algorithm (HBA) and the Convexity-Based Algorithm (CBA) to address these issues. These new approaches aim at optimally balancing the data fitting and generalization behaviors of models when some traditional classification approaches are used. …


Choosing Between Remote I/O Versus Staging In Distributed Environments, Ibrahim Hakki Suslu Jan 2010

Choosing Between Remote I/O Versus Staging In Distributed Environments, Ibrahim Hakki Suslu

LSU Doctoral Dissertations

Today, scientifi_x000C_c applications and experiments have become increasingly complex and more demanding in terms of their computational and data requirements. The amount of data generated and used has grown at a very rapid rate. As tens or hundreds of terabytes of data for a single application is very common today; petabytes and even exabytes of data will be very common in a few years. One of the major challenges in distributed computing environments is how to access these large datasets remotely over the network. Data staging and remote I/O are the most widely used data access methods for distributed applications. …


Model Based Analysis Of Some High Speed Network Issues, Suman Kumar Jan 2010

Model Based Analysis Of Some High Speed Network Issues, Suman Kumar

LSU Doctoral Dissertations

The study of complex problems in science and engineering today typically involves large scale data, huge number of large-scale scientific breakthroughs critically depends on large multi-disciplinary and geographically-dispersed research teams, where the high speed network becomes the integral part. To serve the ongoing bandwidth requirement and scalability of these networks, there has been a continuous evolution of different TCPs for high speed networks. Testing these protocols on a real network would be expensive, time consuming and more over not easily available to the researchers worldwide. Network simulation is well accepted and widely used method for performance evaluation, it is well …


Improving Software Quality Using An Ontology-Based Approach, Yixin Luo Jan 2010

Improving Software Quality Using An Ontology-Based Approach, Yixin Luo

LSU Doctoral Dissertations

Ensuring quality in software development is a challenging process. The concepts of anti-pattern and bad code smells utilize the knowledge of reoccurring problems to improve the quality of current and future software development. Anti-patterns describe recurring bad design solutions while bad code smells describe source code that is error-free but difficult to understand and maintain. Code refactoring aims to remove bad code smells without changing a program’s functionality while improving program quality. There are metrics-based tools to detect a few bad code smells from source code; however, the knowledge and understanding of these indicators of low quality software are still …


Application-Level Optimization Of End-To-End Data Transfer Throughput, Esma Yildirim Jan 2010

Application-Level Optimization Of End-To-End Data Transfer Throughput, Esma Yildirim

LSU Doctoral Dissertations

For large-scale distributed applications, effective use of available network throughput and optimization of data transfer speed is crucial for end-to-end application performance. Today, many regional and national optical networking initiatives such as LONI, ESnet and Teragrid provide high speed network connectivity to their users. However, majority of the users fail to obtain even a fraction of the theoretical speeds promised by these networks due to issues such as sub-optimal protocol tuning, disk bottleneck on the sending and/or receiving ends, and processor limitations. This implies that having high speed networks in place is important but not sufficient for the improvement of …


Data Transfer Scheduling With Advance Reservation And Provisioning, Mehmet Balman Jan 2010

Data Transfer Scheduling With Advance Reservation And Provisioning, Mehmet Balman

LSU Doctoral Dissertations

Over the years, scientific applications have become more complex and more data intensive. Although through the use of distributed resources the institutions and organizations gain access to the resources needed for their large-scale applications, complex middleware is required to orchestrate the use of these storage and network resources between collaborating parties, and to manage the end-to-end processing of data. We present a new data scheduling paradigm with advance reservation and provisioning. Our methodology provides a basis for provisioning end-to-end high performance data transfers which require integration between system, storage and network resources, and coordination between reservation managers and data transfer …


Augmented Breast Tumor Classification By Perfusion Analysis, Bruce Yu-Sun Lin Jan 2010

Augmented Breast Tumor Classification By Perfusion Analysis, Bruce Yu-Sun Lin

LSU Doctoral Dissertations

Magnetic resonance and computed tomography imaging aid in the diagnosis and analysis of pathologic conditions. Blood flow, or perfusion, through a region of tissue can be computed from a time series of contrast-enhanced images. Perfusion is an important set of physiological parameters that reflect angiogenesis. In cancer, heightened angiogenesis is a key process in the growth and spread of tumorous masses. An automatic classification technique using recovered perfusion may prove to be a highly accurate diagnostic tool. Such a classification system would supplement existing histopathological tests, and help physicians to choose the most optimal treatment protocol. Perfusion is obtained through …