Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

LSU Doctoral Dissertations

Computer Sciences

Indexing

Publication Year

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Efficient Indexing For Structured And Unstructured Data, Manish Madhukar Patil Jan 2014

Efficient Indexing For Structured And Unstructured Data, Manish Madhukar Patil

LSU Doctoral Dissertations

The collection of digital data is growing at an exponential rate. Data originates from wide range of data sources such as text feeds, biological sequencers, internet traffic over routers, through sensors and many other sources. To mine intelligent information from these sources, users have to query the data. Indexing techniques aim to reduce the query time by preprocessing the data. Diversity of data sources in real world makes it imperative to develop application specific indexing solutions based on the data to be queried. Data can be structured i.e., relational tables or unstructured i.e., free text. Moreover, increasingly many applications need …


Performance Improvement Of Distributed Computing Framework And Scientific Big Data Analysis, Praveenkumar Kondikoppa Jan 2014

Performance Improvement Of Distributed Computing Framework And Scientific Big Data Analysis, Praveenkumar Kondikoppa

LSU Doctoral Dissertations

Analysis of Big data to gain better insights has been the focus of researchers in the recent past. Traditional desktop computers or database management systems may not be suitable for efficient and timely analysis, due to the requirement of massive parallel processing. Distributed computing frameworks are being explored as a viable solution. For example, Google proposed MapReduce, which is becoming a de facto computing architecture for Big data solutions. However, scheduling in MapReduce is coarse grained and remains as a challenge for improvement. Related with MapReduce scheduler when configured over distributed clusters, we identify two issues: data locality disruption and …


Techniques To Explore Time-Related Correlation In Large Datasets, Sumeet Dua Jan 2002

Techniques To Explore Time-Related Correlation In Large Datasets, Sumeet Dua

LSU Doctoral Dissertations

The next generation of database management and computing systems will be significantly complex with data distributed both in functionality and operation. The complexity arises, at least in part, due to data types involved and types of information request rendered by the database user. Time sequence databases are generated in many practical applications. Detecting similar sequences and subsequences within these databases is an important research area and has generated lot of interest recently. Previous studies in this area have concentrated on calculating similitude between (sub)sequences of equal sizes. The question of unequal sized (sub)sequence comparison to report similitude has been an …