Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Managing Large Data Sets Using Support Vector Machines, Ranjini Srinivas Aug 2010

Managing Large Data Sets Using Support Vector Machines, Ranjini Srinivas

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for …


Cloud Based Scientific Workflow For Nmr Data Analysis, Ashwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith Harshana Ranabahu, Michael L. Raymer, Amit P. Sheth Jul 2010

Cloud Based Scientific Workflow For Nmr Data Analysis, Ashwin Manjunatha, Paul E. Anderson, Satya S. Sahoo, Ajith Harshana Ranabahu, Michael L. Raymer, Amit P. Sheth

Kno.e.sis Publications

This work presents a service oriented scientific workflow approach to NMR-based metabolomics data analysis. We demonstrate the effectiveness of this approach by implementing several common spectral processing techniques in the cloud using a parallel map-reduce framework, Hadoop.


A Study In Hadoop Streaming With Matlab For Nmr Data Processing, Kalpa Gunaratna, Paul E. Anderson, Ajith Harshana Ranabahu, Amit P. Sheth Jan 2010

A Study In Hadoop Streaming With Matlab For Nmr Data Processing, Kalpa Gunaratna, Paul E. Anderson, Ajith Harshana Ranabahu, Amit P. Sheth

Kno.e.sis Publications

Applying Cloud computing techniques for analyzing large data sets has shown promise in many data-driven scientific applications. Our approach presented here is to use Cloud computing for Nuclear Magnetic Resonance (NMR)data analysis which normally consists of large amounts of data. Biologists often use third party or commercial software for ease of use. Enabling the capability to use this kind of software in a Cloud will be highly advantageous in many ways. Scripting languages especially designed for clouds may not have the flexibility biologists need for their purposes. Although this is true, they are familiar with special software packages that allow …