Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 31 - 60 of 287

Full-Text Articles in Physical Sciences and Mathematics

Music Recommendation And Query-By-Content Using Self-Organizing Maps, Kyle B. Dickerson, Dan A. Ventura Jun 2009

Music Recommendation And Query-By-Content Using Self-Organizing Maps, Kyle B. Dickerson, Dan A. Ventura

Faculty Publications

The ever-increasing density of computer storage devices has allowed the average user to store enormous quantities of multimedia content, and a large amount of this content is usually music. Current search techniques for musical content rely on meta-data tags which describe artist, album, year, genre, etc. Query-by-content systems allow users to search based upon the acoustical content of the songs. Recent systems have mainly depended upon textual representations of the queries and targets in order to apply common string-matching algorithms. However, these methods lose much of the information content of the song and limit the ways in which a user …


Super-Resolution Via Recapture And Bayesian Effect Modeling, Bryan S. Morse, Kevin Seppi, Neil Toronto, Dan A. Ventura Jun 2009

Super-Resolution Via Recapture And Bayesian Effect Modeling, Bryan S. Morse, Kevin Seppi, Neil Toronto, Dan A. Ventura

Faculty Publications

This paper presents Bayesian edge inference (BEI), a single-frame super-resolution method explicitly grounded in Bayesian inference that addresses issues common to existing methods. Though the best give excellent results at modest magnification factors, they suffer from gradient stepping and boundary coherence problems by factors of 4x. Central to BEI is a causal framework that allows image capture and recapture to be modeled differently, a principled way of undoing downsampling blur, and a technique for incorporating Markov random field potentials arbitrarily into Bayesian networks. Besides addressing gradient and boundary issues, BEI is shown to be competitive with existing methods on published …


An Exploration Of Topologies And Communication In Large Particle Swarms, Matthew Gardner, Andrew Mcnabb, Kevin Seppi May 2009

An Exploration Of Topologies And Communication In Large Particle Swarms, Matthew Gardner, Andrew Mcnabb, Kevin Seppi

Faculty Publications

Particle Swarm Optimization (PSO) has typically been used with small swarms of about 50 particles. However, PSO is more efficiently parallelized with large swarms. We formally describe existing topologies and identify variations which are better suited to large swarms in both sequential and parallel computing environments. We examine the performance of PSO for benchmark functions with respect to swarm size and topology. We develop and demonstrate a new PSO variant which leverages the unique strengths of large swarms. “Hearsay PSO” allows for information to flow quickly through the swarm, even with very loosely connected topologies. These loosely connected topologies are …


Test Case Generation Using Model Checking For Software Components Deployed Into New Environments, Tonglaga Bao, Michael D. Jones Apr 2009

Test Case Generation Using Model Checking For Software Components Deployed Into New Environments, Tonglaga Bao, Michael D. Jones

Faculty Publications

In this paper, we show how to generate test cases for a component deployed into a new software environment. This problem is important for software engineers who need to deploy a component into a new environment. Most existing model based testing approaches generate models from high level specifications. This leaves a semantic gap between the high level specification and the actual implementation. Furthermore, the high level specification often needs to be manually translated into a model, which is a time consuming and error prone process. We propose generating the model automatically by abstracting the source code of the component using …


The 20-Minute Genealogist: A Context-Preservation Metaphor For Assisted Family History Research, Charles D. Knutson, Jonathan Krein Mar 2009

The 20-Minute Genealogist: A Context-Preservation Metaphor For Assisted Family History Research, Charles D. Knutson, Jonathan Krein

Faculty Publications

What can you possibly do to be productive as a family history researcher in 20 minutes per week? Our studies suggest that currently the answer is, “Nothing.” In 20 minutes a would-be researcher can’t even remember what happened last week, let alone what they were planning to do next. The 20-Minute Genealogist is a powerful metaphor within which software solutions must consider context preservation as the fundamental domain of the system, thus freeing the researcher to do research while the software manages the tasks that computers do best. Two survey-based studies were conducted that indicate a significant disconnect between the …


A Dynamic Attribute-Based Data Filtering And Recovery Scheme For Web Information Processing, Amit Ahuja, Yiu-Kai D. Ng Mar 2009

A Dynamic Attribute-Based Data Filtering And Recovery Scheme For Web Information Processing, Amit Ahuja, Yiu-Kai D. Ng

Faculty Publications

Web data being transmitted over a network channel on the Internet with excessive amount of data causes data processing problems, which include selectively choosing useful information to be retained for various data applications. In this paper, we present an approach for filtering less-informative attribute data from a source Website. A scheme for filtering attributes, instead of tuples (records), from a Website becomes imperative, since filtering a complete tuple would lead to filtering some informative, as well as less-informative, attribute data in the tuple. Since filtered data at the source Website may be of interest to the user at the destination …


Spamed: A Spam Email Detection Approach Based On Phrase Similarity, Yiu-Kai D. Ng, Maria Soledad Pera Feb 2009

Spamed: A Spam Email Detection Approach Based On Phrase Similarity, Yiu-Kai D. Ng, Maria Soledad Pera

Faculty Publications

Emails are unquestionably one of the most popular communication media these days. Not only they are fast and reliable, but also free in general. Unfortunately, a significant number of emails received by email users on a daily basis are spam. This fact is annoying, since spam emails translate into a waste of user’s time in reviewing and deleting them. In addition, spam emails consume resources, such as storage, bandwidth, and computer processing time. Many attempts have been made in the past to eradicate spam emails; however, none has been proved highly effective. In this paper, we propose a spam-email detection …


Author Entropy Vs. File Size In The Gnome Suite Of Applications, Jason R. Casebolt, Daniel P. Delorey, Charles D. Knutson, Jonathan Krein, Alexander C. Maclean Jan 2009

Author Entropy Vs. File Size In The Gnome Suite Of Applications, Jason R. Casebolt, Daniel P. Delorey, Charles D. Knutson, Jonathan Krein, Alexander C. Maclean

Faculty Publications

We present the results of a study in which author entropy was used to characterize author contributions per file. Our analysis reveals three patterns: banding in the data, uneven distribution of data across bands, and file size dependent distributions within bands. Our re- sults suggest that when two authors contribute to a file, large files are more likely to have a dominant author than smaller files.


Psoda: Open Source Phylogenetic Search And Dna Analysis, Mark J. Clement, Quinn O. Snell, Kenneth Sundberg Jan 2009

Psoda: Open Source Phylogenetic Search And Dna Analysis, Mark J. Clement, Quinn O. Snell, Kenneth Sundberg

Faculty Publications

PSODA is an open source (GPL v2) sequence analysis package that implements sequence alignment using biochemical properties, phylogeny search with parsimony or maximum likelihood criteria and selection detection using biochemical properties (TreeSAAP ). PSODA is compatible with PAUP and the search algorithms are competitive with those in PAUP. PSODA also adds a basic scripting language to the PAUP block, making it possible to easily create advanced meta-searches. Because PSODA is open-source, we have also been able to easily add in advanced search techniques and characterize the benefits of various optimizations. PSODA is available for Macintosh OS X, Windows, and Linux.


Hardware Accelerated Sequence Alignment With Traceback, Scott Lloyd, Quinn O. Snell Jan 2009

Hardware Accelerated Sequence Alignment With Traceback, Scott Lloyd, Quinn O. Snell

Faculty Publications

Biological sequence alignment is an essential tool used in molecular biology and biomedical applications. The growing volume of genetic data and the complexity of sequence alignment present a challenge in obtaining alignment results in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop …


Synthesizing Correlated Rss News Articles Based On A Fuzzy Equivalence Relation, Yiu-Kai D. Ng, Maria Soledad Pera Jan 2009

Synthesizing Correlated Rss News Articles Based On A Fuzzy Equivalence Relation, Yiu-Kai D. Ng, Maria Soledad Pera

Faculty Publications

Tens of thousands of news articles are posted on-line each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of information, RSS (news) feeds are used to categorize newly posted articles. Nonetheless, most RSS users must filter through many articles within the same or different RSS feeds to locate articles pertaining to their particular interests. Due to the large number of news articles in individual RSS feeds, there is a need for further organizing articles to aid users in locating non-redundant, informative, and related articles of interest quickly. In this paper, we …


Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, Mike Gashler, Christophe G. Giraud-Carrier, Tony R. Martinez Dec 2008

Decision Tree Ensemble: Small Heterogeneous Is Better Than Large Homogeneous, Mike Gashler, Christophe G. Giraud-Carrier, Tony R. Martinez

Faculty Publications

Using decision trees that split on randomly selected attributes is one way to increase the diversity within an ensemble of decision trees. Another approach increases diversity by combining multiple tree algorithms. The random forest approach has become popular because it is simple and yields good results with common datasets. We present a technique that combines heterogeneous tree algorithms and contrast it with homogeneous forest algorithms. Our results indicate that random forests do poorly when faced with irrelevant attributes, while our heterogeneous technique handles them robustly. Further, we show that large ensembles of random trees are more susceptible to diminishing returns …


Learning-Based Fusion For Data Deduplication, Sabra Dinerstein, Parris K. Egbert, Stephen W. Clyde, Jared Dinerstein Dec 2008

Learning-Based Fusion For Data Deduplication, Sabra Dinerstein, Parris K. Egbert, Stephen W. Clyde, Jared Dinerstein

Faculty Publications

Rule-based deduplication utilizes expert domain knowledge to identify and remove duplicate data records. Achieving high accuracy in a rule-based system requires the creation of rules containing a good combination of discriminatory clues. Unfortunately, accurate rule-based deduplication often requires significant manual tuning of both the rules and the corresponding thresholds. This need for manual tuning reduces the efficacy of rule-based deduplication and its applicability to real-world data sets. No adequate solution exists for this problem. We propose a novel technique for rule-based deduplication. We apply individual deduplication rules, and combine the resultant match scores via learning-based information fusion. We show empirically …


Nowhere To Hide: Finding Plagiarized Documents Based On Sentence Similarity, Nathaniel Gustafson, Yiu-Kai D. Ng, Maria Soledad Pera Dec 2008

Nowhere To Hide: Finding Plagiarized Documents Based On Sentence Similarity, Nathaniel Gustafson, Yiu-Kai D. Ng, Maria Soledad Pera

Faculty Publications

Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of online publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimPaD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) …


Sequence Alignment With Traceback On Reconfigurable Hardware, Scott Lloyd, Quinn O. Snell Dec 2008

Sequence Alignment With Traceback On Reconfigurable Hardware, Scott Lloyd, Quinn O. Snell

Faculty Publications

Biological sequence alignment is an essential tool used in molecular biology and biomedical applications. The growing volume of genetic data and the complexity of sequence alignment present a challenge in obtaining alignment results in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop …


Using Vagueness Measures To Re-Rank Documents Retrieved By A Fuzzy Set Information Retrieval Model, Stephen Lynn, Yiu-Kai D. Ng Oct 2008

Using Vagueness Measures To Re-Rank Documents Retrieved By A Fuzzy Set Information Retrieval Model, Stephen Lynn, Yiu-Kai D. Ng

Faculty Publications

Traditional information retrieval (IR) systems evaluate user queries and retrieve/rank documents based on matching keywords in user queries with words in documents. These exact word-matching and ranking approaches ignore too many relevant documents that do not contain the exact keywords as specified in a user query. Instead of considering these traditional approaches, we propose to retrieve documents using a fuzzy set IR model and rank retrieved documents for any vague query using the “vagueness score” of the documents based on the word senses as defined in WordNet. Using the vagueness scores, we rank the most highest “relevant” documents of a …


Enhancement Of Unusual Color In Aerial Video Sequences For Assisting Wilderness Search And Rescue, Bryan S. Morse, Nathan D. Rasmussen, Daniel Thornton Oct 2008

Enhancement Of Unusual Color In Aerial Video Sequences For Assisting Wilderness Search And Rescue, Bryan S. Morse, Nathan D. Rasmussen, Daniel Thornton

Faculty Publications

The use of aerial video for search and surveillance has been popularized by the increased use of camera-equipped unmanned aerial vehicles. For many search applications, objects may also be missed by observers due to their small size, brief visibility, or the inherent monotony of the scene. This paper presents a novel method for automatically emphasizing unusually colored objects to improve their detectability. We use a hue histogram and a local saliency measure to find unusually colored objects, then boost the saturation of these objects while desaturating more common colors, thus drawing the observer’s attention and facilitating video search.


Hop-By-Hop Multicast Transport For Mobile Ad Hoc Wireless Networks, Manoj Pandey, Daniel Zappala Oct 2008

Hop-By-Hop Multicast Transport For Mobile Ad Hoc Wireless Networks, Manoj Pandey, Daniel Zappala

Faculty Publications

Multicast transport is a challenging problem because the source must provide congestion control and reliability for a tree, rather than a single path. This problem is made even more difficult in mobile ad hoc networks due to problems caused by contention, spatial reuse, and mobility. In this paper, we design a hop-by-hop multicast transport protocol, which pushes transport functionality into the core of the network. Although this requires per-flow state, a hop-by-hop approach simplifies congestion control, enables local recovery of lost packets, and provides low delay and efficient use of wireless capacity. We use a simulation study to demonstrate the …


Scalable Multicast Routing For Ad Hoc Networks, Manoj Pandey, Daniel Zappala Oct 2008

Scalable Multicast Routing For Ad Hoc Networks, Manoj Pandey, Daniel Zappala

Faculty Publications

Routing in a mobile ad hoc network is challenging because nodes can move at any time, invalidating a previously-discovered route. Multicast routing is even more challenging, because a source needs to maintain a route to potentially many group members simultaneously. Providing scalable solutions to this problem typically requires building a hierarchy or an overlay network to reduce the cost of route discovery and maintenance. In this paper, we show that a much simpler alternative is possible, by using source specific semantics and relying on the unicast routing protocol to find all routes. This separation of concerns enables the multicast routing …


Autonomous And Intelligent Radio Switching For Heterogeneous Wireless Networks, Qiuyi Duan, Charles D. Knutson, Lei Wang, Daniel Zappala Sep 2008

Autonomous And Intelligent Radio Switching For Heterogeneous Wireless Networks, Qiuyi Duan, Charles D. Knutson, Lei Wang, Daniel Zappala

Faculty Publications

As wireless devices continue to become more prevalent, heterogeneous wireless networks - in which communicating devices have at their disposal multiple types of radios - will become the norm. Communication between nodes in these networks ought to be as simple as possible; they should be able to seamlessly switch between different radios and network stacks on the fly in order to better serve the user. To make this a possibility, we consider the challenging problems of when two communicating devices should decide to switch to a different radio, and which radio they should choose. We design an Autonomous and Intelligent …


Improving Live Sequence Chart To Automata Transformation For Verification, Rahul Kumar, Eric G. Mercer Aug 2008

Improving Live Sequence Chart To Automata Transformation For Verification, Rahul Kumar, Eric G. Mercer

Faculty Publications

This paper presents a Live Sequence Chart (LSC) to automata transformation algorithm that enables the verification of communication protocol implementations. Using this LSC to automata transformation a communication protocol implementation can be verified using a single verification run as opposed to previous techniques that rely on a three stage verification approach. The novelty and simplicity of the transformation algorithm lies in its placement of accept states in the automata generated from the LSC. We present in detail an example of the transformation as well as the transformation algorithm. Further, we present a detailed analysis and an empirical study comparing the …


Watertight Trimmed Nurbs, Thomas W. Sederberg, Xin Li, Hongwei Lin, Heather Ipson Aug 2008

Watertight Trimmed Nurbs, Thomas W. Sederberg, Xin Li, Hongwei Lin, Heather Ipson

Faculty Publications

This paper addresses the long-standing problem of the unavoidable gaps that arise when expressing the intersection of two NURBS surfaces using conventional trimmed-NURBS representation. The solution converts each trimmed NURBS into an untrimmed T-Spline, and then merges the untrimmed T-Splines into a single, watertight model. The solution enables watertight fillets of NURBS models, as well as arbitrary feature curves that do not have to follow isoparameter curves. The resulting T-Spline representation can be exported without error as a collection of NURBS surfaces.


Data-Driven Programming And Behavior For Autonomous Virtual Characters, Jonathan Dinerstein, Parris K. Egbert, Michael A. Goodrich, Dan A. Ventura Jul 2008

Data-Driven Programming And Behavior For Autonomous Virtual Characters, Jonathan Dinerstein, Parris K. Egbert, Michael A. Goodrich, Dan A. Ventura

Faculty Publications

In the creation of autonomous virtual characters, two levels of autonomy are common. They are often called motion synthesis (low-level autonomy) and behavior synthesis (high-level autonomy), where an action (i.e. motion) achieves a short-term goal and a behavior is a sequence of actions that achieves a long-term goal. There exists a rich literature addressing many aspects of this general problem (and it is discussed in the full paper). In this paper we present a novel technique for behavior (high-level) autonomy and utilize existing motion synthesis techniques. Creating an autonomous virtual character with behavior synthesis abilities frequently includes three stages: forming …


Link Quality Prediction For Wireless Devices With Multiple Radios, Qiuyi Duan, Charles D. Knutson, Lei Wang, Daniel Zappala Jun 2008

Link Quality Prediction For Wireless Devices With Multiple Radios, Qiuyi Duan, Charles D. Knutson, Lei Wang, Daniel Zappala

Faculty Publications

Communication between wireless devices ought to be as simple as possible; they should be able to seamlessly switch between different radios and network stacks on the fly in order to better serve the user. To make this a possibility, we consider the challenging problem of predicting link quality in a changing mobile environment. In this paper we present an algorithm that uses Weighted Least Squares Regression to predict whether a given link can meet application requirements in terms of throughput, delay, and jitter. We use a simulation study to demonstrate that our algorithm is able to predict link quality accurately …


Or Best Offer: A Privacy Policy Negotiation Protocol, Eric G. Mercer, Kent E. Seamons, Daniel D. Walker Jun 2008

Or Best Offer: A Privacy Policy Negotiation Protocol, Eric G. Mercer, Kent E. Seamons, Daniel D. Walker

Faculty Publications

Privacy policy languages, such as P3P, allow websites to publish their privacy practices and policies in machine readable form. Currently, software agents designed to protect users’ privacy follow a “take it or leave it” approach that is inflexible and gives the server ultimate control. Privacy policy negotiation is one approach to leveling the playing field by allowing a client to negotiate with a server to determine how that server collects and uses the client’s data. We present a privacy policy negotiation protocol, “Or Best Offer”, that includes a formal model for specifying privacy preferences and reasoning about privacy policies. The …


Assessing The Costs Of Sampling Methods In Active Learning For Annotation, James Carroll, Robbie Haertel, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi Jun 2008

Assessing The Costs Of Sampling Methods In Active Learning For Annotation, James Carroll, Robbie Haertel, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi

Faculty Publications

Traditional Active Learning (AL) techniques assume that the annotation of each datum costs the same. This is not the case when annotating sequences; some sequences will take longer than others. We show that the AL technique which performs best depends on how cost is measured. Applying an hourly cost model based on the results of an annotation user study, we approximate the amount of time necessary to annotate a given sentence. This model allows us to evaluate the effectiveness of AL sampling methods in terms of time spent in annotation. We acheive a 77% reduction in hours from a random …


Application And Evaluation Of Spatiotemporal Enhancement Of Live Aerial Video Using Temporally Local Mosaics, Dennis Eggett, Cameron Engh, Damon Gerhardt, Michael A. Goodrich, Bryan S. Morse, Nathan Rasmussen, Daniel Thornton Jun 2008

Application And Evaluation Of Spatiotemporal Enhancement Of Live Aerial Video Using Temporally Local Mosaics, Dennis Eggett, Cameron Engh, Damon Gerhardt, Michael A. Goodrich, Bryan S. Morse, Nathan Rasmussen, Daniel Thornton

Faculty Publications

Camera-equipped mini-UAVs are popular for many applications, including search and surveillance, but video from them is commonly plagued with distracting jittery motions and disorienting rotations that make it difficult for human viewers to detect objects of interest and infer spatial relationships. For time-critical search situations there are also inherent tradeoffs between detection and search speed. These problems make the use of dynamic mosaics to expand the spatiotemporal properties of the video appealing. However, for many applications it may not be necessary to maintain full mosaics of all of the video but to mosaic and retain only a number of recent …


Accelerating Corpus Annotation Through Active Learning, George Busby, Marc Carmen, James Carroll, Robbie Haertel, Deryle W. Lonsdale, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi Mar 2008

Accelerating Corpus Annotation Through Active Learning, George Busby, Marc Carmen, James Carroll, Robbie Haertel, Deryle W. Lonsdale, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi

Faculty Publications

PDF of Powerpoint Presentation on accelerating corpus annotation through active learning. This presentation was given at the Conference of the American Association for Corpus Linguistics in 2008.


Analysis Of Canonical Chinese Antonym Co-Occurrence, Eric K. Ringger, Guohui Liu, Shiping Liu, Xingfu Wang Mar 2008

Analysis Of Canonical Chinese Antonym Co-Occurrence, Eric K. Ringger, Guohui Liu, Shiping Liu, Xingfu Wang

Faculty Publications

PDF of Powerpoint Presentation on canonical Chinese antonym co-occurrence. This presentation was given at the Conference of the American Association for Corpus Linguistics in 2008.


Compiling And Annotating A Syriac Corpus, George Busby, James Carroll, Marc Carmen, Carl Griffin, Robbie Haertel, Kristian Heal, Joshua Heaton, Deryle W. Lonsdale, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi, David Taylor Mar 2008

Compiling And Annotating A Syriac Corpus, George Busby, James Carroll, Marc Carmen, Carl Griffin, Robbie Haertel, Kristian Heal, Joshua Heaton, Deryle W. Lonsdale, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi, David Taylor

Faculty Publications

PDF of Powerpoint Presentation on compiling and annotating a Syriac corpus. This presentation was given at the Conference of the American Association for Corpus Linguistics in 2008.