Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2013

Theses/Dissertations

Master's Projects

Articles 1 - 30 of 32

Full-Text Articles in Physical Sciences and Mathematics

Metamorphic Detection Using Singular Value Decomposition, Ranjith Kumar Jidigam Oct 2013

Metamorphic Detection Using Singular Value Decomposition, Ranjith Kumar Jidigam

Master's Projects

Metamorphic malware changes its internal structure with each infection, while maintaining its original functionality. Such malware can be difficult to detect using static techniques, since there may be no common signature across infections. In this research we apply a score based on Singular Value Decomposition (SVD) to the problem of metamorphic detection. SVD is a linear algebraic technique which is applicable to a wide range of problems, including facial recognition. Previous research has shown that a similar facial recognition technique yields good results when applied to metamorphic malware detection. We present experimental results and we analyze the effectiveness and efficiency …


Pattern Recognition Of Dna Sequences Using Automata With Application To Species Distinction, Parnika P. Achrekar Oct 2013

Pattern Recognition Of Dna Sequences Using Automata With Application To Species Distinction, Parnika P. Achrekar

Master's Projects

"Darwin wasn't just provocative in saying that we descend from the apes—he didn't go far enough, we are apes in every way, from our long arms and tailless bodies to our habits and temperament." said Frans de Waal, a primate scientist at Emory University in Atlanta, Georgia. 1.3 million Species have been named and analyzed by scientists. This project focuses on capturing various nucleotide sequences of various species and determining the similarity and differences between them. Finite state automata have been used to accomplish this. The automata for a DNA genome is created using Alergia algorithm and is used as …


Determining Spread Of Diseases Using Social Networking Data, Jiten P. Oswal Oct 2013

Determining Spread Of Diseases Using Social Networking Data, Jiten P. Oswal

Master's Projects

“Hi, in a weeks, there will be high possibility you may get infected by flu. Please go to take a flu shot.” – This is a sample warning to some users in Twitter about the spreading of flu to people from specific location. Twitter may already be used to plan social lives, interact with celebrities and communicate with friends. But now data from the social networking sites could have a far more serious use like tracking diseases and learning about their spread. Using this kind of data, I will help Health Agencies to personalize the prediction and warn the general …


Higher Order Pwm For Modeling Transcription Factor Binding Sites, Dhivya Srinivasan Oct 2013

Higher Order Pwm For Modeling Transcription Factor Binding Sites, Dhivya Srinivasan

Master's Projects

Traditional Position Weight Matrices (PWMs) that are used to model Transcription Factor Binding Sites (TFBS) assume independence among different positions in the binding site. In reality, this may not necessarily be the case. A better way to model TFBS is to consider the distribution of dinucleotides or trinucleotides instead of just mononucleotides, thus taking neighboring nucleotides into account. We can therefore, extend the single nucleotide PWM to a dinucleotide PWM or an even higher-order PWM to correctly estimate the dependencies among the nucleotides in a given sequence. The purpose of this project is to develop an algorithm to implement higher-order …


Hidden Markov Models For Malware Classification, Chinmayee Annachhatre Oct 2013

Hidden Markov Models For Malware Classification, Chinmayee Annachhatre

Master's Projects

Malware is a software which is developed for malicious intent. Malware is a rapidly evolving threat to the computing community. Although many techniques for malware classification have been proposed, there is still the lack of a comprehensible and useful taxonomy to classify malware samples. Previous research has shown that hidden Markov model (HMM) analysis is useful for detecting certain types of malware. In this research, we consider the related problem of malware classification based on HMMs. We train HMMs for a variety of malware generators and a variety of compilers. More than 9000 malware samples are then scored against each …


Compression-Based Analysis Of Metamorphic Malware, Jared Lee Oct 2013

Compression-Based Analysis Of Metamorphic Malware, Jared Lee

Master's Projects

Recent work has presented a technique based on structural entropy measurement as an effective way to detect metamorphic malware. The technique uses two steps, file segmentation and sequence comparison, to calculate file similarity. In another previous work, it was observed that similar malware have similar measures of Kolmogorov complexity. A proposed method of estimating Kolmogorov complexity was to calculate the compression ratio of a given malware which could then be used to cluster the malicious software. Malware detection has also been attempted through the use of adaptive data compression and showed promising results. In this paper, we attempt to combine …


San José State University Building Editor, Viet Trinh Oct 2013

San José State University Building Editor, Viet Trinh

Master's Projects

The San José State University (SJSU) Building Editor is a graphic application that renders the SJSU architectures in a multidimensional space and simulates the flows of people evacuating in the buildings under different circumstances. For a given building, the goals of this application are to analyze a density of people on each floor, to predict bottlenecks in each structure, and to simulate an optimal evacuation plan in case of an emergency for selected SJSU buildings that are visualized in multidimensional models. This report describes in detail functionalities of the application, studies key points in its implementation, and analyzes the application’s …


Pattern Discovery Of Sequential Symbolic Data Using Automata With An Application To Author Identification, Nikhil Kalantri Oct 2013

Pattern Discovery Of Sequential Symbolic Data Using Automata With An Application To Author Identification, Nikhil Kalantri

Master's Projects

Author Identification is the process of identifying a piece of text to ascertain if it has an inherent writing style or pattern based on a certain author. Almost all literary books can be accredited to a certain author since it has been signed. However, there also exist a plethora of unfinished books or manuscripts that could be attributed to a range of possible authors. For example, William Shakespeare has written many plays that have not been signed by him. In order to assess the importance of such texts that do not bear the authors signature, it could be vital to …


Access Control In A Social Networking Environment, Mallika Perepa Oct 2013

Access Control In A Social Networking Environment, Mallika Perepa

Master's Projects

Collecting users into groups is a common activity in social networking sites such as Facebook, Google groups, Yahoo groups and many other web applications. This project explores access control techniques for dynamically created groups. The starting point was Yioop [1], a PHP-based search engine. The ability to create social groups was added to Yioop. The Grouping feature is enhanced by adding additional features like: blogs and pages for each individual user and as well as for groups of users. Access control is provided to every group and each user within a group based on the ownership of the group or …


Repetitive Component Based Motion Learning With Kinect, Govind Kalyankar Oct 2013

Repetitive Component Based Motion Learning With Kinect, Govind Kalyankar

Master's Projects

Today’s world wants quick, smart and cost effective solutions to their problems. People want to learn everything online. They are interested in learning new techniques and every kind of art in a limited amount of time because they are busy with their own work and have very short time to take in class instructor led training. This is an attempt to fulfill the same so that the people can easily learn and master a new kind of art by themselves by using Kinect. The focus of this project is to master Kung-Fu, an ancient form of Chinese Martial Arts. Kung-Fu …


Metamorphic Detection Using Function Call Graph Analysis, Prasad Deshpande Oct 2013

Metamorphic Detection Using Function Call Graph Analysis, Prasad Deshpande

Master's Projects

Well-designed metamorphic malware can evade many commonly used malware detection techniques including signature scanning. In this research, we consider a score based on function call graph analysis. We test this score on several challenging classes of metamorphic malware and we show that the resulting detection rates yield an improvement over previous research.


Yioop Full Historical Indexing In Cache Navigation, Akshat Kukreti Apr 2013

Yioop Full Historical Indexing In Cache Navigation, Akshat Kukreti

Master's Projects

This project adds new cache-related features to Yioop, an Open Source, PHP-based search engine. Search engines often maintain caches of pages downloaded by their crawler. Commercial search engines like Google display a link to the cached version of a web page along with the search results for that particular web page. The first feature enables users to navigate through Yioop's entire cache. When a cached page is displayed along with its contents, links to cached pages saved in the past are also displayed. The feature also enables users to navigate cache history based on year and month. This feature is …


Cloud Based Complaint Management Service, Ajinkya Amrute Apr 2013

Cloud Based Complaint Management Service, Ajinkya Amrute

Master's Projects

Complaint Management is important from both customer as well as business point of view. Complaints contain direct voice of the customer which provides companies a huge volume of data which can be used to improve the quality of the product the company is manufacturing. Hence it is necessary for the organizations to harness the data received via complaints. However as the data received via complaints is enormous it is not an easy task to manage the data received via complaints as the data keeps on expanding and multiplying. For implementing this difficult task of complaint management, websites were built but …


Motion Learning With Biomechanics Principles, Jing Sun Apr 2013

Motion Learning With Biomechanics Principles, Jing Sun

Master's Projects

This project gets the advantage of both biomechanics analysis and Kinect motion capturing, and develops a sports improvement solution with coaching evaluation. It focuses on sample movement patterns to do data quantity and quality analysis. And by combining with professional dedicated bio-mechanical principles, it is able to implement real time motion tracking, coaching and evaluation while motion capturing. We calculate some basic but important parameters from captured motion data, such as the rotation and translation of body segments, and then analyze motion flaws that hid behind it. So a deterministic model for specific movement pattern can be constructed as a …


Modular Approach To Big Data Using Neural Networks, Animesh Dutta Apr 2013

Modular Approach To Big Data Using Neural Networks, Animesh Dutta

Master's Projects

Machine learning can be used to recognize patterns, classify data into classes and make predictions. Neural Networks are one of the many machine learning tools that are capable of performing these tasks. The greatest challenges that we face while dealing with the IBM Watson dataset is the high amount of dimensionality, both in terms of the number of features the data has, as well the number of rows of data we are dealing with. The aim of the project is to identify a course of action that can be chosen when dealing with similar problems. The project aims at setting …


Predicting Product Review Helpfulness Using Machine Learning And Specialized Classification Models, Scott Bolter Apr 2013

Predicting Product Review Helpfulness Using Machine Learning And Specialized Classification Models, Scott Bolter

Master's Projects

In this paper we focus on automatically classifying product reviews as either helpful or unhelpful using machine learning techniques, namely, SVM classifiers. Using LIBSVM and a set of Amazon product reviews from 25 product categories, we train models for each category to determine if a review will be helpful or unhelpful. Previous work has focused on training one classifier for all reviews in the data set, but we hypothesize that a distinct model for each of the 25 product types available in the review dataset will improve the accuracy of classification. ! Furthermore, we develop a framework to inform authors …


Mongodb Performance In The Cloud, Tudor Matei Apr 2013

Mongodb Performance In The Cloud, Tudor Matei

Master's Projects

Web applications are growing at a staggering rate every day. As web applications keep getting more complex, their data storage requirements tend to grow exponentially. Databases play an important role in the way web applications store their information. Mongodb is a document store database that does not have strict schemas that RDBMs require and can grow horizontally without performance degradation. MongoDB brings possibilities for different storage scenarios and allow the programmers to use the database as a storage that fits their needs, not the other way around. Scaling MongoDB horizontally requires tens to hundreds of servers, making it very difficult …


Big Data Classification Using Decision Trees On The Cloud, Chinmay Bhawe Apr 2013

Big Data Classification Using Decision Trees On The Cloud, Chinmay Bhawe

Master's Projects

This writing project addresses the topic of attempting to use machine learning on very large data sets on cloud servers. The project consists of two phases. The first being developing a machine learning system which will learn on the data provided by IBM for the “IBM Watson Great minds Challenge SJSU Pilot” competition and providing the best possible results on the evaluation data set, also provided by the IBM Watson team. This will serve as a basis for the second phase of the project, in which the objective is to move the machine learning system on to a cloud server, …


Cloud Storage Performance And Security Analysis With Hadoop And Gridftp, Wei-Li Liu Apr 2013

Cloud Storage Performance And Security Analysis With Hadoop And Gridftp, Wei-Li Liu

Master's Projects

Even though cloud server has been around for a few years, most of the web hosts today have not converted to cloud yet. If the purpose of the cloud server is distributing and storing files on the internet, FTP servers were much earlier than the cloud. FTP server is sufficient to distribute content on the internet. Therefore, is it worth to shift from FTP server to cloud server? The cloud storage provider declares high durability and availability for their users, and the ability to scale up for more storage space easily could save users tons of money. However, does it …


Recommendation System For News Reader, Shweta Athalye Apr 2013

Recommendation System For News Reader, Shweta Athalye

Master's Projects

Recommendation Systems help users to find information and make decisions where they lack the required knowledge to judge a particular product. Also, the information dataset available can be huge and recommendation systems help in filtering this data according to users‟ needs. Recommendation systems can be used in various different ways to facilitate its users with effective information sorting. For a person who loves reading, this paper presents the research and implementation of a Recommendation System for a NewsReader Application using Android Platform. The NewsReader Application proactively recommends news articles as per the reading habits of the user, recorded over a …


Http Attack Detection Using N-Gram Analysis, Adityaram Oza Apr 2013

Http Attack Detection Using N-Gram Analysis, Adityaram Oza

Master's Projects

Previous research has shown that byte level analysis of HTTP traffic offers a practical solution to the problem of network intrusion detection and traffic analysis. Such an approach does not require any knowledge of applications running on web servers or any pre-processing of incoming data. In this project, we apply three n- gram based techniques to the problem of HTTP attack detection. The goal of such techniques is to provide a first line of defense by filtering out the vast majority of benign HTTP traffic. We analyze our techniques in terms of accuracy of attack detection and performance. We show …


Automated Rtl Generator, Rohit Kulkarni Apr 2013

Automated Rtl Generator, Rohit Kulkarni

Master's Projects

Code generation is a vast topic and has been discussed and implemented for quite a while now. It has been also been a topic of debate as to what is an ideal code generator and how an ideal code generator can be created. The biggest challenge while creating a code generator is to maintain a balance between the amount of freedom given to the user and the restrictions imposed on the code generated. These two seemed to be very conflicting requirements while designing the Automated RTL Code Generator. If the code generator tries to be rigid and sticks to well-defined …


Analysis Of Parallel Montgomery Multiplication In Cuda, Yuheng Liu Apr 2013

Analysis Of Parallel Montgomery Multiplication In Cuda, Yuheng Liu

Master's Projects

For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared.


Mobile Presentation Of Unstructured Information, Shailesh Benake Apr 2013

Mobile Presentation Of Unstructured Information, Shailesh Benake

Master's Projects

Since the advent of online education in 1994 by CALCampus , many improvements have been made for effectiveness of e-learning. Video/audio conferencing, synchronous education system and many such advances in multimedia communication have made this system more popular among the masses. However with many online education websites, competing to make the same course, it’s important for user to find course structure of his interest. What makes even more challenging for a learner is, to decide how good will be the learning from a course provided by a particular site. For example open online course sites like edx.org, canvas.net, coursera.org etc …


Using Social Networks For Assessing Company Sales And Marketing Programs, Vance Tomchalk Apr 2013

Using Social Networks For Assessing Company Sales And Marketing Programs, Vance Tomchalk

Master's Projects

During the course of an extended sales period for a company’s given product line, there are many events that affect the success of its sales. Some of these events include economic downturns, unforeseen shortages and delays that effect the supply chain for the product, and product quality issues that change the perception of the product as a safe and cost-effective choice. In many instances, these events can be tracked by analyzing the signals and messaging present in the social networking media. This analysis requires careful consideration, which the metrics provided by software tools and algorithms lend considerable aid.


Semantic Search Over Encrypted Data In Cloud Computing, Kam Ho Ho Apr 2013

Semantic Search Over Encrypted Data In Cloud Computing, Kam Ho Ho

Master's Projects

Cloud storage becomes more and more popular in the recent trend since it provides various benefits over the traditional storage solutions. Along with many benefits provided by cloud storage, many security problems arise in cloud storage which prevents enterprises from migrate their data to cloud storage. These security problems induce the data owners to encrypt all their sensitive data such as social security number (SSN), credit card information, and personal tax information before they can be stored in cloud storage. The encryption approach may have strengthened the data security of cloud data, but it degrades the data efficiency because the …


Application Of Secretary Algorithm To Dynamic Load Balancing In User-Space On Multicore Systems, Kyoung-Hwan Yun Apr 2013

Application Of Secretary Algorithm To Dynamic Load Balancing In User-Space On Multicore Systems, Kyoung-Hwan Yun

Master's Projects

In recent years, multicore processors have been so prevalent in many types of
systems and are now widely used even in commodities for a wide range of applications.
Although multicore processors are clearly a popular hardware solution to problems that
were not possible with traditional single-core processors, taking advantage of them are
inevitably met by software challenges. As Amdahl’s law puts it, the performance gain is
limited by the percentage of the software that cannot be run in parallel on multiple cores.
Even when an application is “embarrassingly” parallelized by a careful design of
algorithm and implementation, load balancing of …


Entropy And State Visualization For Automation Design And Evaluation Prototyping Toolset, Rohit Deshmukh Apr 2013

Entropy And State Visualization For Automation Design And Evaluation Prototyping Toolset, Rohit Deshmukh

Master's Projects

Automation Design and Evaluation Prototyping Toolset (ADEPT) is a plugin developed on the Eclipse Rich Client Platform(RCP). ADEPT can be used by domain expert designers to create and modify testable prototypes. The aim of the project is to enhance ADEPT by adding dynamic visualizations to the ADEPT user interface. Three types of visualizations are implemented in this project. Table view is helpful to view the hierarchy and nesting of Logic Tables. The State visualization displays all the states in a selected Logic Table. Entropy visualization is a subset of State visualization and displays limited number of states having lowest Entropy …


Cloud Services For An Android Based Home Security System, Karthik Challa Apr 2013

Cloud Services For An Android Based Home Security System, Karthik Challa

Master's Projects

This report talks in detail about an android based application designed for a home security system. The home security system is a tablet device developed using the android framework. The home security system makes use of sensors and a central device to secure an area. Currently the devices are standalone and require the users to be physically present to operate the devices with no interaction possible between two different devices. The system is also limited by its computational resources and storage capacity. For this project, I have developed a cloud based client server architecture to address these limitations and also …


User Profiling In Gui Based Windows Systems For Intrusion Detection, Arshi Agrawal Apr 2013

User Profiling In Gui Based Windows Systems For Intrusion Detection, Arshi Agrawal

Master's Projects

Intrusion detection is the process of identifying any unauthorized access to a sys- tem. This process inspects user behavior to identify any possible attack or intrusion. There exists two type of intrusion detection systems (IDSs): signature-based IDS and anomaly-based IDS. This project concentrates on anomaly-based intrusion detection technique. This technique is based on the deviation of intruder’s actions from the authenticated user’s actions. Much previous research has focused on the deviation of command line input in UNIX systems. However, these techniques fail to detect attacks on modern GUI- based systems, where typical user activities include mouse movements and keystrokes. Our …