Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (66)
- Astrophysics and Astronomy (16)
- External Galaxies (15)
- Artificial Intelligence and Robotics (9)
- Information Security (9)
-
- Oceanography and Atmospheric Sciences and Meteorology (9)
- Databases and Information Systems (8)
- Meteorology (7)
- Climate (6)
- Physics (6)
- Earth Sciences (5)
- Geology (5)
- Chemistry (4)
- Other Computer Sciences (4)
- Archival Science (3)
- Atmospheric Sciences (3)
- Cataloging and Metadata (3)
- Collection Development and Management (3)
- Computer Engineering (3)
- Data Storage Systems (3)
- Engineering (3)
- Library and Information Science (3)
- Programming Languages and Compilers (3)
- Scholarly Communication (3)
- Scholarly Publishing (3)
- Social and Behavioral Sciences (3)
- Systems Architecture (3)
- Condensed Matter Physics (2)
- Discrete Mathematics and Combinatorics (2)
- Keyword
-
- Galaxies: star clusters: general (8)
- Galaxies: evolution (6)
- Galaxies: formation (6)
- Galaxies: kinematics and dynamics (4)
- Galaxies: stellar content (4)
-
- CD (3)
- Galaxies: abundances (3)
- Galaxies: elliptical and lenticular (3)
- Articles (2)
- Galaxies: fundamental parameters (2)
- Galaxies: structure (2)
- Globular clusters: general (2)
- Additive coloring (1)
- Autosuggest entities solr (1)
- Biogeography (1)
- Botnet Control Social Networks (1)
- Car telementrics data analysis (1)
- Catalogs (1)
- Cirrus cloud (1)
- Classic Ciphers Hidden Markov Models (1)
- Clustering hmm malware detection (1)
- Combinatorial Nullstellensatz (1)
- Computer Auctions Online Advertising (1)
- Concept Mining Simplicial Complex (1)
- Condensed-matter physics (1)
- DNA Species Similarity Stochastic Finite Automata (1)
- Degree sequence (1)
- Digitization (1)
- Discharging method (1)
- Discrete tomography (1)
- Publication
- Publication Type
Articles 1 - 30 of 108
Full-Text Articles in Physical Sciences and Mathematics
Pattern Discovery In Dna Using Stochastic Automata, Shweta Shweta
Pattern Discovery In Dna Using Stochastic Automata, Shweta Shweta
Master's Projects
We consider the problem of identifying similarities between different species of DNA. To do this we infer a stochastic finite automata from a given training data and compare it with a test data. The training and test data consist of DNA sequence of different species. Our method first identifies sentences in DNA. To identify sentences we read DNA sequence one character at a time, 3 characters form a codon and codons form proteins (also known as amino acid chains).Each amino acid in proteins belongs to a group. In total we have 5 groups’ polar, non-polar, acidic, basic and stop codons. …
Logistic Regression Models To Predict Solvent Accessible Residues Using Sequence- And Homology-Based Qualitative And Quantitative Descriptors Applied To A Domain-Complete X-Ray Structure Learning Set, Reecha Nepal, Joanna Spencer, Guneet Bhogal, Amulya Nedunuri, Thomas Poelman, Thejas Kamath, Edwin Chung, Katherine Kantardjieff, Andrea Gottlieb, Brooke Lustig
Logistic Regression Models To Predict Solvent Accessible Residues Using Sequence- And Homology-Based Qualitative And Quantitative Descriptors Applied To A Domain-Complete X-Ray Structure Learning Set, Reecha Nepal, Joanna Spencer, Guneet Bhogal, Amulya Nedunuri, Thomas Poelman, Thejas Kamath, Edwin Chung, Katherine Kantardjieff, Andrea Gottlieb, Brooke Lustig
Faculty Publications, Chemistry
A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, …
Cooling Atomic Gases With Disorder, Ehsan Khatami, Thereza Paiva, Shuxiang Yang, Valéry Rousseau, Mark Jarrell, Juana Moreno, Randall Hulet, Richard Scalettar
Cooling Atomic Gases With Disorder, Ehsan Khatami, Thereza Paiva, Shuxiang Yang, Valéry Rousseau, Mark Jarrell, Juana Moreno, Randall Hulet, Richard Scalettar
Faculty Publications
Cold atomic gases have proven capable of emulating a number of fundamental condensed matter phenomena including Bose-Einstein condensation, the Mott transition, Fulde-Ferrell-Larkin-Ovchinnikov pairing, and the quantum Hall effect. Cooling to a low enough temperature to explore magnetism and exotic superconductivity in lattices of fermionic atoms remains a challenge. We propose a method to produce a low temperature gas by preparing it in a disordered potential and following a constant entropy trajectory to deliver the gas into a nondisordered state which exhibits these incompletely understood phases. We show, using quantum Monte Carlo simulations, that we can approach the Néel temperature of …
University Scholar Series: Michael Kaufman, Michael Kaufman
University Scholar Series: Michael Kaufman, Michael Kaufman
University Scholar Series
H2O in Interstellar Space: How the Universe Conspires to Make Water Everywhere
On October 28, 2015, Dr. Michael Kaufman spoke in the University Scholar Series hosted by Provost Andy Feinstein at the Dr. Martin Luther King, Jr. Library. His talk was titled “H2O in Interstellar Space: How the Universe Conspires to Make Water, Water Everywhere.” Dr. Kaufman's astrophysics research focuses on the interactions and feedback between newly formed stars and the interstellar medium—the raw material from which stars form. He constructs computational models of the radiative transfer, dynamics and chemistry that occur in regions of active star formation, …
Topography And Tropical Cyclone Structure Influence On Eyewall Evolution In Typhoon Sinlaku (2008), Cheng-Hsiang Chih, Kun-Hsuan Chou, Sen Chiao
Topography And Tropical Cyclone Structure Influence On Eyewall Evolution In Typhoon Sinlaku (2008), Cheng-Hsiang Chih, Kun-Hsuan Chou, Sen Chiao
Faculty Publications, Meteorology and Climate Science
Typhoon Sinlaku (2008) was a tropical system that affected many countries in East Asia. Besides the loss of life and economic damage, many scientific questions are associated with this system that need to be addressed. A series of numerical simulations were conducted in this study using V3.2 of the advanced research version of the Weather Research and Forecasting (WRF-ARW) model to examine the impacts of different terrain conditions and vortex structures on the eyewall evolution when Sinlaku was crossing Taiwan. The sensitivity experiments using different vortex structures show that a storm of the same intensity with a larger eyewall radius …
Implementing Type Inference In Jedi, Vaibhav Kamble
Implementing Type Inference In Jedi, Vaibhav Kamble
Master's Projects
This thesis begins with an overview of type systems: evolution, concepts, and problems. This survey is based on type systems of modern languages like Scala and Haskell. Scala has a very sophisticated type system that includes generics, polymorphism, and closures. It has a built-in type inference mechanism that enables the programmer to exclude certain type annotations. It is often not required in Scala to mention the type of a variable because the compiler can infer the type from the initialization of the variable. Study of such type system is demonstrated by the implementation of the type system. A type system …
Extensible Authentication Protocol Vulnerabilities And Improvements, Akshay Baheti
Extensible Authentication Protocol Vulnerabilities And Improvements, Akshay Baheti
Master's Projects
Extensible Authentication Protocol(EAP) is a widely used security protocol for Wireless networks around the world. The project examines different security issues with the EAP based protocols, the family of security protocols for Wireless LAN. The project discovers an attack on the subscriber identity module(SIM) based extension of EAP. The attack is a Denial-of-Service attack that exploits the error handling mechanism in EAP protocols. The project further proposes countermeasures for detection and a defense against the discovered attack. The discovered attack can be prevented by changing the protocol to delay the processing of protocol error messages.
Comparative Analysis Of Two Clustering Algorithms: K-Means And Fsdp (Fast Search And Find Of Density Peaks), Li Miao
Master's Projects
With the overwhelming amount of data pouring into our lives, obtaining meaningful information from them is becoming a must task for people. How can people mine for "gold" in this area? Or, what tools can they use to do that? It has been proved that clustering is one of the best tools. In this project, two clustering algorithms are studied and numerically compared with various data sets. The first one is the K-means clustering which starts with initial roughly-guessed clusters, tries to classify some data points into one cluster, and iteratively repeats until converges. The second algorithm is called Fast …
Cryptanalysis Of The Purple Cipher Using Random Restarts, Aparna Shikhare
Cryptanalysis Of The Purple Cipher Using Random Restarts, Aparna Shikhare
Master's Projects
Cryptanalysis is the process of trying to analyze ciphers, cipher text, and crypto systems, which may exploit any loopholes or weaknesses in the systems, leading us to an understanding of the key used to encrypt the data. This project uses Expectation Maximization (EM) approach using numerous restarts to attack decipherment problems such as the Purple Cipher. In this research, we perform cryptanalysis of the Purple cipher using genetic algorithms and hidden Markov models (HMM). If the Purple cipher has a fixed plugboard, we show that genetic algorithms are successful in retrieving the plaintext from cipher text with high accuracy. On …
Metamorphic Java Engine, Sailee Choudhary
Metamorphic Java Engine, Sailee Choudhary
Master's Projects
Malware is a software program outlined to damage or perform other unwanted actions to a computer system. Metamorphic malware is a category of malignant software programs that has the ability to change its code as it propagates. A hidden Markov model (HMM) is a statistical model where the system is assumed to be a Markov process with unseen states. An HMM is based on the use of statistics to detect patterns, and hence in metamorphic virus detection. Previous work has been done in order to create morphing engines using LLVM-bytecode format. This project includes the creation of a morphing engine …
Improving The Accuracy And Robustness Of Self-Tuning Histograms By Subspace Clustering, Sai Kiran Padooru
Improving The Accuracy And Robustness Of Self-Tuning Histograms By Subspace Clustering, Sai Kiran Padooru
Master's Projects
Self-tuning histograms are a type of histograms very popular these days, as they allow the usage of multidimensional datasets. The main advantage of them is that they have a low computational cost due to their capacity to understand the dataset. Also, they proposed a better approach as they stay up-to-date and have adaptability to query patterns. According to the above, many researchers have worked on improving the accuracy of these type of histograms, which has led to the use of subspace clustering methods as initialization values. Following this approach in this study, a self-tuning histogram code was developed with the …
Pattern-Aided Regression Modelling And Prediction Model Analysis, Naresh Avva
Pattern-Aided Regression Modelling And Prediction Model Analysis, Naresh Avva
Master's Projects
In this research, we develop an application for generating a pattern aided regression (PXR) model, a new type of regression model designed to represent accurate and interpretable prediction model. Our goal is to generate a PXR model using Contrast Pattern Aided Regression (CPXR) method and compare it with the multiple linear regression method. The PXR models built by CPXR are very accurate in general, often outperforming state-of-the-art regression methods by big margins. CPXR is especially effective for high-dimensional data. We use pruning to improve the classification accuracy and to remove outliers from the dataset. We provide implementation details and give …
Question Answering System For Yioop, Niravkumar Patel
Question Answering System For Yioop, Niravkumar Patel
Master's Projects
Yioop is an open source search engine developed and managed by Dr. Christopher Pollett. Currently, Yioop returns the search results of the query in the form of list of URLs, just like other search engines (Google, Bing, DuckDuckGo, etc.) This paper created a new module for Yioop. This new module, known as the Question-Answering (QA) System, takes the search queries in the form of natural language questions and returns results in the form of a short answer that is appropriate to the question asked. This feature is achieved by implementing various functionalities of Natural Language Processing (NLP). By using NLP, …
Function Call Graph Score For Malware Detection, Deebiga Rajeswaran
Function Call Graph Score For Malware Detection, Deebiga Rajeswaran
Master's Projects
Metamorphic malware changes its internal structure with each infection, while maintaining its core functionality. Detecting such malware is a challenging research problem. Function call graph analysis has previously shown promise in detecting such malware. In this research, we analyze the robustness of a function call graph score with respect to various code morphing strategies. We also consider modifications of the score that make it more robust in the face of such morphing.
Predicting 'Attention Deficit Hyperactive Disorder' Using Large Scale Child Data Set, Arpi Shah
Predicting 'Attention Deficit Hyperactive Disorder' Using Large Scale Child Data Set, Arpi Shah
Master's Projects
Attention deficit hyperactivity disorder (ADHD) is a disorder found in children affecting about 9.5% of American children aged 13 years or more. Every year, the number of children diagnosed with ADHD is increasing. There is no single test that can diagnose ADHD. In fact, a health practitioner has to analyze the behavior of the child to determine if the child has ADHD. He has to gather information about the child, and his/her behavior and environment. Because of all these problems in diagnosis, I propose to use Machine Learning techniques to predict ADHD by using large scale child data set. Machine …
Predicting Autism Over Large-Scale Child Dataset, Arpit Arya
Predicting Autism Over Large-Scale Child Dataset, Arpit Arya
Master's Projects
Data Analytics and Machine learning in healthcare are one of the most emerging and needed fields in current time. Also, a lot of research has been performed and is still being done in this field. In healthcare, gone are those days when only doctor examines and patient listens. Now doctor has a lot of technologies which can assist him and help in accurately diagnosing the disease with which his patient is suffering. The backbone of such technologies is data analytics and machine learning where we can make out a lot of inferences from tons of patients‟ data already available. This …
Energy Efficiency And Quality Of Services In Virtualized Cloud Radio Access Network, Khushbu Mohta
Energy Efficiency And Quality Of Services In Virtualized Cloud Radio Access Network, Khushbu Mohta
Master's Projects
Cloud Radio Access Network (C-RAN) is being widely studied for soft and green fifth generation of Long Term Evolution - Advanced (LTE-A). The recent technology advancement in network virtualization function (NFV) and software defined radio (SDR) has enabled virtualization of Baseband Units (BBU) and sharing of underlying general purpose processing (GPP) infrastructure. Also, new innovations in optical transport network (OTN) such as Dark Fiber provides low latency and high bandwidth channels that can support C-RAN for more than forty-kilometer radius. All these advancements make C-RAN feasible and practical. Several virtualization strategies and architectures are proposed for C-RAN and it has …
Designing A Programming Contract Library For Java, Neha Rajkumar
Designing A Programming Contract Library For Java, Neha Rajkumar
Master's Projects
Programmers are now developing large and complex software systems, so it’s important to have software that is consistent, efficient, and robust. Programming contracts allow developers to specify preconditions, postconditions, and invariants in order to more easily identify programming errors. The design by contract principle [1] was first used in the Eiffel programming language [2], and has since been extended to libraries in many other languages. The purpose of my project is to design a programming contract library for Java. The library supports a set of preconditions, postconditions, and invariants that are specified in Java annotations. It incorporates contract checking for …
Bitfed, A Centralized Cryptocurrency With Distributed Miners, Shruti Sharma
Bitfed, A Centralized Cryptocurrency With Distributed Miners, Shruti Sharma
Master's Projects
Bitcoin is a decentralized peer-to-peer electronic currency wherein all the payments are sent from one transactor to another directly [1]. Financial institutions are not present in the protocol, hence, there are lower processing fees. The distributed nature provides resilience to Bitcoin transactions, and it operates on mathematical principles and cryptographic proofs. As per Bitcoin generation algorithm, the number of bitcoins in existence will never surpass 21 million, which will lead to deflation and encourage hoarding. In this project, we have implemented a Bitcoin-like currency in order to mitigate the issue of deflation [7]. The idea for our protocol is based …
Sharedwealth: A Cryptocurrency To Reward Miners Evenly, Siddiq Ahmed Syed
Sharedwealth: A Cryptocurrency To Reward Miners Evenly, Siddiq Ahmed Syed
Master's Projects
Bitcoin [19] is a decentralized cryptocurrency that has recently gained popularity and has emerged as a popular medium of exchange. The total market capitalization is around 1.5 billion US dollars as of October 2013 [28]. All the operations of Bitcoin are maintained in a distributed public global ledger known as a block chain which consists of all the successful transactions that have ever taken place. The security of a block chain is maintained by a chain of cryptographic puzzles solved by participants called miners, who in return are rewarded with bitcoins. To be successful, the miner has to put in …
Neural Network Captcha Cracker, Geetika Garg
Neural Network Captcha Cracker, Geetika Garg
Master's Projects
NEURAL NETWORK CAPTCHA CRACKER A CAPTCHA (acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart") is a type of challenge-response test used to determine whether or not a user providing the response is human. In this project, we used a deep neural network framework for CAPTCHA recognition. The core idea of the project is to learn a model that breaks image-based CAPTCHAs. We used convolutional neural networks and recurrent neural networks instead of the conventional methods of CAPTCHA breaking based on segmenting and recognizing a CAPTCHA. Our models consist of two convolutional layers to learn image …
Interactive Phishing Filter, Rushikesh Joshi
Interactive Phishing Filter, Rushikesh Joshi
Master's Projects
Phishing is one of the prevalent techniques used by attackers to breach security and steal private and confidential information. It has compromised millions of users’ data. Blacklisting websites and heuristic-based methods are common approaches to detect a phishing website. The blacklist method suffers from a window of vulnerability. Many heuristics were proposed in the past. Some of them have better accuracy but a lower performance. A phishing filter should have better accuracy and peformance. It should be able to detect fresh phishing websites. Jo et al. [2] present a list of attributes of the web page to find the disparity …
Metamorphic Code Generator Based On Bytecode Of Llvm Ir, Arjun Shah
Metamorphic Code Generator Based On Bytecode Of Llvm Ir, Arjun Shah
Master's Projects
Metamorphic software is famous for changing the internal structure of the code while keeping the functionality same. In order to escape the signature detection along with some advanced detection techniques, many malware writers have used metamorphism as the means. On the other hand, code morphing technique increases the diversity of the software which is considered to be a potential security advantage. In our paper, we have developed a metamorphic code generator based on the LLVM framework. The architecture of LLVM has a three-phase compiler design which includes the front end, the optimizer and the back end. It also gives assistance …
Pattern-Driven Programming In Scala, Huaxin Pang
Pattern-Driven Programming In Scala, Huaxin Pang
Master's Projects
This is an experimental exploration of the pattern-driven programming paradigm—the sole use of pattern matching to determine the next instruction or execute. We define a pure pattern-driven programming language named PA-Scala by defining a subset of the Scala programming language, which restricts sequence control to the powerful pattern matching facilities in Scala. We use PA-Scala to explore the strengths and limitations of pattern-driven programming. By implementing a phrase structure grammar solver in PA-Scala, we show that pattern-driven programming can be used to solve general computation problems. We then implement a Prolog interpreter in PA-Scala, which demonstrates how resolution and unification …
Scalable Techniques For Similarity Search, Siddartha Reddy Nagireddy
Scalable Techniques For Similarity Search, Siddartha Reddy Nagireddy
Master's Projects
Document similarity is similar to the nearest neighbour problem and has applications in various domains. In order to determine the similarity / dissimilarity of the documents first they need to be converted into sets containing shingles. Each document is converted into k-shingles, k being the length of each shingle. The similarity is calculated using Jaccard distance between sets and output into a characteristic matrix, the complexity to parse this matrix is significantly high especially when the sets are large. In this project we explore various approaches such as Min hashing, LSH & Bloom Filter to decrease the matrix size and …
Recommendation System Using Collaborative Filtering, Yunkyoung Lee
Recommendation System Using Collaborative Filtering, Yunkyoung Lee
Master's Projects
Collaborative filtering is one of the well known and most extensive techniques in recommendation system its basic idea is to predict which items a user would be interested in based on their preferences. Recommendation systems using collaborative filtering are able to provide an accurate prediction when enough data is provided, because this technique is based on the user’s preference. User-based collaborative filtering has been very successful in the past to predict the customer’s behavior as the most important part of the recommendation system. However, their widespread use has revealed some real challenges, such as data sparsity and data scalability, with …
Ssct Score For Malware Detection, Srividhya Srinivasan
Ssct Score For Malware Detection, Srividhya Srinivasan
Master's Projects
Metamorphic malware transforms its internal structure when it propagates, making detection of such malware a challenging research problem. Previous research considered a score based on simple substitution cryptanalysis, which was applied to the metamorphic detection problem. In this research, we analyze a new score based on a combined simple substitution and column transposition (SSCT) cryptanalysis. We show that this SSCT score significantly outperforms the simple substitution score— and other malware detection scores—in many cases.
On-The-Fly Map Generator For Openstreetmap Data Using Webgl, Sreenidhi Pundi Muralidharan
On-The-Fly Map Generator For Openstreetmap Data Using Webgl, Sreenidhi Pundi Muralidharan
Master's Projects
This project describes an approach to create an On-the-fly Map Generator for Openstreetmap Data Using WebGL. The most common methods to generate online maps generate PNG overlay tile images from a wide range of data sources, like GeoJSON, GeoTIFF, PostGIS, CSV, and SQLite, etc., based on the coordinates and zoom-level. This project aims to send vector data for the map to the browser and hence render maps on-the-fly using WebGL. We push all of the vector computation to the GPU. This means that less data needs to be sent to the browser. We have compared existing approaches to our method …
Author Recognition Using Locality Sensitive Hashing & Alergia (Stochastic Finite Automata), Prashanth Sandela
Author Recognition Using Locality Sensitive Hashing & Alergia (Stochastic Finite Automata), Prashanth Sandela
Master's Projects
In today’s world data grows very fast. It is difficult to answer questions like 1) Is the content completely written by this author, 2) Did he get few sentences or pages from another author, 3) Is there any way to identify actual author. There are many plagiarism software’s available in the market which identify duplicate content. It doesn’t understand writing pattern involved. There is always a necessity to make an effort to find the original author. Locality sensitive hashing is one such standard for applying hashing to recognize authors writing pattern.
Clustering Web Concepts Using Algebraic Topology, Harleen Kaur Ahuja
Clustering Web Concepts Using Algebraic Topology, Harleen Kaur Ahuja
Master's Projects
In this world of Internet, there is a rapid amount of growth in data both in terms of size and dimension. It consists of web pages that represents human thoughts. These thoughts involves concepts and associations which we can capture. Using mathematics, we can perform meaningful clustering of these pages. This project aims at providing a new problem solving paradigm known as algebraic topology in data science. Professor Vasant Dhar, Editor-In-Chief of Big Data (Professor at NYU) define data science as a generalizable extraction of knowledge from data. The core concept of semantic based search engine project developed by my …