Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Machine Learning (4)
- COVID-19 (2)
- Data Science (2)
- Machine learning (2)
- 2020 US Cyber Challenge National Championship (1)
-
- ARIMA (1)
- Abusive Language (1)
- Adversarial (1)
- Adversarial algorithm (1)
- Alina Kuzmenkova (1)
- Analytics (1)
- Animation Jam (1)
- Antibiotic Resistance (1)
- Artificial neural networks (1)
- Attribute (feature) (1)
- Attribute (feature) space (1)
- Audio classification (1)
- Autoregressive (1)
- Bias (1)
- Big Data (1)
- Big data (1)
- Bioconductor (1)
- Black maternal health (1)
- Brain (1)
- CHA +DePaul Youth Partnership (1)
- CNN (1)
- Cash flow forecasting (1)
- Churn prediction (1)
- Clean energy (1)
- Cole Anderson (1)
Articles 1 - 25 of 25
Full-Text Articles in Physical Sciences and Mathematics
Reading Pdfs Using Adversarially Trained Convolutional Neural Network Based Optical Character Recognition, Michael B. Brewer, Michael Catalano, Yat Leung, David Stroud
Reading Pdfs Using Adversarially Trained Convolutional Neural Network Based Optical Character Recognition, Michael B. Brewer, Michael Catalano, Yat Leung, David Stroud
SMU Data Science Review
A common problem that has plagued companies for years is digitizing documents and making use of the data contained within. Optical Character Recognition (OCR) technology has flooded the market, but companies still face challenges productionizing these solutions at scale. Although these technologies can identify and recognize the text on the page, they fail to classify the data to the appropriate datatype in an automated system that uses OCR technology as its data mining process. The research contained in this paper presents a novel framework for the identification of datapoints on check stub images by utilizing generative adversarial networks (GANs) to …
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Applying The Data: Predictive Analytics In Sport, Anthony Teeter, Margo Bergman
Access*: Interdisciplinary Journal of Student Research and Scholarship
The history of wagering predictions and their impact on wide reaching disciplines such as statistics and economics dates to at least the 1700’s, if not before. Predicting the outcomes of sports is a multibillion-dollar business that capitalizes on these tools but is in constant development with the addition of big data analytics methods. Sportsline.com, a popular website for fantasy sports leagues, provides odds predictions in multiple sports, produces proprietary computer models of both winning and losing teams, and provides specific point estimates. To test likely candidates for inclusion in these prediction algorithms, the authors developed a computer model, and test …
An Analysis Of Technological Components In Relation To Privacy In A Smart City, Kayla Rutherford, Ben Lands, A. J. Stiles
An Analysis Of Technological Components In Relation To Privacy In A Smart City, Kayla Rutherford, Ben Lands, A. J. Stiles
James Madison Undergraduate Research Journal (JMURJ)
A smart city is an interconnection of technological components that store, process, and wirelessly transmit information to enhance the efficiency of applications and the individuals who use those applications. Over the course of the 21st century, it is expected that an overwhelming majority of the world’s population will live in urban areas and that the number of wireless devices will increase. The resulting increase in wireless data transmission means that the privacy of data will be increasingly at risk. This paper uses a holistic problem-solving approach to evaluate the security challenges posed by the technological components that make up a …
Cash Flow Forecasting Using Probabilistic Neural Networks, Marwan Ashour
Cash Flow Forecasting Using Probabilistic Neural Networks, Marwan Ashour
Journal of the Arab American University مجلة الجامعة العربية الامريكية للبحوث
This paper aimed to compare the modern methods of cash flow forecasting with the traditional ones. In other words, the researcher compared between the Probabilistic Neural Networks and Transfer Function. It is worth mentioning that cash flow forecasting , nowadays, is very important and helps the upper management plan, control, assess the performance and make decisions. More specifically, in this paper, the Artificial Neural networks were used to diagnose the nature of the cash flow for the next period of time and then forecast the cash flow. The experiment was conducted in The General company for Electricity Distribution in Baghdad. …
Fall 2020
In The Loop
Studio CDM Documents Remote Initiatives; "Tom of Your Life" Film Release; Animation Jam Goes Virtual; DePaul Experimental Film Showcase 2020; Trackmania Soundtrack; Alumni Games at Pixel Pop; Alumnus Commemorates St. Vincent de Paul; Cybersecurity Champion Alina Kuzmenkova; Walking the Walk: Youth programs at CDM express DePaul’s Vincentian values; Fair Treatment: Three initiatives address racial inequity in health care; They've Got You Covered: A School of Design instructor leads a cottage industry of makers protecting essential workers from the novel coronavirus; Meet Would-Be Hot Topic Influencer Vera Drew; Data Detectives: CDM helps Chicago track the racial proportions of its COVID-19 cases
Extraction D’Information À Partir Des Sites Web En Arabe Basée Sur Une Méthode À Base Des Règles, Moustafa Alhajj, Amani Sabra
Extraction D’Information À Partir Des Sites Web En Arabe Basée Sur Une Méthode À Base Des Règles, Moustafa Alhajj, Amani Sabra
Al Jinan الجنان
Cet article décrit un outil qui se sert de l’ingénierie de la langue pour l’extraction d’information à partir des sites web en arabe, Ces informations serviront aux documentalistes du Web poue créer des fches d’archivage pour les sites. Une fche d’archivage est proposée, l’objectif étant de remplir cette fche automatiquement. Pour la reconnaissance et la classifcation des segments textuels, la méthode d’exploration contextuelle proposée par Descles est utilisée, les marqueurs et règles linguistiques sont défnis en se basant sur une étude synthétique des spécifcités de la langue arabe. Un corpus de plus de 1300 sites Web en langue arabe a …
Data Is Personal: We Should Treat It As Such, Kaleb Dunn
Data Is Personal: We Should Treat It As Such, Kaleb Dunn
Student Papers in Public Policy
The rise of the internet as a fact of daily life is the defining element of the modern age. Widespread use of the internet has fundamentally altered entire industries, and much of American life has migrated online. Dating is augmented by online dating; shopping by online shopping; television by internet streaming.
The digitization of American life has brought with it considerable benefits, including great convenience and innumerable efficiencies, but it has not come without a cost. Although there are many business models used by internet companies, many of the now-largest companies in the world have converged on one entity upon …
Implement Multi-Factor Authentication On All Federal Systems Now, Megan Walsh
Implement Multi-Factor Authentication On All Federal Systems Now, Megan Walsh
Student Papers in Public Policy
The White House Office of Management and Budget recorded 31,107 information security incidents in fiscal year 2018. The most common attacks to gain access to a user’s login credentials were e-mail/phishing, web-based attack, and brute force entering of username/password combinations. Given this high number of incidents, strong reliance on computers for everyday business, and common attacks that target passwords, information security should be a priority for information technology administrators working in federal agencies.
Removing Racially Biased Algorithms In Policing, Andie Lee
Removing Racially Biased Algorithms In Policing, Andie Lee
Student Papers in Public Policy
Local police departments use algorithm-based programs to do police work and predict crime. Technology has created the police tactic of predictive crime prevention. Police work, however, requires social skills, assessment of the environment, and most importantly human interaction. Automated policing lacks these characteristics. Moreover, the algorithms used to make crime predictions and risk assessments have disproportionately affected minorities.
The Case For Online Ranked-Choice Voting, Rayyan Khan
The Case For Online Ranked-Choice Voting, Rayyan Khan
Student Papers in Public Policy
Maine was the first to embrace ranked-choice voting on a statewide level in 2018, using it for all state and general elections. Maine voters will be the first to use ranked-choice voting in a presidential election in 2020. This system differs from traditional voting in that voters rank candidates rather than choose just one. Supporters of ranked-choice voting tout it as a better model for accurately representing the values of the voting population; however, a study conducted in San Francisco details a potential shortfall referred to as “ballot fatigue” that the theoretically-ideal system may face as it struggles to deal …
Topic Modeling To Understand Technology Talent, Chad Madding, Allen Ansari, Chris Ballenger, Aswini Thota
Topic Modeling To Understand Technology Talent, Chad Madding, Allen Ansari, Chris Ballenger, Aswini Thota
SMU Data Science Review
Attracting technology talent in today’s hiring climate is more complicated than ever. Recruiting for technology talent in non-technology industries is even more challenging. This intense hiring landscape is motivating companies not only to attract the right talent but also to create a culture that can retain and grow that talent. In this paper, we developed algorithms and present insights that use data provided in reviews to glean information employers can use to address or even change their priorities to meet the demands of an ever-changing job market. The core of our research is to investigate and attribute the role of …
Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed
Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed
SMU Data Science Review
Music is incorporated into our daily lives whether intentional or unintentional. It evokes responses and behavior so much so there is an entire study dedicated to the psychology of music. Music creates the mood for dancing, exercising, creative thought or even relaxation. It is a powerful tool that can be used in various venues and through advertisements to influence and guide human reactions. Music is also often "borrowed" in the industry today. The practices of sampling and remixing music in the digital age have made cover song identification an active area of research. While most of this research is focused …
Time Series Analysis Of Offshore Buoy Light Detection And Ranging (Lidar) Windspeed Data, Aditya Garapati, Charles J. Henderson, Carl Walenciak, Brian T. Waite
Time Series Analysis Of Offshore Buoy Light Detection And Ranging (Lidar) Windspeed Data, Aditya Garapati, Charles J. Henderson, Carl Walenciak, Brian T. Waite
SMU Data Science Review
In this paper, modeling techniques for the forecasting of wind speed using historical values observed by Light Detection and Ranging (LIDAR) sensors in an offshore context are described. Both univariate time series and multivariate time series modeling techniques leveraging meteorological data collected simultaneously with the LIDAR data are evaluated for potential contributions to predictive ability. Accurate and timely ability to predict wind values is essential to the effective integration of wind power into existing power grid systems. It allows for both the management of rapid ramp-up / down of base production capacity due to highly variable wind power inputs and …
Toxic Language Detection Using Robust Filters, Deepti Kunupudi, Shantanu Godbole, Pankaj Kumar, Suhas Pai
Toxic Language Detection Using Robust Filters, Deepti Kunupudi, Shantanu Godbole, Pankaj Kumar, Suhas Pai
SMU Data Science Review
Social networks sometimes become a medium for threats, insults, and other types of cyberbullying. A large number of people are involved in online social networks. Hence, the protection of network users from anti-social behavior is a critical activity [19]. One of the significant tasks of such activity is the detection of toxic language. Abusive/Toxic language in user-generated online content has become an issue of increasing importance in recent years. Most current commercial methods use blacklists and regular expressions; however, these measures fall short when contending with more subtle, lesser-known examples of hate speech, profanity, or swearing[6]. Abusive language classification has …
Reducing Age Bias In Machine Learning: An Algorithmic Approach, Adriana Solange Garcia De Alford, Steven K. Hayden, Nicole Wittlin, Amy Atwood
Reducing Age Bias In Machine Learning: An Algorithmic Approach, Adriana Solange Garcia De Alford, Steven K. Hayden, Nicole Wittlin, Amy Atwood
SMU Data Science Review
In this paper, we study the prevalence of bias in machine learning; we explore the life cycle phases where bias is potentially introduced into a machine learning model; and lastly, we present how adversarial learning can be leveraged to measure unwanted bias and unfair behavior from a machine learning algorithm. This study focuses particularly on the topics of age bias in predicting employee attrition and presents a practical approach for how adversarial learning can be successful in mitigating age bias. To measure bias, we calculate group fairness metrics across five-year age groups and evaluate fairness between a baseline predictive model …
Forecasting Spare Parts Sporadic Demand Using Traditional Methods And Machine Learning - A Comparative Study, Bhuvana Adur Kannan, Ganesh Kodi, Oscar Padilla, Dough Gray, Barry C. Smith
Forecasting Spare Parts Sporadic Demand Using Traditional Methods And Machine Learning - A Comparative Study, Bhuvana Adur Kannan, Ganesh Kodi, Oscar Padilla, Dough Gray, Barry C. Smith
SMU Data Science Review
Sporadic demand presents a particular challenge to traditional time forecasting methods. In the past 50 years, there has been developments, such as, the Croston Model [3], which has improved forecast performance. With the rise of Machine Learning (ML) there is abundant research in the field of applying ML algorithms to predict sporadic demand [8][12][9]. However, most existing research has analyzed this problem from the demand side [17]. In this paper, we tackle this predictive analytics challenge from the supply side. We perform a comparative analysis utilizing a spare parts demand dataset from an Original Equipment Manufacturer (OEM). Since traditional measurements …
Floor Regularization And Investigation Of Transfer Learning Through Sharing Of Probability Distribution Parameters, Daniel Byrne, Stacey Smith, Joanna Duran, John Santerre
Floor Regularization And Investigation Of Transfer Learning Through Sharing Of Probability Distribution Parameters, Daniel Byrne, Stacey Smith, Joanna Duran, John Santerre
SMU Data Science Review
In this work we introduce a simple new regularization technique, aptly named Floor, which drops low weight connections on every forward pass whenever they fall below a specified event horizon threshold. We compare the results of this technique side by side on identical network architectures between regular Dropout and Floor algorithms. We report similar or improved regularization, with the Floor algorithm versus regular Dropout and/or in concert with regular Dropout.
In this paper we also describe our research into transfer learning by sharing of probability distribution parameters in which we investigated methods of transferring Gaussian prior parameters derived from the …
The Transcript Profile Changes With Developmental Maturation Of Fetal Lung Type 2 Cells: An Analysis Of Rnaseq Data, Heber C. Nielsen, Volodymyr Orlov, Rebecca Holsapple, Monnie Mcgee
The Transcript Profile Changes With Developmental Maturation Of Fetal Lung Type 2 Cells: An Analysis Of Rnaseq Data, Heber C. Nielsen, Volodymyr Orlov, Rebecca Holsapple, Monnie Mcgee
SMU Data Science Review
In this paper, we utilize next-generation sequencing (NGS) data from the LungMap project to identify and characterize the developmental RNA transcriptome in alveolar epithelial type II cells of embryonic mouse lungs of gestational ages embryonic days 16 (E16) and 18 (E18). Late gestation lung cellular maturation is necessary for survival at birth. Using R and the BioConductor packages for RNAseq analysis, we analyze changes in the mouse lung RNA transcriptome as this maturation process takes place. We particularly identify the cluster of genes whose expression changes markedly between immature (E16) and mature (E18) lungs which can be used to define …
Forecasting Power Consumption In Pennsylvania During The Covid-19 Pandemic: A Sarimax Model With External Covid-19 And Unemployment Variables, Jackson Au, Javier Saldaña Jr., Ben Spanswick, John Santerre
Forecasting Power Consumption In Pennsylvania During The Covid-19 Pandemic: A Sarimax Model With External Covid-19 And Unemployment Variables, Jackson Au, Javier Saldaña Jr., Ben Spanswick, John Santerre
SMU Data Science Review
In this paper, we present how electrical consumption can reveal insight into the novel COVID-19 pandemic spread. We analyze electrical power consumption provided by PPL Electric Utilities, Department of Labor’s unemployment claims, and the COVID-19 cases/deaths for the State of Pennsylvania to study the impact of the pandemic on the infrastructure. Using a SARIMA model as our benchmark and we analyzed the use of a SARIMAX model to forecast the power consumption in Pennsylvania 14 days ahead. Our work quantifies and illuminates the effect that the strict legislation passed to minimize the spread of COVID19 had a on power consumption. …
Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre
Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre
SMU Data Science Review
In this paper, we explore a representation methodology for the compression of DNA isolates. Using lossless string compression via tokenization of frequently repeated segments of DNA, we reduce the length of the isolates to be counted as k-mers for classification. With this new representation, we apply a previously established feature sampling method to dramatically reduce the feature space. In understanding the genetic diversity, we also look at conserving biological function across these spaces. Using a random forest model we were able to predict the resistance or susceptibility of bacteria with 85-90\% accuracy, with a 30-50\% reduction in overall isolate length, …
Spoken Language Recognition On Open-Source Datasets, Brady Arendale, Samira Zarandioon, Ryan Goodwin, Douglas Reynolds
Spoken Language Recognition On Open-Source Datasets, Brady Arendale, Samira Zarandioon, Ryan Goodwin, Douglas Reynolds
SMU Data Science Review
The field of speaker and language recognition is constantly being researched and developed, but much of this research is done on private or expensive datasets, making the field more inaccessible than many other areas of machine learning. In addition, many papers make performance claims without comparing their models to other recent research. With the recent development of public multilingual speech corpora such as Mozilla's Common Voice as well as several single-language corpora, we now have the resources to attempt to address both of these problems. We construct an eight-language dataset from Common Voice and a Google Bengali corpus as well …
Predicting Attrition - A Driver For Creating Value, Realizing Strategy, And Refining Key Hr Processes, Kevin Mendonsa, Maureen Stolberg, Vivek Viswanathan, Scott Crum
Predicting Attrition - A Driver For Creating Value, Realizing Strategy, And Refining Key Hr Processes, Kevin Mendonsa, Maureen Stolberg, Vivek Viswanathan, Scott Crum
SMU Data Science Review
Talent is the most important asset for every organization's success. While attrition (or churn) and turnover can refer to both employees and customers, this paper will focus on employee attrition only. Many organizations accept attrition as an inevitable cost of doing business and do nothing to adopt or implement mitigating strategies to combat it. World class companies on the other hand take deliberate measures to understand, control and mitigate attrition (turnover) at every stage. Unmitigated attrition can have a devastating effect on an organization's bottom line and market value. In addition, the “invisible" costs of low employee morale, reduced employee …
An Effective Method For Attribute Subset Selection, Considering The Resource In Pattern Recognition, Bakhtiyorjon Bakirovich Akbaraliev
An Effective Method For Attribute Subset Selection, Considering The Resource In Pattern Recognition, Bakhtiyorjon Bakirovich Akbaraliev
Chemical Technology, Control and Management
An analytical method for determining informative sets of features (INP) is developed, taking into account the resource for criteria based on the use of a measure of dispersion of classified objects. The areas of existence of the solution are defined. The statements and properties for the Fischer-type information criterion are proved, using which the proposed analytical method for determining the INP guarantees optimal results in the sense of maximizing the selected functional. The appropriateness of choosing this type of informative criterion is justified. A method for transforming attributes is proposed. The universality of the method in relation to the type …
Human Trafficking In Nepal: Can Big Data Help?, Shushant Khanal
Human Trafficking In Nepal: Can Big Data Help?, Shushant Khanal
Undergraduate Research Journal
This paper provides an overview of human trafficking in Nepal, identifies strategies implemented by the government of the country to handle the problem and possibilities of using big data as a solution to the problem of human trafficking in Nepal. Big data, may be defined as the collection of a large volume of data from the past that is processed using machine learning and artificial intelligence to find a common pattern. The use of big data in tackling the problem of human trafficking is not new in developed countries like the United States but it is still a foreign idea …
Prediction Of Feed Utilization Performance In Clarias Gariepinus Using Multiple Linear Regression In Machine Learning, Adekunle Oluwatosin Familusi
Prediction Of Feed Utilization Performance In Clarias Gariepinus Using Multiple Linear Regression In Machine Learning, Adekunle Oluwatosin Familusi
Journal of Bioresource Management
Machine learning models can be used to make predictions about nutrient utilization performance index using available proximate analysis data on feed composition. Data from similar experiments on nutrient utilization performance was used to fit a multiple linear regression model for the prediction of four performance indexes. The Specific Growth Rate and percentage inclusion with strength of 0.57 was noted along with a negative relationship between protein efficiency and protein content. A negative relationship between Nitrogen Free Extract (NFE) and Protein Efficiency Ratio (PER) at NFE content ≥25 % was observed. PER was predicted with 85 % accuracy, while Weight Gain …