Open Access. Powered by Scholars. Published by Universities.®

Social and Behavioral Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Old Dominion University

Computer Science Faculty Publications

Series

Discipline
Keyword
Publication Year

Articles 31 - 59 of 59

Full-Text Articles in Social and Behavioral Sciences

Understanding The Impact Of Encrypted Dns On Internet Censorship, Lin Jin, Shuai Hao, Haining Wang, Chase Cotton Jan 2021

Understanding The Impact Of Encrypted Dns On Internet Censorship, Lin Jin, Shuai Hao, Haining Wang, Chase Cotton

Computer Science Faculty Publications

DNS traffic is transmitted in plaintext, resulting in privacy leakage. To combat this problem, secure protocols have been used to encrypt DNS messages. Existing studies have investigated the performance overhead and privacy benefits of encrypted DNS communications, yet little has been done from the perspective of censorship. In this paper, we study the impact of the encrypted DNS on Internet censorship in two aspects. On one hand, we explore the severity of DNS manipulation, which could be leveraged for Internet censorship, given the use of encrypted DNS resolvers. In particular, we perform 7.4 million DNS lookup measurements on 3,813 DoT …


Ssentiaa: A Self-Supervised Sentiment Analyzer For Classification From Unlabeled Data, Salim Sazzed, Sampath Jayarathna Jan 2021

Ssentiaa: A Self-Supervised Sentiment Analyzer For Classification From Unlabeled Data, Salim Sazzed, Sampath Jayarathna

Computer Science Faculty Publications

In recent years, supervised machine learning (ML) methods have realized remarkable performance gains for sentiment classification utilizing labeled data. However, labeled data are usually expensive to obtain, thus, not always achievable. When annotated data are unavailable, the unsupervised tools are exercised, which still lag behind the performance of supervised ML methods by a large margin. Therefore, in this work, we focus on improving the performance of sentiment classification from unlabeled data. We present a self-supervised hybrid methodology SSentiA (Self-supervised Sentiment Analyzer) that couples an ML classifier with a lexicon-based method for sentiment classification from unlabeled data. We first introduce LRSentiA …


A Survey Of Enabling Technologies For Smart Communities, Amna Iqbal, Stephan Olariu Jan 2021

A Survey Of Enabling Technologies For Smart Communities, Amna Iqbal, Stephan Olariu

Computer Science Faculty Publications

In 2016, the Japanese Government publicized an initiative and a call to action for the implementation of a "Super Smart Society" announced as Society 5.0. The stated goal of Society 5.0 is to meet the various needs of the members of society through the provisioning of goods and services to those who require them, when they are required and in the amount required, thus enabling the citizens to live an active and comfortable life. In spite of its genuine appeal, details of a feasible path to Society 5.0 are conspicuously missing. The first main goal of this survey is to …


Smartcitecon: Implicit Citation Context Extraction From Academic Literature Using Unsupervised Learning, Chenrui Gao, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, Jian Wu Jan 2020

Smartcitecon: Implicit Citation Context Extraction From Academic Literature Using Unsupervised Learning, Chenrui Gao, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, Jian Wu

Computer Science Faculty Publications

We introduce SmartCiteCon (SCC), a Java API for extracting both explicit and implicit citation context from academic literature in English. The tool is built on a Support Vector Machine (SVM) model trained on a set of 7,058 manually annotated citation context sentences, curated from 34,000 papers in the ACL Anthology. The model with 19 features achieves F1=85.6%. SCC supports PDF, XML, and JSON files out-of-box, provided that they are conformed to certain schemas. The API supports single document processing and batch processing in parallel. It takes about 12–45 seconds on average depending on the format to process a …


Acknowledgement Entity Recognition In Cord-19 Papers, Jian Wu, Pei Wang, Xin Wei, Sarah Rajtmajer, C. Lee Giles, Christopher Griffin Jan 2020

Acknowledgement Entity Recognition In Cord-19 Papers, Jian Wu, Pei Wang, Xin Wei, Sarah Rajtmajer, C. Lee Giles, Christopher Griffin

Computer Science Faculty Publications

Acknowledgements are ubiquitous in scholarly papers. Existing acknowledgement entity recognition methods assume all named entities are acknowledged. Here, we examine the nuances between acknowledged and named entities by analyzing sentence structure. We develop an acknowledgement extraction system, AckExtract based on open-source text mining software and evaluate our method using manually labeled data. AckExtract uses the PDF of a scholarly paper as input and outputs acknowledgement entities. Results show an overall performance of F1=0.92. We built a supplementary database by linking CORD-19 papers with acknowledgement entities extracted by AckExtract including persons and organizations and find that only up to …


Opening Books And The National Corpus Of Graduate Research, William A. Ingram, Edward A. Fox, Jian Wu Jan 2020

Opening Books And The National Corpus Of Graduate Research, William A. Ingram, Edward A. Fox, Jian Wu

Computer Science Faculty Publications

Virginia Tech University Libraries, in collaboration with Virginia Tech Department of Computer Science and Old Dominion University Department of Computer Science, request $505,214 in grant funding for a 3-year project, the goal of which is to bring computational access to book-length documents, demonstrating that with Electronic Theses and Dissertations (ETDs). The project is motivated by the following library and community needs. (1) Despite huge volumes of book-length documents in digital libraries, there is a lack of models offering effective and efficient computational access to these long documents. (2) Nationwide open access services for ETDs generally function at the metadata level. …


Rotate-And-Press: A Non-Visual Alternative To Point-And-Click, Hae-Na Lee, Vikas Ashok, I. V. Ramakrishnan Jan 2020

Rotate-And-Press: A Non-Visual Alternative To Point-And-Click, Hae-Na Lee, Vikas Ashok, I. V. Ramakrishnan

Computer Science Faculty Publications

Most computer applications manifest visually rich and dense graphical user interfaces (GUIs) that are primarily tailored for an easy-and-efficient sighted interaction using a combination of two default input modalities, namely the keyboard and the mouse/touchpad. However, blind screen-reader users predominantly rely only on keyboard, and therefore struggle to interact with these applications, since it is both arduous and tedious to perform the visual 'point-and-click' tasks such as accessing the various application commands/features using just keyboard shortcuts supported by screen readers.

In this paper, we investigate the suitability of a 'rotate-and-press' input modality as an effective non-visual substitute for the visual …


A Heuristic Baseline Method For Metadata Extraction From Scanned Electronic Theses And Dissertations, Muntabir H. Choudhury, Jian Wu, William A. Ingam, Edward A. Fox Jan 2020

A Heuristic Baseline Method For Metadata Extraction From Scanned Electronic Theses And Dissertations, Muntabir H. Choudhury, Jian Wu, William A. Ingam, Edward A. Fox

Computer Science Faculty Publications

Extracting metadata from scholarly papers is an important text mining problem. Widely used open-source tools such as GROBID are designed for born-digital scholarly papers but often fail for scanned documents, such as Electronic Theses and Dissertations (ETDs). Here we present a preliminary baseline work with a heuristic model to extract metadata from the cover pages of scanned ETDs. The process started with converting scanned pages into images and then text files by applying OCR tools. Then a series of carefully designed regular expressions for each field is applied, capturing patterns for seven metadata fields: titles, authors, years, degrees, academic programs, …


Repurposing Visual Input Modalities For Blind Users: A Case Study Of Word Processors, Hae-Na Lee, Vikas Ashok, I.V. Ramakrishnan Jan 2020

Repurposing Visual Input Modalities For Blind Users: A Case Study Of Word Processors, Hae-Na Lee, Vikas Ashok, I.V. Ramakrishnan

Computer Science Faculty Publications

Visual 'point-and-click' interaction artifacts such as mouse and touchpad are tangible input modalities, which are essential for sighted users to conveniently interact with computer applications. In contrast, blind users are unable to leverage these visual input modalities and are thus limited while interacting with computers using a sequentially narrating screen-reader assistive technology that is coupled to keyboards. As a consequence, blind users generally require significantly more time and effort to do even simple application tasks (e.g., applying a style to text in a word processor) using only keyboard, compared to their sighted peers who can effortlessly accomplish the same tasks …


Towards Making Videos Accessible For Low Vision Screen Magnifier Users, Ali Selman Aydin, Shirin Feiz, Vikas Ashok, Iv Ramakrishnan Jan 2020

Towards Making Videos Accessible For Low Vision Screen Magnifier Users, Ali Selman Aydin, Shirin Feiz, Vikas Ashok, Iv Ramakrishnan

Computer Science Faculty Publications

People with low vision who use screen magnifiers to interact with computing devices find it very challenging to interact with dynamically changing digital content such as videos, since they do not have the luxury of time to manually move, i.e., pan the magnifier lens to different regions of interest (ROIs) or zoom into these ROIs before the content changes across frames.

In this paper, we present SViM, a first of its kind screen-magnifier interface for such users that leverages advances in computer vision, particularly video saliency models, to identify salient ROIs in videos. SViM's interface allows users to zoom in/out …


Sail: Saliency-Driven Injection Of Aria Landmarks, Ali Selman Aydin, Shirin Feiz, Vikas Ashok, Iv Ramakrishnan Jan 2020

Sail: Saliency-Driven Injection Of Aria Landmarks, Ali Selman Aydin, Shirin Feiz, Vikas Ashok, Iv Ramakrishnan

Computer Science Faculty Publications

Navigating webpages with screen readers is a challenge even with recent improvements in screen reader technologies and the increased adoption of web standards for accessibility, namely ARIA. ARIA landmarks, an important aspect of ARIA, lets screen reader users access different sections of the webpage quickly, by enabling them to skip over blocks of irrelevant or redundant content. However, these landmarks are sporadically and inconsistently used by web developers, and in many cases, even absent in numerous web pages. Therefore, we propose SaIL, a scalable approach that automatically detects the important sections of a web page, and then injects ARIA landmarks …


Web Archives At The Nexus Of Good Fakes And Flawed Originals, Michael L. Nelson Jan 2019

Web Archives At The Nexus Of Good Fakes And Flawed Originals, Michael L. Nelson

Computer Science Faculty Publications

[Summary] The authenticity, integrity, and provenance of resources we encounter on the web are increasingly in question. While many people are inured to the possibility of altered images, the easy accessibility of powerful software tools that synthesize audio and video will unleash a torrent of convincing “deepfakes” into our social discourse. Archives will no longer be monopolized by a countable number of institutions such as governments and publishers, but will become a competitive space filled with social engineers, propagandists, conspiracy theorists, and aspiring Hollywood directors. While the historical record has never been singular nor unmalleable, current technologies empower an unprecedented …


Automatic Slide Generation For Scientific Papers, Athar Sefid, Jian Wu, Prasenjit Mitra, C. Lee Giles Jan 2019

Automatic Slide Generation For Scientific Papers, Athar Sefid, Jian Wu, Prasenjit Mitra, C. Lee Giles

Computer Science Faculty Publications

We describe our approach for automatically generating presentation slides for scientific papers using deep neural networks. Such slides can help authors have a starting point for their slide generation process. Extractive summarization techniques are applied to rank and select important sentences from the original document. Previous work identified important sentences based only on a limited number of features that were extracted from the position and structure of sentences in the paper. Our method extends previous work by (1) extracting a more comprehensive list of surface features, (2) considering semantic or meaning of the sentence, and (3) using context around the …


A Survey Of Attention Deficit Hyperactivity Disorder Identification Using Psychophysiological Data, S. De Silva, S. Dayarathna, G. Ariyarathne, D. Meedeniya, Sampath Jayarathna Jan 2019

A Survey Of Attention Deficit Hyperactivity Disorder Identification Using Psychophysiological Data, S. De Silva, S. Dayarathna, G. Ariyarathne, D. Meedeniya, Sampath Jayarathna

Computer Science Faculty Publications

Attention Deficit Hyperactivity Disorder (ADHD) is one of the most common neurological disorders among children, that affects different areas in the brain that allows executing certain functionalities. This may lead to a variety of impairments such as difficulties in paying attention or focusing, controlling impulsive behaviours and overreacting. The continuous symptoms may have a severe impact in the long-term. This paper explores the ADHD identification studies using eye movement data and functional Magnetic Resonance Imaging (fMRI). This study discusses different machine learning techniques, existing models and analyses the existing literature. We have identified the current challenges and possible future directions …


Client-Assisted Memento Aggregation Using The Prefer Header, Mat Kelly, Sawood Alam, Michael L. Nelson, Michele C. Weigle Jan 2018

Client-Assisted Memento Aggregation Using The Prefer Header, Mat Kelly, Sawood Alam, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

[First paragraph] Preservation of the Web ensures that future generations have a picture of how the web was. Web archives like Internet Archive's Wayback Machine, WebCite, and archive.is allow individuals to submit URIs to be archived, but the captures they preserve then reside at the archives. Traversing these captures in time as preserved by multiple archive sources (using Memento [8]) provides a more comprehensive picture of the past Web than relying on a single archive. Some content on the Web, such as content behind authentication, may be unsuitable or inaccessible for preservation by these organizations. Furthermore, this content may be …


Swimming In A Sea Of Javascript Or: How I Learned To Stop Worrying And Love High-Fidelity Replay, John A. Berlin, Michael L. Nelson, Michele C. Weigle Jan 2018

Swimming In A Sea Of Javascript Or: How I Learned To Stop Worrying And Love High-Fidelity Replay, John A. Berlin, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

[First paragraph] Preserving and replaying modern web pages in high-fidelity has become an increasingly difficult task due to the increased usage of JavaScript. Reliance on server-side rewriting alone results in live-leakage and or the inability to replay a page due to the preserved JavaScript performing an action not permissible from the archive. The current state-of-the-art high fidelity archival preservation and replay solutions rely on handcrafted client-side URL rewriting libraries specifically tailored for the archive, namely Webrecoder's and Pywb's wombat.js [12]. Web archives not utilizing client-side rewriting rely on server-side rewriting that misses URLs used in a manner not accounted for …


It Is Hard To Compute Fixity On Archived Web Pages, Mohamed Aturban, Michael L. Nelson, Michele C. Weigle Jan 2018

It Is Hard To Compute Fixity On Archived Web Pages, Mohamed Aturban, Michael L. Nelson, Michele C. Weigle

Computer Science Faculty Publications

[Introduction] Checking fixity in web archives is performed to ensure archived resources, or mementos (denoted by URI-M) have remained unaltered since when they were captured. The final report of the PREMIS Working Group [2] defines information used for fixity as "information used to verify whether an object has been altered in an undocumented or unauthorized way." The common technique for checking fixity is to generate a current hash value (i.e., a message digest or a checksum) for a file using a cryptographic hash function (e.g., SHA-256) and compare it to the hash value generated originally. If they have different hash …


A Survey Of Archival Replay Banners, Sawood Alam, Mat Kelly, Michele C. Weigle, Michael L. Nelson Jan 2018

A Survey Of Archival Replay Banners, Sawood Alam, Mat Kelly, Michele C. Weigle, Michael L. Nelson

Computer Science Faculty Publications

We surveyed various archival systems to compare and contrast different techniques used to implement an archival replay banner. We found that inline plain HTML injection is the most common approach, but prone to style conflicts. Iframe-based banners are also very common and while they do not have style conflicts, they suffer from screen real estate wastage and limited design choices. Custom Elements-based banners are promising, but due to being a new web standard, these are not yet widely deployed.


205.3 The Many Shapes Of Archive-It, Shawn Jones, Michael L. Nelson, Alexander Nwala, Michele C. Weigle Jan 2018

205.3 The Many Shapes Of Archive-It, Shawn Jones, Michael L. Nelson, Alexander Nwala, Michele C. Weigle

Computer Science Faculty Publications

Web archives, a key area of digital preservation, meet the needs of journalists, social scientists, historians, and government organizations. The use cases for these groups often require that they guide the archiving process themselves, selecting their own original resources, or seeds, and creating their own web archive collections. We focus on the collections within Archive-It, a subscription service started by the Internet Archive in 2005 for the purpose of allowing organizations to create their own collections of archived web pages, or mementos. Understanding these collections could be done via their user-supplied metadata or via text analysis, but the metadata is …


Avoiding Zombies In Archival Replay Using Serviceworker, Sawood Alam, Mat Kelly, Michele C. Weigle, Michael L. Nelson Jan 2017

Avoiding Zombies In Archival Replay Using Serviceworker, Sawood Alam, Mat Kelly, Michele C. Weigle, Michael L. Nelson

Computer Science Faculty Publications

[First paragraph] A Composite Memento is an archived representation of a web page with all the page requisites such as images and stylesheets. All embedded resources have their own URIs, hence, they are archived independently. For a meaningful archival replay, it is important to load all the page requisites from the archive within the temporal neighborhood of the base HTML page. To achieve this goal, archival replay systems try to rewrite all the resource references to appropriate archived versions before serving HTML, CSS, or JS. However, an effective server-side URL rewriting is difficult when URLs are generated dynamically using JavaScript. …


Profiling Web Archives For Efficient Memento Query Routing, Sawood Alam, Michael L. Nelson, Herbert Van De Sompel, Lyudmila L. Balakireva, Harihar Shankar, David S. H. Rosenthal Jan 2015

Profiling Web Archives For Efficient Memento Query Routing, Sawood Alam, Michael L. Nelson, Herbert Van De Sompel, Lyudmila L. Balakireva, Harihar Shankar, David S. H. Rosenthal

Computer Science Faculty Publications

No abstract provided.


Characteristics Of Social Media Stories, Yasmin Ainoamany, Michele C. Weigle, Michael L. Nelson Jan 2015

Characteristics Of Social Media Stories, Yasmin Ainoamany, Michele C. Weigle, Michael L. Nelson

Computer Science Faculty Publications

An emerging trend in social media is for users to create and publish "stories", or curated lists of web resources with the purpose of creating a particular narrative of interest to the user. While some stories on the web are automatically generated, such as Facebook’s "Year in Review", one of the most popular storytelling services is "Storify", which provides users with curation tools to select, arrange, and annotate stories with content from social media and the web at large. We would like to use tools like Storify to present automatically created summaries of archival collections. To support automatic story creation, …


Moved But Not Gone: An Evaluation Of Real-Time Methods For Discovering Replacement Web Pages, Martin Klein, Michael L. Nelson Jan 2014

Moved But Not Gone: An Evaluation Of Real-Time Methods For Discovering Replacement Web Pages, Martin Klein, Michael L. Nelson

Computer Science Faculty Publications

Inaccessible Web pages and 404 “Page Not Found” responses are a common Web phenomenon and a detriment to the user’s browsing experience. The rediscovery of missing Web pages is, therefore, a relevant research topic in the digital preservation as well as in the Information Retrieval realm. In this article, we bring these two areas together by analyzing four content- and link-based methods to rediscover missing Web pages. We investigate the retrieval performance of the methods individually as well as their combinations and give an insight into how effective these methods are over time. As the main result of this work, …


Demographic Prediction Of Mobile User From Phone Usage, Shahram Mohrehkesh, Shuiwang Ji, Tamer Nadeem, Michele C. Weigle Jan 2012

Demographic Prediction Of Mobile User From Phone Usage, Shahram Mohrehkesh, Shuiwang Ji, Tamer Nadeem, Michele C. Weigle

Computer Science Faculty Publications

In this paper, we describe how we use the mobile phone usage of users to predict their demographic attributes. Using call log, visited GSM cells information, visited Bluetooth devices, visited Wireless LAN devices, accelerometer data, and so on, we predict the gender, age, marital status, job and number of people in household of users. The accuracy of developed classifiers for these classification problems ranges from 45-87% depending upon the particular classification problem.


Everyone Is A Curator: Human-Assisted Preservation For Ore Aggregations, Frank Mccown, Michael L. Nelson, Herbert Van De Sompel Jan 2009

Everyone Is A Curator: Human-Assisted Preservation For Ore Aggregations, Frank Mccown, Michael L. Nelson, Herbert Van De Sompel

Computer Science Faculty Publications

The Open Archives Initiative (OAI) has recently created the Object Reuse and Exchange (ORE) project that defines Resource Maps (ReMs) for describing aggregations of web resources. These aggregations are susceptible to many of the same preservation challenges that face other web resources. In this paper, we investigate how the aggregations of web resources can be preserved outside of the typical repository environment and instead rely on the thousands of interactive users in the web community and the Web Infrastructure (the collection of web archives, search engines, and personal archiving services) to facilitate preservation. Inspired by Web 2.0 services such as …


Object Reuse And Exchange, Michael L. Nelson, Carl Lagoze, Herbert Van De Sompel, Pete Johnston, Robert Sanderson, Simeon Warner, Jürgen Sieck (Ed.), Michael A. Herzog (Ed.) Jan 2009

Object Reuse And Exchange, Michael L. Nelson, Carl Lagoze, Herbert Van De Sompel, Pete Johnston, Robert Sanderson, Simeon Warner, Jürgen Sieck (Ed.), Michael A. Herzog (Ed.)

Computer Science Faculty Publications

The Open Archives Object Reuse and Exchange (OAI-ORE) project defines standards for the description and exchange of aggregations of Web resources. The OAI-ORE abstract data model is conformant with the Architecture of the World Wide Web and leverages concepts from the Semantic Web, including RDF descriptions and Linked Data. In this paper we provide a brief review of a motivating example and its serialization in Atom.


Factors Affecting Website Reconstruction From The Web Infrastructure, Frank Mccown, Norou Diawara, Michael L. Nelson Jun 2007

Factors Affecting Website Reconstruction From The Web Infrastructure, Frank Mccown, Norou Diawara, Michael L. Nelson

Computer Science Faculty Publications

When a website is suddenly lost without a backup, it may be reconstituted by probing web archives and search engine caches for missing content. In this paper we describe an experiment where we crawled and reconstructed 300 randomly selected websites on a weekly basis for 14 weeks. The reconstructions were performed using our web-repository crawler named Warrick which recovers missing resources from the Web Infrastructure (WI), the collective preservation effort of web archives and search engine caches. We examine several characteristics of the websites over time including birth rate, decay and age of resources. We evaluate the reconstructions when compared …


Istart 2: Improvements For Efficiency And Effectiveness, Irwin B. Levinstein, Chutima Boonthum, Srinivasa P. Pillarisetti, Courtney Bell, Danielle S. Mcnamara Jan 2007

Istart 2: Improvements For Efficiency And Effectiveness, Irwin B. Levinstein, Chutima Boonthum, Srinivasa P. Pillarisetti, Courtney Bell, Danielle S. Mcnamara

Computer Science Faculty Publications

iSTART (interactive strategy training for active reading and thinking) is a Web-based reading strategy trainer that develops students' ability to self-explain difficult text as a means to improving reading comprehension. Its curriculum consists of modules presented interactively by pedagogical agents: an introduction to the basics of using reading strategies in the context of self-explanation, a demonstration of self-explanation, and a practice module in which the trainee generates self-explanations with feedback on the quality of reading strategies contained in the self-explanations. We discuss the objectives that guided the development of the second version of iSTART toward the goals of increased efficiency …


Assessing The Format Of The Presentation Of Text In Developing A Reading Strategy Assessment Tool (R-Sat), Sara Gilliam, Joseph P. Magliano, Keith K. Millis, Irwin Levinstein, Chutima Boonthum Jan 2007

Assessing The Format Of The Presentation Of Text In Developing A Reading Strategy Assessment Tool (R-Sat), Sara Gilliam, Joseph P. Magliano, Keith K. Millis, Irwin Levinstein, Chutima Boonthum

Computer Science Faculty Publications

We are constructing a new computerized test of reading comprehension called the Reading Strategy Assessment Tool (R-SAT). R-SAT elicits and analyzes verbal protocols that readers generate in response to questions as they read texts. We examined whether the amount of information available to the reader when reading and answering questions influenced the extent to which R-SAT accounts for comprehension. We found that R-SAT was most predictive of comprehension when the readers did not have access to the text as they answered questions.