Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Singapore Management University

Discipline
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 4261 - 4290 of 6848

Full-Text Articles in Physical Sciences and Mathematics

Understanding The Paradigm Shift To Computational Social Science In The Presence Of Big Data, Ray M. Chang, Robert J. Kauffman, Young Ok Kwon Jul 2014

Understanding The Paradigm Shift To Computational Social Science In The Presence Of Big Data, Ray M. Chang, Robert J. Kauffman, Young Ok Kwon

Research Collection School Of Computing and Information Systems

The era of big data has created new opportunities for researchers to achieve high relevance and impact amid changes and transformations in how we study social science phenomena. With the emergence of new data collection technologies, advanced data mining and analytics support, there seems to be fundamental changes that are occurring with the research questions we can ask, and the research methods we can apply. The contexts include social networks and blogs, political discourse, corporate announcements, digital journalism, mobile telephony, home entertainment, online gaming, financial services, online shopping, social advertising, and social commerce. The changing costs of data collection and …


Decentralized Stochastic Planning With Anonymity In Interactions, Pradeep Varakantham, Yossiri Adulyasak, Patrick Jaillet Jul 2014

Decentralized Stochastic Planning With Anonymity In Interactions, Pradeep Varakantham, Yossiri Adulyasak, Patrick Jaillet

Research Collection School Of Computing and Information Systems

In this paper, we solve cooperative decentralized stochastic planning problems, where the interactions between agents (specified using transition and reward functions) are dependent on the number of agents (and not on the identity of the individual agents) involved in the interaction. A collision of robots in a narrow corridor, defender teams coordinating patrol activities to secure a target, etc. are examples of such anonymous interactions. Formally, we consider problems that are a subset of the well known Decentralized MDP (DEC-MDP) model, where the anonymity in interactions is specified within the joint reward and transition functions. In this paper, not only …


Manifold Learning For Jointly Modeling Topic And Visualization, Tuan Minh Van Le, Hady W. Lauw Jul 2014

Manifold Learning For Jointly Modeling Topic And Visualization, Tuan Minh Van Le, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Classical approaches to visualization directly reduce a document's high-dimensional representation into visualizable two or three dimensions, using techniques such as multidimensional scaling. More recent approaches consider an intermediate representation in topic space, between word space and visualization space, which preserves the semantics by topic modeling. We call the latter semantic visualization problem, as it seeks to jointly model topic and visualization. While previous approaches aim to preserve the global consistency, they do not consider the local consistency in terms of the intrinsic geometric structure of the document manifold. We therefore propose an unsupervised probabilistic model, called Semafore, which aims to …


Board Interlock Networks And The Use Of Relative Performance Evaluation, Qian Hao, Nan Hu, Ling Liu, Lee J. Yao Jul 2014

Board Interlock Networks And The Use Of Relative Performance Evaluation, Qian Hao, Nan Hu, Ling Liu, Lee J. Yao

Research Collection School Of Computing and Information Systems

Purpose - The purpose of this paper is to explore how networks of boards of directors affect relative performance evaluation (RPE) in chief executive officer (CEO) compensation. Design/methodology/approach - In this study, the authors propose that an interlocking network is an important inter-corporate setting, which has a bearing on whether boards decide to use RPE in CEO compensation. They adopt four typical graph measures to depict the centrality/position of each board in the interlock network: degree, betweenness, eigenvector and closeness, and study their impacts on RPE use. Findings - The authors find that firms that have more connected board members …


Predicting The Popularity Of Web 2.0 Items Based On User Comments, Xiangnan He, Ming Gao, Min-Yen Kan, Yiqun Liu, Kazunari Sugiyama Jul 2014

Predicting The Popularity Of Web 2.0 Items Based On User Comments, Xiangnan He, Ming Gao, Min-Yen Kan, Yiqun Liu, Kazunari Sugiyama

Research Collection School Of Computing and Information Systems

In the current Web 2.0 era, the popularity of Web resources fluctuates ephemerally, based on trends and social interest. As a result, content-based relevance signals are insufficient to meet users' constantly evolving information needs in searching for Web 2.0 items. Incorporating future popularity into ranking is one way to counter this. However, predicting popularity as a third party (as in the case of general search engines) is difficult in practice, due to their limited access to item view histories. To enable popularity prediction externally without excessive crawling, we propose an alternative solution by leveraging user comments, which are more accessible …


A Simple Polynomial-Time Randomized Distributed Algorithm For Connected Row Convex Constraints, T. K. Satish Kumar, Nguyen Duc Thien, William Yeoh, Sven Koenig Jul 2014

A Simple Polynomial-Time Randomized Distributed Algorithm For Connected Row Convex Constraints, T. K. Satish Kumar, Nguyen Duc Thien, William Yeoh, Sven Koenig

Research Collection School Of Computing and Information Systems

In this paper, we describe a simple randomized algorithm that runs in polynomial time and solves connected row convex (CRC) constraints in distributed settings. CRC constraints generalize many known tractable classes of constraints like 2-SAT and implicational constraints. They can model problems in many domains including temporal reasoning and geometric reasoning, and generally speaking, play the role of "Gaussians" in the logical world. Our simple randomized algorithm for solving them in distributed settings, therefore, has a number of important applications. We support our claims through a theoretical analysis and empirical results.


Fsph: Fitted Spectral Hashing For Efficient Similarity Search, Yong-Dong Zhang, Yu Wang, Sheng Tang, Steven C. H. Hoi, Jin-Tao Li Jul 2014

Fsph: Fitted Spectral Hashing For Efficient Similarity Search, Yong-Dong Zhang, Yu Wang, Sheng Tang, Steven C. H. Hoi, Jin-Tao Li

Research Collection School Of Computing and Information Systems

Spectral hashing (SpH) is an efficient and simple binary hashing method, which assumes that data are sampled from a multidimensional uniform distribution. However, this assumption is too restrictive in practice. In this paper we propose an improved method, fitted spectral hashing (FSpH), to relax this distribution assumption. Our work is based on the fact that one-dimensional data of any distribution could be mapped to a uniform distribution without changing the local neighbor relations among data items. We have found that this mapping on each PCA direction has certain regular pattern, and could be fitted well by S-curve function (Sigmoid function). …


Cenknn: A Scalable And Effective Text Classifier, Guansong Pang, Huidong Jin, Shengyi Jiang Jul 2014

Cenknn: A Scalable And Effective Text Classifier, Guansong Pang, Huidong Jin, Shengyi Jiang

Research Collection School Of Computing and Information Systems

A big challenge in text classification is to perform classification on a large-scale and high-dimensional text corpus in the presence of imbalanced class distributions and a large number of irrelevant or noisy term features. A number of techniques have been proposed to handle this challenge with varying degrees of success. In this paper, by combining the strengths of two widely used text classification techniques, K-Nearest-Neighbor (KNN) and centroid based (Centroid) classifiers, we propose a scalable and effective flat classifier, called CenKNN, to cope with this challenge. CenKNN projects high-dimensional (often hundreds of thousands) documents into a low-dimensional (normally a few …


Collaborative Error Reduction For Hierarchical Classification, Shiai Zhu, Xiao-Yong Wei, Chong-Wah Ngo Jul 2014

Collaborative Error Reduction For Hierarchical Classification, Shiai Zhu, Xiao-Yong Wei, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Hierarchical classification (HC) is a popular and efficient way for detecting the semantic concepts from the images. The conventional method always selects the branch with the highest classification response. This branch selection strategy has a risk of propagating classification errors from higher levels of the hierarchy to the lower levels. We argue that the local strategy is too arbitrary, because the candidate nodes are considered individually, which ignores the semantic and context relationships among concepts. In this paper, we first propose a novel method for HC, which is able to utilize the semantic relationship among candidate nodes and their children …


Integrating Self-Organizing Neural Network And Motivated Learning For Coordinated Multi-Agent Reinforcement Learning In Multi-Stage Stochastic Game, Teck-Hou Teng, Ah-Hwee Tan, Janusz A. Starzyk, Yuan-Sin Tan, Loo-Nin Teow Jul 2014

Integrating Self-Organizing Neural Network And Motivated Learning For Coordinated Multi-Agent Reinforcement Learning In Multi-Stage Stochastic Game, Teck-Hou Teng, Ah-Hwee Tan, Janusz A. Starzyk, Yuan-Sin Tan, Loo-Nin Teow

Research Collection School Of Computing and Information Systems

Most non-trivial problems require the coordinated performance of multiple goal-oriented and time-critical tasks. Coordinating the performance of the tasks is required due to the dependencies among the tasks and the sharing of resources. In this work, an agent learns to perform a task using reinforcement learning with a self-organizing neural network as the function approximator. We propose a novel coordination strategy integrating Motivated Learning (ML) and a self-organizing neural network for multi-agent reinforcement learning (MARL). Specifically, we adapt the ML idea of using pain signal to overcome the resource competition issue. Dependency among the agents is resolved using domain knowledge …


Click-Through-Based Cross-View Learning For Image Search, Yingwei Pan, Ting Yao, Tao Mei, Houqiang Li, Chong-Wah Ngo, Yong Rui Jul 2014

Click-Through-Based Cross-View Learning For Image Search, Yingwei Pan, Ting Yao, Tao Mei, Houqiang Li, Chong-Wah Ngo, Yong Rui

Research Collection School Of Computing and Information Systems

One of the fundamental problems in image search is to rank image documents according to a given textual query. Existing search engines highly depend on surrounding texts for ranking images, or leverage the query-image pairs annotated by human labelers to train a series of ranking functions. However, there are two major limitations: 1) the surrounding texts are often noisy or too few to accurately describe the image content, and 2) the human annotations are resourcefully expensive and thus cannot be scaled up. We demonstrate in this paper that the above two fundamental challenges can be mitigated by jointly exploring the …


A Novel Algorithm Based On Visual Saliency Attention For Localization And Segmentation In Rapidly-Stained Leukocyte Images, Xin Zheng, Yong Wang, Guoyou Wang, Zhong Chen Jul 2014

A Novel Algorithm Based On Visual Saliency Attention For Localization And Segmentation In Rapidly-Stained Leukocyte Images, Xin Zheng, Yong Wang, Guoyou Wang, Zhong Chen

Research Collection School Of Computing and Information Systems

In this paper, we propose a fast hierarchical framework of leukocyte localization and segmentation in rapidly-stained leukocyte images (RSLI) with complex backgrounds and varying illumination. The proposed framework contains two main steps. First, a nucleus saliency model based on average absolute difference is built, which locates each leukocyte precisely while effectively removes dyeing impurities and erythrocyte fragments. Secondly, two different schemes are presented for segmenting the nuclei and cytoplasm respectively. As for nuclei segmentation, to solve the overlap problem between leukocytes, we extract the nucleus lobes first and further group them. The lobes extraction is realized by the histogram-based contrast …


Building Algorithm Portfolios For Memetic Algorithms, Mustafa Misir, Stephanus Daniel Handoko, Hoong Chuin Lau Jul 2014

Building Algorithm Portfolios For Memetic Algorithms, Mustafa Misir, Stephanus Daniel Handoko, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

The present study introduces an automated mechanism to build algorithm portfolios for memetic algorithms. The objective is to determine an algorithm set involving combinations of crossover, mutation and local search operators based on their past performance. The past performance is used to cluster algorithm combinations. Top performing combinations are then considered as the members of the set. The set is expected to have algorithm combinations complementing each other with respect to their strengths in a portfolio setting. In other words, each algorithm combination should be good at solving a certain type of problem instances such that this set can be …


Reinforcement Learning For Adaptive Operator Selection In Memetic Search Applied To Quadratic Assignment Problem, Stephanus Daniel Handoko, Duc Thien Nguyen, Zhi Yuan, Hoong Chuin Lau Jul 2014

Reinforcement Learning For Adaptive Operator Selection In Memetic Search Applied To Quadratic Assignment Problem, Stephanus Daniel Handoko, Duc Thien Nguyen, Zhi Yuan, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Memetic search is well known as one of the state-of-the-art metaheuristics for finding high-quality solutions to NP-hard problems. Its performance is often attributable to appropriate design, including the choice of its operators. In this paper, we propose a Markov Decision Process model for the selection of crossover operators in the course of the evolutionary search. We solve the proposed model by a Q-learning method. We experimentally verify the efficacy of our proposed approach on the benchmark instances of Quadratic Assignment Problem.


New A*Star-Smu Centre Combines High-Powered Computing And Behavioural Sciences To Study People-Centric Issues, Singapore Management University Jun 2014

New A*Star-Smu Centre Combines High-Powered Computing And Behavioural Sciences To Study People-Centric Issues, Singapore Management University

SMU Press Releases

The Agency for Science, Technology and Research (A*STAR) and the Singapore Management University (SMU) will establish a Centre for Technology and Social-Behavioural Insights (CTSBI) to tap on high performance computing technology, big data analytics and behavioural sciences to study people-centric issues and human behaviour including how people think, feel and act in different settings. Such information can be used to enhance planning and address issues in different areas such as retail, logistics, urban planning, education and community development.


On Modeling Brand Preferences In Item Adoptions, Minh Duc Luu, Ee Peng Lim, Freddy Chong-Tat Chua Jun 2014

On Modeling Brand Preferences In Item Adoptions, Minh Duc Luu, Ee Peng Lim, Freddy Chong-Tat Chua

Research Collection School Of Computing and Information Systems

In marketing and advertising, developing and managingbrands value represent the core activities performedby companies. Successful brands attract buyers andadopters, which in turn increase the companies’ value.Given a set of user-item adoption data, can we inferbrand effects from users adopting items? To answerthis question, we develop the Brand Item Topic Model(BITM) that incorporates users’ brand preferences inthe process of item adoption by the users. We evaluateour model using synthetic and two real world datasetsagainst baseline models which do not consider brand effects.The results show that BITM can determine userswho demonstrate brand preferences and predict itemadoptions more accurately.


Hydra: Large-Scale Social Identity Linkage Via Heterogeneous Behavior Modeling, Siyuan Liu, Shuhui Wang, Feida Zhu, Jinbo Zhang, Ramayya Krishnan Jun 2014

Hydra: Large-Scale Social Identity Linkage Via Heterogeneous Behavior Modeling, Siyuan Liu, Shuhui Wang, Feida Zhu, Jinbo Zhang, Ramayya Krishnan

Research Collection School Of Computing and Information Systems

We study the problem of large-scale social identity linkage across different social media platforms, which is of critical importance to business intelligence by gaining from social data a deeper understanding and more accurate profiling of users. This paper proposes HYDRA, a solution framework which consists of three key steps: (I) modeling heterogeneous behavior by long-term behavior distribution analysis and multi-resolution temporal information matching; (II) constructing structural consistency graph to measure the high-order structure consistency on users' core social structures across different platforms; and (III) learning the mapping function by multi-objective optimization composed of both the supervised learning on pair-wise ID …


Optimal Performance Trade-Offs In Mac For Wireless Sensor Networks Powered By Heterogeneous Ambient Energy Harvesting, Jin Yunye, Hwee-Pink Tan Jun 2014

Optimal Performance Trade-Offs In Mac For Wireless Sensor Networks Powered By Heterogeneous Ambient Energy Harvesting, Jin Yunye, Hwee-Pink Tan

Research Collection School Of Computing and Information Systems

In wireless sensor networks powered by ambient energy harvesting (WSNs-HEAP), sensor nodes' energy harvesting rates are spatially heterogeneous and temporally variant, which impose difficulties for medium access control (MAC). In this paper, we first derive the necessary conditions under which channel utilization and fairness are optimal in a WSN-HEAP, respectively. Based on the analysis, we propose an earliest deadline first (EDF) polling MAC protocol, which regulates transmission sequence of the sensor nodes based on the spatially heterogeneous energy harvesting rates. It also mitigates temporal variations in energy harvesting rates by a prediction and update mechanism. Simulation results verify the performance …


Sewordsim: Software-Specific Word Similarity Database, Yuan Tian, David Lo, Julia Lawall Jun 2014

Sewordsim: Software-Specific Word Similarity Database, Yuan Tian, David Lo, Julia Lawall

Research Collection School Of Computing and Information Systems

Measuring the similarity of words is important in accurately representing and comparing documents, and thus improves the results of many natural language processing (NLP) tasks. The NLP community has proposed various measurements based on WordNet, a lexical database that contains relationships between many pairs of words. Recently, a number of techniques have been proposed to address software engineering issues such as code search and fault localization that require understanding natural language documents, and a measure of word similarity could improve their results. However, WordNet only contains information about words senses in general-purpose conversation, which often differ from word senses in …


Placing Videos On A Semantic Hierarchy For Search Result Navigation, Song Tan, Yu-Gang Jiang, Chong-Wah Ngo Jun 2014

Placing Videos On A Semantic Hierarchy For Search Result Navigation, Song Tan, Yu-Gang Jiang, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Organizing video search results in a list view is widely adopted by current commercial search engines, which cannot support efficient browsing for complex search topics that have multiple semantic facets. In this article, we propose to organize video search results in a highly structured way. Specifically, videos are placed on a semantic hierarchy that accurately organizes various facets of a given search topic. To pick the most suitable videos for each node of the hierarchy, we define and utilize three important criteria: relevance, uniqueness, and diversity. Extensive evaluations on a large YouTube video dataset demonstrate the effectiveness of our approach.


Learning Euclidean-To-Riemannian Metric For Point-To-Set Classification, Zhiwu Huang, R. Wang, S. Shan, X. Chen Jun 2014

Learning Euclidean-To-Riemannian Metric For Point-To-Set Classification, Zhiwu Huang, R. Wang, S. Shan, X. Chen

Research Collection School Of Computing and Information Systems

In this paper, we focus on the problem of point-to-set classification, where single points are matched against sets of correlated points. Since the points commonly lie in Euclidean space while the sets are typically modeled as elements on Riemannian manifold, they can be treated as Euclidean points and Riemannian points respectively. To learn a metric between the heterogeneous points, we propose a novel Euclidean-to-Riemannian metric learning framework. Specifically, by exploiting typical Riemannian metrics, the Riemannian manifold is first embedded into a high dimensional Hilbert space to reduce the gaps between the heterogeneous spaces and meanwhile respect the Riemannian geometry of …


Version History, Similar Report, And Structure: Putting Them Together For Improved Bug Localization, Shaowei Wang, David Lo Jun 2014

Version History, Similar Report, And Structure: Putting Them Together For Improved Bug Localization, Shaowei Wang, David Lo

Research Collection School Of Computing and Information Systems

During the evolution of a software system, a large number of bug reports are submitted. Locating the source code files that need to be fixed to resolve the bugs is a challenging problem. Thus, there is a need for a technique that can automatically figure out these buggy files. A number of bug localization solutions that take in a bug report and output a ranked list of files sorted based on their likelihood to be buggy have been proposed in the literature. However, the accuracy of these tools still need to be improved. In this paper, to address this need, …


Air Indexing For On-Demand Xml Data Broadcast, Weiwei Sun, Rongrui Qin, Jinjin Wu, Baihua Zheng Jun 2014

Air Indexing For On-Demand Xml Data Broadcast, Weiwei Sun, Rongrui Qin, Jinjin Wu, Baihua Zheng

Research Collection School Of Computing and Information Systems

XML data broadcast is an efficient way to disseminate semi-structured information in wireless mobile environments. In this paper, we propose a novel two-tier index structure to facilitate the access of XML document in an on-demand broadcast system. It provides the clients with an overall image of all the XML documents available at the server side and hence enables the clients to locate complete result sets accordingly. A pruning strategy is developed to cut down the index size and a two-tier structure is proposed to further remove any redundant information. In addition, two index distribution strategies, namely naive distribution and partial …


Self-Organizing Neural Networks Integrating Domain Knowledge And Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Jacek M. Zurada Jun 2014

Self-Organizing Neural Networks Integrating Domain Knowledge And Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Jacek M. Zurada

Research Collection School Of Computing and Information Systems

The use of domain knowledge in learning systems is expected to improve learning efficiency and reduce model complexity. However, due to the incompatibility with knowledge structure of the learning systems and real-time exploratory nature of reinforcement learning (RL), domain knowledge cannot be inserted directly. In this paper, we show how self-organizing neural networks designed for online and incremental adaptation can integrate domain knowledge and RL. Specifically, symbol-based domain knowledge is translated into numeric patterns before inserting into the self-organizing neural networks. To ensure effective use of domain knowledge, we present an analysis of how the inserted knowledge is used by …


Interactive Two-Sided Transparent Displays: Designing For Collaboration, Jiannan Li, Saul Greenberg, Ehud Sharlin, Joaquim Jorge Jun 2014

Interactive Two-Sided Transparent Displays: Designing For Collaboration, Jiannan Li, Saul Greenberg, Ehud Sharlin, Joaquim Jorge

Research Collection School Of Computing and Information Systems

Transparent displays can serve as an important collaborative medium supporting face-to-face interactions over a shared visual work surface. Such displays enhance workspace awareness: when a person is working on one side of a transparent display, the person on the other side can see the other's body, hand gestures, gaze and what he or she is actually manipulating on the shared screen. Even so, we argue that designing such transparent displays must go beyond current offerings if it is to support collaboration. First, both sides of the display must accept interactive input, preferably by at least touch and / or pen, …


Constructive Visualization, Samuel Huron, Sheelagh Carpendale, Alice Thudt, Anthony Tang, Michael Mauerer Jun 2014

Constructive Visualization, Samuel Huron, Sheelagh Carpendale, Alice Thudt, Anthony Tang, Michael Mauerer

Research Collection School Of Computing and Information Systems

If visualization is to be democratized, we need to provide means for non-experts to create visualizations that allow them to engage directly with datasets. We present constructive visualization a new paradigm for the simple creation of flexible, dynamic visualizations. Constructive visualization is simple—in that the skills required to build and manipulate the visualizations are akin to kindergarten play; it is expressive— in that one can build within the constraints of the chosen environment, and it also supports dynamics — in that these constructed visualizations can be rebuilt and adjusted. We describe the conceptual components and processes underlying constructive visualization, and …


Socio-Physical Analytics: Challenges & Opportunities, Archan Misra, Kasthuri Jayarajah, Shriguru Nayak, Philips Kokoh Prasetyo, Ee-Peng Lim Jun 2014

Socio-Physical Analytics: Challenges & Opportunities, Archan Misra, Kasthuri Jayarajah, Shriguru Nayak, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

In this paper, we argue for expanded research into an area called Socio-Physical Analytics, that focuses on combining the behavioral insight gained from mobile-sensing based monitoring of physical behavior with the inter-personal relationships and preferences deduced from online social networks. We highlight some of the research challenges in combining these heterogeneous data sources and then describe some examples of our ongoing work (based on real-world data being collected at SMU) that illustrate two aspects of socio-physical analytics: (a) how additional demographic and online analytics based attributes can potentially provide better insights into the preferences and behaviors of individuals or groups …


Global Immutable Region Computation, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang Jun 2014

Global Immutable Region Computation, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

A top-k query shortlists the k records in a dataset that best match the user's preferences. To indicate her preferences, the user typically determines a numeric weight for each data dimension (i.e., attribute). We refer to these weights collectively as the query vector. Based on this vector, each data record is implicitly mapped to a score value (via a weighted sum function). The records with the k largest scores are reported as the result. In this paper we propose an auxiliary feature to standard top-k query processing. Specifically, we compute the maximal locus within which the query vector incurs no …


Evolving An Information Systems Capstone Course To Align With The Fast Changing Singapore Marketplace, Chris Boesch, Benjamin Kok Siew Gan Jun 2014

Evolving An Information Systems Capstone Course To Align With The Fast Changing Singapore Marketplace, Chris Boesch, Benjamin Kok Siew Gan

Research Collection School Of Computing and Information Systems

Every year, around fifty-five undergraduate teams of four to six students are required to complete a capstone course for the School of Information Systems at Singapore Management University. Each team spends approximately five months working with an industry sponsor using the latest tools and techniques. Students actively learn by implementing the system to solve a real world problem. In addition to delivering value to the local sponsor, our students learn specialized skills currently needed in the marketplace, which might not yet be incorporated into electives and core courses. In this paper, we discuss the tradeoffs of providing students and project …


Graph-Based Semi-Supervised Learning: Realizing Pointwise Smoothness Probabilistically, Yuan Fang, Kevin Chen-Chuan Chang, Hady W. Lauw Jun 2014

Graph-Based Semi-Supervised Learning: Realizing Pointwise Smoothness Probabilistically, Yuan Fang, Kevin Chen-Chuan Chang, Hady W. Lauw

Research Collection School Of Computing and Information Systems

As the central notion in semi-supervised learning, smoothness is often realized on a graph representation of the data. In this paper, we study two complementary dimensions of smoothness: its pointwise nature and probabilistic modeling. While no existing graph-based work exploits them in conjunction, we encompass both in a novel framework of Probabilistic Graph-based Pointwise Smoothness (PGP), building upon two foundational models of data closeness and label coupling. This new form of smoothness axiomatizes a set of probability constraints, which ultimately enables class prediction. Theoretically, we provide an error and robustness analysis of PGP. Empirically, we conduct extensive experiments to show …