Operations Research, Systems Engineering and Industrial Engineering | Open Access Articles

On Step Sizes, Stochastic Shortest Paths, And Survival Probabilities In Reinforcement Learning, Abhijit Gosavi Dec 2008

On Step Sizes, Stochastic Shortest Paths, And Survival Probabilities In Reinforcement Learning, Abhijit Gosavi

Engineering Management and Systems Engineering Faculty Research & Creative Works

Reinforcement learning (RL) is a simulation-based technique useful in solving Markov decision processes if their transition probabilities are not easily obtainable or if the problems have a very large number of states. We present an empirical study of (i) the effect of step-sizes (learning rules) in the convergence of RL algorithms, (ii) stochastic shortest paths in solving average reward problems via RL, and (iii) the notion of survival probabilities (downside risk) in RL. We also study the impact of step sizes when function approximation is combined with RL. Our experiments yield some interesting insights that will be useful in practice …

Go to article

Reinforcement Learning Based Dual-Control Methodology For Complex Nonlinear Discrete-Time Systems With Application To Spark Engine Egr Operation, Peter Shih, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier Aug 2008

Reinforcement Learning Based Dual-Control Methodology For Complex Nonlinear Discrete-Time Systems With Application To Spark Engine Egr Operation, Peter Shih, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier

Electrical and Computer Engineering Faculty Research & Creative Works

A novel reinforcement-learning-based dual-control methodology adaptive neural network (NN) controller is developed to deliver a desired tracking performance for a class of complex feedback nonlinear discrete-time systems, which consists of a second-order nonlinear discrete-time system in nonstrict feedback form and an affine nonlinear discrete-time system, in the presence of bounded and unknown disturbances. For example, the exhaust gas recirculation (EGR) operation of a spark ignition (SI) engine is modeled by using such a complex nonlinear discrete-time system. A dual-controller approach is undertaken where primary adaptive critic NN controller is designed for the nonstrict feedback nonlinear discrete-time system whereas the secondary …

Go to article

Online Reinforcement Learning-Based Neural Network Controller Design For Affine Nonlinear Discrete-Time Systems, Qinmin Yang, Jagannathan Sarangapani Jul 2007

Online Reinforcement Learning-Based Neural Network Controller Design For Affine Nonlinear Discrete-Time Systems, Qinmin Yang, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

In this paper, a novel reinforcement learning neural network (NN)-based controller, referred to adaptive critic controller, is proposed for general multi-input and multi- output affine unknown nonlinear discrete-time systems in the presence of bounded disturbances. Adaptive critic designs consist of two entities, an action network that produces optimal solution and a critic that evaluates the performance of the action network. The critic is termed adaptive as it adapts itself to output the optimal cost-to-go function and the action network is adapted simultaneously based on the information from the critic. In our online learning method, one NN is designated as the …

Go to article

Reinforcement Learning Based Output-Feedback Control Of Nonlinear Nonstrict Feedback Discrete-Time Systems With Application To Engines, Peter Shih, Jonathan B. Vance, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier Jul 2007

Reinforcement Learning Based Output-Feedback Control Of Nonlinear Nonstrict Feedback Discrete-Time Systems With Application To Engines, Peter Shih, Jonathan B. Vance, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier

Electrical and Computer Engineering Faculty Research & Creative Works

A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. …

Go to article

Online Reinforcement Learning Neural Network Controller Design For Nanomanipulation, Qinmin Yang, Jagannathan Sarangapani Jan 2007

Online Reinforcement Learning Neural Network Controller Design For Nanomanipulation, Qinmin Yang, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

In this paper, a novel reinforcement learning neural network (NN)-based controller, referred to adaptive critic controller, is proposed for affine nonlinear discrete-time systems with applications to nanomanipulation. In the online NN reinforcement learning method, one NN is designated as the critic NN, which approximates the long-term cost function by assuming that the states of the nonlinear systems is available for measurement. An action NN is employed to derive an optimal control signal to track a desired system trajectory while minimizing the cost function. Online updating weight tuning schemes for these two NNs are also derived. By using the Lyapunov approach, …

Go to article

Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He Jan 2006

Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He

Electrical and Computer Engineering Faculty Research & Creative Works

Spark ignition (SI) engines running at very lean conditions demonstrate significant nonlinear behavior by exhibiting cycle-to-cycle dispersion of heat release even though such operation can significantly reduce NOx emissions and improve fuel efficiency by as much as 5-10%. A suite of neural network (NN) controller without and with reinforcement learning employing output feedback has shown ability to reduce the nonlinear cyclic dispersion observed under lean operating conditions. The neural network controllers consists of three NN: a) A NN observer to estimate the states of the engine such as total fuel and air; b) a second NN for generating virtual input; …

Go to article

Forecasting Series-Based Stock Price Data Using Direct Reinforcement Learning, H. Li, Cihan H. Dagli, David Lee Enke Jan 2004

Forecasting Series-Based Stock Price Data Using Direct Reinforcement Learning, H. Li, Cihan H. Dagli, David Lee Enke

Engineering Management and Systems Engineering Faculty Research & Creative Works

A significant amount of work has been done in the area of price series forecasting using soft computing techniques, most of which are based upon supervised learning. Unfortunately, there has been evidence that such models suffer from fundamental drawbacks. Given that the short-term performance of the financial forecasting architecture can be immediately measured, it is possible to integrate reinforcement learning into such applications. In this paper, we present the novel hybrid view for a financial series and critic adaptation stock price forecasting architecture using direct reinforcement. A new utility function called policies-matching ratio is also proposed. The need for the …

Go to article

An Enhanced Least-Squares Approach For Reinforcement Learning, Hailin Li, Cihan H. Dagli Jan 2003

An Enhanced Least-Squares Approach For Reinforcement Learning, Hailin Li, Cihan H. Dagli

Engineering Management and Systems Engineering Faculty Research & Creative Works

This paper presents an enhanced least-squares approach for solving reinforcement learning control problems. Model-free least-squares policy iteration (LSPI) method has been successfully used for this learning domain. Although LSPI is a promising algorithm that uses linear approximator architecture to achieve policy optimization in the spirit of Q-learning, it faces challenging issues in terms of the selection of basis functions and training samples. Inspired by orthogonal least-squares regression (OLSR) method for selecting the centers of RBF neural network, we propose a new hybrid learning method. The suggested approach combines LSPI algorithm with OLSR strategy and uses simulation as a tool to …

Go to article

Combining Evolving Neural Network Classifiers Using Bagging, Sunghwan Sohn, Cihan H. Dagli Jan 2003

Combining Evolving Neural Network Classifiers Using Bagging, Sunghwan Sohn, Cihan H. Dagli

Engineering Management and Systems Engineering Faculty Research & Creative Works

The performance of the neural network classifier significantly depends on its architecture and generalization. It is usual to find the proper architecture by trial and error. This is time consuming and may not always find the optimal network. For this reason, we apply genetic algorithms to the automatic generation of neural networks. Many researchers have provided that combining multiple classifiers improves generalization. One of the most effective combining methods is bagging. In bagging, training sets are selected by resampling from the original training set and classifiers trained with these sets are combined by voting. We implement the bagging technique into …

Go to article

Adaptive Critic-Based Neural Network Controller For Uncertain Nonlinear Systems With Unknown Deadzones, Pingan He, Jagannathan Sarangapani, S. N. Balakrishnan Jan 2002

Adaptive Critic-Based Neural Network Controller For Uncertain Nonlinear Systems With Unknown Deadzones, Pingan He, Jagannathan Sarangapani, S. N. Balakrishnan

Electrical and Computer Engineering Faculty Research & Creative Works

A multilayer neural network (NN) controller in discrete-time is designed to deliver a desired tracking performance for a class of nonlinear systems with input deadzones. This multilayer NN controller has an adaptive critic NN architecture with two NNs for compensating the deadzone nonlinearity and a third NN for approximating the dynamics of the nonlinear system. A reinforcement learning scheme in discrete-time is proposed for the adaptive critic NN deadzone compensator, where the learning is performed based on a certain performance measure, which is supplied from a critic. The adaptive generating NN rejects the errors induced by the deadzone whereas a …

Go to article

Using A Neuro-Fuzzy-Genetic Data Mining Architecture To Determine A Marketing Strategy In A Charitable Organization's Donor Database, Korakot Hemsathapat, Cihan H. Dagli, David Lee Enke Jan 2001

Using A Neuro-Fuzzy-Genetic Data Mining Architecture To Determine A Marketing Strategy In A Charitable Organization's Donor Database, Korakot Hemsathapat, Cihan H. Dagli, David Lee Enke

Engineering Management and Systems Engineering Faculty Research & Creative Works

This paper describes the use of a neuro-fuzzy-genetic data mining architecture for finding hidden knowledge and modeling the data of the 1997 donation campaign of an American charitable organization. This data was used during the 1998 KDD Cup competition. In the architecture, all input variables are first preprocessed and all continuous variables are fuzzified. Principal component analysis (PCA) is then applied to reduce the dimensions of the input variables in finding combinations of variables, or factors, that describe major trends in the data. The reduced dimensions of the input variables are then used to train probabilistic neural networks (PNN) to …

Go to article

Derivation Of Fuzzy Membership Functions Using One-Dimensional Self-Organizing Maps, Thomas E. Sandidge, Cihan H. Dagli Jan 1997

Derivation Of Fuzzy Membership Functions Using One-Dimensional Self-Organizing Maps, Thomas E. Sandidge, Cihan H. Dagli

Engineering Management and Systems Engineering Faculty Research & Creative Works

This paper discusses a system of self-organizing maps that approximate the fuzzy membership function for an arbitrary number of fuzzy classes. This is done through the ordering and clustering properties of one-dimensional self-organizing maps and iterative approximation of conditional probabilities of nodes in one map being the winner given that a node in the other map is the winner. Application of this system reduces fuzzy membership design time to that required to train the system of self-organizing maps.

Go to article

Operations Research, Systems Engineering and Industrial Engineering Commons^™

Full-Text Articles in Operations Research, Systems Engineering and Industrial Engineering

On Step Sizes, Stochastic Shortest Paths, And Survival Probabilities In Reinforcement Learning, Abhijit Gosavi

Engineering Management and Systems Engineering Faculty Research & Creative Works

Reinforcement Learning Based Dual-Control Methodology For Complex Nonlinear Discrete-Time Systems With Application To Spark Engine Egr Operation, Peter Shih, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier

Electrical and Computer Engineering Faculty Research & Creative Works

Online Reinforcement Learning-Based Neural Network Controller Design For Affine Nonlinear Discrete-Time Systems, Qinmin Yang, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

Reinforcement Learning Based Output-Feedback Control Of Nonlinear Nonstrict Feedback Discrete-Time Systems With Application To Engines, Peter Shih, Jonathan B. Vance, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier

Electrical and Computer Engineering Faculty Research & Creative Works

Online Reinforcement Learning Neural Network Controller Design For Nanomanipulation, Qinmin Yang, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He

Electrical and Computer Engineering Faculty Research & Creative Works

Forecasting Series-Based Stock Price Data Using Direct Reinforcement Learning, H. Li, Cihan H. Dagli, David Lee Enke

Engineering Management and Systems Engineering Faculty Research & Creative Works

An Enhanced Least-Squares Approach For Reinforcement Learning, Hailin Li, Cihan H. Dagli

Engineering Management and Systems Engineering Faculty Research & Creative Works

Combining Evolving Neural Network Classifiers Using Bagging, Sunghwan Sohn, Cihan H. Dagli

Engineering Management and Systems Engineering Faculty Research & Creative Works

Adaptive Critic-Based Neural Network Controller For Uncertain Nonlinear Systems With Unknown Deadzones, Pingan He, Jagannathan Sarangapani, S. N. Balakrishnan

Electrical and Computer Engineering Faculty Research & Creative Works

Using A Neuro-Fuzzy-Genetic Data Mining Architecture To Determine A Marketing Strategy In A Charitable Organization's Donor Database, Korakot Hemsathapat, Cihan H. Dagli, David Lee Enke

Engineering Management and Systems Engineering Faculty Research & Creative Works

Derivation Of Fuzzy Membership Functions Using One-Dimensional Self-Organizing Maps, Thomas E. Sandidge, Cihan H. Dagli

Engineering Management and Systems Engineering Faculty Research & Creative Works