Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Indian Statistical Institute

1994

Statistics

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Some Limit Theorem On Conditional U-Statistics And Censored Data Non Parametric Regression., Arusharka Sen Dr. Feb 1994

Some Limit Theorem On Conditional U-Statistics And Censored Data Non Parametric Regression., Arusharka Sen Dr.

Doctoral Theses

In Statistics, a classical problem is that of estimating the regression function which is defined as m{t) := E(Y|X = ), te R, for two random variables X and Y such that EY < 0o. The estimators are constructed iased on a sample {(Xi, Yi.)}, 1sis n,n 2 1, from the distribution of (X, Y). Throughout this thesis, we assume X and Y to be real-valued for the sake of convenience. The classical approach to this problem is to assume a parametrized, polynomial form for nt-), i.e., m(t) := Bo + E-1 P,ti, p 21, and obtain estimates of the unknown paraineters Bo, Bj,, 1sjsp. Later, with the development of techıniques for non-parametrie density estimation, it was sought to extend these techniques to regression estimation. Heuristically, the two problems can be seen to be related as follows : let fi(-) be the marginal density of X and note that E1(X S x) = h(t)dt, z € R, whereas EY 1(X Sx) = m(t)fi(t)dt, x E MR. (1.0.2) In other wordds, (1.0.1) can be looked upon as a special case of (1.0.2), with Y = 1. + similarity, as we shall see later on, has been the underlying theme in Chapters 2 and 4 of the present work.) The following non-parametric regression estimator was proposed independently by Nadaraya (1964) and Watson (1964): "(): := m.(Y,)/m.(1, ), te R, (1.0.3) where m,(Y, t) = (nan)- E-, Y;K((t - X:)/an). (1.0.4) m,(1,1) (na,)-E, K((I - X)/a,). Here K(), the so-called kernel function, is chosen to satiafy various analytical conditions (typically, K(-) is taken to be a density function), and a, 1 0 are the bandwidths which go to zero sufficiently slowly (e.g., na,0o as n00) in order to ensure consistency of the estimator mW (). The intuition behind such an estimator is that m,(Y,) is an estimator of mt-)fi() while m,(1,) cstimates the density fa(-). See Prakasa Rao (1983), Chapters 1-4. for an introduction to non-parametric density and regression estimation. Now, m(t) is a functional of the conditional distribution of Y, given X = t. A natu- ral generalisation of the regression estimation problem seems to be the estimation of the following functionals: mh(t1,....tk) := E{h(Y1,.....Yk) | X1, = t1.,Xk. = tk), (t....) € R*, k 2 1, (1.0.5) where h: R*- R is such that Elh(Y...., Y) < 0. A similar generalisation led Hoelfding (1948) from the sample mcan to the theory of so-called U-statistics, in the uncondilional set-up. The estimation of (1.0.5) were considerexl, for the first time in published form, in Stute (1991) where the following conditional U-statistics were proposed as estimators;where Fn(-) := n-1E, 1(Xi; < ) denotes the empirical distribution function (c.d.f Bochynek discussed the asymptotic normality of conditional U- and V-statistics and pei formed simulation studies on them. Stute (1991) established weak and strong pointwis consistency and asymptotic normality of U(t). Liero (1991) studied uniform strong con sistency of conditional U-statistics and established asymptotic normality of the integrate squared error (ISE) statistic:for suitable A c R* and weight function w(-). We quote the following examples to illustrate the possible use of conditional U-statistics See Stute (1991) and Bochynek (1987) for other examples. Throughout this thesis, our set up will be as foliows: {(Xn, Yn)}n>ı is a bi-variate i.i.d sequence, with (X1, Y1) having join density f(,-) and X, having marginal density fi(-). Consequently,