[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Verkuilen, Jay" <JVerkuilen@gc.cuny.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: logistic tranformation, proportion variables |

Date |
Thu, 13 Dec 2007 15:33:15 -0500 |

Marck Bulter wrote: <<I have a question that is not entirely related to Stata. Do hope that you forgive me. Assume the following model, *ivreg* pstrmon price maturity age coupon pstrmonprev pstrprev intrest ivol compl (precmon = precmonprev) Where pstrmon, pstrmonprev, precmon and precmonprev are all proportions.>> Some points: (1) There is a very large literature in geostatistics on proportions data summarized in the excellent book by John Aitchison, Compsitional Data Analysis. The zeros problem is, of course, a real issue and this literature has dealt with the problem to some degree. There are a few articles by econometricians that might be of use to you. The references are in the 2003 edition of Aitchison's book (which was originally published in 1986). http://en.wikipedia.org/wiki/Compositional_data (Shameless self-promotion: The next two points involve work I'm currently engaged in.) (2) Michael Smithson of ANU and I published an article on using beta regression for these kinds of data published last year in Psychological Methods (the APA's methodology journal). It can be found here: http://psychology.anu.edu.au/people/smithson/details/betareg/betareg.htm l We talk about the zeros problem quite a bit, though see below. Independently Maarten Buis wrote some Stata software that estimates this model. If you have need of mixed model analysis, contact me as Mike and I have worked out the details and, indeed, use some data very much like your own as a test case that Mike got from an economist friend of his at ANU. We are in the process of writing this paper up but it's really not ready for readers or I'd send it to you. (3) A friend from grad school, Clint Stober, and I are in the middle of writing a paper on the zeros problem in bounded data. It turns out that depending on the distribution you use, the estimation problem can be affected horribly or hardly at all by what you do with the boundary observations. The actual characterization of distributions depends on a bunch of differential geometry which, fortunately, my friend understands very well and I have only a rudimentary "statistician's grasp" of. Essentially it comes down to characteristics of the log-likelihood function near the boundaries of the sample space, in particular the nature of the derivatives. In some cases, replacing an exact 0 with epsilon has little effect on the estimation and in other cases it causes major damage to standard errors and coefficient estimates. This can happen with only a few cases out of several hundred observations. Unfortunately transformations of the normal distribution are badly affected by this phenomenon. I have a real example with six exact 0 observations out of about 300 cases, where analysis by the lognormal distribution fails horribly and analysis by the gamma distribution is not affected at all.) Someone else in the thread already noted that an exact 0 or 1 may be qualitatively different than epsilon. If this is the case the analysis I just mentioned does not apply. It only applies to the (very common case) that a 0 is due to limitations of the measurement device, rounding error, or the like. If you have a good dataset involving this kind of problem, I would definitely be interested! Jay -- J. Verkuilen Assistant Professor of Educational Psychology City University of New York-Graduate Center 365 Fifth Ave. New York, NY 10016 Email: jverkuilen@gc.cuny.edu Office: (212) 817-8286 FAX: (212) 817-1516 Cell: (217) 390-4609 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: logistic tranformation, proportion variables***From:*Richard Goldstein <richgold@ix.netcom.com>

**st: RE: RE: logistic tranformation, proportion variables***From:*"Verkuilen, Jay" <JVerkuilen@gc.cuny.edu>

**References**:**st: logistic tranformation, proportion variables***From:*Marck Bulter <177316mb@student.eur.nl>

- Prev by Date:
**Re: st: Re: SQL Query Password/User ID** - Next by Date:
**st: RE: RE: logistic tranformation, proportion variables** - Previous by thread:
**Re: st: logistic tranformation, proportion variables** - Next by thread:
**st: RE: RE: logistic tranformation, proportion variables** - Index(es):

© Copyright 1996–2021 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |