Thursday, April 26, 2012


I found out on Monday that my project, "Empirical Likelihood in Statsmodels" was accepted for GSoC.  It was great news for me and I am eager to get started.

Submitting the patch as part of the application proved to be very helpful.  Spending a couple days just reading the current code really helped me determine exactly how I will be organizing my project.  Also, going through the code drilled in how classes work in Python and I found out what class inheritance was and how to use it.

This week I will be in contact with my mentor(s) to discuss some more of the structure of my project.  I also hope to use the next couple days to get more familiar with git-hub.   I want to make sure that I have all the essentials tightened up before the official coding period begins.   

Wednesday, April 4, 2012

It begins...kind of

As the application period comes to an end, I realized I put as much thought into my application as a hefty class assignment.  In fact, I would say that just putting the application together helped me to better understand the material that I hope to code for statsmodels this summer and taught me as much if not more than a typical class project usually does.

Briefly, I proposed to implement empirical likelihood estimation in python.  Empirical likelihood is a non-parametric method of estimation that gives observed data a very loud voice.  I am particularly  drawn to this and nonparametric statistics in general  for 2 (main) reasons. 

First, it frees the researcher from many of the distributional assumptions that are typically found in standard econometrics and statistics textbooks.  Although there are countless reasons to remove these assumptions,  one is to predict movements in stock prices.  Many classical models rely on assumptions or normality (or Brownian motion), when forecasting stock prices or pricing options.  However, it has been shown that stock prices follow a distribution with heavy tails.  Ignoring these fat tails (higher probabilities of large movements) can be problematic for researchers and practitioners alike.

The second reason I am attracted to empirical likelihood and nonparametric statistics is more pragmatical (or lazy, however you look at it).  Statisticians spend plenty of resources deriving complicated analytical solutions that only apply "in the limit" and can often be very misleading when used in finite samples among people who are unsure of their worth.  While these are sometimes helpful for practitioners or policy maker that only wants to "throw" a couple variables into a statistical software, it seems as though the derivation of these analytical solutions is somewhat of a mis-allocation of resources (brain power of some very smart people) since specific questions can be answered much more easily through other computational methods.  Undoubtedly though,  we will never stop developing these nice, pretty analytical solutions.  It is in out nature.

"One reason comes from our wish, as theoreticians, to explore the source of the a priori practical principles that lie in our reason.  " -Immanuel Kant, Groundwork of the Metaphysics of Morals.