Running the algorithm (not my excel or R model this is an R function called normalmixEM()) came up with SPY being ~83% a normal distribution [EM dist1] that has mu=.0166 and sd=.0333 and ~17% a normal distribution [EM dist2] that has mu=-.0302 and sd=.0525. The original SPY was mu=.0085 and sd=.0414 These are monthly series, by the way. When you reconstruct it you get a gaussian mix with mu=.0072 and sd=.0446. I didn't measure the other moments yet because there is something wanky with my understanding of the two competing R functions I was using to do that and I didn't want to get it wrong. On the other hand visually it works out nicely....like this:
- blue is the normally distributed "EM dist 1" (high) - random return generation
- red is the normally distributed "EM dist 2" (low) - random return generation
- black dotted is the artificially/mathematically reconstructed non-normal Gaussian mix
Works pretty well (smallish data series so not perfect). I thought that was pretty slick. In fact after I posted this I was thinking about it a little more. I am, for better or worse, an amateur or perhaps a tourist visiting the land of retirement finance and probability theory. I have my camera and my ugly tourist shorts but that's about it; I don't speak the language and I don't live there. So, for me, while it is one thing to know some basic stats like the various moments of a distribution or how to generate a CDF or how to integrate a PDF, its another thing altogether to look at a distribution and see multiple other distributions hidden inside trying to get out. I'll probably never look at a data distribution in quite the same way again. I'll call this whole exercise a "net add" to my trip.
No comments:
Post a Comment