Jump to content
SteadyOptions is an options trading forum where you can find solutions from top options traders. Join Us!

We’ve all been there… researching options strategies and unable to find the answers we’re looking for. SteadyOptions has your solution.

Leaderboard

Popular Content

Showing content with the highest reputation on 05/09/17 in all areas

  1. My 2 cents on this: I'm an engineer and armature user of statistics based models in my day to day work for using past data to predict future data (and yes, it is probabilistic as Kim mentioned). To me, this is most certainly "curve fitting" here, or if you don't like that term, he's come up with a model for prediction of profitably playing post-earnings events in TWTR by selling put spreads. The obvious parameters of the model are option expiration, short option delta, long option delta, entry date, and exit date- so 5 parameters. If he's training that 5-parameter model based on 8 data points (2 years of earnings events), that's n-1 degrees of freedom, or 7 (n-1 since it doesn't include the entire population of earnings events). In analysis of variance (stats 101) for regression modeling, and assuming your errors are normally distributed, you use up one degree of freedom for each parameter, and the rest are left over for calculating error (i.e., confidence intervals for each parameter and error in the overall model). So, with only 3 degrees of freedom left over for computing confidence intervals, it is very unlikely the interval would be very tight around any one parameter. Hence, that's why I'm very skeptical the confidence interval around entry date is less than +/- 1 day. So basically, in my opinion, it's wise not to focus too much on the specifics of any one setting CML is predicting based on such a small sample size (as a previous poster mentioned), and just focus on what's actually going on here: implied volatility in the options is in reversion to the mean after earnings announcement to a level that still provides edge over the realized volatility in this time frame, and there's a slight bullish tilt to price (again, for a very small sample size).
    5 points
  2. So, I am a "real statistician" or mathematician as it were. My application of machine learning to finance was endorsed by the head of Germany's artificial intelligence arm. I am also in SSRN. My background stems from my graduate work at Stanford University. I have also been a market maker on the NYSE ARCA and CBOE (remotely) floors, I am, and have been, considered one of the pioneers of ML in finance and was the earliest to note that neural networks worked better than the prevailing literature dictated, which, believe it or not, was a controversial back then. I showed unquestionably that NN did outperform SVM with enough data, which, to translate into English by using a different industry's own "ah ha" moment would be like proving that battery powered auto engines have greater torque and substantially more power than a combustion engine. It's obvious now, but I assure you it was not obvious before. I love cynicism! It brings intelligent discussion and an evolution to thought processes. I also know that back and forth discussions with non-scientists also brings out a form of cynic that is in fact, not cynical, but rather angry. I cannot, and choose not, to change an angry man's mind. The cynic that looks at the results and says, "hey, this feels like curve fitting," has a winning approach. The cynic that uses Stats 101 to prove a point, is one that assumes the counter party is foolish. Don't be that person. First, for people that really want to learn, one example of a back-test is not what we do. We publish thousands, yes thousands, a day to Google News. It is the accumulation of tens of thousands of back tests that inspire and power the facility to begin an analysis of backtests. A stat cynic that has cognition would know that one backtest, in and of itself, even if it was 1000 years of data, is not sufficient to say very much at all. That entire backtest is one data point, and thousands of data points, all at the same time. A trader inside the body of a mathematician would also note that going further back, say more than 2-years, is often less robust than a shorter time period. The pre earnings trades and the post earnings trades, are an amalgamation of analysis, that together deliver robustness. If one where to employ a trade strategy the idea is to create a portfolio of the trades, not one. As Kim does with Steady Options, no month is based on one trade, it’s based on a portfolio of trades. It's not in my nature to interfere with the learning process, so in that vein, I observe responses. I have read the forum broadly not just this thread. I can see traders at various levels, it's really wonderful what Kim has built here. I chime in now not to change the conversation, but to reassure the cynics that are on their path to improvement to feel confident in their questions but to note that there are people here who are not appropriate to add to your knowledge base. Good luck to all, friends. A quant back-tester is not a product for everyone. That's OK. I wish everyone success in trading. Our goal is to empower everyone with the tools and information the top 0.1% have so we can break the information asymmetry that has benefited the few at the expense of the many for far too long.
    4 points
  3. @Ophir GottliebThese 2 NFLX write-up are perfect examples, albeit extreme, of why IMO the total gain/loss percentage should be calculated as an average of each trade iteration as opposed to being calculated based on the total dollar returns summed up for all trade iterations (or allow the user to specify equal weight backtests). NFLX had a 7-for-1 stock split in July 2015, which means these 3 year backtests are heavily skewed to the first year when NFLX stock price was much higher prior to the split. This first year is weighted more than the next 2 years combined. This can lead to very misleading data. If you are opposed to calculating the gain/loss percentage in this fashion, at least include the gain/loss percentage of each trade iteration in the downloadable trade details to make it easier for people to see the gain/loss% for each trade iteration instead of having to manually calculate it.
    2 points
  4. This answer most certainly depends on how much confidence you desire, or "alpha level". Getting into this is beyond the scope of the thread and my expertise for sure. Just to throw out a broad brush answer, in the research studies I've done and if the sampling plan is unbiased (a huge assumption in this analysis since it's only the last 8 events, not 8 randomly sampled from all earnings events in the entire population), I've found to that in order to test significance of a single parameter (i.e., calculate a "p-value" in the vernacular), I've needed at least 20 samples for estimating each parameter in the model. And this is just to check the significance of the main effect of each parameter (not interactions between those parameters which add another level of complexity - say if entry date is set to one, does the significance and sensitivity of the optimum short delta change, etc). Again, my main point was to not read so much into the precise settings of these studies and slap that particular trade on blindly, but consider what's going on in IV, RV, and possibly technical analysis that's fundamentally leading to these undoubtedly strong trends that appear to have good edge. Tim
    1 point
  5. There is something else that I think many people here, and in other , similar, places have taken so far out of context that it has reached the point of "abusing" the concept. Here is the most extreme example, but you will get my point. I bought an option for .10 cents, and sold it the next day for .12 cents. That is a 20% return, and if you "annualize" this, it comes out to (pick a number) 35,000%. There comes a point where .02 cents = .02 cents!
    1 point
This leaderboard is set to New York/GMT-05:00
×
×
  • Create New...