![]() |
Perry J. Kaufman. Smarter Trading. Improving Perfomance in Changing Markets | ||||
|
books about online stock trading, forex, futures, stock investing, market, trading systems A trading strategy is robust if it is successful under many different con ditions. It is especially good if it works under situations that are very different from those used in testing, for example, a more volatile move to new high prices. Many users can blame the speed of the computer for trading systems that do not work. Combined with strategy-testing and statistical software, the computer has made it too easy to simulate thousands of trading rules and techniques. Preprogrammed strategies, countless indicators, and the ability to create your own variations often draw inexperienced users into an indiscriminating and unfocused approach to testing. In the end, the computer has tested too much and used too little as criteria for success. Often, the resulting trading programs appear to be remarkably profitable but in reality are complete failures. Overfitting A system that has been tailored to work on a specific period of historic data is called overfit. Everyone who develops a trading system will use past data to verify the results. It would be irresponsible to define a set of trading rules, open an account, and begin trading without knowing whether those rules would have worked in the past. The historic risk will give you an indication of the investment size needed to achieve your goal and survive the interim losses required to reach your objective. A careful study of historic results will often point out an area of high risk. Sometimes a simple rule is all that is needed to reduce the risk to a comfortable level. For example, Making a Trading Strategy Robust Reduce the size of the entry position as the market increases in volatility. Further analysis could lead to other rules: Close out all positions if the S&P drops more than 1500 points in 3 days, or, Close out all long positions on Friday if the S&P has dropped more than 10% during the previous week. These rules move from a general, logical risk control to very specific guid ance intended to isolate one or two past problems. Where do these changes stop being reasonable and start being manipulative? The answer is rarely clear. In this chapter, we will consider a system robust if it does not depend on a narrow set of conditions. A profitable 10-day moving average sys tem will not be used if similar 8-day and 12-day systems generate loss es. The best system is one that is profitable for a broad set of parameters, including trend speeds, risk control, profit-taking, and filters. The trad er gains a more dependable program when nearly any choice of para meters is likely to give profits. Separating Robustness from Parameter Selection There is a clear separation between determining that a trading strategy is valid, and being able to use that method to produce profits in the future. Historic testing can verify a premise and show which sets of parameters, or variables, were successful in the past. But this does not mean the parameters that generated the most profits in the past will lead to future profits. And, when there are two tests with similar his toric success, which will be the best? Comprehensive testing, called optimization, results in some cases that show profits and others that have losses. The profits show that the logic behind the strategy is sound. The more cases that are profitable, the more confidence we have in the trading method. However, the parameters that gave profits in historic tests do not always generate profits during real trading. The ability to choose, in advance, which parameters will give future profits is a separate problem from defining robust trading rules. Most of this chapter will focus on how to build a robust trading system; the more robust it is, the less it will depend on picking the right parameters. We can assume that an arbitrary choice of parameters will yield the average performance; therefore, we should be sure that the average is good. Principles The development of a trading model that is independent of parameter selec tion is always the ideal solution. But it is elusive. It is the opposite concept of an arbitrage, which takes advantage of distinct economic anomalies. This chapter will set down a procedure that will greatly improve the robustness of any trading model. It is not a simple process, but it can be implemented one part at a time. The more that is done, the better the results. No model is without some limitations or restrictions. Each has a pur pose that requires some definitions of its operating environment. It is certainly reasonable to exclude a 3-day moving average from a long- term trading system. Both trading frequency and risk can be narrowed to ranges that make sense for the strategy as well as for the type of busi ness. Exploiting a price pattern is valid if the patterns can be identified in advance. It is within this framework that the program can be robust. Some basic principles of parameter selection assure a choice that yields superior results. These include slower trends and fewer artificial risk controls. This chapter will draw on previous conclusions to suggest a set of rules and a testing approach that will give test results similar to real trading. Example of Optimized Performance Table 10-1 compares test results of a moving average system with a per centage stop-loss. The speed of the moving average (from 5 to 50 days) and the percentage stop-loss (from 0 percent to 2.0 percent) are the two parameters needed to trade. A TeleTrac optimization was used to find the returns for each combination of speed and stop for the Hang Seng Index during each of the calendar years 1991 and 1992. If we had used the best results of the 1-year test on 1991 data to select the parameters to be traded in 1992, we would have tended toward the slower moving averages. The highest return of 17.4 percent was given by the 45-day trend. Speeds from 5 to 30 days showed erratic results and only a few small profits. The 1992 results show nearly the opposite. Moving averages from 5 to 20 days had large returns, while speeds from 40 to 50 days had the worst results. Had we chosen the parameters that posted the highest profits in 1991, the 1992 performance would have been only 1.3 percent (if executed perfectly). Making a Trading Strategy Robust This simple moving average test shows the typical inconsistency in the performance of the "best" choices when a test uses only a small amount of data. The area of highest profits in 1991 produced the worst results in 1992.
The Underlying Method for Determining Robustness We can improve the selection of parameters by focusing on systems that have the broadest success. If all test cases are profitable, we would have the perfect robust system and any selection of parameters should return profits. To measure which strategies are better than others, we will define test procedures that measure the results using the average and standard deviation of all tests. The highest average alone is not enough. A smaller stan dard deviation shows the consistency of performance for all tests. A Best Choice Index combines both values by subtracting the standard deviation from the average: Best Choice Index = Average returns - 1 Standard deviation of returns Because one standard deviation represents a grouping of 68 percent of all data, the Best Choice Index tells us that this system gives an 84 per cent chance of a return greater than or equal to the Best Choice Index value. Remember that losses are half the probability, on the left part of the distribution curve. For example, if all combinations of test returns averaged a rate of return (ROR) of 14 percent with a standard deviation of 6 percent, we have An 84 percent chance that any choice will yield returns A 97.5 percent chance that any choice will yield returns A 99.5 percent chance that any choice will yield returns greater The minimum test criterion should be an 84 percent chance of success, given by the Best Choice Index. Testing Process The total solution is the test process. It begins with conceptualization. It must be followed by clear steps that lead to a well-defined result. Experience shows that if you do not control the process, the process will control you. The test procedure can be separated into five parts: Deciding what to test. Deciding how to test it. Evaluating the results. Choosing the specific parameters to trade. Trading and monitoring performance. Each of these steps is critical to the success of the program. Setting up this process for the first time will take a lot of careful work, but most of it will only need to be done once. There will be many decisions to make with regard to the data, testing software, and the method of evaluation. Because the proper development of a trading strategy is so important to its success, these issues will be discussed in detail in this chapter. Box 10-1 provides a checklist that will serve as a reminder. Part 1: Deciding What to Test Before you begin testing, define the system and the test plan complete ly. You must tell the computer what to do, not allow the computer to tell you. Do not drift from one idea to another as you reach obstacles. Try to follow the original idea to completion and learn its advantages and dis advantages. Step 1. Is the Strategy Logical? Did you write the rules before you began testing? Where did you get your ideas? Successful trading programs are based on sound ideas such as economic relationships (e.g., arbitrage, seasonality, and the spreading of strong and weak economies) or valid technical strategies (e.g., break out of a support or resistance level, selling volatility with options, or use of divergence). Letting the computer uncover an obscure short-term pattern, no matter how reliable it seems, is not a sound trading approach. Price patterns can always be found, but they have doubtful predictability and often change without notice. When you develop your program, the strategy must make sense for the market and fit your own objectives, as in the following examples: For the stock market, you might want a long-term buying strategy with For the bond market, a long-term strategy that parallels slow-changing For foreign exchange, a short-term method that would buy or sell in the Using Logical Ideas. A logical idea does not need to be based on fundamentals. Years of watching price movement on Chicago's International Monetary Market (IMM) may give you the idea that dependable entry sig nals occur only during the three periods of high volume each day—at the PARTI: Deciding What to Test 1. Is the strategy logical? 2. Can you program all the rules? 3. Does the strategy make sense only under certain conditions? 4. Take a guess as to the expected results. PART 2: Deciding How to Test ? 5. Choose the testing tools and method. D 6. Do you have enough of the "right" data? D 7. Have you included realistic transaction costs? 8. Will you test a full range of parameters? 9. In what order will the parameters be tested? 11. Have you defined the evaluation criteria? 12. How will the output be presented? PART 3: Evaluating the Results 13. Are the calculations correct? 14. Were there enough trades to be "significant?" 15. Does the trading system produce profits for most combinations of parameters? Q 16. Did logic changes improve overall test performance? ? 17. How did it perform on out-of-sample data? PART 4: Choosing the Specific Parameters to Trade ¦ Q 18. Did the last test include the most recent data? 19. Did you choose from an area of broad success? 20. Are profits distributed relatively evenly over the tested history? 21. Are the profits per trade large enough to absorb errors? CH 22. Did the historic results show any large losses due to price shocks? n 23. Have you risk-adjusted the returns to your acceptable risk level? PART 5: Trading and Monitoring Performance D 24. Are you following the same rules that were tested? ? 25. Are you trading the same data that was tested? D 26. Are you monitoring the difference between the system and actual entries and exits? open, close, and just after traders return from lunch. The low volume peri ods between give less dependable indication of direction and require a more demanding price move to enter a trade. The important rule is to know what you want to do, and then use the computer to verify your idea. You must control the process. Starting with One Idea and Ending with Another. Be sure that computer feedback does not cause you to stray from your original idea. A logical strategy can evolve into meaningless patterns. There is a natural tendency to explain why a system must be fundamentally sound, simply because you have already seen that the test results are good. Step 2. Can You Program All the Rules? Can all the rules in the trading strategy be entered into the computer or a spreadsheet program? Have you assumed anything that was not programmed? A strategy that cannot be tested cannot be evaluated. If you assume that you would not have been caught in a price shock because the program does not trade overnight, then you leave yourself open to unexpected losses, undercapitalization, and justifiable criticism. Writing clear trading rules is essential to testing. You must be certain that you can account for entry and exit conditions, risk control, types of orders, time of day, and other situations that completely describe your plan. Writing the rules will tell you the type of data needed for testing (whether it is only prices, the Producer Price Index, or API statistics) the frequency and extent of the data (open, high, low, close, or 30-minute prices with tick volume). As carefully as you try, you will always need to add details later. Intraday Breakout Example. Start with the most basic approach, omitting risk control, profit-taking, or qualified entries. If you believe that an intraday breakout system is a sound idea, then first test only the breakout entry and the basic exit signals. You might want to close out the trade at the end of each day; or, you might exit if prices reverse and breakout in the opposite direction. It is important that you know whether the under lying idea works before adding profit-taking, risk control, and other more specialized features. Decide which parts of the system can vary. You know that a breakout early in the day allows more time to reach bigger profits during the rest of the trading session. Therefore, you will want to test the time of the breakout. You will not want to accept an entry signal late in the day, because of the limited potential for profits before the close. If the program is fully computerized, you will want to look at the data no more often than every 5 minutes. Although you may be able to exe cute an order within 60 seconds of a breakout, it is not practical to assume good executions. Using 5-minute bars for testing; rather than 1 minute, will also reduce the time needed to test the strategy. Before you start testing, you know that an intraday breakout system depends on the period over which the breakout is measured, the time of day for entry, the size of the profit-taking objective, and some risk control. Trend System Example. All systems have common features: entry and exit rules, risk control, and possibly profit-taking situations. A trend system requires a trend speed. This can vary significantly with your appli cation and objectives. Equities programs, with little leverage and higher transaction costs, require a range from 50 to 500 days. A futures trader, with margins of only 5%, will favor faster trends, from 5 to 30 days. It is a mistake to use a smoothing approach on intraday data. As the time period between data observations gets shorter, the level of noise increases. Because of illiquid periods in all markets, prices can jump in either direction without indicating a true trend change. This causes frequent false signals that cannot be eliminated by using a longer trend based on the same intraday prices. The combination of intraday noise and trend lag will be a difficult obstacle to overcome. Step 3. Does the Strategy Make Sense Only under Certain Conditions? Decide, in advance, whether the strategy targets certain market move ment, or a specific set of conditions. The idea may only make sense for long or short time intervals. For example, a day-trading program using 15-minute data would not use a 200-period moving average, while a long-term investment program in stocks would not use a 3-day trend. By defining the range over which the trading model will operate, you reduce the chance of being diverted from your objective. Write out the most reasonable test range for each of the parameters that are considered important to the strategy. The more you can define your expecta tions, the better the results. Step 4. Take a Guess as to the Expected Results Decide the expected rate of return, the percentage of profitable trades, and the size of the losses. The objective is to compare the test results with your expectations. Whether the results are much better or worse than planned, when you have a basis for evaluation it will be easier to correct and move forward with the development of the system. To say "something is wrong" with the test results, you must first decide what you expect. Part 2: Deciding How to Test Step 5. Choose the Testing Tools and Method With more sophisticated strategy-testing software, it is no longer neces sary to program the trading method in FORTRAN, BASIC, or C to test its success. In a few minutes, using a strategy testing package such as TeleTrac, Omega's System Writer, or even a Lotus or Quattro spreadsheet program, you can have a good idea of the viability of the technique. An increasing number of programmable graphics terminals and new strategy-testing software are available at very competitive prices. They all have the advantage of calculating profits and losses accurately, the flexibility of rule changes and data selection, and the ability to plot both data and profitability. In some cases, results can be read into spreadsheets for further evaluation. The time saved is well worth the price. For the more sophisticated analysts, supplementary software such as Manugistics Statgraphics and Mathsoft Mathcad are impressive tools for evaluating complex statistical relationships and expressing mathematical formulas. Long Test, Short Test, or "Step-Forward" Test? The pattern in Table 10-1 is not unusual. Tests using a small amount of data give results showing that many combinations of parameters will work. The shorter the test period, the more profitable the system will appear. Consider a bond market that has moved steadily up for 3 months. If there were only small retracements, then any moving average from 10 days and longer would have yielded the same results, which is the net move from the beginning to the end of the period (see Figure 10-1 (a)). When a short test interval has one or more price swings, the slower trends give back profits, while some of the faster ones are very success ful. The size of the swing and the amount of market noise determine which trend speeds are best (see Figure 10-1 (b)). In general, tests of small amounts of data give: Individual and average test results that are much higher Risk that is sometimes lower Profitable results for more models that trade faster Erratic forecasting ability It is much more difficult to find a trading method that is good over longer test periods. The best tested performance (annualized rate of return) of a system tested over many years will never be as high as the rate of return of a similar system tested over a few months or a year. Using more data, you should expect: Much lower returns Larger risk when positions are held longer Difficulty in getting consistent profits from short-term trading Questions as to the relevance of older data Better forecasting ability Therefore, test results using smaller amounts of data look better, but do not perform as expected; results based on longer tests look worse but perform closer to expectations. You should not be disappointed in the results of a long test period when compared with shorter tests. It is only that the shorter tests are misleading. Select a Long, Representative Test Period. When more data are tested, there is a greater variety of unusual situations, longer profitable price moves, sequences of losses, and price shocks. When longer periods are tested, both risk and reward increase; however, risk increases faster than returns. Testing shorter periods can give an unrealistically small risk, cause undercapitalized trading and fatal results. A good rule is to be cer tain that the data contain two full cycles, that is, there should be two clear bull markets, two bear markets and two prolonged sideways intervals. Because results never look as good when the same strategy is tested over longer periods, you might argue that markets have changed and the old data are no longer representative; that globalization and region al alliances have changed the price relationships and patterns in many sectors, or that government controls will prevent an economic collapse. By saying that the market will continue to exhibit only the price patterns seen recently is unrealistic. It will evolve to new patterns; however, we have no way of knowing what they will be. The past contains the most accessible, practical, and realistic examples of changing situations. Box 10-2 shows that performance drops but predictability increases with the use of more data. Short test periods produce unreasonable expectations of profits. "Step-Forward Testing" versus One Long Test. The technique of "step-forward testing" seems to be a sensible approach to resolving some of the testing dilemmas. It works as follows: Select a short data interval, called a "test window" (e.g., 2 years of Test (optimize) a full set of parameters on the test window and select Run the model on a short period of out-of-sample data, immediately Collect performance data on the "out-of-sample" period, including a Move the test window forward and repeat steps 2 through 4 until done. The parameters that perform most consistently in the out-of-sample Hidden Problems. Step-forward testing seems to duplicate the way we would operate a trading program. But there are hidden problems: Shorter test periods favor faster strategies that produce higher profits Short test periods do not represent long-term trading fairly. Each short Retesting the same system with modified rules means that the "out- The step-forward process will usually select an inconsistent, fast-trading method over a better long-term system simply because the test window forces this result. Instead, use all the data in one long test to get continuous performance over as many changing patterns as possible. Step 6: Do You Have Enough of the "Right" Data? The more data you test, the more situations the program will experi ence. There must be at least two bull markets, two bear markets, and two sideways periods. Unless you can prove that the older data is mis leading, or no longer valid, you should use as much data as possible. Put some data aside for out-of-sample validation after the final system has been selected. This is discussed further in Step 18. Using more data produces more consistent and realistic results. Final results may show that risk is higher and profits are lower, but these fig ures are more likely to be achieved in trading. It is more difficult to find persistent short-term patterns in a longer data series; therefore, selec tions favor slower trading. Long-term solutions, in turn, include realis tic equity fluctuations because they cannot be fine-tuned to avoid spe cific losing periods. This performance profile shows higher risk and makes it necessary to have higher capitalization. A simple test of the MATIF CAC-40 Index (Table 10-2) shows the predictive ability of tests based on 1,2,3, and 5 years of data. The sys tem tested was An exponential moving average from 5 to 50 days, in increments A trend change criterion ("filter") from 0 to 10 points, in incre A buy signal that occurred when the trend turned up by the The highest profits for each test determined the trend speed and entry filter that were to be used to evaluate the next 1 year of data. The aver ages for each test case were compared. 1-year test. The best trend speed and filter varied considerably 2-year test. Overall profits per year declined and the average best 3-year test. The performance pattern continued to improve. The 1- 5-year test. Improvement continued overall. Longer trends were Average trend is slower. Average trend is slowest. In-sample profits are lower. In-sample profits are lowest. 1 year ahead is better. 1 year ahead is best. New highs in 1988 generated losses because it is not part of sample data. Are You Testing the Same Data That Ton Will Trade? Do not test one set of data, then trade another. Do not use a "continuation" series because either the gaps have been removed, or they cause windfall profits or losses that would not have happened in trading. A "perpetual" contract has prices that never existed and usually dampens any severe price move causing the risks to look smaller. Did Ton Verify the Accuracy of the Data? Data can be inaccurate even when prepared by a reliable vendor. Look for prices at the beginning or end of a contract that are completely different. Sometimes the data will have prices from another market that have not been erased, or an erroneous date one or two years earlier. Check for blank or zero entries. If you chart the data, you will easily see errors. The ones that are too small to see can be ignored. Special Cases in Selecting Test Data. It is not always possible to have enough data for testing. New markets or changing situations may render old data questionable. Or, you are looking to profit from a recent price pattern, without expectation of using the system for very long. The following sections offer some alternatives. Selecting Similar Data Periods. A stock that has dropped to a very low level can have a very different performance pattern from a period of high prices and high volatility. Selecting similar historic periods, such as those following a prolonged decline, or after a sell-off of 10%, may be the only way to model your strategy. Using Cash Markets to Model Futures. Cash markets are often used to test a system that will be traded as a new futures contract; however, a new contract can be illiquid. A good model will account for similar sit uations in other new markets, adapting to the change between the cash and futures. Because there are many examples of changing markets, this should be a successful exercise. Stock and Futures Markets under Special Situations. All markets go through severe changes: a corporate scandal or mismanagement, sudden new competition or government regulation; a price shock in coffee or orange juice due to a freeze. These special situations must be also studied separately, rather than absorbing them into the flow of everyday price movement. Market reaction to special situations is often similar because of the human response, rather than the fundamentals of a com pany or commodity. Similar cases can be found in other markets. When the special situation is a "price shock," a new set of rules can be used. This is discussed in Chapter 7. Structural Changes and Not Enough Data. The European Monetary System (EMS) imposed a structure on participating currencies that had lit tle precedent. A previous period, under the Bretton Woods agreement, may not provide enough similarity or adequate data for modeling a trad ing strategy. In this case, a fundamental analysis is the only course. Results based on small amounts of data are unreliable. A sound understanding of the fundamental interaction and the rules under which the new agreement operates may allow some confirmation by testing. To date, the EMS has proved to be unstable, therefore, a test of the 1 to 2 years of data would have led to poor results. Creating More Data. For some markets, it is possible to create synthetic data. By studying volatility at different price levels, sequences of runs, vari ation in periodicity between highs and lows, and seasonality, it is possible to use random price generation to create data with the same statistical qual ities as the one being evaluated. Synthetic data gives you the ability to test more situations and develop a more robust solution, but it is not the same as real data. It is best to use synthetic data first, before testing actual data. Type of Data to Test. The data used for testing strategies should always be the same as the data to be traded. This is very straightforward for stocks, but becomes more difficult when you use foreign exchange or futures prices. The forex market will require adjustment for interest over the holding period, but the futures market presents the greatest problem. Although the nearest futures has the greatest liquidity, it may trade actively for as little as one month, and rarely more than three months. This frequent expiration makes testing inconvenient. The following sections will show how to fix this problem. Original Data Series. For stocks, foreign exchange, interest rates, or other cash market data, a long series of original, unadjusted data is available for testing. When trading the cash market you will also need the spot interest rate to calculate a forward price. Treating the cash price as a valid entry and exit point omits the need to roll the position forward daily. Each rollover has an implied transaction cost that eats away at profits. Alternately, you can use spot prices for entry and exit, and calculate the net interest rate credit and debit when liquidating the position. Additional transaction costs must be included in testing each time the position is rolled over. Futures Contracts. Original futures contract data can also be used for testing, without modification. Use the following steps: Read the futures contract data. Start the strategy calculations at the beginning of the series ("wind Begin taking positions in the new contract on a specific date, or on Exit any trade on a specific preset date before expiration. For interest =-] This method is inconvenient because results are usually given by con- : ; tract. For a 10-year test of interest rate futures, 40 separate sets of results must be accumulated. In addition, it is difficult to assess maximum drawdown unless you can treat the segments of data as a continuous i"! equity stream. Continuous Data Series for Futures. Many data vendors provide a !¦ ate a new series (e.g., a 3-month price) calculated in a way that resembles ;¦. the London Metal Exchange forward contracts. These choices are unac- ! ceptable for testing because they do not show the data that will be traded ¦|! in a way that can duplicate a realistic trading environment. The construct- ed 3-month series, with interpolated carrying charges or interest, is fre- '.; both the profit and risk. Gap-Adjusted Series and Index Series. A gap-adjusted data series is a good alternative for most technical applications in futures. It puts the nearest-to-delivery segments together into a single price series by closing the gaps at the time one contract rolls into the next. By proceeding back ward, the most recent futures or forward contact has today's prices, and the older contracts are adjusted up or down according to the gaps. The gap-adjusted series works well for trend-following applications and strategies where the comparative price, rather than the actual price, is needed. It does not work for chart analysis, economic studies (sup ply/demand/price relationships), and similar uses. One problem with gap-adjusting, where older prices are changed, is that the very old data can take on negative or unrealistic values. Because the prices are not real, the rate of return and risk measurements must refer back to the actual prices, rather than base their values on the gap-adjusted series. With one additional step—indexing—the gap-adjusted series becomes more useful. Indexing is simply starting with the value of 100 (or 1000, depending on convenience), then adding or subtracting successive val ues as a percentage change. For example, index = index[1] + (price - price[1])/price[1] Today's new index value, index, is yesterday's value, index[1 ], plus the percentage change in yesterday's price. The notation [1 ] means the 1-day prior value. The index price represents a percentage change and allows simple comparisons between returns of different markets. It eliminates the need to reference the original price data to calculate risk and returns. Building a Gap-Adjusted Series. If you are working with futures contracts, a continuous series can be very useful. There are three steps to follow, (1) creating a continuous series with duplicate entries on the day of the rollover, (2) gap-adjusting the series, and (3) indexing. Figure 10-3 shows a flowchart of this process beginning with Step 2. Use the prices in Table 10-3 to follow the flowchart. For example, if the S&P 500 were being combined, Step 1 causes the June 93 contract to stop on the last day of May, and the September 93 contract to start on that day. Step 2 would gap-adjust the prices, working backward, whenever it identified a duplicate date. In Table 10-3 the June contract values are adjusted up by 12.00, equal to the roll-forward gap on May 31. Step 3 would assign 100 to the first value, then calculate the percentage changes for each successive entry. Note: A clever analyst can eliminate Step 2 if an index is the only output. Alternatives. The only remaining problem with gap-adjusting is that transaction costs cannot be posted at the time of the roll-forward, because that date can no longer be identified. It may be more difficult, but preferable, to write program logic around the continuation file, which contains the duplicate dates and data. When a duplicate date is encountered, the old trade is closed out and the new trade entered. Shock-Adjusted Series. A FORTRAN program for removing price shocks, then restoring the continuity of the data by indexing, can be found in Chapter 7. It is a similar program to the one in Figure 10-3 and gives coding details. Table 10-3. Sample S&P Prices Combined before Gap- Adjusting
S&P prices have been combined into a single series, and still show the original prices. A duplicate entry appears on May 31, which will be the date of the roll- forward where the gap is adjusted. The "Gap-Adj" and "Index" columns show the values after those steps have been completed. Step 7. Have You Included Realistic Transaction Costs? Transaction costs include brokerage and slippage. But other factors reduce performance. Do You IZxpect Any Missed Trades? "Unables" have a great impact on results because they reduce only the profits and not the losses. If you over trade the liquidity of the market, then unables become an important fac tor. Programs that trade intraday will face more problems than those that trade on the close. Part of a successful program is achieving actual trading results similar to expectations. A full discussion of slippage and unables can be found in Chapter 2. Step 8. Will You Test a Full Range of Parameters? Determine, in advance, the range of parameters that is sensible for this strat egy. If you are trading stocks for an institutional portfolio, a moving aver age test range may be 50 to 400 days. Stop-losses must be equally large. However, do not prescan and remove very fast and slow ranges because they showed losses. That is the same as eliminating everything except the one set of parameters that was profitable. You cannot develop a robust model by looking at a narrow range that has been preselected to work. Step 9. In What Order Will the Parameters Be Tested? Test the most important variables first, the ones that cause the largest change in performance. That would be the number of days (the "period") in a moving average, Relative Strength Index (RSI), or stochastic; the time of day or number of days in a breakout system; or, the deviation from the norm in a countertrend or arbitrage approach. These variables usually have the greatest effect on profits. Tests of other rules should follow, in order of most impact on profits or most frequently applied. Testing the variables that are most important will speed up the test process. Rather than testing all combinations of all variables in one procedure, selecting the test range for one variable at a time can reduce the number of tests and the total time of the testing process. In some cases, the most profitable combination of parameters occurs when the primary variable is "suboptimized." For example, profit- taking opportunities may be increased when the moving average is very fast, therefore you want high-momentum situations for very fast profit- taking objectives and a short holding time. If two features must work together, testing both the trending period and the profit-taking level simultaneously can work. It may also be that the profit-taking level is the most important variable, and the trending period is not as significant. Step 1O. Are the Parameters Distributed Properly? Not only should the range of parameters be set in advance, but the dis tribution of those tests is important. Box 10-3 describes what needs to be done. This is a crucial step in preparing to see the whole test picture, which is essential for a robust system. Because the final decision is based on the average of all tests, the distribution of parameters must not favor either the fast or slow strategies. They must be evenly distributed. When a moving average system is test ed, it is generally thought that a test of 5,10,15, 20,... days is a reason able choice. Equal increments, however, favor the very slowest trading. Figure 10-4 shows how equal days have very unequal percentage changes from one test to the next. A change from a 5- to a 10-day mov ing average is a 100 percent change in the amount of data. A change from 10 to 15 is a 50 percent change, but a change from 95 to 100 is only a 5.2 percent shift. An equal distribution of days will skew the results toward the slower tests. Visual Distribution. It is not necessary to use mathematics to decide the distribution of parameters for testing. A very effective visual method can be best shown by the following example. If the fast end of the test shows 100 trades and the slow end has 10 trades, choose test periods so that 11 tests give results showing trades of 100,90,80,..., 20,10. In reality, a perfect distribution is impossible, but the goal is clear. Try to find the parameters that cause the number of trades to be evenly distributed across the full range of tests. Step 11. Have Yon Defined the Evaluation Criteria? What do you measure to decide which system is better? To evaluate results, it is necessary to produce a minimum number of statistics for each test. Selecting the test with the highest profits may not be as important as finding the one with the best return/risk ratio. Decide in advance how you will select the best strategy. Most often, you need a combination of statistics, including reward/risk ratio, profits per trade, and risk-adjusted returns. ¦ Return/risk ratio is the compounded, annualized rate of return divid Compounded annualized rate of return, CROR = (Ending value - Starting value) A (1/years) Standard deviation , SD = @STD(Monthly changes in equity) Return/risk ratio, RR = CROR/SD Calculations should use returns on cash to see the raw performance before deciding on the potential use of leverage. The importance and use of these three statistics are discussed thoroughly in Chapter 4. ¦ Profits per trade show how much room you have for unexpected prob The selection of which trend speeds to test will give a correct or distorted view of the potential of the system. If moving averages from 5 days to 100 days are tested, the total picture is skewed toward longer trends; that is, the results of trend periods from 55 to 100 days can be very similar, while those of 5 to 50 days may each show very different performance. By viewing the percentage change in consecutive tests, it is evident that there should be fewer tests as the trend speed becomes longer. Table 10-4 shows (1) Days, equal test periods, in days, for an exponential mov ing average; (2) %Change, the percentage change in the length of the period; (3) ExpSC, the equivalent exponential smoothing constant; (4) Equal, an equal distribution of smoothing constants, calculated as smoothing_constant = 2/(days + 1) and (5) Days, the equivalent number of days corresponding to the smoothing constants in column (4). The averages are at the bottom of the columns.
5 Through 100-Day Periods (b)
Figure 10-4. Trend distribution, (a) Equal test periods. Equal test period increments result in very different % changes, (b) Equal smoothing constants. Smoothing constants, which can be viewed as a percentage, show how the test periods, in days, are closer together for faster trends. A series of tests in which the trend speeds change by an equal per centage gives a much better sample of overall performance than equal ly spaced periods. An exponential moving average is an easier choice for accomplishing this because an equal spacing of smoothing constants is the same as an equal percentage change. Column (4) has an equal distribution of smoothing constants, beginning and ending at the same values as in column (3). Column (5) gives the number of days approximately equal to the smoothing constants in column (4), con verted using days = (2/smoothing_constant) - 1 Figure 10-4(b) compares the pattern of the test periods in equal days with the pattern of equal percentages necessary to achieve an even dis tribution of performance. The number of trades will show whether there are enough trades to have sound results. A rough idea of the accuracy is given by sample error = 1/@SQRT(number of trades) Maximum drawdown, on a day-to-day basis measures the peak-to-valley decline in equity, and gives the minimum capital needed for trading. Although one test may have a smaller equity variation, mea sured by the standard deviation, the maximum drawdown can remain the same because both models were on the same side of a severe price shock. The model with the smaller standard deviation shows a more acceptable equity variation during normal markets, but both require the same investment from peak to valley. It is often used for a worst-case scenario. Unfortunately, it is rarely the worst case. Risk-adjusted returns is the most important performance measure ment. It compares standardized returns at the same risk level. Percentage of profitable trades gives an indication of the consistency of performance. More frequent profits normally translate into less equi ty fluctuation. A very low percentage shows dependence on a few large price moves. Each type of system, trend-following or coun- tertrend, has a recognizable profile. Trend-following systems should have from 35 percent to 45 percent profitable trades, while coun-tertrend programs should exceed more than 60 percent successful trades. Variations from these patterns should be examined closely. ¦ Time to recovery, although similar to risk, gives a different interpreta tion. It measures the time between new equity highs. From a practi cal view, a larger equity drop but a very fast recovery may be preferable to a smaller decline with a slow recovery. Step 12. How Will the Output Be Presented? If you only saw the most profitable result from a set of 500 historic tests of various parameter combinations, you would have no idea whether the strategy was robust. This chapter tries to stress that the combined performance of a wide range of parameters determines the level of confidence. Within this total picture, patterns of performance can be used for making the final parameter selection. For example, positions held longer will normally have a higher profit per trade; other tests that limit risk may show a better return/risk ratio. As the parameters that indicate trading frequency or risk control move from small to large values, performance should change in a con tinuous pattern. The presentation of test results can make the final parameter selection a much simpler task. Tests are commonly presented line by line, giving the results of the first moving average speed and the incremented stop-loss, similar to the presentation in Table 10-1. By changing the form to a two- or three-dimensional chart, the results become much more useful. A Two-Dimensional Display. A bar or line chart is a two-dimension al display. It can show net profits or profits per trade versus trend speed. In Figure 10-5, line a shows that the profits per trade are erratic for a very small stop-loss and trend speeds under 20 days. Results become more con sistent above 20 days. The center gray zone holds the best trends. Line b shows the profits per trade from the same trends speeds with a slightly larger stop-loss. Results improve uniformly, but the original pattern remains the same. The line chart in Figure 10-5 works for this example, but becomes unreadable when many lines are drawn for each stop-loss tested. Instead, a contour map (Figure 10-6(a)) shows the patterns clearly. In Figure 10-6(b), which holds the values plotted in the contour map, the trend speed is the left scale and the stop-loss is along the bottom. The fastest strategy, combining the shortest trend and smallest stop-loss, shows profits per trade of .07 percent in the upper left corner. The slow est strategy and the largest stop-loss give a much larger profit per trade of .22 percent in the lower right corner. Clustered in the center are the peak results. PART 3: Evaluating the Results Using Averages and Maps The average minus the standard deviation gives the Best Choice Index, which is simply the chance of picking a trading model that will produce an average result. The contour map display can help locate broad areas of success and prevent the selection of a trading model that targets a profit per trade too small for practical use. If the overall picture is good, the strategy is profitable, and results are smooth over most of the map, the chance of choosing a successful model is also good. The following questions will help qualify the results. Step 13. Are the Calculations Correct? Before going further, step back and ask yourself whether you have checked all the calculations. Did you manually verify a few lines in the spreadsheet? Did you calculate, in advance, the exact entry and exit prices for a number of trades that used different rules? Do the answers look reasonable? Even the best analyst can make an error typing a formula. Do not waste time run ning hundreds of tests without verifying the results. Step 14. Were There Enough Trades to Be "Significant?" In Step 11, the sample error was given as sample error = 1/@SQRT(num- ber of trades). Therefore, if there are only 16 trades, the error in the performance is ± 25 percent. It requires 400 trades to keep the error to 5 per cent, considered the minimum acceptable size, but few systems produce that many trades. The only alternative is to be sure that the underlying premise is sound, and to produce as many sample trades as possible. Step 15. Does the Trading System Produce Profits for Most Combinations of Parameters? What are the chances that any selection will be profitable? Are the patterns continuous? A robust system must be broadly successful. When you look at the test results, you should see mostly profits, and the Best Choice Index must be positive, giving an 84 percent chance of success. Use the aver age less 2 standard deviations to get the 97.5 percent level, and the aver age less 3 standard deviations to find the 99.5 percent level. The higher the probability, the more robust. The contour map display should show continuous patterns, as in Figure 10-6(a). Jagged peaks and valleys may be caused by specific rules that work in one test case but not others. Step 16. Did Logic Changes Improve Overall Test Performance? When a new rule or calculation is added to the program, the results are robust if they improve the Best Choice Index. This assures that the change in logic was not pointed toward a specific event, but was a gen eral improvement. A higher Best Choice Index occurs when the average of all tests increases while the standard deviation does not increase, or the average remains the same while the standard deviation decreases. A smaller standard deviation indicates improved consistency and makes it easier to select successful parameters. These cases are shown in Figure 10-8. Step 17. How Did It Perform on Out-of-Sample Data? At least 10 percent of the test data should have been set aside. Even bet ter, the 10 percent oldest and most recent data should not have been used for testing. Once the trading strategy has been finalized, test that data separately and compare the average of all tests against the average of the final tests of the longer set of historic data. Even in the best of cases, you can expect profits to be lower and risk higher; however, the pattern should be similar to the tested profile.
Average Standard Deviation Remains the Same (b) Figure 10-8. Selecting a robust system using the perfor mance curve and Best Choice Index, (a) When the performance curve flattens and widens, the results get worse. The average returns remain the same, but the standard deviation gets larger causing the Best Choice Index to drop, (b) When the average shifts to the right or left, the overall performance gets better or worse, as long as the standard deviation remains the same. Results of the out-of-sample test that are very different from the other tests must be reviewed carefully. Poor results indicate that the strategy is not working. The use of a chi-square test (see Chapter 11) will show whether this failure is part of the long-term performance profile or indi cates that something is wrong. You may have an error in the rules or cal culations, but that should have been corrected long before this point. Or r the test period might have been too short, resulting in unstable results.
Feedback Dilemma. Once you have used the out-of-sample data to verify the system, you can no longer use that data again. Inspecting the trades and adding rules may produce a valid improvement, but you have made it work in the "unseen" data; therefore, you have no way to check the results. You might include the new data and omit some other piece; however, the reliability of the results has dropped. Part 4: Choosing the Specific Parameters to Trade The final section of a trading model is a combination of profits, risk, and personal preference. A program that holds trades for weeks may pro duce the highest profits per trade but may not meet the investor's short- term objectives. Even though individuals may choose differently, the most robust systems offer the best platform from which to select. This section asks questions that are important, regardless of your specific goals. Generally, selecting from models that hold positions longer gives more dependable results. It is also more difficult to assess the expected returns from faster trading models. Figure 10-9 shows the hypothetical results of a trend system, where the fastest trading model is posted at the left. Performance is erratic although a smoothed line can give a bet ter idea of expectations. In actual trading, the 6-day trend may capture the next big profit, while the 4- and 8-day trends post losses. A comparison of fast and slow strategies shows that: Faster trading is more sensitive to current market patterns. Faster trading gives up a large percentage of profits and losses to Faster trading may have the same large losses due to price shocks, but Regardless of the trading strategy, taking the long-term view is the more conservative, reliable approach. Although the long-term strategy may have larger absolute losses, it often has a better return/risk ratio than faster programs. This does not mean that you cannot have a system that works well trading fast. The performance must be high when you draw the smooth line through the irregular results. You must also expect real returns to be erratic. Tests plotted in Figure 10-9 show that results can vary significantly from expectations, especially with fast- trading methods. You should expect real returns to vary even more than the tests show. Step 18. Did the Last Test Include the Most Recent Data? Having reserved some data for out-of-sample testing (see Step 17), the program should be retested using all data. This is particularly impor tant if the out-of-sample data is the most recent. Once the model is oper ational, retesting should be performed whenever 5 to 10 percent new data is available, or unique market patterns occur. The model may be adjusted by a small amount, but it will become ever so slightly more robust. Step 19. Did You Choose from an Area of Broad Success? Was it the slow selection? The contour map shows whether the performance of the strategy has a smooth or irregular pattern with respect to parameter changes. The areas of broad success show stability and are often associated with slower trading models. A choice of a faster strategy must be justified by a larger profit per trade and reasonably high reliability to compensate for inherent uncer tainty. The worst-performance case in the neighborhood of the selection should still be acceptable. Figure 10-9 shows that erratic results associ ated with short-term trading should be considered as smoothed when selecting from this region. Step 20. Are Profits Distributed Evenly over the Tested History? Study the trades and equity of the final model to see whether profits and losses alternate in a reasonable pattern. A standard deviation of the equity changes, time to recovery, and other statistical measures give the
Faster Trading Slower Trading Figure 10-9. The typical results of a trend-following strategy opti mization. By selecting the peak profits, or return/risk ratio, results often favor isolated returns of short-term trends. The chance of repeating this performance in actual trading is very slim. The smoothed line is the most likely return. relative merits of one test against another, but only a visual study is good enough before you begin trading. It may be helpful to look at quarterly results to see consistency. Step 21. Are the Profits per Trade Large Enough to Absorb Errors? When two tests have similar risks and returns, the best choice is the one with the largest profits per trade. Larger profits absorb unexpected problems (e.g., slippage in a fast market) that result in lost profits when an order cannot be executed. Establish a minimum acceptable profit per trade. Step 22. Did the Historic Results Show Any Large Losses due to Price Shocks? Price shocks are unpredictable events. Your program should have an equal number of losses as it has profits due to price shocks, although some may be controlled by a stop-loss. Check the obvious past price shocks against the system trades. If the system profited from all of them, or avoided the losses, the results are overfitted or just lucky. You cannot expect the program to profit from unpredictable events in the future. The danger of trading a system which has not shown losses from price shocks is that the risk is unreasonably small. This leads to greater leverage and large losses. Step 23. Have Ton Risk-Adjusted the Returns to Tour Acceptable Risk Level? The return/risk ratio turns absolute performance into relative returns and allows the fair comparison of each model. Traders however must establish their own acceptable risk level. Decide, for example, that you are willing to take a 1 percent chance of losing more than 10 percent during any month. Then the system you trade must show a risk (measured as 1 standard deviation of the monthly equity changes) of less than 3V3 percent. Three standard deviations will be 10 percent. Remember that equity changes based on monthly data are already smoothed. You can expect larger mid-month equity fluctuations, some times as much as 50 percent greater. Put 5: Trading and Monitoring Performance No amount of testing can substitute for trading. As soon as the first position is set, you may realize that the transaction costs used in testing were too low, you cannot execute the full position in the cash market after the New* York close, or that a breakout signal produced liquidity gaps. Monitoring the system signals against actual trading provides information that will continue to improve the testing process. Step 24. Are Ton Following the Same Rules That Were Tested? Real trading results often vary from test results because the rules used in testing are not followed. The size of the transaction costs or the liq uidity of the market may also prevent you from executing the full posi tion. Most often, it is the execution technique. By waiting until after the computer has given a trading signal, the trade price and the theoretical computer signal are far apart. This is solved by anticipating the comsome may be controlled by a stop-loss. Check the obvious past price shocks against the system trades. If the system profited from all of them, or avoided the losses, the results are overfitted or just lucky. You cannot expect the program to profit from unpredictable events in the future. The danger of trading a system which has not shown losses from price shocks is that the risk is unreasonably small. This leads to greater leverage and large losses. Step 23. Have Ton Risk-Adjusted the Returns to Your Acceptable Risk Level? The return/risk ratio turns absolute performance into relative returns and allows the fair comparison of each model. Traders however must establish their own acceptable risk level. Decide, for example, that you are willing to take a 1 percent chance of losing more than 10 percent during any month. Then the system you trade must show a risk (mea sured as 1 standard deviation of the monthly equity changes) of less than 3V3 percent. Three standard deviations will be 10 percent. Remember that equity changes based on monthly data are already smoothed. You can expect larger mid-month equity fluctuations, sometimes as much as 50 percent greater. Part 5: Trading and Monitoring Performance No amount of testing can substitute for trading. As soon as the first position is set, you may realize that the transaction costs used in testing were too low, you cannot execute the full position in the cash market after the New* York close, or that a breakout signal produced liquidity gaps. Monitoring the system signals against actual trading provides information that will continue to improve the testing process. Step 24. Are Ton Following the Same Rules That Were Tested? Real trading results often vary from test results because the rules used in testing are not followed. The size of the transaction costs or the liq uidity of the market may also prevent you from executing the full position. Most often, it is the execution technique. By waiting until after the computer has given a trading signal, the trade price and the theoretical computer signal are far apart. This is solved by anticipating the comsome may be controlled by a stop-loss. Check the obvious past price shocks against the system trades. If the system profited from all of them, or avoided the losses, the results are overfitted or just lucky. You cannot expect the program to profit from unpredictable events in the future. The danger of trading a system which has not shown losses from price shocks is that the risk is unreasonably small. This leads to greater lever age and large losses. Step 23. Have You Risk-Adjusted the Returns to Tour Acceptable Risk Level? The return/risk ratio turns absolute performance into relative returns and allows the fair comparison of each model. Traders however must establish their own acceptable risk level. Decide, for example, that you are willing to take a 1 percent chance of losing more than 10 percent during any month. Then the system you trade must show a risk (mea sured as 1 standard deviation of the monthly equity changes) of less than 3V 3 percent. Three standard deviations will be 10 percent. Remember that equity changes based on monthly data are already smoothed. You can expect larger mid-month equity fluctuations, some times as much as 50 percent greater. Part 5: Trading and Monitoring Performance No amount of testing can substitute for trading. As soon as the first position is set, you may realize that the transaction costs used in testing were too low, you cannot execute the full position in the cash market after the New* York close, or that a breakout signal produced liquidity gaps. Monitoring the system signals against actual trading provides information that will continue to improve the testing process. Step 24. Are You Following the Same Rules That Were Tested? Real trading results often vary from test results because the rules used in testing are not followed. The size of the transaction costs or the liq uidity of the market may also prevent you from executing the full posi tion. Most often, it is the execution technique. By waiting until after the computer has given a trading signal, the trade price and the theoretical computer signal are far apart. This is solved by anticipating the computer signal. To be a successful system trader, you must execute at the same time the system is executing. Chapter 11 shows how to anticipate a computer signal. Step 25. Are You Trading the Same Data That Was Tested? Although it is convenient to test a strategy using a continuation or "perpetual" contract, the results will not be the same when you trade cash or futures contracts. Be sure that you are trading the same market that was tested, and that you tested the same market you are trading. Step 26. Are Ton Monitoring the Difference between the System and Actual Entries and Exits? Understanding how to test a strategy comes from identifying why testing and actual trading results are different. Monitor the theoretical sig nals, real executions, and the percentage of trades that cannot be exe cuted, then retest the strategy with these improved values. In time, you will be able to show very realistic test results. Other Important Practical Guidelines Even the most careful, responsible testing cannot show how the system will perform when it is traded. From the preceding guidelines, experi ence shows how the following points should be highlighted: Slower systems, those using longer periods of evaluation, perform Avoid systems that do not show downside risk. Absence of risk is an View test results as a smoothed line. In a robust system, expect peak Avoid systems with low reliability. They may indicate dependence Avoid systems that have only a few trades. They may not yet show an More Data Give More Predictable Results It is worth repeating the importance of using more data, rather than less data, for testing. More data contain more price patterns, sustained moves, and price shocks. Many people argue that old data lack rele vance—markets have changed. In specific cases, and for some applica tions, that is true. It is safer to assume that there is more danger than benefit in using small amounts of data. A system tested over the past 3 years will not see the largest price shocks of the recent 10 years. Yet you must expect that even larger shocks will come. If you capitalize an investment according to recent risk, you will not survive for long. The greatest failure in trading is undercapitalization, and this is the direct result of unrealistic expecta tions of risk. If recent data are best for maximizing profits, more data are best for risk evaluation. It is possible to test a strategy twice, once for parameter selection and once for risk. Because tests of more years of data show lower profits and higher risk, they are not viewed as desirable. It is much more pleasing to choose from the high returns and low risk of shorter test intervals. But the reality is that the longer tests are more representative of real trading results. Choosing to ignore these results does not produce greater profits. Start by Knowing the Answer The best use of computer testing is to verify a theory. If your idea is good, then testing various time intervals, entry and exit criteria, and risk man agement parameters should show reasonably consistent returns. It may show that your theory is good for short-term patterns, but not for the longer view; however, it should verify your idea. A concept based on an understanding of the market—whether economic, statistical, or price patterns—is a valid, valuable basis for a system and the best way to begin the development of a trading program. Feeding a test package a multitude of indicators, rules, and prices series, and letting it crunch away until it combines them into a profitable result, has a very low chance of being a successful trading system. Errors of Omission "Survivor bias" and the failure to apply a worst-case scenario are two problems classified as errors of omission. Omissions constitute an unseen trap for analysts. It is far easier to account for odd patterns and price shocks than to consider situations that do not appear in the data. Survivor Bias. The selection of certain stocks, funds, arid investment managers for testing unconsciously omits the worst cases—those where the company or manager went out of business. A classic case of survivor bias is in the review of investment managers. The one who generates the highest profits may have the highest risk. If you review only those man agers currently reporting, you do not find out that all managers with com parably high risk were previously forced out of business by losses. These comparisons result in unrealistically low risk. Similarly, the selection of specific stock issues means that those firms have not seen the patterns that precede failure. Even the largest firms are no longer as secure as we once thought. Drexel Burnham, E.F. Hutton, Stotler, and the Pennsylvania Railroad (also the Penn-Central, with the most assets of any company in the United States ) proved that mismanagement and litigation can ruin even the biggest. IBM, the auto giants, and insurance companies no longer look inviolate. It is difficult to assess risk properly if you only study the winners. Worst-Case Scenarios. More difficult, yet just as important, is the abil ity to conceive "worst-case scenarios." What might cause a market to go to new high prices, fall to new lows, or become twice as volatile as the worst period in history? If this happens, what steps do you take to stabilize risk? Or, do you remove those markets from your portfolio? Will the trading strategy perform properly if prices move to levels not seen in historic data? Will previously uncorrelated markets move together? These scenarios are critical to risk control. Often, there are no immediate answers to these hypothetical cases, but only a general confidence that the current strategy has the flexibility to adapt to market change. That is not always enough. A sharp drop in one market can force a need for capital, causing investors to liquidate unrelated assets to finance the losing ones. This results in a broad reversal in many investment areas. Data Integrity The assumption that a historic data series is correct can result in a tremendous loss of time. All data should be scanned for gross errors before being used. Data received electronically or on disk from a reli able vendor may still have problems. Testing and evaluating a system takes time. To find a data error after weeks of work means that all the testing must be done again. A few fast steps can avoid that aggravation and cost. 1. Look at a price chart of all the data to be used. Any serious data prob lem will be obvious. If you have strategy-testing software, identify opening, high, low, or When the final model has been selected, look at the profits and loss Patching the Problems Trading strategies succeed by generalization. Most plans are profitable because they grind out larger profits than losses. The problem with a general or statistical solution is that it is blind to specific cases, but the trader is well aware of the reasons for big price moves. Each major move and price shock can be explained. By carefully studying the cause and patterns of larger trading losses, indicators and rules can be combined to control the losses, leaving a more profitable performance profile. But the next big move is always different. They can be explained in retrospect, but rarely fit a prescribed pattern. Explaining each loss has intellectual satisfaction but falls short of reducing trading risk. Fixing each case based on its own features is still "overfitting." Do Not Oversolve the Problem A young analyst, trying to do his best, produces an answer to four dec imal places, when each of the inputs had only two places of accuracy. You cannot create more accuracy than you have. Technical models, based on either price patterns or statistics, do not depend on one price move or a single trade. They succeed over a large number of events. Fine-tuning a moving average can be counterproductive because it moves away from the general solution. A specific trend speed that avoided a large loss has no way of avoiding similar losses in the future. Oversolving or overtesting produces unrealistic expectations of system performance. Accuracy and Test Time. For most system tests, there is a direct rela tionship between accuracy and calculation time. The more time it takes, the better the result. Is it better to test exponential smoothing constants in steps of .1, .01, or .001? There can be 10,100, or 1000 tests in the range .1 to .9, based on the test increment. But 1000 tests is wasted accuracy, just as testing stop-losses in $5 total investment increments is naive. Is it important if the one that was not tested showed twice the profit of the two adjacent tests? If you are still trying to find peak profits rather than the best system or contour, then you are wasting your time. Trends are intended to smooth data. Fine-tuning a trend seems inconsistent with the concept of smoothing. If you select a 154-day moving average because a large loss was averted, while a 153-day average was caught, you have a basic misunderstanding about the implied accuracy of a sys tem. New powerful computers with increased speed have made it painless to run large, meaningless tests. When computers were slower or resources limited, it was necessary to reason out the benefits of each hour spent on the machine. The "broad-brush" approach may still be the most efficient use of time and a way to prevent overfitting. Summary The method of finding a trading strategy can increase or decrease your chance for success. Using sound procedures and statistical methods is safe and conservative. This includes long data series that encompass as many unique situations as possible. In addition, global statistics, which average all the tests, are an excellent measure of a robust system and prevent the temptation to seek high-profit simulations. When using averages, it is clear which strategies and new techniques are best. |
||||||||||||||||||||||||||||||||||||||||||||
| ©2007 Olesia | Home My photos Forex News My trading Contacts |