The Importance of Betting Early

We evaluate the impact of timing on decision outcome, when both the timing and the relevant decision are chosen under uncertainty. Betting markets provide the testing ground, as we exploit an original dataset containing more than one million online bets on games of the Italian Major Soccer League. We find that individuals perform systematically better when they place their bets farther away from the game day. The better performance of early bettors holds controlling for (time-invariant) unobservable ability, learning during the season, and timing of the odds. We attribute this result to the increase of noisy information on game day, which hampers the capacity of late (non-professional) bettors to use very simple prediction methods, such as team rankings or last game results. We also find that more successful bettors tend to bet in advance, focus on a smaller set of events, and prefer events associated with smaller betting odds.


Introduction
According to the empirical evidence, for the same bettor, the probability of making a correct forecast is higher when the bet is made on the days before the event; as opposed to bets on game day, the chance of winning increases by 1. 3

percentage points (that is, by about 3%
with respect to the average). The effect is larger when big teams or multiple bets are involved (about 5% in both cases). The relationship between betting early and winning is monotonic, as the probability of a correct forecast is larger the higher the number of days from the event, up to the maximum of 5 days. This evidence supports the hypothesis that information overload may occur; as the event becomes closer, individuals receive more information than they are able to properly digest, therefore increasing the probability of mistakes.
The estimated individual fixed effects show that successful (non-professional) bettors also tend to place their bets in advance. Furthermore, they are more selective, as they place a smaller number of bets in the same week, and tend to focus on events associated with lower betting odds, which are arguably easier to forecast.
The paper is organized as follows. Section 2 reviews the relevant literature. Section 3 describes our empirical strategy and data. The empirical results are presented and discussed in Section 4. Section 5 concludes.

Related literature
Since the 1970s, sports forecasting has been the object of extensive research motivated by two main reasons: (i) to ascertain if betting markets are informationally efficient and enable learning processes, and (ii) to check if experts make more accurate predictions than nonexperts. Both strands of the literature aim at analyzing the conditions under which the availability of comprehensive information and professional advice is fully discounted by market prices (that is, betting odds) and rules out observable biases that could allow speculators to make higher-than-average returns.
A large body of empirical evidence supports the view that bettors' behavior does not conform to the rational decision model and is affected by a number of cognitive biases (Diecidue et al. 2004). First, bettors show a clear tendency to under bet favorites and over bet longshots (Golec and Tamarkin 1995). Second, they exhibit decision biases such as confirmation, gambler's fallacy, and overconfidence related to inaccurate information processing (Blavatskyy 2009). Third, bettors adopt a series of heuristics whose suitability is context-dependent (Conlisk 1993). Finally, they are not effective enough in discounting the effect of noisy and redundant information and in reducing the impact of information overload (Bleichrodt and Schmidt 2002).
A major strand of research concerns horse-race betting, which is a naturally occurring asset market in which the transmission of information from informed to uninformed traders is not typically smooth. This betting market is efficient if it aggregates less-than-perfect information owned by all the participants and disseminates it to all the bettors, through the publicly available information given by track and bookmakers' odds and handicappers' picks. Figlewski (1979) investigates odds and forecasts of a number of bookmakers and experts concluding that racetrack betting markets discount quite well the available information, although bettors exhibit different degrees of accuracy depending on whether they are on-track or off-track bettors. Snyder (1978), Hausch et al. (1981), Asch et al. (1984), and Ziegelmeyer et al. (2004) provide evidence on the tendency to under bet favorites and to over bet long-shots relatively to their winning probability.
Baseball, basketball, football, and soccer are sports in which the sources of insider information are less relevant than in racetrack. Pope and Peel (1989) analyze the fixed odds offered by bookmakers and the forecasts made by professional tipsters on UK soccer league games. They argue that betting markets are efficient in preventing bettors to gain abnormal returns on the basis of public information, but odds do not fully reflect all the available information. This finding is confirmed by Forrest and Simmons (2000), who consider newspaper tipsters offering professional advice on English and Scottish soccer games. They conclude that tipsters show a clear inadequacy in discounting the information publicly available on the newspaper. Moreover, their performance in predicting games is less successful than following the very simple strategy of betting on home wins.
The fact that the condition of being experts is not necessarily associated with a high degree of forecasting accuracy is extensively discussed by Camerer and Johnson (1991) for various domains (medical, financial, academic). Their conclusion is that experts' superiority in processing information is not strictly related to performance superiority, which is crucially affected by the matching of experts' cognitive abilities with "environmental demands" (Camerer and Johnson 1991, p. 213). An interpretation of this finding can be traced back to the paper by Oskamp (1965), who argues that the extent of collected information cannot be directly related to predictive accuracy. While predictive ability reaches a ceiling once a limited amount of information has been collected, confidence in the ability to make accurate decisions continues to grow proportionally (Davis et al. 1994). This induces overconfidence in decision-makers, who become even more convinced of their understanding of the case at hand, independently of the quality of collected information (Angner 2006). Further exposure to sources of information is consequently distorted by the confirmation bias, according to which once decision-makers devise a strong hypothesis, they will tend to misinterpret or even misread new information unfavorable to this hypothesis (Kahneman and Tversky 1973). Gigerenzer et al. (1999), Benartzi and Thaler (2001), Martignon and Hoffrage (2002), Rieskamp and Otto (2006), and Gigerenzer and Goldstein (2011) argue that decision making can be better explained by models of heuristics rather than by the standard rational decision model. Quoting Goldstein and Gigerenzer (2009, p. 3), "cognitive heuristics are strategies that humans and other animals use. We call them fast because they involve relatively little estimation and frugal because they ignore information. A heuristic is not either good or bad per se. Its performance is dictated by features of the information environment, such as low predictability, or high cue redundancy." Anderson et al. (2005) use the recognition heuristics to account for non-experts' performance in soccer betting. According to Newell and Shanks (2004), recognition heuristics is assumed to demand little time, information, and cognitive effort, and exploits the relationship between a criterion value (e.g., success in home win) and its predictors (e.g., team rank position).
Heuristics perform quite well in environments affected by noisy and redundant information such as sports forecasting. Noisy information is defined as an information structure in which not only can one signal indicates several states, but also several signals can occur in the same state (Bichler and Butler 2007;Crawford and Sobel 1982). In Dieckmann and Rieskamp (2007), redundant information is defined as information composed by pieces highly correlated with each other and supporting the same prediction (positive redundancy), or that contradict each other and suggest incompatible predictions (negative redundancy).
By again quoting Oskamp (1965), if bettors are provided with a very rich source of information without activating a costly search process, confidence increases in relation to the beliefs that they had before, because they are able to find explanations for that. For example, Bettman et al. (1993) provide support for the notion that people also select strategies adaptively in response to information redundancy. They show that participants choosing between gambles search only for a subset of the available information when they encounter a redundant environment with positively correlated attributes. Negatively correlated attributes, in contrast, give rise to search patterns consistent with compensatory strategies that integrate more information. This cognitive bias is known as the illusion of knowledge, according to which beyond a threshold more information on the event increases self-confidence more than accuracy (Barber and Odean 2002).
This condition of "information overload" characterizes media information on Italian soccer, which provides the ground for our empirical analysis. The amount of information to be processed is greatly increased by the variety of communication systems on TV, the internet, and newspapers. Furthermore, much of the information is not original and watchers continuously process information received from other sources but differently presented. The introduction of online betting causes a further increase in the availability of information, which is also diffused by online betting sites. Our dataset, which is described in the next section, includes small bets, generally evenly distributed across individuals. Therefore, it can be safely assumed that the individuals contained in our dataset are non-expert bettors.

Empirical strategy and data
Based on the literature surveyed in the previous section and on the available data, we test the following behavioral hypothesis. H1 (information overload): As soon as the event approaches, the amount of noisy information available to bettors increases, therefore reducing their winning ability.
At the same time, we control for the following confounding hypothesis.
H2 (learning): Bettors improve their performance over time, as they get more acquainted with the environment and the relative strength of the teams.
We use a unique (large) dataset of online bets from a provider specialized in this field.
The company is located in Southern Italy, but bets are made from all over the country. have to register and then bet online through credit card payments. We were provided with bets on all games of 20 game weeks of the Italian Soccer Major League (Serie A), namely, the last 10 weeks of the 2004/05 season and the first 10 weeks of the 2005/06 season. Our dataset includes 1,205,597 single bets made by 7,093 registered users. Single bets may also be part of multiple bets including more than one event and may concern several events (e.g., which team wins, draw, goals scored, goals scored in the first half, and so forth). Multiple bets increase potential profits and are won only if all the events happen at the same time. In our analysis, we focus on the simplest events 1, X, 2, 12, 1X, and X2 (where 1 stands for home win, X for draw, and 2 for away win). These types of event account for 85% of all bets. 2 The occurrence that bettor j correctly forecasts event i at game week t (W ijt ) can be modeled as follows: where γ j are individual fixed effects (capturing all time-invariant characteristics of bettor j, is a function of game week t; and ijt is an idiosyncratic error clustered at the event level. 3 To test H1 (information overload), we consider three specifications of g(.): linear function of D ijt ("betting distance"); dummy equal to one if the bet is placed before the game day and zero otherwise ("betting early"); non-parametric specification including a set of dummies  Table 1 reports the descriptive statistics of our variables. In our data, 45% of single bets are successful. This does not mean, however, that bettors have such a high winning rate, because single bets may be part of multiple bets (on average slightly more than 5 bets are made in each play, with considerable variability), and some of them may be wrong. Indeed, the winning rate in multiple bets is quite low: 5% on average. Most bettors place their play on the same day of the event, while early bettors (i.e., those who play in the previous days) are about 32%. The average amount spent per bettor in a game week is 211 Euros, again with a large standard deviation. Almost 40% of bets are made on the main four teams. Table 2 provides information on the above variables and on bettors' socio-economic characteristics by betting distance. We also test whether means are different between bets placed on game day and bets placed before. Thanks to the large sample size, many differences are statistically significant, although most of the time economically small. Early bets tend to be placed on stronger teams, and to be associated with a larger number of multiple bets.

Empirical results and discussion
Tables 3, 4, and 5 report our baseline specifications as in equation (1). In the first three columns, we do not control for individual fixed effects, whereas this is done in the last three columns. The latter represent our preferred specifications, but it is instructive to compare results with and without fixed effects. As discussed above, to control for possible learning we use three specifications: linear trend in game week (columns 1 and 4); quadratic trend (columns 2 and 5); and full set of game week dummies (columns 3 and 6). The difference between the three tables concerns how we model betting distance: linearly in Table 3; with the dummy "betting early" in Table 4; and with a full set of dummies for each value of the betting distance, which is measured in days, in Table 5. Table 3 shows very similar results across all specifications. The coefficient of betting distance is significantly positive and very stable: the farther away from the event date the bet is, the higher the probability of winning. On average and for the same bettor, betting one day earlier increases the chance of winning by about 0.8 percentage points, that is, by about 1.8% with respect to the average probability of a correct forecast. This provides evidence of possible information overload. As long as the season goes on, however, bettors worsen their performance, as highlighted by the significantly negative coefficients for the game week trend in both the linear and quadratic specifications.
Consistently with the previous literature, we find very strong effects for both home wins and strong wins (equal to 40.8% and 60.9%, respectively, with respect to the average outcome).
The ability of winning is positively and significantly affected by the monetary amount that each player bets, meaning that there is higher effort as long as more money is involved, with a large effect with respect to the average outcome (37.4% for an increase of the amount bet equal to its standard deviation). Betting for the main teams gives a higher probability of winning.
Betting on more than one event also increases the probability of winning, although by only 0.8%. Columns 2 and 5 include the variable game week squared. We do not report its value since it is extremely small (in the order of four decimals); therefore the linear specification is fairly good. As we would also expect, higher odds are associated with a lower probability of winning (on average by -46.0% for an increase of odds equal to its standard deviation).
[Tables 3, 4, and 5 here] In Table 4 the regressor of interest is the dummy "betting early," equal to one if the bet was placed on one of the 5 days preceding game day. This variable is significantly positive, meaning that the probability of making the correct forecast is higher when the bet is made in advance. On average and for the same bettor, the chance of winning increases by 1.3 percentage points (that is, by 2.9% with respect to the average). All the other variables confirm their behavior from a qualitative and a quantitative point of view. We also address heterogeneity issues, that is, we assess whether the effect of betting distance is stronger in specific subsamples. This is meant to further evaluate our informationoverload interpretation of the positive effect of betting early. Specifically, in Table 6, we distinguish between bets on one of the main teams and on all the other teams. In Table 7, we discriminate between bets done on many events (that is, above the median of events associated in multiple bets) or lesser events. Table 8 distinguishes between "hard bets" (that is, bets whose amount is above the median value, where we consider the amount of the multiple bet made by the individual) and all the others. In the last row of each table, we report the p-value of the Wald-test on the equality of the estimated coefficients of betting distance for each pair of subsamples. The subsample coefficients are statistically different between each other only in the case of "many events" and in some of the estimates for "main teams." [Tables 6, 7, and 8 here] In particular, in Table 6, betting distance is always significantly positive, but the size of its coefficient is about three times larger when only the main teams are involved in the bet.
This is consistent with our interpretation of the positive impact of betting early, because information overload on the event date is expected to be even more relevant for major teams.
Compared with the previous estimations, another relevant variable changes its behavior: game week is usually positive in the linear specification when the main teams are included, and negative otherwise. Therefore, we observe some positive learning when the main teams-which are usually under the spotlights of newspapers-are involved.
In Table 7, interestingly, the effect of betting early is quantitatively larger for bets linked to other bets in a multiple play. Again, in these circumstances, information overload is likely to exacerbate fallacies in decision making and to reduce the probability of winning. In Table   8, instead, we do not detect statistically significant differences in the size of coefficients for "hard bets" versus the others. Interestingly, registered users that place the 50% of bets that we code as hard (1,352) are just one-fifth of all bettors (7,093). This means that only a fraction of sophisticated bettors place higher-than-median bets, but their behavior in terms of informational patterns is not significantly different from the behavior of the other, less sophisticated, bettors.

, 3, and 4 here]
Finally, the estimated individual fixed effects allow us to shed light on additional behavioral patterns in our data. Figure 1 shows that more successful bettors (that is, those with a larger fixed effect) also tend to bet in advance, from 3 to 5 days before the event takes place. This regularity, of course, does not affect the estimates discussed above, as they accommodate for unobservable heterogeneity, but it is an interesting finding per se. More skilled bettors seem to anticipate information overload and place their bets in advance. They are also more selective, as they place a smaller number of bets ( Figure 2) and focus on bets associated with smaller betting odds (Figure 3), which are arguably easier to forecast. There is no clear pattern of association between ability and age (Figure 4), or other observable bettors' characteristics (available upon request).

Conclusion
We find that betting timing matters. From the analysis of more than 1,250,000 online bets, we obtain an economically small but statistically very significant and stable difference in the winning probability of early versus late bettors. The estimated effect controls for timeinvariant unobservable heterogeneity, learning, betting odds, and observable characteristics of the event. Therefore, when we refer to "late" versus "early" bettors we are comparing the same individual making bets at different distances from each event. The poorer forecasting performance of late bettors is attributed to an inefficient processing of information, also consistent with the heterogeneity results that we are able to disclose thanks to the richness of our data. The late bettors' decision process is affected by various cues that, unknown to the earlier bettors, have scarce relevance for predicting the outcomes. The excess of noisy information (especially harsh if the same individual decides to bet on the main teams or on multiple events) reduces the possibility of using very simple prediction methods, such as team rankings or home team winning. The use of these criteria and cues greatly improves the possibility of placing a winning bet. Some skilled bettors partly anticipate the issue, as individuals with larger fixed effects tend to bet from 3 to 5 days in advance.
We acknowledge two main limitations of our results. First, they are based on small stakes and we cannot rule out that when stakes are higher information processing could become more efficient, therefore bringing about positive learning and lower confusion from several sources of information. Second, we cannot rule out the fulfillment of other emotional objectives rather than standard profit maximization.   Notes. Dependent variable: probability of correctly forecasting the single event (included in either a single or multiple bet). Estimation method: linear probability model as in equation (1). Standard errors clustered at the event level are reported in parentheses. Significance at the 10% level is represented by *, at the 5% level by **, and at the 1% level by ***.  Notes. Dependent variable: probability of correctly forecasting the single event (included in either a single or multiple bet). Estimation method: linear probability model as in equation (1). Standard errors clustered at the event level are reported in parentheses. Significance at the 10% level is represented by *, at the 5% level by **, and at the 1% level by ***.   Notes. Dependent variable: probability of correctly forecasting the single event (included in either a single or multiple bet). Estimation method: linear probability model as in equation (1). Standard errors clustered at the event level are reported in parentheses. Significance at the 10% level is represented by *, at the 5% level by **, and at the 1% level by ***. In italics, p-values for Wald tests on the equality of the coefficients of the betting-distance dummies.  Notes. Dependent variable: probability of correctly forecasting the single event (included in either a single or multiple bet). Estimation method: linear probability model as in equation (1) in separate subsamples (bets linked to a higher-than-median number of multiple bets vs. others). Standard errors clustered at the event level are reported in parentheses. Significance at the 10% level is represented by *, at the 5% level by **, and at the 1% level by ***. The Wald test p-value captures the significance of the difference of the coefficients of betting distance in the two subsamples.   Notes. Dependent variable: probability of correctly forecasting the single event (included in either a single or multiple bet). Estimation method: linear probability model as in equation (1) in separate subsamples (bets linked to a larger-than-median amount vs. others). Standard errors clustered at the event level are reported in parentheses. Significance at the 10% level is represented by *, at the 5% level by **, and at the 1% level by ***. The Wald test p-value captures the significance of the difference of the coefficients of betting distance in the two subsamples.