Rational macroeconomic learning in linear expectational models
MPhil Thesis, 2008
The partial information rational expectations solution to a general linear multivariate expectational macro-model is found when agents are uncertain about the true values of the model’s parameters. Necessary and sufficient conditions for convergence to the full information rational expectations solution are given, and the core of an algorithm for the Bayesian updating of beliefs is provided. In the course of this a new class of full information rational expectations equilibria is described and some of its desirable properties proven.
- 48 Views
Munich Personal RePEc Archive
Rational macroeconomic learning in linear expectational models
Holden, Tom Department of Economics, University of Oxford
01. May 2008
Online at http://mpra.ub.uni-muenchen.de/10872/ MPRA Paper No. 10872, posted 02. October 2008 / 14:29
Tom Holden 01/05/2008
Rational macroeconomic learning in linear expectational models
An analysis of the convergence properties of macroeconomic models under partial information rational expectations and Bayesian learning
Abstract: The partial information rational expectations solution to a general linear multivariate expectational macro-model is found when agents are uncertain about the true values of the model’s parameters. Necessary and sufficient conditions for convergence to the full information rational expectations solution are given, and the core of an algorithm for the Bayesian updating of beliefs is provided. In the course of this a new class of full information rational expectations equilibria is described and some of its desirable properties proven.
Keywords: Rational Expectations, Partial information, Bayesian learning, Generalized Schur decomposition, Sunspots, Indeterminacy, Feasible Rational Expectations Equilibria
JEL Classification: C11, C60, E00
Word count: Actual: 19898 words. Official: 365 words per page × 81 pages = 29565 words.
Post: Tom Holden, Balliol College, Oxford, OX1 3BJ Phone: +44 7815 067305 E-mail: thomas.holden@gmail.com Acknowledgements: The author would particularly like to thank his primary supervisor David Vines for steering him towards this topic in its current form and his secondary supervisor Martin Ellison for his advice. Additional thanks for helpful comments are due to Simon WrenLewis, Clive Bowsher and Chris Bowdler.
2
Rational macroeconomic learning in linear expectational models
Contents
1. Introduction ............................................................................................................................................ 4 1.1. 1.2. Expectations in macroeconomics ................................................................................................... 4 “Rational expectations” .................................................................................................................. 5 Calculating rational expectations ............................................................................................ 5 Indeterminacy ......................................................................................................................... 7 Problems with “rational expectations” ................................................................................... 8
1.2.1. 1.2.2. 1.2.3. 1.3.
Bounded rationality ........................................................................................................................ 9 Adaptive expectations............................................................................................................. 9 Statistical learning à la Evans and Honkapohja ..................................................................... 10 Problems with Evans and Honkapohja’s work ...................................................................... 12
1.3.1. 1.3.2. 1.3.3. 1.4. 1.5.
Full rationality, limited information .............................................................................................. 13 The model ..................................................................................................................................... 15 Core details ........................................................................................................................... 15 Canonical form ...................................................................................................................... 17
1.5.1. 1.5.2. 2.
Full information solution....................................................................................................................... 18 2.1. 2.2. Information sets ............................................................................................................................ 18 The univariate special case ........................................................................................................... 20 Stability analysis .................................................................................................................... 20 Fully stable cases ................................................................................................................... 21 Saddle-path stable cases ....................................................................................................... 22 Proposition 1 ......................................................................................................................... 25
2.2.1. 2.2.2. 2.2.3. 2.2.4. 2.3.
Solution to the general canonical form ........................................................................................ 26 Set-up .................................................................................................................................... 26 Derivation of restrictions ...................................................................................................... 27 Derivation of the stacked form solution ............................................................................... 30 VARMAX form solution ......................................................................................................... 32 FREE solutions ....................................................................................................................... 32 Proposition 2 ......................................................................................................................... 35
2.3.1. 2.3.2. 2.3.3. 2.3.4. 2.3.5. 2.3.6. 3.
Partial information solution .................................................................................................................. 36 3.1. Expectation formation with exogenous beliefs ............................................................................ 36 Set-up .................................................................................................................................... 36 Derivation of restrictions ...................................................................................................... 38 Derivation of the stacked form solution ............................................................................... 41 Solution for the off stable path term .................................................................................... 42 Towards a FREE solution ....................................................................................................... 49
3.1.1. 3.1.2. 3.1.3. 3.1.4. 3.1.5. Tom Holden
Introduction 3.2. Endogenous beliefs ....................................................................................................................... 50 Additional assumptions ......................................................................................................... 50 Information sets .................................................................................................................... 51 Application of the Martingale Convergence Theorem .......................................................... 52 Lemma 1 ................................................................................................................................ 53 Additional restrictions under this information set ................................................................ 53 Conditions for almost sure convergence .............................................................................. 58 Performance under full indeterminacy ................................................................................. 61 Proposition 3 ......................................................................................................................... 62 Beliefs and learning ............................................................................................................... 63
3
3.2.1. 3.2.2. 3.2.3. 3.2.4. 3.2.5. 3.2.6. 3.2.7. 3.2.8. 3.2.9. 3.3.
Application to the univariate case ................................................................................................ 72 Fully stable cases ................................................................................................................... 73 Saddle-path stable cases ....................................................................................................... 74 Convergence conditions ....................................................................................................... 77 Proposition 4 ......................................................................................................................... 77
3.3.1. 3.3.2. 3.3.3. 3.3.4. 3.4. 4. 5. 6.
Bounded rationality approximations ............................................................................................ 78
Conclusion ............................................................................................................................................. 79 Appendix A: Matrix quasi-geometric series .......................................................................................... 80 References ............................................................................................................................................. 81
May 1, 2008
4
Rational macroeconomic learning in linear expectational models
1.
Introduction
In this thesis, we solve the problem of forming macroeconomic rational expectations under partial information about a model’s parameters. We find necessary and sufficient conditions for convergence to the full information solution and we develop the core of an algorithm for the updating of beliefs. This provides a fully rational alternative to the statistical learning literature, popularized by Evans and Honkapohja (2001), which has been influential in recent years. We begin with the motivation for this project.
1.1.
Expectations in macroeconomics
Expectations are inextricably tied up with the optimising agent framework that underlies almost all modern economics. In choosing whether to invest in stock, we consider whether the dividends we expect to get from it are more than adequate compensation for the price asked. More generally, whenever an agent is making a decision that will potentially deliver costs or rewards in the future, then they must form expectations of what that reward might be. Consumers choose current consumption to maximise their expectations of lifetime utility. Firms make pricing and investment decisions to maximise the expected value of the stream of profits that will result. Central banks choose the interest rate to minimise the expected future deviation of inflation and output from their targets. Indeed, almost all economic decisions have a forward-looking aspect to them, and so require the formation of expectations. What makes expectations particularly interesting to macroeconomists are the many macroeconomic variables that are affected by their own expectations. If when a firm chooses a price for their product they know they may be constrained to stick to that price for several periods, then they will optimally choose their price taking not only their current marginal costs into account, but also their expectations about the marginal costs they may face in the future. With price a mark-up over marginal costs such a set-up leads to current inflation depending on current expectations of future inflation (Calvo 1983; Walsh 2003: 23440). Similarly the optimization decisions of households lead current output to depend on households’ expectations of future output (Walsh 2003: 232-34). Many contemporary macroeconomic models take a dynamic stochastic general equilibrium (DSGE) approach in which the optimisation decisions of households, firms, investors and the central bank are combined, which leads to expectations of one macroeco-
Tom Holden
Introduction nomic variable having consequences for the path of virtually every other variable considered. Clearly then, precisely how these expectations are formed will have significant consequences for the path the economy actually takes. Traditionally the literature has been divided between full information “rational expectations” on the one hand and various partial information, boundedly rational schemes on the other. Neither is entirely satisfactory. On the one hand, the knowledge and mental capacities ascribed to agents under rational expectations are surely infeasible in general; on the other hand, though, there are at least some agents in the economy, often those with most influence, who really could not be sensibly modelled as anything other than fully rational. Most boundedly rational schemes also suffer from exceptionally poor performance in certain specific settings, meaning that in some circumstances even the least rational agents in the economy may realise the flaws in the way they form expectations. It is also hard to interpret the predictions of partial information boundedly rational models as until now has there has been no partial information full rationality benchmark to compare them against. Finally, since there are so many ways in which an agent can fail to be fully rational, any boundedly rational scheme will always seem somewhat arbitrary unless sound reasons can be given for one form rather than another.
5
1.2.
“Rational expectations”
1.2.1. Calculating rational expectations If we have a model of some part of the economy and values for all the model’s parameters, and we take that model to be true, how should we rationally form expectations of the model’s variables? This is the question to which “rational expectations” were the answer, an answer first formulated by Muth (1961) and later popularized by Lucas (1972) and Sargent et al. (1973). Broadly, rational expectations are just mathematical expectations; complications arise, though, when these expectations directly affect the model’s variables. Consider as a first example an industry in which supply decisions must be taken a period prior to the realisation of demand, due to the time taken by production. If markets clear and we take a locally linear approximation of the supply and demand curves then we will have an equation of the form:
May 1, 2008
6
Rational macroeconomic learning in linear expectational models − + , = + −1 + , where and subscripts denote demand and supply side parameters respectively, is the price level and ∙, are unpredictable shocks (i.e. −1 ∙, = 0)1. To find the rational expectations solution, we take expectations conditional on the − 1 information set of both sides, giving: − + +
, − ,
− −1 = + −1 ⇒ −1 = Substituting this back into the original equation gives us that: =
− +
. This then is the
value would take if all agents in the economy had formed rational expectations with knowledge of the values of the parameters , , and . Because “rational expectations” are only rational when everyone in the economy knows that everyone else is rational, it is important to note that strictly construed “rational expectations” are an equilibrium concept. Were it the case that almost everyone in the economy (irrationally) expected next period’s price to be zero, then the rational expectation of the next period price would instead approximately equal
−
. In light of this, we shall term a solution to a model under
rational expectations a rational expectations equilibrium or REE2. The models we will chiefly be concerned with in this thesis will not admit such simple REE as the one just given for the Cobweb model. In particular, we will focus on models in which current expectations of future values influence the current value of those variables, rather than those in which only past expectations matter. Most DSGE and New Keynesian models take this “-dated” form. The canonical example is asset pricing under risk neutrality, with a constant, non-stochastic real interest rate. It is straightforward to see that in this situation, = 1 +
−1
+1 + , where is the period asset price and is the
dividend paid at the start of that period (so in particular is in the period information set). In general, this has many REE. For example, let be any white noise process, then we can impose = −1 + and still get a solution, since stacking these equations we have: 1 − 1 + 1 0
1
−1
0 +1 = 0
0 −1 + 1 −1
This is the Cobweb model considered by Muth (1961). This concept was introduced in Radner (1979).
2
Tom Holden
Introduction i.e. 0 +1 = 0 1 + −1 0 −1 + −1 1 + 1 + 1 +
7
It is common when considering rational expectations solutions to such problems to restrict attention to those satisfying some stationarity condition. These are often justified by the transversality conditions of the optimization problem from which the equations arrived, or by an appeal to agents’ assumption that the future is not radically different from the present. In this model, it turns out that if ~NIID , 2 , for sensible values of there is always a stationary solution taking the form = + for some unknown parameter . When this holds we must have +1 = + , so identifying coefficients = 1 +
−1
+ , i.e. = . This method of guessing solutions based on the state variables of the problem is
due to McCallum (1983; 1999) and is known as the minimal state variables (MSV) solution. Unfortunately, for more complex models finding MSV solutions is numerically cumbersome (Binder and Pesaran 1996) and it will not in any case find all solutions of the original model. Instead the solution method we shall use in this paper owes its intellectual debt to that of Blanchard and Kahn (1980). 1.2.2. Indeterminacy General linear expectational models often have many REE. Although the early DSGE literature confined itself to models in which there was a unique solution, recently models exhibiting indeterminacy have been given more serious consideration. Indeterminacy may arise from increasing returns to scale (Benhabib and Farmer 1994), market imperfections (Benhabib and Nishimura 1998), search externalities (Howitt and McAfee 1988), variable mark-ups (Woodford 1987), collusion (Rotemberg and Woodford 1992), the interaction of monetary policy and cash in advance constraints (Woodford 1994), policy feedback (Blanchard and Summers 1987; Taylor 1998), sticky prices (Benhabib et al. 1998), endogenous growth (Benhabib and Gali 1995) and several other sources3. Indeed, the theoretical evidence at least is almost overwhelming in support of some level of indeterminacy. Indeterminacy can also potentially explain many macroeconomic puzzles. Benhabib and Farmer (1999) suggest it may have a role to play in explaining price stickiness, Auray and Fève (2007) suggest it may ex-
3
This literature is extensively surveyed in Benhabib and Farmer (1999).
May 1, 2008
8
Rational macroeconomic learning in linear expectational models plain the price puzzle and Benhabib and Farmer (2000) suggest it may help explain the real effects of an increase in the money supply. All of this suggests that indeterminacy is empirically important as well. Our interest in indeterminacy stems from two facts. Firstly, the traditional macroeconomic learning literature has had most problems with learning under indeterminacy, (which is something we will discuss later), and secondly, intuitively rational learning should perform best under indeterminacy, since under indeterminacy the set of expectations consistent with stability will be much larger, and thus it will be easier to end up within it. In light of the previous remarks, we assert that these problems with traditional learning under indeterminacy should be taken seriously and not dismissed as being the result of poor modelling choice, and we can be optimistic for the performance of rational learning, even if it turns out to perform badly under determinacy. 1.2.3. Problems with “rational expectations” We have already hinted at many of the problems with the REE concept. It is objected firstly that agents do not have the information to form rational expectations and secondly that they lack the mental capabilities to act on that information in the required way. The first objection is uncontroversial. Even professional macroeconomists still have a great deal of uncertainty as to the precise impact of a monetary policy shock, for example. Finding out the parameters of a macro-model invariably requires undertaking at least some econometrics – a procedure that will never produce certainty, only posterior probability distributions over the values those parameters might take. It really does then seem hard to justify assuming that all agents in the economy actually form expectations under full information. The second objection leaves more room for debate. It might be argued that it only takes a few agents in the economy forming expectations rationally for the whole economy soon to acquire rational expectations4. For example, given sufficient liquidity it only takes a single risk-neutral agent with rational expecta4
Precisely this is shown within the context of a simple model in Blume and Easley (1993: 38). In particular, they
show that if all traders in a simple economy have logarithmic preferences and some traders are Bayesian learners who put positive probability on the correct model, then in the long run, assets are correctly priced.
Tom Holden
Introduction tions participating in futures markets for all futures prices to correspond to their prices under rational expectations. Indeed even non-futures markets reveal significant amounts about market expectations of the future paths of output and the interest rate. The media then notice such signals and broadcast them back to the wider population, in effect giving every agent in the economy free access to a set of almost rational forecasts for major macroeconomic variables. Of course, agents may well ignore this information or act on it in irrational ways, but this is not an argument against ascribing them rational expectations so much as one against modelling their micro-behaviour as fully rational. The validity of the second criticism then depends on both the strength of the transmission mechanism of expectations and the extent to which forming fully rational expectations is computationally feasible for those working at investment banks. We will be better placed to answer the latter of these two questions once we have analysed what rational expectations look like under partial information. In any case, though, it seems the full information assumption implicit in the classical REE framework is sufficiently dubious to warrant a search for alternatives.
9
1.3.
Bounded rationality
1.3.1. Adaptive expectations The earliest models of macroeconomic expectations formation (e.g. Cagan 1954) took the form: +1 = + 1 − −1 where is a period non-rational expectation operator, is an arbitrary parameter and is the process of interest. With = 1 the variable is not expected to change from its current value and with = 0 expectations can take any constant value, independent of time. With ∈ 0,1 , expectations adjust sluggishly to changes in the level of , which can be thought of as something like a learning process. This form of learning seems reasonable when the REE solution for takes the form = + (a form we saw was taken in the Cobweb model when is the price level) and where there is some constant probability in each period of a structural break that changes the value of . When is constant over time the learning procedure will soon settle down to satisfying +1 ≈ +1 providing both and +1 are asymptotically stationary (though even asymptotically for > 0 there is greater variance in the procMay 1, 2008
10
Rational macroeconomic learning in linear expectational models ess than there would be in the REE) (G. W. Evans and Honkapohja 2001: 49), but the learning procedure is nonetheless also capable of responding to changes in . It is worthwhile comparing these models’ properties to those in which we instead have:
=
+1 =
−1
+ 1 −
−1
−1 ⇒ +1
1 =
=1
i.e. +1 is the sample mean of 1 , … , . If it was genuinely the case that for all , = + , then this would be the unique fully rational way of forming expectations. Unfortunately, if everyone else is learning at the same time then in models containing expectations it will not in general be the case that = + , though this may be approximately true for large if the REE solution takes this form. Consideration of these decreasing-gain learning procedures gives an alternative interpretation of the constant gain case: if we consider a large population of agents all of differing ages each of whom is undertaking decreasing-gain learning, then, providing agents’ life-spans are not changing through time, constant gains may, in the aggregate, be a reasonable approximation5. However, crude learning procedures such as these are utterly unsuited to modelling any situation in which the REE solution is not of the form = + , since then +1 would not be constant and so, even in the best possible case in which everyone else in the economy has rational expectations, there would still be no possible way in which +1 could be even approximately asymptotically rational. 1.3.2. Statistical learning à la Evans and Honkapohja Evans and Honkapohja’s work (henceforth E&H)6 (e.g. G. W. Evans and Honkapohja 2001) is designed to address this criticism. They assume agents estimate the parameters of the REE solution by usual econometric techniques such as ordinary least squares (OLS). Due to the “online” nature of the learning, it is
5
This result is highly dependent on the age structure of the population, and the value of for which this comes
closest to holding will be a function of the population’s structure. We will discuss this issue in more detail in § 1.3.3.
6
The origins of this literature go back at least as far as Bray (1982), but most of the ideas later used and popularised
by E&H, not least the stochastic approximation techniques, were introduced by Marcet and Sargent (1989).
Tom Holden
Introduction usually convenient to express this in recursive least squares (RLS) form. For example if the REE solution has the AR 1 form = −1 + + , then the estimates and of and would be updated by: 1 −1 = −1 + −1 − −1 − −1 −1 −1 −1 where is the estimated covariance matrix of (assumed IID) which is updated according to: 1 −1 −1 − −1 2 −1
11
= −1 + −1
This is fully rational learning if and only if it is actually the case that for all , = −1 + + . Again, this will not be true in general if the economy is affected by expectations and everyone is learning at the same time. For example, if = +1 + −1 + + (so = 1 ± 1 − 4 2 , = 1 − − and = 1 − ) then, if expectations are formed according to the learning pro-
cedure given above, it will actually be the case that: = −1 −1 + −1 + −1 + + = −1 + −1 + −1 + + This means agents are estimating evolving parameters as being in fact constant, so their learning procedure is misspecified and consequently cannot be fully rational. E&H derive some general convergence conditions for this type of learning. The current model under consideration serves as a good illustration of its performance7. When the REE is fully stable, so one solution for is in the unit circle and one is outside it8, locally at least, RLS learning will always converge to the unique stable REE. However, under indeterminacy, at most one of the two MSV solutions is locally stable under RLS learning and, indeed, in one non-null region of indeterminacy there is a zero probability of convergence to either of these two MSV solutions under RLS learning. This demonstrates that the learning method posited by E&H may fail catastrophically in certain circumstances and illustrates our claim above that statistical learning performs particularly badly under indeterminacy.
7
See Figure 8.7 of “Learning and Expectations in Macroeconomics” (G. W. Evans and Honkapohja 2001: 203). That this is the condition is shown in § 2.2.
8
May 1, 2008
12
Rational macroeconomic learning in linear expectational models When applying their work to real world data, E&H tend to switch from decreasing to constant gain, both to allow for structural breaks and because in real world agents die taking their accumulated knowledge with them. The convergence properties of constant gain learning are more complicated, as even in the limit the estimated parameters will be stochastic, which in certain circumstances can cause periodic jumps from one basin of attraction (i.e. an REE solution) to another. Nevertheless, they prove that in certain circumstances even constant gain learning will converge in the mean to an REE solution. 1.3.3. Problems with Evans and Honkapohja’s work The chief problem with E&H’s approach to learning lies in its fundamental misspecification. They attempt to justify this by noting that “the misspecification may not even be statistically detectable during the transition *to a steady state+” (G. W. Evans and Honkapohja 2001: 32), but this will certainly fail to hold in situations in which RLS learning does not even converge. In these circumstances, surely even the least rational agents would realise their misspecification. Worse still, this criticism applies not just to regions in which RLS fails to converge to anything, but also to those in which some, but not all, stationary REE have a basin of attraction under RLS, such as those described above. To see this, suppose that we are in an economy of this AR 1 form with parameters in an indeterminate region in which the lower solution is uniquely stable under RLS, and suppose that until period , all the agents had full information and were forming expectations in line with the higher of the two REE solutions. If from period onwards these fully informed agents started slowly dying and being replaced with uninformed agents of infinite lifespan, then we would expect the economy still to remain near its original REE, as the uninformed agents should be able to learn the equilibria the informed agents had been playing until that point. However, if the uninformed agents were learning by RLS, then their probability of convergence to the larger solution would still be zero, providing the informed agents all died off in a finite period. E&H wish to use RLS convergence as a justification for picking one REE rather than another. However, given that even boundedly rational agents would realise RLS was failing in such circumstances, at best, they have shown criteria for RLS being an acceptable approximation to learning. Additional problems are caused by E&H’s reliance on constant gain learning in order to get empirical predictions. Even if all agents learned by RLS, constant-gain learning would still not necessarily be a reasonTom Holden
Introduction able model of aggregated expectations. For example, if we take the continuous time version of the model described in § 1.3.1, then if is the density of people of age in the population, for (continuous time) RLS to aggregate to (continuous time) constant gain learning, it is easy to see that we require ∫
∞
13
=
− since these are the contributions of the − data point to aggregated RLS and constant gain learning respectively. This can only hold if = 2 − , which our numerical calibrations have shown to be a poor model of actual data: in particular, it requires there to be far too many over 80s as this distribution has relatively fat tails. Therefore, in general we expect the dynamics in a population of agents, all of whom are learning by RLS, to differ substantially from the dynamics under constant gain learning. Both our claim that stability under RLS learning cannot be validly used as an equilibrium selection device and our claim that it is invalid to use constant gain learning as an approximation to aggregate learning are fundamental criticisms of the E&H approach. A perhaps yet more damning one, though, comes from our suggestion that the only reasonable model may be that expectations are rational in aggregate, given the expectational transition mechanisms present in the economy, and given the many agents who have strong financial incentives for rationality. This approach of full rationality but partial information is what we pursue in this thesis.
1.4.
Full rationality, limited information
That economic agents may be fully rational and yet not have full information is certainly not a new idea. There have been substantial tranches of literature devoted to learning in general equilibrium and learning in games. Two fairly comprehensive surveys are Blume and Easley (1993) and Blume et al. (1982). The “rational” part enters from the use of Bayes’ Law for the updating of beliefs. If one accepts the Savage axioms (Savage 1954) as defining rationality, then Bayesian learning is the only rational kind of learning there is. Though far from uncontroversial, for the duration of this thesis we will suppose the Savage axioms are a given, so “rational learning” and “Bayesian learning” are synonymous. The first thing to note is that much of the existing literature has been concerned with estimating unobserved variables rather than estimating the model’s parameters. This covers estimating current values of variables that are only available with a lag, estimating variables subject to measurement error and esti-
May 1, 2008
14
Rational macroeconomic learning in linear expectational models mating the permanent component of variables subject to transitory shocks. The fully general solution to this under homogeneous beliefs in a macroeconomic linear REE context was given in Pearlman et al. (1986), and is (broadly) based on Kalman filter methods (Kalman 1960). Since we are attempting to answer the same question as E&H, ours is an entirely different problem to this and Kalman filtering techniques will not be applicable. That said, future work could examine learning under uncertainty both about unobserved variables and about the model’s parameters. Another thing to note is that a good deal of the literature deals with heterogeneity in beliefs and hence in expectations, the most famous example of which is Townsend (1983) which deals with this in an unobserved variables context. In assuming homogeneity, we will escape many issues connected with this. Another source of apparent complication in the existing literature is the placing of learning within the contexts of a very specific general equilibrium model that has not gone through the usual macroeconomic “mashing” process of log-linearization, assumed certainty equivalence etc. to get it into a standard linear expectational reduced form. This means that learning is very closely tied in to the particular agent doing the learning and that inter-temporal optimization needs to take into account how beliefs might be revised in future. Townsend (1978) and the subsequent literature it spawned all fall into this category. A significant explanation for the success of the E&H approach to learning is that it is entirely generic and plugs straight into the linear expectational reduced form, which would normally be calculated anyway in order to find the full information REE in an analytically tractable way. Admittedly, there are some very good reasons, when one is concerned with modelling strict rationality, for not log-linearizing and assuming certainty equivalence, since at best the reduced form that results is a local approximation to the true behaviour described by the model. However, many of these reasons are just as valid under full information as they are under partial, and yet few quibble with the ascription of “rationality” to the full information REE solution that results from solving the reduced form. In light of this, we will be solely concerned with linear expectational reduced forms and we will treat them as if they were complete and exact de-
Tom Holden
Introduction scriptions of the micro-founded models from which they arose9. This means that, much as in E&H, learning will be performed by a representative agent and will be unrelated to utility. To the best of our knowledge, the problem of forming partial information rational expectations (in the macroeconomic sense) has never been addressed. In particular, it is the combination of parameter learning and having to choose expectations in order to (attempt to) stay on the stable path that is novel. There has been some literature on the related problem of optimal control under parameter uncertainty, including Prescott (1972), Easley and Kiefer (1988) and Kiefer and Nyarko (1989), but the complications present in these papers (chiefly coming from trade-offs between learning speed and the control target) do not give any great insights into the problems we will encounter below, which is unsurprising since our learning is utility independent and our “control target” is binary (“end up on the stable path” or “don’t”). Our task is made particularly difficult by the fact that if agents are far enough off the stable path then they may never be able to return to it, even if they later know better where it is, since expectational errors must be unpredictable from the period in which the expectations were formed.
15
1.5.
The model
1.5.1. Core details We will be solely concerned with models with the standard -dated expectations form: 1 = +1 + 1 −1 + + + + , 2 = 2 −1 + + + , , = ~NIID 0, Ξ ,
9
The approximation implicit in this is close to what Cogley and Sargent (2008) call an “anticipated-utility” model,
after Kreps (1998). In these models, agents treat parameters as uncertain when learning, but as constants when forming decisions. They show that at least in their model the anticipated utility approximation is close to the fully rational solution. Our agents are slightly more sophisticated than this, though, because they only treat expectations as constants when forming decisions. The formation of the actual expectation each period will fully account for uncertainty as to the model’s parameters, which is not true in the Evans and Honkapohja approach for example.
May 1, 2008
16
Rational macroeconomic learning in linear expectational models where is a vector of endogenous variables (in the sense that they can be influenced by expectations) and is a vector of exogenous variables (in the sense that they are not affected by expectations). A large proportion of DSGE models take this form, which justifies our focus on it, and as in the standard REE literature, we shall assume agents have homogenous beliefs. However, unlike this literature, we shall not assume that agents are aware of the entire past history of the economy before their “birth”10, or that they know 1 , 2 , , 1 , 2 , , Ξ, , with certainty; in fact we will not even assume agents know which variables are exogenous. We do however assume that all agents ascribe probability 1 to all variables asymptotically growing at a sub-exponential rate, i.e. that for all ∈ ℤ, there is some polynomial such that as → ∞, − → 0. This could be justified by assuming that agents are reluctant to assign probability to the future being significantly different from the past. We have included a linear time trend in this core model to allow for growth, as even removing a linear trend is not a trivial operation in small samples when there is uncertainty about other parameters as well. This model can be simplified if we let ≔ and assume 2 is invertible as then: = +1 + −1 + + + where = 0 0 , = 1 0 0
−1 2 2 , = 1 −1 0 2 2
(1.1)
−1 −1 + 2 + 2 0 , = , = and −1 −1 2 2
where =
−1 , + 2 , −1 2 ,
~NIID 0, Σ , where Σ =
0
−1 2 Ξ −1 ′ ′ −1 2 2
−1 ′ 2
0
.
We will take this equation as our general form from here on. This is valid as in general agents are uncertain which variables are exogenous, so there are no restrictions they can place with certainty on the structure of this equation’s parameters.
10
This can better be thought of as a model of a major structural change to the economy in period − 1, after which
everyone has to start their learning again from scratch. A major change in political institutions or central bank monetary policy regime is the usual example. In future work we will give “birth” its more literal meaning and assess learning in an overlapping generations model (without the assumption of homogeneity of beliefs).
Tom Holden
Introduction 1.5.2. Canonical form Let us now define the innovation process by ≔ − −1 for all ∈ ℤ. We can stack this definition together with (1.1) to get the canonical form: − +1 = 0 0 0 −1 0 −1 + 0 + 0 + 0 + 0 0 , = , = ,Ψ= and Π = we have: 0 0 0 (1.2)
17
So defining = , Γ0 = +1
− , Γ1 = 0 0
Γ0 = Γ1 −1 + + + Ψ + Π
Beyond requiring that = , our solution method will not depend at all on the precise internal +1 block structure of Γ0 , Γ1 , , , Ψ and Π. However, it is worth noting that if is invertible then we can pre−1 multiply by Γ0 =
0 −−1 0 −−1
−1
giving:
=
0 0 0 + + + + −1 −−1 −1 −1 −−1 −−1
(1.3)
If is taken to be an arbitrary white noise process, then this is the full set of solutions including explosive ones. The challenge in both the full and partial information cases is to restrict in order to guarantee that is asymptotically polynomial in .
May 1, 2008
18
Rational macroeconomic learning in linear expectational models
2.
Full information solution
We begin by solving the canonical form under full information. We do this both to introduce the mathematical machinery and because we wish eventually to find necessary and sufficient conditions for the expectational errors under partial information to converge to those under full, which, unsurprisingly, requires a solution for these errors in both circumstances. We will also introduce the concept of a “Feasible Rational Expectations Equilibria” in this chapter, without which finding the partial information REE would be incredibly difficult, if not impossible.
2.1.
Information sets
In what follows, we will mark all variables that are different under full information by a superscript ∗. This is necessary to make it perfectly clear that (the economy’s state when everyone has limited informa∗ tion) is not the same random variable as (the economy’s state under full information). We will also de∗ note expectations taken under this information set at by ∗ . So we replace by = ∗ . ∗ ∗ +1
We suppose that everyone was born at time −∞ and so knows the complete history of the economy (in∗ cluding contemporaneous values of 11) and that they also know the values of , , , Σ, , with cer-
tainty. We suppose they know the data generating process for and that Σ is of full rank. Furthermore, we suppose that at agents know the value of , a vector of all the sunspot shocks that may possibly affect the economy. Additionally, we suppose that agents know arbitrary matrices and of size
∗ ∗ ∗ dim − × dim and dim − × dim respectively (where is a known constant whose
value will be defined later in terms of , , , Σ, , ), which determine the aggregation of sunspots vari11
∗ Allowing to be in the time information set is not completely uncontroversial, since in the real world data of-
ten takes a while to arrive. However this is not the level on which to incorporate such insights, since the microfoundations of these models invariably use information sets in which is either observable or at least in equilibrium perfectly predictable at . (For example in Calvo pricing models (Calvo 1983), firms set prices equal to a constant mark-up over nominal marginal cost, which itself depends on the actual aggregate price level that period.) We trust that micro-founded model builders would have written −1 instead of if they did not think the agent in question had access to contemporaneous variables.
Tom Holden
Full information solution ables into a combined sunspot term. We will require that ∗ = 0, which is to say that sunspots are −1 unpredictable. We also assume that is independent of all other random variables (so in particular
′ ∗ = 0). This assumption is harmless, as the actual sunspot term will be given by + . −1
19
More precisely then, the time information set for all agents is given by:
∞ ∗ , =−∞ ∞
ℐ∗
≔
∪ , , , Σ, , , , ∪
=−∞
~NIID 0, Σ
∪ Σ is of full rank
∪
=−∞
= 0 and is independent of , , , Σ, , , , −1 , … , +1 , +2 , …
∪ the economy's law of motion is of the form of (1.1) ∪ the economy is asymptotically growing at a sub-exponential rate Note that we have not assumed that , −1 , … is in the ℐ∗ information set. This is because in the partial information case (where there is some uncertainty over , , , Σ, , ) it is very hard to justify assuming that , −1 , … is known at ; econometric data sources do not have series of shock values, rather econometricians estimate a theoretically justified model from output, inflation etc. and then infer estimates of the shock series. In addition, were known at , then after at most 3 dim + 2 observations of , +1 , −1 and the parameters , , , and would be known with certainty (since Σ is of full rank), which would be a rather poor model of “learning”, particularly as it would lead to all shocks being fully identified, something certainly not true in most macroeconomic contexts. Now despite not being in ℐ∗ , if we take expectations of (1.1), then we have:
∗ ∗ ∗ ∗ = − ∗ +1 − −1 − − =
Thus under the ℐ∗ information set agents will know anyway. However, this result clearly relies on the inclusion of , , , , in ℐ∗ ; if there is any uncertainty at all as to their values then agents will not be able to work out with certainty. In light of this, and since we are chiefly concerned with learning in this
∗ thesis, we will be particularly interested in REE in which ∗ +1 is expressible as linear in ∗∗ , −1 , … , , −1 , … and so in particular is not a function of , −1 , …. We will term such equilibria
May 1, 2008
20
Rational macroeconomic learning in linear expectational models “Feasible Rational Expectations Equilibria” or FREE12. It is worth pointing out that trivially the MSV solution is always feasible in this sense, since it will only include contemporaneous shocks.
2.2.
The univariate special case
We commence with an analysis of the univariate case. This provides a gentle introduction to the mathematical methods and the procedure for finding FREE solutions, and gives a convenient way of checking our algebra in the harder cases. It also makes clear the limitations of the MSV solution method. 2.2.1. Stability analysis Suppose temporarily that is one dimensional, so = , = and = for some scalars , and . If = 0, then the model is in AR 1 form and so there is a non-explosive solution if and only if = 0 (in
∗ which case ∗ +1 = 0) or ∗ ∗ ≤ 1 (in which case ∗ +1 = + + + 1 ).
If ≠ 0, then from (1.3): 0 ∗ = ∗ ∗ +1 − 0 The eigenvalues 1 , 2 of − 1
1 0 0 0 1 ∗ −1 ∗ 1 + + + − + ∗ − ∗ − −1 satisfy 2 − + = 0, so:
(2.1)
1 =
− 2 − 4 , 2
2 =
+ 2 − 4 2
If 1 ≤ 1 and 2 ≤ 1 then the system is stable13, so expectations are indeterminate. If precisely one eigenvalue satisfies ≤ 1, then the system is saddle path stable and expectations will be determinate. If 1 > 1 and 2 > 1 then the system is unstable independent of expectations.
12
∗ It may be objected that for an REE to be feasible, in fact ∗ +1 should not even depend on , −1 , …. There is
certainly some validity to this objection, but the direct observability of may be justified by noting that the source of ’s variance is in some sense a choice variable, since expectations are. We may think of agents as calculating the determinate parts of their expectations and then choosing to use e.g. the deviation between the expected and actual number of goals scored in Premiere League matches to determine the other components of their expectations.
Tom Holden
Full information solution Note that when 2 − 4 < 0, both eigenvalues are complex and 1
2
21
= 2
2
= . Thus, in this case,
the system will be stable and indeterminate if ≤ 1 and explosive otherwise. When 0 ≤ 2 − 4, both eigenvalues are real. In this case 1 = 1 if and only if 2 = 1 if and only if = + or = − − . Now
1 2
≤ 0 and
2 2
≥ 0. Thus 1 ≤ 1 if and only if ≥ − + and
2 ≤ 1 if and only if ≤ + . 2.2.2. Fully stable cases In the fully stable cases either 2 − 4 < 0 and
≤ 1 or 0 ≤ 2 − 4 and − + ≤ ≤ + . In
∗ these cases rational expectations impose no restrictions on , so the full set of solutions satisfies ′ ′ ∗ = + , where = is a scalar and = is a row vector (i.e. in this case, = 1). We
are particularly interested in FREE solutions in which +1 does not depend on , −1 , …. We can accomplish this if we are prepared to further restrict . In particular, if we assume ≠ 0 then =
′ ∗ −
∗ so from the bottom row of (2.1) and the definition of , the FREE solutions satisfy:
∗ ∗ +1
′ ∗ ∗ ∗ 1 − ∗ ∗ = − −1 + −1 − − − + ′ 1 1 ∗ 1∗ 1 ∗ ∗ − − −1 + −1 − − +
=
The condition that ≠ 0 is also necessary for the existence of a FREE. To see this suppose for a contradiction that = 0 but that:
∗ ∗ ∗ +1 = + + other terms known at − 1 ∗ ∗ Then 0 = Cov−1 , = Cov−1 , , so we also have:
13
In the sense of exhibiting polynomially bound, i.e. non-explosive, growth. We are thus treating unit roots as sta-
ble. This is valid given our particular definition of explosiveness since expectations of a unit root process, though time dependent, are nonetheless polynomial. For example if = −1 + 1 + + then + = + + +
1 2
+ 1 , which is quadratic in . It is possible to treat unit roots as explosive and ensure asymptotic linearity, but
this considerably complicates the derivations.
May 1, 2008
22
Rational macroeconomic learning in linear expectational models
∗ ∗ ∗ ∗ 0 = Cov−1 , = Cov−1 ∗ +1 + −1 + + + , = Cov−1 ∗ +1 + , ∗ = Cov−1 , + Cov−1 , + Var−1 = Σ
However Σ is of full rank, so we have a contradiction from 0 = Σ ≠ 0.
∗ ∗ To obtain the general solution for , we instead use the definition of to replace the expectational
terms in the bottom row of (2.1), which implies: ∗ ∗ 1 ′ − −1 − − + +1 − + +1
∗ +1 =
(2.2)
This is an ARMAX 2,1,1 process and thus is more general than the usual “MSV” AR 1 one. To show that generically these two forms are not equivalent we suppose there exist , , , ℳ , ℳ such that:
∗ ∗ +1 = + + + ℳ +1 + ℳ +1
(This is the sunspot augmented MSV form.) So for any ℬ:
∗ ∗ ∗ +1 = − ℬ + ℬ−1 + + ℬ − ℬ + 1 + ℬ + ℳ +1 + ℬℳ
+ ℳ +1 + ℬℳ
(2.3)
For this to be equivalent to (2.2) we must be able to equate terms, which at least requires that ℬℳ = 0. If ℬ = 0, then the ℬℳ term disappears, which is always present in (2.2), thus in fact we must have
′ ℳ = 0, which can only possibly hold if = 0 too. When this is the case, equating terms we have:
− ℬ =
,
ℬ = − ,
+ ℬ − ℬ = − ,
1 + ℬ = − , ℳ = ,
ℬℳ = −
1
⇒
ℬ=−
1
1 and =
But then from the first equation +
= , so this can only hold if we are also prepared to restrict
, illustrating how many solutions are ruled out by the imposition of the MSV form. 2.2.3. Saddle-path stable cases In the saddle-path stable cases 0 ≤ 2 − 4 and either < − + (for 1 > 1) or > + (for 2 > 1). Without loss of generality we assume the latter holds, so 1 ≤ 1 and 2 > 1. Now by the
Tom Holden
Full information solution 0 Schur decomposition14 (Horn and Johnson 1985: 79) of − 1
23
there exist possibly complex matrices
and Ω, where is unitary15 and Ω is upper triangular with 1 and 2 on its diagonal such that: 0 − 1 = Ω = 11 21 12 1 22 0
12 11 2 12 21 22
where denotes the Hermitian or conjugate transpose of . We note the following implied identities that will prove useful below: −22
−21
11 + 21 12
0 = − + 22
1 + = Ω = 1 11 12 12 2 12 1 = Ω = 11 1 21 1
1 21 + 12 22 2 22
(2.4)
21 −11 + 21
22 0 = −12 + 22 −
11 12 + 12 2 21 12 + 22 2
(2.5)
A third identity follows from ’s unitarity, namely: 1 22 −21 Now if we let ∗ ≔ −12 = −1 = = 11 11 21
12 22
(2.6)
∗ and we pre-multiply (2.1) by then we have: ∗ ∗ +1
∗
1 =0
0 0 0 1 12 ∗ ∗ + 1 + + −1 + − 2 − −
(2.7)
The bottom row of this is given by: 1 ∗ − 22 − 22 + 12 + 22
∗ ∗ 2, = 2 2,−1 − 22
Since 2 > 1, this equation is explosive, so we solve forward following Sims (2002: 9), giving ∀ ∈ ℕ:
14
0 We could as well have just diagonalized −
1
in the usual way, but by using the Schur decomposition here we
hope to make the comparison between the univariate and non-univariate cases clearer.
15
That is to say the conjugate transpose of is ’s inverse.
May 1, 2008
24
Rational macroeconomic learning in linear expectational models
∗ 2,
=
− ∗ 2 2,+
−
=1
− ∗ 2 12 + − 22
1 ∗ + + + + − +
Taking dated expectations then gives:
∗ 2,
=
∗ ∗ 2,
=
∗ − 2 ∗ 2,+
+
=1
− 2 22
+ +
∗ − By assumption ∗ 2,+ grows at an asymptotically polynomial rate and thus is dominated by 2 . This
means that in the limit as → ∞:
∞ ∗ 2,
=
=1
− 2 22
+ +
=
22 + + 1 + 2 − 1 2 − 1
2
(where we have used standard formulae for geometric series, proved in the matrix case in appendix A, § 5). If we let ≔
22 2 + −
2 −1 2
, then we can write:
22 2 − 1
∗ 2, = +
(2.8)
Now conveniently16:
11 + 21 1 = 12 + 22 11 + 21 − 12 + 22 =0 12 + 22
1
−
11 + 21
Thus if we pre-multiply (2.7) by 1 −
11 +21 12 +22
=1
−
1 21 + 12 22 2 22
=1
1 z 12 − 12 11 2 11
(this is valid
∗ assuming 11 ≠ 0), by (2.4) and (2.6) we will obtain an expression for the linear combination of and ∗ ∗ +1 that is pre-determined, namely:
1
1 z12 − 12 11 ∗ = 1 2 11 +
1 z12 ∗ −1 + − 1 z12 z11 2 22
1 22 − 1 z12 + − 1 z12 2 2 22 1 z12 ∗ 1 −1 + + + z11 2 11 2 11 2 11
= 1
(where we have used (2.5) and (2.6) to simplify). Stacking this equation with (2.8) gives:
16
This trick comes from Sims (2002).
Tom Holden
Full information solution 1 z12 − 12 11 ∗ = 1 2 11 1 0 1 0 1 1 z12 1 2 11 2 11 ∗ + 2 11 z11 −1 + + − + 22 2 22 0 0 2 2 − 1 2 − 1
1 z 12 − 12 11 −1 2 11
25
1 0
Finally pre-multiplying by
1
= ∙1
∙1
12 11 − 1 z 12 2 11
+ ∙2 and again simplifying us-
ing (2.5) and (2.6) gives the solution:
∗ ∗∗ +1
= 1 ∙1 11 +
z12 z11 12
0
∗ −1 ∗ ∗ −1
+
1 1 12 11 − 1 z12 2 + − ∙1 + 2 11 2 − 1 2
+ ∙2 22
2 + − 2 − 1 2
+
1 ∙1 ∙1 − 12 + ∙2 11 + 2 − 1 11 2 11
∗ To obtain the general solution for we take the top row of this equation and simplify, which gives:
∗ = 1
11 22 z12 21 ∗ 1 1 − 2 12 11 12 11 − −1 + + + 2 2 − 11 12 + 12 11 + 1 2
2 + − 2 − 1 2
+
1 2 − 1
∗ = 1 −1 +
1 2 + − 1 1 + + + 2 2 2 2 − 1 2 − 1 2
Thus we have shown that: + 2 − 1 2 − 1 + 2 − 1 2
∗ ∗ = 1 −1 +
2
+
which straight-forward calculation shows to agree with the usual AR 1 “MSV” solution. Pushing this forward one period and taking expectations we have the following FREE form expectation: + + 2 − 1 2 − 1 2 − 1
∗ ∗ ∗ +1 = 1 +
2
+
2.2.4. Proposition 1 The previous sections have shown that in the univariate case under stability ( 2 − 4 < 0 and
≤ 1 or
0 ≤ 2 − 4 and − + ≤ ≤ + ) there is always an ARMAX 2,1,1 form REE, which is a FREE if
May 1, 2008
26
Rational macroeconomic learning in linear expectational models and only if is of full rank, and that in the univariate case under saddle-path stability (0 ≤ 2 − 4 and either < − + or > + ), there is an AR 1 form REE, providing 11 ≠ 0, which in fact is always a FREE.
2.3.
Solution to the general canonical form
We now turn to solving the generalized canonical form (1.2) in full generality. To do this we broadly follow Lubik and Schorfheide’s (2003) extension to the irregular case of Sims’s (2002) method for solving rational expectations models, which is itself more general than that of Blanchard and Kahn (1980) since it avoids some invertibility assumptions and enables linear combinations of variables to be jointly predetermined. This method is particularly convenient for our purposes since it proceeds by first solving for the expectational error, which, to assess the convergence of the partial information case, is what we shall be interested in. Our chief innovations are the inclusion of the drift and linear terms, which are important as being able to accurately remove a linear trend is non-trivial in the partial information case; the derivation of a simpler condition for existence of REEs for a large class of models; the addition of FREE restrictions, which will
∗ play a role in the partial information case; and the explicit derivation of VARMAX form solutions for .
2.3.1. Set-up By the generalized complex Schur decomposition (also known as the QZ decomposition) (Quarteroni et al. 2000: 225) of the matrices Γ0 and Γ1 defined in § 1.5.2, there always exist possibly complex matrices , , Λ = ,
,
and Ω = ,
,
such that Λ = Γ0 , Ω = Γ1 , and are unitary and Λ and Ω are
upper triangular.
∗ Now let ∗ = for all ∈ ℤ, then if we pre-multiply (1.2) by we have:
∗ ∗ Λ∗ = Ω−1 + + + Ψ + Π
Tom Holden
Full information solution Providing Γ0 and Γ1 do not have zero eigenvalues corresponding to the same eigenvector17 the QZ decomposition always exists and the set
27
∈ 1, … , dim
⊆ ℝ ∪ ∞ is unique even though the
decomposition itself is not (Sims 2002: 9, 20). Thus, without loss of generality we may assume that for < ,
<
. Let be the number of for which
≤ 1 and consider a partition of the matrices
under consideration in which in each case the top left block is of dimension × 18. We then write: Λ11 0 Λ12 Λ22
∗ 1, Ω = 11 ∗ 0 2,
Ω12 Ω22
∗ 1,−1 + 1∙ ∗ 2∙ 2,−1
∗ + + Ψ + Π
(2.9)
Note that this decomposition means that only Λ11 and Ω22 are guaranteed to be invertible. 2.3.2. Derivation of restrictions The second block of (2.9) is purely explosive by construction; thus we solve it forward following Sims (2002: 9). From this block we have that for all ∈ ℕ+:
∗ 2,
=
∗ Ω−1 Λ22 2,+ 22
−
=1
Ω−1 Λ22 22
−1
∗ Ω−1 2∙ + + + Ψ+ + Π+ 22
∗ So if we take dated expectations and then take the limit as → ∞, since the components of ∗ 2,+ −1 are asymptotically polynomial by assumption and thus dominated by Ω22 Λ22 , we have that:
∞ ∗ 2,
=
∗ ∗ 2,
=
−∗
=1
−1 Ω22 Λ22
−1
∗ Ω−1 2∙ + + + Ψ+ + Π+ 22
∞
∞ −1 Ω22 Λ22
=−
=0
Ω−1 2∙ 22
+ + 1
−
=0
−1 Ω22 Λ22
−1
Ω−1 Λ22 Ω−1 2∙ 22 22
where all sums are well defined since the eigenvalues of Ω−1 Λ22 are strictly in the unit circle by construc22 tion, which is shown to be a necessary and sufficient condition for convergence in appendix A, § 5. In fact by the formulae derived in that appendix:
17
This means that there is one or more equation that places no restrictions on either or −1 . This will create an
∗ additional source of indeterminacy in and may also imply that one or more components of and are linear
combinations of the others. We, like both Sims and Lubik & Schorfheide, will not pursue this avenue.
18
Again, this means that we are not treating unit roots as explosive.
May 1, 2008
28
Rational macroeconomic learning in linear expectational models
∗ −1 2, = − − Ω22 Λ22 −1
Ω−1 2∙ + + 1 22
− − Ω−1 Λ22 22
−2
−1 Ω22 Λ22 Ω−1 2∙ 22 −1
= Λ22 − Ω22
−1
2∙ + + 1
−1
−1 + 2Λ22 − Ω22 − Λ22 Ω22 Λ22
Λ22 Ω−1 2∙ 22 (2.10)
= + Λ22 − Ω22 where: ≔ Λ22 − Ω22
2∙
−1
2∙ + + 2Λ22 − Ω22 − Λ22 Ω−1 Λ22 22
−1
Λ22 Ω−1 2∙ 22
(2.11)
∗ ∗ ∗ If 2, takes this form then ∗ 2, = ∗ 2, which from the forward solution is true if and only if: +1
∞
−∗ +1
=1
−1 Ω22 Λ22
−1
−1 ∗ Ω22 2∙ + + + Ψ+ + Π+
∞
= −∗
=1
−1 Ω22 Λ22
−1
∗ Ω−1 2∙ + + + Ψ+ + Π+ 22
⇔
−1 ∗ Ω22 2∙ Ψ+1 + Π+1 = 0
−1 (Here we have followed Mavroeidis and Zwols (2007).) Thus as Ω22 is of full rank, we require that is
chosen each period such that:
∗ 2∙ Ψ + 2∙ Π = 0
(2.12)
This is identical to the condition derived by Lubik and Schorfheide from a canonical form omitting the constant and linear terms (Lubik and Schorfheide 2003: 5). If we then take the singular value decomposition (SVD) (Horn and Johnson 1985: 414) of 2∙ Π and 2∙ Ψ we can write: ∙2 11 0 ∙2 11 0
0 ∙1 = ∙1 11 ∙1 0 ∙2 0 ∙1 = ∙1 11 ∙1 0 ∙2
2∙ Π = = ∙1 and 2∙ Ψ = = ∙1
(2.13)
where , , and are unitary and 11 and 11 have strictly positive diagonals and zeroes elsewhere. With this we can write down a necessary and sufficient condition (Sims 2002: 13, 20) for the existence of
∗ an satisfying (2.12), namely that:
Tom Holden
Full information solution
∙1 ∙1 ∙1 = ∙1 This is necessary as if we pre-multiply (2.12) by ∙1 ∙1 , from the unitarity of :
29
(2.14)
∗ ∗ 0 = ∙1 ∙1 0 = ∙1 ∙1 ∙1 11 ∙1 + ∙1 ∙1 ∙1 11 ∙1 = ∙1 ∙1 ∙1 11 ∙1 + 2∙ Π = ∙1 ∙1 ∙1 11 ∙1 − 2∙ Ψ = ∙1 ∙1 ∙1 − ∙1 11 ∙1 −1 So since can take any value it can certainly take the value ∙1 11 for some arbitrary vector , so by the unitarity of for all , ∙1 ∙1 ∙1 − ∙1 = 0, which means condition (2.14) must be satisfied.
We note here something not mentioned in Lubik and Schorfheide (2003) or Sims (2002): when the matrix Ψ
19 Π is invertible , condition (2.14) holds if and only if ∙1 ∙1 = . The “if” direction is trivial and the
“only if” direction follows from the fact that by the unitarity of : ΨH ΠH
−1 −1
= 2∙ 2∙ = 2∙ Ψ
ΠΨ
Π
−1
Ψ H 2∙ ΠH
−1
= ∙1 11 ∙1
∙1 11 ∙1 Ψ
Π
ΨH ΠH
Ψ H 2∙ ΠH
so from (2.14), pre-multiplying this by ∙1 ∙1 we have:
∙1 ∙1 = ∙1 ∙1 ∙1 11 ∙1
∙1 ∙1 ∙1 11 ∙1 Ψ
Π
−1
−1
ΨH ΠH
−1
Ψ H 2∙ ΠH
= ∙1 11 ∙1
∙1 11 ∙1 Ψ Π
−1
ΨH ΠH
Ψ H = 2∙ ΠH
as required.
19
This trivially holds when the law of motion is known to be given by (1.1), as in the case under consideration, as this Π = . Indeed it is very hard to conceive of a realistic model in which it does not hold, as it would
means Ψ
mean that there was some linear combination of equations in the model which was entirely non stochastic (i.e. it lacked both an term and an one). It may be argued that simple Taylor rules take precisely this form, but firstly they can easily be substituted in to the other equations giving a system with one fewer equation and secondly most models of any degree of sophistication include an interest-rate targeting shock so their Taylor rule equation is in fact stochastic.
May 1, 2008
30
Rational macroeconomic learning in linear expectational models We now demonstrate sufficiency of (2.14) by writing down an explicit solution. First let ≔ rank 2∙ Π, so that 11 is of dimension × . Then following Lubik and Schorfheide (2003: 9), we posit the following set
∗ of solutions for the forecast errors :
−1 ∗ = −∙1 11 ∙1 2∙ Ψ + ∙2 + ∙2
(2.15)
∗ ∗ ∗ (This is valid as and are of size dim − × dim and dim − × dim respectively.)
When (2.14) holds this satisfies (2.12) as by the unitarity of :
∗ −1 ∗ 2∙ Π = ∙1 11 ∙1 = −∙1 11 ∙1 ∙1 11 ∙1 2∙ Ψ + ∙1 11 ∙1 ∙2 + ∙1 11 ∙1 ∙2 = −∙1 ∙1 2∙ Ψ = −∙1 ∙1 ∙1 11 ∙1 = −∙1 11 ∙1 = −2∙ Ψ
where the penultimate step used (2.14). It is immediate from this solution for the forecast errors that
∗ there is a unique solution if and only if = dim , in which case the last two terms drop out.
2.3.3. Derivation of the stacked form solution Now by (2.13) and the unitarity of and :
−1 −1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Π = 1∙ Π − ∙1 11 ∙1 ∙1 11 ∙1 = 1∙ Π − ∙1 ∙1 = 1∙ Π∙2 ∙2
(2.16)
So from (2.15) and unitarity again:
−1 −1 ∗ 1∙ − 1∙ Π∙1 11 ∙1 2∙ Π = 1∙ Π∙2 ∙2 −∙1 11 ∙1 2∙ Ψ + ∙2 + ∙2
= 1∙ Π∙2 + Thus if we now follow Sims (2002) and Mavroeidis and Zwols (2007) and pre-multiply (2.9) by
−1 −1∙ Π∙1 11 ∙1 we have:
Λ11
−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22
∗ 1, ∗ 2, ∗ 1,−1 −1 ∗ + 1∙ − 1∙ Π∙1 11 ∙1 2∙ + + Ψ + Π ∗ 2,−1 ∗ 1,−1 −1 + 1∙ − 1∙ Π∙1 11 ∙1 2∙ + + Ψ ∗ 2,−1
= Ω11
−1 Ω12 − 1∙ Π∙1 11 ∙1 Ω22
= Ω11
−1 Ω12 − 1∙ Π∙1 11 ∙1 Ω22
+ 1∙ Π∙2 +
Tom Holden
Full information solution Now stacking this equation with our solution for the explosive terms, (2.10), we can write: Λ11 0
−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22 ∗ 1, = Ω11 ∗ 2, 0 −1 Ω12 − 1∙ Π∙1 11 ∙1 Ω22 0 ∗ 1,−1 ∗ 2,−1
31
−1 Π + 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ + 1∙ ∙2 + 0 0
+
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ +
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
The matrix multiplying the left hand side is clearly invertible since Λ11 is by construction. In fact by the block inverse formula: Λ11 0
−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ 22 −1 −1 = Λ11 0 −1 −Λ−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22 11
So if we now pre-multiply both sides by:
−1 ≔ Λ11 0 −1 −Λ−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22 11
(2.17)
we have:
∗ = Ω11 0 −1 −1 Ω12 − 1∙ Π∙1 11 ∙1 Ω22 ∗ + 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ −1 0 0 −1 1∙ − 1∙ Π∙1 11 ∙1 2∙
+
1∙ Π∙2 + + 0
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
+
−1 ∗ −1 = ∙1 Λ−1 Ω11 ∙1 + Ω12 − 1∙ Π∙1 11 ∙1 Ω22 ∙2 −1 + ∙1 Λ−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ 11 11
+ ∙1 Λ−1 1∙ Π∙2 + + 11
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙
+
where has been partitioned conformably with . If we now define:
−1 Θ ≔ ∙1 Λ−1 Ω11 ∙1 + Ω12 − 1∙ Π∙1 11 ∙1 Ω22 ∙2 11 −1 Θ ≔ ∙1 Λ−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ + ∙1 Λ−1 1∙ Π∙2 , 11 11
Θ ≔ ∙1 Λ−1 1∙ Π∙2 11
May 1, 2008
32
Rational macroeconomic learning in linear expectational models ≔
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ ,
=
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
∗ and partition these matrices and vectors conformably with , then we can write:
∗ Θ,11 = ∗∗ Θ,21 +1
Θ,12 Θ,22
∗ Θ ,1∙ ,1∙ ,1∙ Θ,1∙ −1 + + + ∗ ∗ + Θ,2∙ Θ ,2∙ ,2∙ −1 ,2∙
(2.18)
2.3.4. VARMAX form solution
∗ To make the implications of this for the path of clear we explicitly derive a VARMAX form solution.
∗ From the bottom row of (2.18), the definition of and (2.15):
∗ ∗ −1 ∗ ∗ +1 = Θ,21 −1 + Θ,22 + ∙1 11 ∙1 2∙ Ψ − ∙2 , − ∙2 ,
+ ,2∙ + ,2∙ + Θ,2∙ + Θ,2∙
∗ −1 ∗ = Θ,22 + Θ,21 −1 + ,2∙ + ,2∙ + Θ,22 ∙1 11 ∙1 2∙ Ψ − ∙2 , + Θ,2∙
+ Θ,2∙ − Θ,22 ∙2 ,
∗ So again by the definition of and (2.15):
∗ ∗ −1 ∗ +1 = Θ,22 + Θ,21 −1 + ,2∙ + ,2∙ + Θ,22 ∙1 11 ∙1 2∙ Ψ − ∙2 1, + Θ,2∙ −1 + −∙1 11 ∙1 2∙ Ψ + ∙2 , +1 + Θ,2∙ − Θ,22 ∙2 , + ∙2 , +1
which is in VARMAX 2,1,2 form rather than the usually considered “MSV” VAR 1 form, again illustrating the restrictions implicit in looking for MSV form solutions. 2.3.5. FREE solutions
∗ We not look for a FREE solution, which requires us to find an expression for ∗ +1 which does not de ∗ pend on , −1 , …. Just as in the univariate case, the key to doing this is to express as a function of
and .
∗ Now from our solution for , (2.15):
∗ = ∙1
∙2
−1 −11 ∙1 2∙ Ψ + ∙1
∙2
0
So since is unitary, if we pre-multiply by we have:
Tom Holden
Full information solution
−1 0 −11 ∙1 ∙1 11 ∙1 ∗ = − ∗ For to be uniquely determined given and we thus require
33
(2.19)
−1 −11 ∙1 ∙1 11 ∙1 to be invertible.
Thus we must have that is of full rank and that its rows are linearly independent of those of
−1 −1 11 ∙1 ∙1 11 ∙1 , which also must be of full rank, i.e. rank 11 ∙1 ∙1 11 ∙1 = . We will now investi-
gate under what circumstances this final condition holds.
−1 −1 ∗ Note that by the rank-nullity theorem, rank 11 ∙1 ∙1 11 ∙1 = dim − dim ker 11 ∙1 ∙1 11 ∙1 ; so −1 −1 ∗ rank 11 ∙1 ∙1 11 ∙1 = , if and only if, dim ker 11 ∙1 ∙1 11 ∙1 = dim − . Suppose then for −1 −1 some vector ≠ 0, 11 ∙1 ∙1 11 ∙1 = 0. Pre-multiplying this equation by ∙1 11 ∙1 ∙1 11 , by (2.14) and the unitarity of we have that ∙1 ∙1 = 0. However, by the unitarity of , ker ∙1 = 0
20.
Thus,
ker ∙1 ∙1 = ker ∙1 = span ∙2 21, so ∈ span ∙2 . Note also that if ∈ span ∙2 then by the unitarity of −1 −1 , 11 ∙1 ∙1 11 ∙1 = 0, so this is sufficient as well as necessary for ∈ ker 11 ∙1 ∙1 11 ∙1 . This −1 ∗ means by the above that rank 11 ∙1 ∙1 11 ∙1 = , if and only if ∙2 has dim − rows (since they
are linearly independent by the unitarity of ), i.e. if and only if rank 2∙ Ψ = .
−1 Now by (2.14), 2∙ Π∙1 11 ∙1 2∙ Ψ = 2∙ Ψ, thus span 2∙ Ψ ⊆ span 2∙ Π so rank 2∙ Ψ ≤ rank 2∙ Π =
. Thus if it is to be the case that rank 2∙ Ψ = , since span 2∙ Ψ ⊆ span 2∙ Π, we must in fact have that span 2∙ Ψ = span 2∙ Π, which is true if and only if in addition to (2.14) we also have:
∙1 ∙1 ∙1 = ∙1
(2.20)
Note that this condition is independent of both the condition for the existence of a REE (2.14) and the
condition for the REE to be unique, which is that 1∙ Π = 1∙ Π∙1 ∙1 (Sims 2002: 11). Note also that it
20
∈ ker ∙1 implies ∙1 = 0, so pre-multiplying by ∙1 , by unitarity we have = 0. The converse is trivial.
21
∈ ker ∙1 implies ∙1 = 0. Now let
1 ≔ ∙1 . By unitarity then if we pre-multiply by we have 2 ∙2
= ∙1
∙2
1 . Thus ∙1 ∙1 2
∙2
1 = 0 which implies by unitarity that 1 = 0, i.e. = ∙2 2 . The 2
converse is again trivial.
May 1, 2008
34
Rational macroeconomic learning in linear expectational models holds trivially in the fully stable (fully indeterminate) case, since then ∙1 and ∙1 have 0 rows. Finally note that when Ψ
Π is invertible, so ∙1 ∙1 = , this is equivalent to the condition that ∙1 ∙1 = .
∗ The above argument implies that is uniquely determined given and if and only if condition (2.20) −1 holds and is of full rank with rows which are linearly independent of those of 11 ∙1 2∙ Ψ. In this case −1 −11 ∙1 2∙ Ψ is invertible and:
= Now let:
−1 −11 ∙1 2∙ Ψ
−1
0 ∗ −
−1 −11 ∙1 2∙ Ψ Φ ≔
−1
,
−1 −11 ∙1 2∙ Ψ Φ ≔ −
−1
0
∗ where Φ has dim columns. Then from (2.18):
∗ ∗ ∗ ∗ ∗ +1 = Θ,2∙ Φ + Θ,21 −1 + Θ,22 − Θ,2∙ Φ ∗ + ,2∙ + ,2∙ −1
+ Θ,2∙ + Θ,2∙ Φ This is in FREE form as required.
(2.21)
It now just remains to prove that condition (2.20) and being full rank with rows which are linearly in−1 dependent of those of 11 ∙1 ∙1 11 ∙1 (which we have established holds if and only if −1 −11 ∙1 2∙ Ψ
is invertible) are necessary for there to be a FREE as well as sufficient. As in the univariate case we proceed by contradiction and assume that
−1 −11 ∙1 2∙ Ψ is not invertible, but that:
∗ ∗ ∗ +1 = ℛ + + other terms known at − 1 First let = ∙1 11 ∙1 be the SVD of −1 −11 ∙1 2∙ Ψ . Since it is square but not invertible ∙2 must
have a positive number of columns. Then by (2.19) and the unitarity of :
∙1 ∙2
0 = ∙2 ∙1 11 ∙1 = ∙2
0 −
∗ 0 ∗ ⇒ ∙2 = ∙2
Tom Holden
Full information solution
∗ ∗ ∗ Thus 0 = Cov−1 ∙2 , = Cov−1 ∙2 , = ∙2 Cov−1 , . Now is invertible which means that ker ∙2 = ker ∙2 = span ∙1
35
22
23,
thus there exists a rank
−1 −11 ∙1 2∙ Ψ ×
∗ ∗ dim matrix such that Cov−1 , = ∙1 . Now since has fewer rows than columns, if ∗ = ∙1 11 ∙1 is the SVD of , then ∙2 must have a positive number of rows. Cov−1 , ∙2 = ∗ Cov−1 , ∙2 = ∙1 ∙1 11 ∙1 ∙2 = 0, by the unitarity of . Thus:
∗ ∗ ∗ 0 = Cov−1 ∗ +1 + −1 + + + , ∙2 = Cov−1 ∗ +1 + , ∙2 ∗ = ℛ Cov−1 , ∙2 + Cov−1 , ∙2 + Var−1 ∙2 = Σ∙2
However Σ is of full rank, so we must have that ∙2 = 0, which gives the required contradiction from
0 = ∙2 ∙2 = as by the above ∙2 has a positive number of rows. This brings us to:
2.3.6. Proposition 2 The canonical form (1.2) has one or more REEs, given belief in non-explosiveness, if and only if condition (2.14) is satisfied. These REEs are FREEs if and only condition (2.20) is satisfied and is of full rank with
−1 rows which are linearly independent of those of 11 ∙1 2∙ Ψ. All REEs (and consequently FREEs) are ex-
pressible in VARMAX 2,1,2 form.
22
Let ∈ ker ∙2 and let = . Then = , so 0 = ∙2 = ∙2 = ∙2 , i.e. ∈ ker ∙2 ,
thus ∈ ker ∙2 . Conversely if ∈ ker ∙2 , then = for some satisfying 0 = ∙2 , so ∈ ker ∙2 .
23
By an identical argument to that made in footnote 21.
May 1, 2008
36
Rational macroeconomic learning in linear expectational models
3.
Partial information solution
We now turn to the partial information case. Forming rational expectations under learning is considerably harder than in the full information case. Intuitively, this is because partial information will mean the agents will never be exactly on the stable path, and may perhaps never be able to get back on it, once they know better where it is, without violating the requirement under full rationality that −1 = 0. If they have a chance to return to the stable path, it is by exploiting the fact that their expectational errors may be predictable by an individual with more information. This is the avenue we shall pursue. We approach this in two parts, firstly by working out how expectations are formed with minimal assumptions on information sets and no assumptions on where beliefs are coming from, and then by analysing the resulting solution under Bayesian belief updating with sensible information sets.
3.1.
Expectation formation with exogenous beliefs
3.1.1. Set-up Let all agents have identical beliefs about , , , Σ, , and let the period information set be ℐ for all agents, where ⊆ ℐ ⊆ ℐ∗ ∪ a FREE solution exists with given by:
∞
≔ , ∪
=−∞ ∞
~NIID 0, Σ
∪ Σ is of full rank
∪
=−∞
= 0 and is independent of , , , Σ, , , , −1 , … , +1 , +2 , …
∪ the economy's law of motion is of the form of (1.1) ∪ the economy is asymptotically growing at a sub-exponential rate ∪ conditions (2.14) 3.14 and (2.20) 3.20 hold
−1 ∪ ∀ Γ0 , Γ1 , , , Ψ, Π, Σ : is of full rank with rows linearly independent of those of 11 ∙1 2∙ Ψ
Note that we are now treating , not as matrices but as functions from Γ0 , Γ1 , , , Ψ, Π, Σ -tuples to matrices. That is to say and determine how the sunspot term would be formed given the parameters of the economy. We will however abuse notation slightly and write e.g. rather than
Tom Holden
Partial information solution Γ0 , Γ1 , , , Ψ, Π, Σ where no confusion will arise. Assuming agents know these functions, and , is not ideal, but it greatly simplifies the maths and may be justified by noting that these are choice variables, so we are just assuming coordinated choices. This in turn could be justified by assuming that there are psychological processes identical across all individuals that lead people to include sunspots of a particular form, or by assuming that in-period communication leads to everyone having the same sunspots. In future work we will investigate learning under uncertainty about and as well. As in the previous chapter, we shall solve the canonical form (1.2) rather than assuming (1.1) holds, in order to facilitate applying this to more general models. We will however assume that Ψ Π is invertible, which it certainly is when (1.1) holds. As noted in footnote 19, there are virtually no realistic models for which this does not hold, at least not if we are allowed to substitute out equations24; to put it crudely, economics is never an exact science. We do however have an additional reason for assuming Ψ Π is
37
invertible when dealing with learning. Were this not the case, then after a finite number of periods a linear combination of equations would be known with certainty, which introduces countless technical difficulties in the belief updating equations and does not concord with the econometrician’s experience of uncertainty about everything. Our goal in this chapter is to find necessary and sufficient conditions for convergence to the full information case. However, if we are to do so it is important that we first specify exactly what we actually mean by this. That we should be asymptotically on the full information stable path is certainly a base requirement. We will additionally require though that as → ∞, the difference between full and partial information expectational errors tends to 0, which is a weak form of convergence of expectations requirement,
∗ since it is independent of the convergence or otherwise of − . Note that both of these conditions are
satisfied by the form of convergence examined by E&H since they look at convergence of the coefficients of the actual law of motion to those of a (possibly sunspot augmented) MSV solution (G. W. Evans and Honkapohja 2001). With this definition in hand, we begin the task of finding such necessary and sufficient conditions.
24
And in fact even market clearing or account balance equations rarely hold exactly due to measurement error.
May 1, 2008
38
Rational macroeconomic learning in linear expectational models 3.1.2. Derivation of restrictions We proceed to solve the model much as in the previous chapter. As there, we know that: Λ11 0 Λ12 1, Ω = 11 Λ22 2, 0 Ω12 1,−1 + 1∙ Ω22 2,−1 2∙
+ + Ψ + Π
(3.1)
(note the lack of stars) and again we solve the explosive bottom block forward. This means for all ∈ ℕ+:
2, =
Ω−1 Λ22 2,+ 22
−
=1
Ω−1 Λ22 22
−1
Ω−1 2∙ + + + Ψ+ + Π+ 22
(3.2)
∗ So if we take expectations under the ℐ information set25 ( ≥ ) and then take the limit as → ∞, since
the components of ∗ 2,+ are asymptotically polynomial by our assumptions above and thus domi
−1 nated by Ω22 Λ22 , we have that:
∞
2, =
∗
2, =
−∗
=1 ∞
Ω−1 Λ22 22
−1
Ω−1 2∙ + + + Ψ+ + Π+ 22
∞
=−
=0 −−1
Ω−1 Λ22 22
−1 Ω22 2∙
+ + 1
−−1
−
=0
−1 Ω22 Λ22
−1
−1 Ω22 Λ22 Ω−1 2∙ 22
−
=0 ∞
Ω−1 Λ22 Ω−1 2∙ Ψ++1 22 22
−
=0
−1 Ω22 Λ22 Ω−1 2∙ Π++1 22
−
=−
Ω−1 Λ22 Ω−1 2∙ Π∗ ++1 22 22
where all sums are well defined since the eigenvalues of Ω−1 Λ22 are strictly in the unit circle by construc22 tion, which is necessary and sufficient for convergence by the results of appendix A, § 5. Now let:
∞
, ≔
=−
−1 −1 Ω22 Λ22 Ω22 2∙ Π∗ ++1
∞
=
−1 Ω22 Λ22 − =0
−1 −1 Ω−1 Λ22 Ω22 2∙ Π∗ ++1 = Ω22 Λ22 22
−
,
(3.3)
25
There is no additional benefit taking expectations under the ℐ information set at this point, since we can always
∗ retrieve ℐ expectations from ℐ ones using the law of iterated expectations.
Tom Holden
Partial information solution
∗ (Note that when ℐ = ℐ , , = 0. However, in general this will not hold since +1 is defined relative to ∗ the ℐ information set, which is a subset of the ℐ one.) Using this definition and the formulae derived in
39
the aforementioned appendix, we then have that:
−−1 −−1
2, = + Λ22 − Ω22 − , ⇒ &
−1
2∙ −
=0
Ω−1 Λ22 Ω−1 2∙ Ψ++1 22 22
−
=0
−1 Ω22 Λ22 Ω−1 2∙ Π++1 22
2, = + Λ22 − Ω22 2, = + Λ22 − Ω22
−1
−1
2∙ − ,
(3.4)
−1 −1 2∙ − Ω22 2∙ Ψ+1 − Ω22 2∙ Π+1 − +1,
where is defined as before by (2.11). Equating both right hand sides and pre-multiplying by Ω22 we thus have that: = 2∙ Ψ + 2∙ Π where: ≔ Ω22 −1,−1 − ,−1 = Ω22 −1,−1 − Λ22 , (3.5) (3.6)
As in the previous chapter, we posit an explicit solution for , namely:
−1 −1 = ∙1 11 ∙1 + −∙1 11 ∙1 2∙ Ψ + ∙2 + ∙2 + ∙2
(3.7)
where is some pseudo-sunspot, possibly a time varying function of the other random variables in the system, chosen so as to not violate +1 = 0. (Note we do not require that ∗ +1 = 0 or even neces sarily that +1 dim +1 = 0 due to the possible conditional covariance between ∙2 and .) This satisfies (3.5) since by the unitarity of and the fact that together condition (2.14) and the invertibility of Ψ
Π implies ∙1 ∙1 = , we have:
2∙ Π = ∙1 11 ∙1 −1 −1 = ∙1 11 ∙1 ∙1 11 ∙1 + −∙1 11 ∙1 ∙1 11 ∙1 2∙ Ψ + ∙1 11 ∙1 ∙2 +∙1 11 ∙1 ∙2 + ∙1 11 ∙1 ∙2 = ∙1 ∙1 − ∙1 ∙1 ∙1 11 ∙1 = − 2∙ Ψ
It is also clearly as general a solution as is possible while still satisfying (3.5).
May 1, 2008
40
Rational macroeconomic learning in linear expectational models Equation (3.7) has three important consequences. Firstly, it implies necessary and sufficient conditions for convergence, namely that: lim = lim , = 0
→∞
→∞
(3.8)
This is sufficient since by (3.6) if , tends to 0 as → ∞ then so does and that and tend to 0 as
−1 ∗ → ∞ is trivially sufficient for − → 0. Now note that the matrix 11 ∙1 0
0 is invertible as
∗ ∙1 ∙1 = , thus that and tend to 0 as → ∞ is necessary for − to tend to 0. But in fact from
equation (3.4), lim→∞ , = 0 is also necessary, else asymptotically we would be off the stable path.
Secondly, if we pre-multiply (3.6) by ∙2 , using the unitarity of , we have that:
−1 −1 ∙2 = ∙2 ∙1 11 ∙1 + −∙2 ∙1 11 ∙1 2∙ Ψ + ∙2 ∙2 + ∙2 ∙2 + ∙2 ∙2
= + +
(3.9)
∗ which will eventually be used to estimate past values of . Thirdly, since ℐ−1 ⊆ ℐ−1 , −1 ∗ = −1
−1 = 0, so by (3.6):
−1 0 = −1 ∙1 11 ∙1 Ω22 −1,−1 − Λ22 ,
+ −1 ∙2
(3.10)
which will turn out to be our chief rationality constraint. This equation also means that if it is ever known under the ℐ−1 information set that ∙2 has no columns (i.e. it is known that the model is fully determinate), then for some vector satisfying −1 = 0:
−1 −1 ∙1 11 ∙1 Λ22 , = ∙1 11 ∙1 Ω22 −1,−1 +
Thus as ∙1 ∙1 = by condition (2.14) and the invertibility of Ψ Π :
−1 −1,−1 = Ω−1 Λ22 , − Ω22 ∙1 11 ∙1 22 −1 As Ω22 Λ22 has eigenvalues in the unit-circle, this suggests26 that expectations of , are explosive, violat-
ing the belief in convergence unless we are in fact already on the stable path and we have enough information to stay on it for good. This in turn suggests that it is the lingering possibility that in fact the model is fully stable that enables a return to the stable path. Indeed, we conjecture that providing we are uncer26
−1 But does not imply, since possibly Ω22 ∙1 11 ∙1 and are conditionally correlated.
Tom Holden
Partial information solution tain where the stable path is, convergence in non-fully-stable cases is impossible, unless beliefs put positive probability on full-stability (full indeterminacy) at all times. 3.1.3. Derivation of the stacked form solution Now by (2.16), (3.7) and the unitarity of :
−1 ∗ 1∙ − 1∙ Π∙1 11 ∙1 2∙ Π −1 −1 = 1∙ Π∙2 ∙2 ∙1 11 ∙1 + −∙1 11 ∙1 2∙ Ψ + ∙2 + ∙2 + ∙2
41
= 1∙ Π∙2 + + Thus if we pre-multiply (3.1) by
−1 −1∙ Π∙1 11 ∙1 , exactly as in the previous chapter we have:
Λ11 = Ω11 = Ω11
1, −1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22 2, 1,−1 −1 −1 + 1∙ − 1∙ Π∙1 11 ∙1 2∙ + + Ψ + Π Ω12 − 1∙ Π∙1 11 ∙1 Ω22 2,−1 1,−1 −1 −1 + 1∙ − 1∙ Π∙1 11 ∙1 2∙ + + Ψ Ω12 − 1∙ Π∙1 11 ∙1 Ω22 2,−1
+ 1∙ Π∙2 + + Again as in the previous chapter, we now stack this equation with our solution for the explosive terms, (3.4), we can write: Λ11 0
−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22 1, = Ω11 2, 0 −1 Ω12 − 1∙ Π∙1 11 ∙1 Ω22 1,−1 2,−1 0
−1 Π + 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ + 1∙ ∙2 + + 0 0
+
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ +
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
−
0 ,
Thus if we now pre-multiply both sides by (defined as in (2.17)) we have: = Ω11 0 +
−1 −1 Ω12 − 1∙ Π∙1 11 ∙1 Ω22 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ −1 + 0 0
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ 1∙ Π∙2 + + + 0 −1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
+
−
0 ,
May 1, 2008
42
Rational macroeconomic learning in linear expectational models
−1 −1 = ∙1 Λ−1 Ω11 ∙1 + Ω12 − 1∙ Π∙1 11 ∙1 Ω22 ∙2 −1 + ∙1 Λ−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Ψ 11 11
+ ∙1 Λ−1 1∙ Π∙2 + + + 11
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙ Λ22 − Ω22 −1 2∙
−1 1∙ − 1∙ Π∙1 11 ∙1 2∙
+
−1 + ∙1 Λ−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ22 , − ∙2 , 11
Thus with Θ , Θ , Θ , and defined as in § 2.3.3, and Θ , Θ and , given by:
−1 Θ ≔ ∙1 Λ−1 Λ12 − 1∙ Π∙1 11 ∙1 Λ 22 − ∙2 , 11
Θ ≔ ∙1 Λ−1 1∙ Π∙2 11
, ≔ Θ ∗ +,+ + Θ ∗ + we can write: = Θ −1 + + + Θ + Θ + ,0 (3.11)
We note here that trivially our condition (3.8) is sufficient for ,0 to tend to 0 as → ∞ and hence, by
∗ (2.18), sufficient for − to tend to 0 as well.
3.1.4. Solution for the off stable path term Equation (3.11) would be identical to the solution under partial information (2.18) were it not for the ,0 term. In this section, we concentrate initially on finding an expression for the , component, which by equation (3.4) measures the distance off the saddle path. It is not at all obvious a priori whether its definition in terms of future expectational errors actually restricts the values it can take. We approach this problem by using equation (3.11) to derive a solution for these expectations of future errors. Pushing equation (3.11) forward one period and then iterating times before finally taking dated fullinformation expectations, we have that:
∗ + Now:
=
Θ
+
=1
Θ
−
+ + + ,
− Θ
− +1 Θ
Θ −
=1
=
=1
−
=1
Θ
−
= Θ −
and:
Tom Holden
Partial information solution
43
Θ −
=1
− 1
− Θ
=
=1
− 1
− +1 Θ
−
=1
− 1 Θ
−
− Θ
− Θ
=
=1
− 1
− −1 Θ
−
=1
+
=1
=
=1
Θ
−
−
Thus assuming Θ − is invertible (which is certainly true whenever there are no unit roots), by the above, for ∈ ℕ: ∗ + = Θ + Θ −
−1 −1
Θ −
−1
+ + 1
−
+ Θ −
Θ −
Θ
− − +
=1
Θ
,
But:
Θ Θ − = Θ+1 − Θ = Θ − Θ
−1
So pre- and post- multiplying by Θ −
we have that Θ Θ −
−1
−1
= Θ −
−1 Θ ,
thus:
∗ + = Θ + Θ − Θ − Θ
−1
+ + 1
−1 −
+
− Θ −
− Θ −
+
=1
Θ
,
Thus for > 0: ∗ + = ∗ + − +−1 + = = + − − 0 ∗ + − 0
−1
∗ +−1 + − − Θ −
−2
Θ −
−1
+ + 1
−1
−
− − 1 Θ −
+
0 , +
=1
− ,
where ≔
0 Θ − 0
Θ−1 =
0 Θ − 0
Θ−1 . This is true for any information set
and any values for and , providing the conditions at the start of this chapter are satisfied. Therefore it is certainly true in the case when ℐ = ℐ∗ , = = 0 and we are on the saddle-hyper-plane. In this case
∗ = = 0 and , = 0 for all , . Thus we have that for > 0, 0 = ∗ + = ∗ + = . Now ∗ ∗ ∗ note that in this case, 0 = 2, = 2, , thus = = ∙1 1, = ∙1 1, . Also note that “acting as god”
we can choose initial conditions (holding the shock series constant) without violating rationality such that May 1, 2008
44
Rational macroeconomic learning in linear expectational models 1, takes any value; thus it must be the case that 0 = ∙1 , at least when ℐ = ℐ∗ and = = 0. In fact though, this must be true no matter what ℐ , and are, as examining the definition of we see that it does not depend on any of these things (though it will not be true more generally that 0 = since we may be off the saddle-hyper-plane). Now by the unitarity of , for > 0: Θ = ∙1 Λ−1 Ω11 Λ−1 11 11
−1 −1 Ω11 ∙1 + Ω12 − 1∙ Π∙1 11 ∙1 Ω22 ∙2
Therefore for > 1, = 0. Note for future reference that 0 = ∙1 also implies that 1 Θ = −1 ∙2 and 1 Θ = 0, so 1 , = −1 ∙2 ∗ +,+ . Also by the unitarity of we have that:
1 = 1 = 1 ∙1 ∙1 + ∙2 ∙2 = 1 ∙2 ∙2
Thus by (3.4) and the just derived equation for 1 , :
1 = 1 ∙2 ∙2 = 1 ∙2 2, = 1 ∙2 + 1 ∙2 Λ22 − Ω22 −1
2∙ + 1 ,0
This means that: ∗ +1 = 1 ∙2 + 1 ∙2 Λ22 − Ω22 + 1 + − and that for > 1: ∗ + = − + − Θ −
−1 −1
2∙ + 1 ,0 + 1 + − − 1 Θ −
−1
Θ −
−1
+ + 1
Θ −
−2
+ −
+
0 ,1
+ + 1
−1
+ −
Θ −
−2
− 1 Θ −
+
0 , + 1 ,−1
Now as before, we note that the equation for ∗ + just derived must hold for any information set, so in particular it holds when ℐ = ℐ∗ and we are on the saddle-hyper-plane. As , = 0 in this case this means that for all and for all > 1:
∗ 0 = ∗ + = ∗ +
= −
Θ −
−1
+ + 1
−1
+ −
Θ −
−2
+ −
− 1 Θ −
−1
This can only hold if −
Θ −
= 0 and:
Tom Holden
Partial information solution 0 = − Θ −
−1
45
+ −
Θ −
−2
+ 0
− Θ −
−1
Again, since the variables in these equalities are independent of the information set they must in fact hold for any information set. Thus: ∗ +1 = 1 ∙2 + ∙2 Λ22 − Ω22 + 0 ,1 0 , + 1 ,−1 .
−1
2∙ + ,0 + Θ −
−1
+ + 1
+ Θ −
−2
and for > 1, ∗ + =
Now again we have that this ∗ +1 equation must hold when ℐ = ℐ∗ and we are on the saddle-hyper plane, so as ,0 = ,1 = 0 in this case it must be that:
∗ 0 = ∗ +1 = ∗ +1
= 1 ∙2 + ∙2 Λ22 − Ω22
−1
2∙ + Θ −
−1
+ + 1
+ Θ −
−2
As before we note that since none of these variables depend on the information set this must in fact hold whatever ℐ is. Thus in fact for all > 0: ∗ + = 0 , + 1 ,−1 (3.12)
We can now use this equation to tackle the off-stable path term +,+ . From its definition (3.3), for ≥ 0:
∞
∗ +,+
=
=1
−1 Ω22 Λ 22
−1
−1 Ω22 2∙ Π
0 ,+ + 1 ,+−1
Now as noted above, 1 , = −1 ∙2 ∗ +,+ , thus:
−1 Ω22 2∙ Π1 ∙2 + ∗ +,+ ∞
+
=1 ∞
Ω−1 Λ22 22
−1
−1 Ω−1 Λ22 Ω22 2∙ Π1 ∙2 − 2∙ Π 22
0 Θ ∗ ++,++
=
=1
Ω−1 Λ22 22
−1
Ω−1 2∙ Π 22
0 Θ ∗ ++
May 1, 2008
46
Rational macroeconomic learning in linear expectational models This is an infinite order difference equation, but luckily we can reduce it to a first order one. Let and be the left and right side respectively of this equation. Additionally let: ≔ Ω−1 2∙ Π1 ∙2 + , 22 and Then: − ℒ ≔ Ω−1 Λ22 Ω−1 2∙ Π1 ∙2 − 2∙ Π 22 22
−1 ℛ ≔ Ω22 2∙ Π ∞
0 Θ
0 Θ
−1
∗ +,+
=
=1
Ω−1 Λ22 22
ℒ∗ ++,++
∞
and Therefore for ≥ 1:
=
=1
Ω−1 Λ22 22
−1
ℛ∗ ++
∞
−1 −
∗ +−1,+−1
=
ℒ∗ +,+
+
=1
Ω−1 Λ22 ℒ∗ ++,++ 22
= ℒ∗ +,+ + Ω−1 Λ22 − ∗ +,+ 22
∞
−1 = ℛ∗ + +
=1
Ω−1 Λ22 22
ℛ∗ ++ = ℛ∗ + + Ω−1 Λ22 22
So as −1 = −1 and = : ∗ +−1,+−1 + ℒ − Ω−1 Λ22 ∗ +,+ = ℛ∗ + 22 This holding for all ≥ 1 is sufficient for there to be a solution to the infinite order difference equation, as for all ≥ 0:
∞
=
=1 ∞
−1 Ω22 Λ22
−1
ℛ∗ ++
=
=1 ∞
−1 Ω22 Λ22
−1
−1 ∗ ++−1,++−1 + ℒ − Ω22 Λ22 ∗ ++,++
∞ −1 Ω22 Λ22 −1 ℒ∗ ++,++
=
=1
+
=0
−1 Ω22 Λ22 ∗ ++,++
∞
−
=1 ∞
−1 Ω22 Λ22 ∗ ++,++
=
=1
−1 Ω22 Λ22
−1
ℒ∗ ++,++ + ∗ +,+ =
Tom Holden
Partial information solution so we do not need to worry about initial conditions. Now let: ∗ +,+ ∗ +
47
−1 ≔ Ω22 Λ22 − ℒ
ℛ,
ℬ ≔
0,
, ≔
It follows from the first order form just derived then that for all ≥ 1: , = ℬ,−1
Now let = ∙1 11 ∙1 be the SVD of , then from pre-multiplying by ∙1 ∙1 , by the unitarity of
we have that for all ≥ 1:
ℬ,−1 = , = ∙1 11 ∙1 , = ∙1 ∙1 ∙1 11 ∙1 , = ∙1 ∙1 ℬ,−1
(3.13)
As ever we demonstrate this condition is sufficient by exhibiting a solution, in this case:
−1 , = ∙1 11 ∙1 ℬ,−1 + ∙2 ,
(3.14)
for some undetermined variable , . This satisfies , = ℬ,−1 since by the unitarity of and the just derived condition:
−1 , = ∙1 11 ∙1 , = ∙1 11 ∙1 ∙1 11 ∙1 ℬ,−1 + ∙1 11 ∙1 ∙2 , = ∙1 ∙1 ℬ,−1 = ℬ,−1
−1 Now let Ω be the Schur decomposition of ∙1 11 ∙1 ℬ, with eigenvalues ordered by increasing
modulus. As ever we partition these matrices so that the top and/or leftmost block corresponds to the fully stable eigenvalues (which here we take to be those with modulus strictly less that 1). Pre-multiplying (3.14) by then and writing , = , , we have that: ,1, Ω11 ,2, = 0 Ω12 ,1,−1 + ∙1 ∙2 , ∙2 Ω22 ,2,−1
Now our necessary and sufficient condition for convergence, (3.8), is equivalent to the condition that , → 0 as → ∞, for which it is certainly necessary that , → 0 as → ∞. So as the bottom block of the equation above is explosive (or at least a random walk), the following conditions are necessary for convergence:
∙2 ,0 = 0
(3.15) (3.16)
and:
∙2 ∙2 = 0
May 1, 2008
48
Rational macroeconomic learning in linear expectational models Therefore:
, = ∙1 Ω11 ∙1 ,−1 + ∙1 ∙1 ∙2 ,
(3.17)
i.e.
−1
, =
∙1 Ω11 ∙1 ,0
+
=0
∙1 Ω11 ∙1 ∙2 ,−
Note that since ∙2 ,0 = 0, ∙1 ∙1 ,0 = ,0 , so if we define a new variable by ≔ ∙1 ,0 , then ∙1 = ∙1 ∙1 ,0 = ,0 . Therefore for all ≥ 0:
−1
, =
∙1 Ω11
+
=0
∙1 Ω11 ∙1 ∙2 ,−
Now by the law of iterated expectations, for all , ≥ 0, ∗ + , = ,+ , thus for , ≥ 0:
−1 ∙1 Ω11 ∗ + + =0 + −1 ∙1 Ω11 ∙1 ∙2 ∗ + ,−
=
+ ∙1 Ω11
+
=0
∙1 Ω11 ∙1 ∙2 ,+ −
(3.18)
Thus from the = 0 case, ∙1 ∗ + = , for all ≥ 0, so pre-multiplying this equation by ∙1 :
−1
∗ +
=
Ω11
+
=0
Ω11 ∙1 ∙2 , −
(3.19)
Now from (3.18) and (3.19) we also have that for , ≥ 0:
−1 + ∙1 Ω11 ∙1 ∙2 , − =0 −1 −1 + −1 ∙1 Ω11 ∙1 ∙2 ∗ + ,− =0
+
=
=0
∙1 Ω11 ∙1 ∙2 ,+ −
⇒
0=
=0
∙1 Ω11 ∙1 ∙2 ,+ − − ∗ + ,−
So pre-multiplying the = 1 case by ∙1 :
0 = ∙1 ∙2 ,1+ − ∗ + ,1
Therefore by (3.16): 0 = ∙2 ,1+ − ∗ + ,1 0
0=
Tom Holden
Partial information solution ⇒
0 = ∙2 0 = ∙2 ∙2 ,1+ − ∗ + ,1 = ,1+ − ∗ + ,1
49
Therefore it seems sensible to define another variable ≔ ,1 , so for all ≥ 1, , = ∗ + −1 , which means that we can rewrite (3.19) as:
∗ +
=
Ω11
+
=1
−1 Ω11 ∙1 ∙2 ∗ + −
Since , = ∙1 ∗ + for all ≥ 0, this also gives us our final solution for , . Note too that since by
(3.16), ∙1 ∙1 ∙2 = ∙2 , we have = ∙2 ∙1 ∗ +1 − Ω11 , given any sequences
and ∗ +1 ,
we can always choose so as to make ∗ +1 rational according to this equation. Thus we can think of this solution as only restricting ∗ + for > 1. We must also have that −1 = 0 so by (3.10):
−1 0 = −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 + −1 −∙1 11 ∙1 Λ22
∙2 ∙1
(3.20)
Finally note that this solution for , implies that we can rewrite equation (3.11) as: = Θ −1 + + + Θ + Θ + Θ 3.1.5. Towards a FREE solution We will now use the just derived results to put the solution into a form that does not depend on . From our solution for , (3.7), similarly to equation (2.19) we have:
−1 −1 0 −11 ∙1 ∙1 11 ∙1 0 ∗ = − − − 11 ∙1 0
Θ ∙1
(3.21)
As we are assuming condition (2.20) and that is of full rank with rows which are linearly independent
−1 of those of 11 ∙1 2∙ Ψ, −1 −11 ∙1 2∙ Ψ is invertible by the arguments of § 2.3.5 and:
=
−1 −11 ∙1 2∙ Ψ
−1
−1 0 0 ∗ − − − 11 ∙1 0
Now we let Φ and Φ be defined as in § 2.3.5 and we additionally define:
−1 −11 ∙1 2∙ Ψ −1
Φ ≔ −
0 ,
Φ ≔ −
−1 −11 ∙1 2∙ Ψ
−1
−1 11 ∙1 0
May 1, 2008
50
Rational macroeconomic learning in linear expectational models
∗ (where Φ has dim columns), which means that from equations (3.6) and (3.21):
+1 = Θ,2∙ Φ + Θ,21 −1 + Θ,22 − Θ,2∙ Φ −1 + ,2∙ + ,2∙ + Θ,2∙ + Θ,2∙ Φ +Θ,2∙ Φ Ω22 0 ∙1 −1 + Θ ,2∙ − Θ,2∙ Φ Λ22 Θ ,2∙ + Θ,2∙ Φ ∙1
where Θ and Θ have been partitioned conformably with . Now by (3.4):
−1 2∙ − 1 − ∙2
0 ∙1 −1 = −1,−1 = + Λ22 − Ω22
0 − ∙2 0 −1 −1
Consequently:
+1 = Θ,2∙ Φ + Θ,21 − Θ,2∙ Φ Ω22 ∙2 + Θ,22 − Θ,2∙ Φ − Θ,2∙ Φ Ω22 ∙2
0
−1 −1
−1
0
+ ,2∙ + Θ,2∙ Φ Ω22 − Θ,2∙ Φ Ω22 Λ22 − Ω22 + ,2∙ + Θ,2∙ Φ Ω22 Λ22 − Ω22 + Θ ,2∙ − Θ,2∙ Φ Λ22
−1
2∙
2∙ + Θ,2∙ + Θ,2∙ Φ (3.22)
Θ ,2∙ + Θ,2∙ Φ ∙1
To progress further it will be convenient to specify information sets more precisely, which is what we will now proceed to do.
3.2.
Endogenous beliefs
3.2.1. Additional assumptions Now although we are still assuming homogeneity, unlike in the full information case, § 2.1, we will not assume agents are aware of , , , Σ, , or of what happened more than one period before their “birth” in period . We will also assume that at all points in time beliefs are continuous (i.e. atom-less) and put positive probability on full indeterminacy, which can be justified by the argument made in § 3.1.2. Moreover, we assume that the true distribution from which “nature” chose the variables , , , Σ, , “before the start of time” is absolutely continuous (Young 2004: 97) with respect to the agent’s initial priors over these variables (before any observations of or have been made) and that the agent’s initial
Tom Holden
Partial information solution priors are absolutely continuous with respect to the true distribution. This condition means that the true process generating , , , Σ, , puts positive probability on them being in some set if and only if the agent’s initial priors also put positive probability on their being in that set. In the game theoretic strategy learning literature it is common to assume absolute continuity of the truth with respect to beliefs27, but the converse which we assume here is less common in that literature. However, the current problem is analogous to an extensive form game with nature moving first then the other player (the unique representative agent) repeatedly moving from then on under uncertainty as to nature’s first move. Applying the weak perfect Bayesian equilibrium concept (Mas-Colell et al. 1995: 285) to this would imply the far stronger restriction that the representative agent’s initial priors should actually be equal to the probability distribution from which nature chose , , , Σ, , ; this provides some justification for our assumption. We will use these absolute continuity conditions for two purposes, firstly to ensure that the agent puts positive probability on a neighbourhood of the true values and secondly to ensure that anything believed to happen with probability 1 does indeed happen with probability 1 and vice versa. It may be noted that in light of this last assumption we can write “almost surely” (abbreviated “a.s.”) without any ambiguity as to whether it is over beliefs or the truth. 3.2.2. Information sets From the discussion in § 2.1 then, the only sensible28 period information set is:
51
ℐ =
=−1
,
∪
27
For example the absolute continuity of the truth with respect to player’s beliefs is the main condition in the Kalai
Lehrer theorem that guarantees that the play of an infinitely repeated stage game played by rational Bayesian players comes arbitrarily close to equilibrium behaviour (Kalai and Lehrer 1991).
28
In the traditional learning literature, it has been common to “forget” what exactly the information set is under full
information and assume that contemporaneous values are not available. However as Ellison and Pearlman (2008) point out, there is “no point in investigating convergence to RE unless one compares like with like”. This is particularly important in the current work since we do not have bounded rationality to fall back on in order to justify not using the same information set for forming expectations of both the model’s parameters and +1 .
May 1, 2008
52
Rational macroeconomic learning in linear expectational models (Note that we do allow to be observable; as mentioned in footnote 12 this can be justified by the realisation that the source of ’s variation is a choice variable.) Since this satisfies ⊆ ℐ ⊆ ℐ∗ ∪ a FREE solution exists all the results of § 3.1 apply. 3.2.3. Application of the Martingale Convergence Theorem Let ℐ∞ ≔
∞ =−1 ℐ ,
then by the conditional expectation form of Doob’s Martingale convergence theoa.s.
rem (Doob 1953), for any function of , , , Σ, , , as → ∞,
∞ , where ∞ denotes expec-
tations under the ℐ∞ information set29. To reiterate, by our absolute continuity assumptions, this means convergence with probability 1 under both beliefs and the true distribution. We want to place strong enough restrictions on beliefs to ensure that for the ’s we are interested in, ∞ = , , , Σ, , , i.e. expectations converge to the truth. We claim that the following is a sufficient condition: ∃ ≥ s.t. ∀ ≥ : rank Cov , +1 = dim (3.23)
We shall see in § 3.3 that, at least in the univariate case, given beliefs put positive probability on full indeterminacy, this is really quite a weak condition. To see the sufficiency of (3.23), suppose first that was invertible and we estimated the equation: = −1 +1 + −1 −1 + −1 + −1 + −1 (i.e. the original law of motion (1.1) pre-multiplied by −1 ) by a standard systems instrumental variables (SIV) regression such as three stage least squares (Wooldridge 2002: 183-208), using as an instrument for +1 and discarding all observations prior to . (Using instrumental variables is necessary as +1 is correlated with , so ordinary least squares is not consistent.) Since by our assumption that Cov , +1 is of full rank and the fact that is independent of , this is a fully valid instrument, hence by the asymptotic properties of generalized method of moments instrumental variables regressions (Wooldridge 2002: 190), in the limit as → ∞ we would know the variables
−1 , −1 , −1 Σ′−1 , −1 , −1 with certainty. Now pre-multiplying (1.1) by any full rank matrix be-
29
Strictly speaking, this should be formulated in terms of filtrations on the probability space’s -algebra, but increas-
ing mathematical rigour in this manner does not affect our results.
Tom Holden
Partial information solution fore putting it into canonical form cannot possibly affect the solution for , thus the solution must be identical to the one obtained when Γ0 = Ψ= − −1 , Γ = −1 1 0 0 0 , = −1 , = −1 , 0 0
53
0 ,Π= and ~NIID 0, −1 Σ′−1 (i.e. when Σ has been “redefined” to −1 Σold ′−1 , where 0
Σold is its former value). Consequently since up to here our solution (either full or partial information) has not depended on the precise form (1.1) and only on the canonical form (1.2), in the limit as → ∞ when is invertible running this SIV regression would tell you every variable that is only a function of Γ0 , Γ1 , , , Ψ, Π, Σ, which includes, amongst others, the following: Θ , Θ , Θ , Θ , Θ , , , Φ , Φ , Φ , Φ and in fact any variable mentioned in § 3.1.2-3.1.5. Then, since the singular matrices form a null set in the space of all matrices of any given size, by the continuity of beliefs, must be believed to be invertible with probability 1. Thus by absolute continuity of the truth with respect to beliefs, must also be invertible with probability 1 under the true distribution. Now anyone in possession of the ℐ∞ information set can perform the infinite sample version of the aforementioned regression30 and so since their priors put positive probability on a neighbourhood of the truth, we have the following lemma: 3.2.4. Lemma 1 For any function of Γ0 , Γ1 , , , Ψ, Π, Σ, as → ∞,
a.s.
.
3.2.5. Additional restrictions under this information set Using the information set defined in § 3.2.2 we can now solve for the unknown term in (3.22), . We first define some convenient notation. For all matrices/vectors let: ≔ − so = 0. Thus by taking expectations of both sides of (3.22) under the ℐ information set:
30
That classical consistency implies Bayesian consistency in this way is basically a consequence of Theorem 2.2 of
Blume and Easley (1993: 6).
May 1, 2008
54
Rational macroeconomic learning in linear expectational models Θ ,2∙ − Θ,2∙ Φ Λ22 Θ ,2∙ + Θ,2∙ Φ ∙1 − Θ ,2∙ − Θ,2∙ Φ Λ22 0 −1 Θ,2∙ + Θ,2∙ Φ ∙1
= Θ,2∙ Φ + Θ,21 − Θ,2∙ Φ Ω22 ∙2
+ Θ,22 − Θ,2∙ Φ − Θ,2∙ Φ Ω22 ∙2
0 −1
−1
+ ,2∙ + Θ,2∙ Φ Ω22 − Θ,2∙ Φ Ω22 Λ22 − Ω22 + ,2∙ + Θ,2∙ Φ Ω22 Λ22 − Ω22
−1
2∙
2∙ + Θ,2∙ + Θ,2∙ Φ
Now the right hand side of this equation (which we shall denote ) is uniquely pinned down in the ℐ∗ information set, but the left hand side still has degrees of freedom since is free; then must be chosen so as to satisfy: ∙1 = ∙1 − where ≔ Θ ,2∙ − Θ,2∙ Φ Λ22 (3.24)
Θ,2∙ + Θ,2∙ Φ . Now dim = dim , so for this to have a solution
that does not restrict the right hand side we must have: dim ≤ rank ∙1 ≤ min rank , rank ∙1 ≤ rank ≤ rows = dim (3.25)
Note that amongst other things this implies that dim ≥ dim . Let us then write for the SVD of ∙1 , therefore since ∙1 has full row rank, we can write = ∙1 ∙1 is invertible and is of size dim × dim . Therefore by (3.24) for some unrestricted variable :
−1 −1 = ∙1 ∙1 ∙1 − ∙1 ∙1 + ∙2
0
∙1 = ∙1 ∙1 , where ∙2
(It is easy to see this is sufficient for (3.24) as well.) This implies no restrictions whatsoever on ∙1
31,
since pre-multiplying both sides by ∙1 ∙1 and taking expectations we have:
−1 −1 ∙1 ∙1 = ∙1 ∙1 ∙1 ∙1 ∙1 − ∙1 ∙1 ∙1 ∙1 + ∙1 ∙1 ∙2 = ∙1 ∙1
31
This is the motivation for looking for solutions that do not restrict the right hand side of (3.24).
Tom Holden
Partial information solution (since = 0). However, is restricted by condition (3.20). From the just derived solution for we can rewrite this condition as:
−1 0 = −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 + −1 −∙1 11 ∙1 Λ22 −1 ∙2 ∙1 ∙1 ∙1 ∙1
55
−1 − −1 −∙1 11 ∙1 Λ22
−1 −1 ∙2 ∙1 ∙1 ∙1 + −1 −∙1 11 ∙1 Λ22
∙2 ∙1 ∙2
−1 Letting ≔ −∙1 11 ∙1 Λ22
−1 ∙2 ∙1 ∙1 ∙1 , by the law of iterated expectations:
−1 ∙1
−1 = −1 − −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 − −1 −∙1 11 ∙1 Λ 22
∙2 ∙1 ∙2
so for some vector satisfying −1 = 0 and = : ∙1
−1 = −1 − −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 − −1 −∙1 11 ∙1 Λ22
∙2 ∙1 ∙2 +
Now first note that is dim × dim . Also note that when Λ22 is invertible we have:
−Λ−1 ∙1 11 ∙1 22 −1 −∙1 11 ∙1 Λ22 ∙2
∙1 ∙1 ∙1
−1 ∙2 ∙1 ∙1 ∙1 =
since condition (2.14) and the invertibility of Ψ Π implies ∙1 ∙1 = . Therefore:
−1 = ∙1 ∙1
−Λ−1 ∙1 11 ∙1 22 ∙2
By the continuity of beliefs, Λ22 is believed to be invertible with probability 1, thus is also believed to be invertible with probability 1, which by our absolute continuity assumption implies that actually is invertible with probability 1. Consequently by Lemma 1, with probability 1, → as → ∞, so by the continuity of the map taking to the modulus of its eigenvalue with smallest absolute value (Hinrichsen and Pritchard 2005: 399), with probability 1 there exists some point in time after which ’s eigenvalues are all bounded away from 0, so is also invertible after this point. (Quite how long this takes will generically depend on the realised shock sequence.) This provides some justification for the following assumption, which we will show to be sufficient for there being a solution for that satisfies (3.20): May 1, 2008
56
Rational macroeconomic learning in linear expectational models ∀ ≥ : is invertible (3.26)
By the previous remark, this can be thought of as ruling out any overly outlandish priors, which is a similar restriction to the idea of local convergence used by E&H. When this holds: ∙1 = − Thus:
−1 = ∙1 ∙1 −1 −1 −1 − ∙1 ∙1 −1 −1 −1 −∙1 11 ∙1 Λ22 −1 −1 −1 ∙1 11 ∙1 Ω22 −1 −1
−1 −
−1
−1 −1 ∙1 11 ∙1 Ω22
0 ∙1 −1
−1
−1 −1 −∙1 11 ∙1 Λ22
∙2 ∙1 ∙2 +
0 ∙1 −1
−1 −1 − ∙1 ∙1
−1 − ∙1 ∙1
−1 ∙2 ∙1 ∙2 + ∙1 ∙1
It just remains for us to check what conditions on and are necessary to ensure that condition (3.20) holds. When satisfies this equation we have:
−1 −1 ∙1 11 ∙1 Ω22 −1 = −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 + −1 −∙1 11 ∙1 Λ22
∙2 ∙1
0 ∙1 −1 + −1
−1 −1 ∙1 11 ∙1 Ω22 −1 −1 −∙1 11 ∙1 Λ22
−1
−1
− −1 − −1
−1
0 ∙1 −1 ∙2 ∙1 ∙2
−1
−1
+ −1
−1
− −1
−1 = −1 ∙1 11 ∙1 Ω22
0 ∙1 −1 + −1
−1 −1 −1 ∙1 11 ∙1 Ω22 −1 −1 −∙1 11 ∙1 Λ 22
−1
− −1 − −1 − −1
−1 = −1 ∙1 11 ∙1 Ω22
0 ∙1 −1 ∙2 ∙1 ∙2 + −1
−1
−1
−1 0 ∙1 −1 + −1 − −1 ∙1 11 ∙1 Ω22
0 ∙1 −1
−1 − −1 −∙1 11 ∙1 Λ22 −1 = −−1 −∙1 11 ∙1 Λ22
∙2 ∙1 ∙2 + −1 − −1
∙2 ∙1 ∙2
(where the simplification in the middle block came from the law of iterated expectations), thus we must
−1 have −−1 −∙1 11 ∙1 Λ22
∙2 ∙1 ∙2 = 0. Ideally we would now chose and to maximise the
Tom Holden
Partial information solution convergence speed subject to this restriction on and the restrictions that −1 = 0 and = , since beyond these conditions rationality imposes no restrictions on and . Unfortunately though, this is not analytically tractable, so instead we will just assume: = = 0 (3.27)
57
which trivially satisfies these conditions and should nonetheless be near optimal as it minimises the variance of these terms, though not necessarily of the whole expression. This means:
−1 = ∙1 ∙1 −1
−1
−1 −1 −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 − ∙1 ∙1
−1 − ∙1 ∙1
(3.28)
Finally using (3.22) and (3.28), when (3.26) holds we can write down our fully feasible solution for +1 : 0 −1
+1 = Θ,2∙ Φ + Θ,21 − Θ,2∙ Φ Ω22 ∙2
+ Θ,22 − Θ,2∙ Φ − Θ,2∙ Φ Ω22 ∙2
0 −1
−1
+ ,2∙ + Θ,2∙ Φ Ω22 − Θ,2∙ Φ Ω22 Λ22 − Ω22 + ,2∙ + Θ,2∙ Φ Ω22 Λ22 − Ω22 +
−1 −1
2∙
2∙ + Θ,2∙ + Θ,2∙ Φ 0 ∙1 −1 (3.29)
−1 −
−1
−1 −1 ∙1 11 ∙1 Ω22
These two extra terms have a fairly straightforward intuitive explanation: the first one is correcting for mistakes that with the newly available information you now realise you made last period and the second is correcting for being off the stable path. Given period beliefs (i.e. a probability distribution over , , , Σ, , ) this solution could be computed numerically by fairly standard Monte-Carlo methods. The only minor complications come firstly from the −1 , which requires a nested Monte-Carlo simulation (as contains terms), and secondly from the
−1 −1 ∙1 11 ∙1 Ω22
0 ∙1 −1 .
By
(3.28)
we
can
express
this
in
terms
of
−1 −2 ∙1 11 ∙1 Ω22
0 ∙1 −2 , so we potentially have an infinite backwards regress. Conveniently
though, since we are assuming that the agent was “born” at , we may just take
−1 −1 ∙1 11 ∙1 Ω22
0 ∙1 −1 = 0, so preventing this problem.
We now look for necessary and sufficient conditions for convergence to happen with probability 1. May 1, 2008
58
Rational macroeconomic learning in linear expectational models 3.2.6. Conditions for almost sure convergence Recall that providing (3.13), (3.15) and (3.16) hold (which are all necessary for convergence), , =
∙1 . Therefore as ∙1 ∙1 = , by (3.8) when (3.13), (3.15) and (3.16) hold, the following condition is both
necessary and sufficient for convergence: lim = 0 (3.30)
→∞
Consequently (3.13), (3.15), (3.16) and (3.30) are jointly necessary and sufficient for convergence. Now suppose that convergence happened with probability 1, then by absolute continuity of beliefs with respect to the truth, convergence would also be believed to happen with probability 1. So consequently whether or not (3.25) holds, by the sub-multiplicative property of the induced matrix norm:
0 ≤ lim sup ∙1
→∞
= lim ∙1
→∞
≤
∙1 lim
→∞
=0
Since this holds for all ≥ , by the Dominated Convergence Theorem for conditional expectations (Doob 1984: 397) it must also be true that ∞ lim sup→∞ ∙1 = 0. Now by Jensen’s inequality
and Fatou’s Lemma for conditional expectations (Doob 1984: 396-97), the following is believed to hold with probability 1 (and hence does):
0 ≤ lim sup ∙1
→∞
≤ lim sup ∙1 ≤ ∞ lim sup ∙1
→∞ →∞
=0
so since 0 ≤ lim inf→∞ ∙1 , almost surely, lim→∞ ∙1 = 0. Therefore whenever convergence happens with probability 1, by (3.24), almost surely: lim = 0 (3.31)
→∞
By Lemma 1 if is bounded this will hold automatically, which concords with our initial intuition that including a linear trend does make learning more difficult. However, since virtually all macro-variables do exhibit long run growth, we will not pursue this avenue further. More generally though, this condition can be thought of as requiring convergence of beliefs to be sufficiently fast. So for example if exhibits linear growth, then faster than linear convergence would be sufficient for (3.31) to hold.
Tom Holden
Partial information solution Also note that, trivially from (3.30), if convergence happens with probability 1 then almost surely: sup −1 < ∞
≥
59
(3.32)
Finally note that again from (3.30), by the Dominated Convergence Theorem for conditional expectations (Doob 1984: 397), almost sure convergence and the absolute continuity of beliefs with respect to the truth implies that as → ∞,
a.s.
0, thus with probability 1: lim − = 0 (3.33)
→∞
We have shown these three fairly weak conditions holding with probability 1 are necessary for (3.30) to hold almost surely. We will now show that jointly with (3.23), (3.25) and (3.27) they are sufficient. We proceed by showing that if (3.23), (3.25), (3.27), (3.31), (3.32) and (3.33) hold almost surely, then with probability 1, so does (3.30). Now by an identical argument to the one above by which we showed the necessity of (3.31) (using Fatou’s lemma etc.), that (3.31) holds with probability 1 implies that as → ∞, −1
a.s.
0. Therefore as (3.25) and (3.27) are sufficient for (3.28) to hold asymptotically (despite the
fact we need (3.26) for small ) by (3.28), with probability 1:
−1 lim + ∙1 ∙1 −1 −1 −1 ∙1 11 ∙1 Ω22
→∞
0 ∙1 −1
=0
so from pre-multiplying both sides by ∙1 ∙1 , almost surely:
→∞
lim
−1 ∙1 + −1 ∙1 11 ∙1 Ω22
0 ∙1 −1
=0
Now from the sub-multiplicative property of the induced matrix norm and Lemma 1, almost surely:
0 ≤ lim
→∞
− ∙1 ≤ ∙1
sup
≥
→∞
lim
−
=0
since sup≥ is finite almost surely by (3.32). Consequently:
−1 lim ∙1 + −1 ∙1 11 ∙1 Ω22
→∞
0 ∙1 −1
=0
To remove the final expectation from this expression we begin by noting that by Jensen’s inequality, Fatou’s Lemma and the sub-multiplicative property of the induced matrix norm we have:
May 1, 2008
60
Rational macroeconomic learning in linear expectational models
−1 0 ≤ lim sup −1 ∙1 11 ∙1 Ω22 →∞ −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 − −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
0 ∙1 −1 0 ∙1 −1 0 ∙1 −1 0 ∙1
≤ lim sup −1
→∞
≤ ∞ lim sup
→∞
−1 ∙1 11 ∙1 Ω22
−1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
≤ ∞
sup −1
≥
−1 lim sup ∙1 11 ∙1 Ω22 →∞
−1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
However by (3.32) and Lemma 1, almost surely:
sup −1
≥
−1 lim sup ∙1 11 ∙1 Ω22 →∞
−1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
0 ∙1 0 ∙1 =0
= sup −1
≥
→∞
−1 lim ∙1 11 ∙1 Ω22
−1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
Therefore by the absolute continuity of beliefs with respect to the truth:
−1 ∞ lim sup ∙1 11 ∙1 Ω22 →∞
−1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
0 ∙1
=0
−1 so as 0 ≤ lim inf→∞ −1 ∙1 11 ∙1 Ω22
−1 0 ∙1 − −1 ∙1 11 ∙1 Ω22
0 ∙1 −1
, with
probability 1:
−1 lim −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 −1 − −1 ∙1 11 ∙1 Ω22
→∞
0 ∙1 −1 −1 = 0
Now from the sub-multiplicative property of the induced matrix norm, almost surely: 0 ≤ lim ≤ lim
−1 −1 ∙1 11 ∙1 Ω22 −1 −1 ∙1 11 ∙1 Ω22 −1 0 ∙1 − ∙1 11 ∙1 Ω22 −1 0 ∙1 − ∙1 11 ∙1 Ω22
→∞
0 ∙1 −1 −1 0 ∙1 −1 −1 0 ∙1 =0
→∞
≤ sup −1 −1
≥
→∞
−1 lim −1 ∙1 11 ∙1 Ω22
−1 0 ∙1 − ∙1 11 ∙1 Ω22
where the final equality comes from Lemma 1 and the fact that by (3.32), (3.33): sup −1 −1 ≤ sup −1 −1 − −1 + −1
≥ ≥
≤ sup −1 −1 − −1 + sup −1 < ∞
≥ ≥
Consequently then, by (3.33), almost surely:
Tom Holden
Partial information solution
−1 lim ∙1 + ∙1 11 ∙1 Ω22
61
→∞
0 ∙1 −1 = 0
Thus from pre-multiplying this by ∙1 11 ∙1 , by the definition of , the fact that ∙1 ∙1 = and the fact that by (3.28), asymptotically ∙2 , with probability 1:
→∞
lim −Λ 22 ∙1 + Ω22 ∙1 −1 = 0
Therefore since the relationship Λ22 ∙1 = Ω22 ∙1 −1 is explosive, with probability 1 we must have
that lim→∞ ∙1 = 0, or else we would be violating (3.32). So finally, as ∙1 ∙1 = , as → ∞, a.s.
0.
We have therefore established that (3.23), (3.25), (3.27), (3.31), (3.32) and (3.33) holding with probability 1 are jointly sufficient for (3.30) to hold almost surely. 3.2.7. Performance under full indeterminacy When the true values of the model’s parameters mean the model is fully indeterminate (fully stable), we can derive stronger results on convergence. Note that by Lemma 1, if we write ∙ for an indicator function that takes the value 1 when its argument is true and 0 otherwise, then in this case as → ∞: Pr−1 full indeterminacy = −1 full indeterminacy Now note that when condition (3.32) holds:
−1 lim sup −1 ∙1 11 ∙1 Ω22 →∞ a.s.
1
0 ∙1 −1 not full indeterminacy 0 ∙1 −1 not full indeterminacy not full indeterminacy
−1 ≤ lim sup −1 ∙1 11 ∙1 Ω22 →∞
≤ ∞
−1 ∙1 11 ∙1 Ω22
0 ∙1 lim sup −1
→∞
−1 = ∙1 11 ∙1 Ω22
0 ∙1 lim sup −1 < ∞
→∞
so as → ∞:
−1 −1 ∙1 11 ∙1 Ω22
0 ∙1 −1 0 ∙1 −1 full indeterminacy 0 ∙1 −1 not full indeterminacy
a.s.
−1 = Pr−1 full indeterminacy −1 ∙1 11 ∙1 Ω22
−1 + 1 − Pr−1 full indeterminacy −1 ∙1 11 ∙1 Ω22
0
May 1, 2008
62
Rational macroeconomic learning in linear expectational models since under full indeterminacy ∙1 has 0 columns. Thus assuming (3.23), (3.25) and (3.27) hold, by (3.28) as → ∞:
−1 − ∙1 ∙1 −1 −1 −1 − ∙1 ∙1 a.s. a.s.
0
Thus, by identical arguments to those in § 3.2.6, that
0 as → ∞ is sufficient for → 0.
Also note that under full indeterminacy ℬ = 0 so we can take = which means (3.13), (3.15) and (3.16) hold trivially. Thus, under full indeterminacy, conditions (3.23), (3.25), (3.27), (3.31) and (3.32), jointly with the condition for the existence of a full information FREE, are sufficient for convergence. We summarise the results of § 3.2 up to here in the following key proposition: 3.2.8. Proposition 3 Given the invertibility of Ψ
Π , the condition that ∙1 ∙1 = is necessary for the existence of a full or
partial information REE, given belief in non-explosiveness. Providing this holds along with the assumptions on beliefs in § 3.2.1 and § 3.2.2, and providing there are no unit roots: Conditions (3.13), (3.15), (3.16) and (3.30) are jointly necessary and sufficient for the partial information solution to converge asymptotically to the full information one. For probability 1 convergence, conditions (3.13), (3.15), (3.16), (3.31), (3.32) and (3.33) holding almost surely are necessary.
In the full or partial information case, the condition that ∙1 ∙1 = and the condition that is −1 of full rank with rows which are linearly independent of those of 11 ∙1 2∙ Ψ are necessary for
the existence of a FREE. In the partial information case, that this condition, (3.25), (3.26) and (3.27) all hold almost surely is sufficient for the existence of a FREE. For probability 1 convergence, that the aforementioned necessary condition for the existence of a FREE, (3.13), (3.15), (3.16), (3.23), (3.25), (3.27), (3.31), (3.32) and (3.33) all hold almost surely is sufficient. When the realised true model is fully indeterminate, we can drop (3.13), (3.15), (3.16) and (3.33) from this list and still have sufficiency.
Tom Holden
Partial information solution 3.2.9. Beliefs and learning We finish this section by describing the core of an algorithm to update beliefs in each period, in order both to show the difficulties (which justify us not including simulation results) and to investigate any additional assumptions that may be required for this to be computationally feasible. For reasons of tractability, in doing this we restrict our attention to cases in which (3.25), (3.26) and (3.27) hold, so our FREE solution (3.29) is valid, and in which convergence happens with probability 1. Suppose just before the arrival of the time information set, all agents in the economy have the same joint atom-less prior over , , , Σ, , , which satisfies the restrictions in § 3.2.1 and has continuously differentiable probability density −1 , , , Σ, , . In ℐ the agent receives and , thus: , , , Σ, , = −1 , , , Σ, , , so by Bayes’ Theorem and the independence properties of : , , , Σ, , ∝ −1 , , , Σ, , , −1 , , , Σ, , −1 , , , Σ, ,
∗ = −1 −1 −1 , , , Σ, , ∗ ∝ −1 −1 , , , Σ, ,
63
(3.34)
∗ ∗ (where ∝ denotes proportionality32 and where −1 is the density function of beliefs under the ℐ−1 ).
If it was not for the +1 term in our law of motion, (1.1), we would have an entirely standard Bayesian linear regression problem, and by using a Normal-Inverse-Wishart prior (i.e. −1 , , , Σ, , ), we could ensure the posterior (i.e. , , , Σ, , ), would have the same functional form (I. G. Evans 1965). However, as it is we do not have this option – both because of the correlation between +1 and the errors in (1.1)33 and because of the simultaneous determination of and . We will certainly not be able to find conjugate priors, so will be a numerical density in practice. We will however assume that
32
We do not need the normalizing constant for many Monte-Carlo methods, such as the Metropolis-Hastings algo-
rithm (Hastings 1970).
33
This may suggest Bayesian instrumental variables regressions (see e.g. Dreze 1976) to the reader. However even
these are not directly applicable here due to non-linearities.
May 1, 2008
64
Rational macroeconomic learning in linear expectational models
∗ −1
is continuously differentiable which from (3.34) is necessary and sufficient for
, , , Σ, , to be continuously differentiable. Now let: 0 , Θ,22 − Θ,2∙ Φ − Θ,2∙ Φ Ω22 ∙2 0
−1
≔ ∙ Θ,2∙ Φ , Θ,21 − Θ,2∙ Φ Ω22 ∙2
∙, ,2∙ + Θ,2∙ Φ Ω22 − Θ,2∙ Φ Ω22 Λ22 − Ω22 ∙, ,2∙ + Θ,2∙ Φ Ω22 Λ22 − Ω22
−1
2∙
2∙ , Θ,2∙ + Θ,2∙ Φ , ∙
where angled brackets denote ordered tuples. We shall write ,1 , … , ,7 for the 7 ordered members of . So for ∈ 1, … ,7 , there exists a matrix or vector valued function such that:
, = ∝
,,,Σ, ,
Γ0 , Γ1 , , , Ψ, Π, Σ , , , Σ, , , , , Σ, ,
∗ Γ0 , Γ1 , , , Ψ, Π, Σ −1 −1 , , , Σ, , , , , Σ, ,
,,,Σ, ,
Thus we can think of as a function of . To make this clear below we shall always write , in place of , . Using this notation we define another function of , ℎ by: ℎ = − ,1 − + ,2 −1 − ,3 −1 − + ,4 − + ,5 − ,6 − ,7 + ,7
−1 −1 −1 ∙1 11 ∙1 Ω22 −1
−1
0 ∙1 −1
∗ Thus by (1.1) and (3.29), ℎ ℐ−1 , ~NIID 0, Σ . Since Σ is of full rank, this means ℎ can take any
value in ℝdim , i.e. it is surjective. If we could show ℎ was also injective then we could use the change of variables formula to recover the probability distribution of . Unfortunately we will not in fact be able to show this, but with some reasonable additional assumptions we will be able to show that ℎ is approximately invertible. Now, by the results of the previous section, as convergence happens almost surely, with probability 1 (3.30) and (3.31) hold, i.e. as → ∞, ,
a.s.
0. Since by Lemma 1, ,7 converges as → ∞, by iden-
tical arguments to those in the previous subsection (using Fatou’s lemma etc.) as → ∞:
Tom Holden
Partial information solution − ,7
−1
65
−1 + ,7
a.s.
−1
−1 −1 ∙1 11 ∙1 Ω22
0 ∙1 −1
a.s.
0
Also from the definition of , that
0 as → ∞ implies that there is a variable , known in
∗ ℐ−1 ∪ (and in particular not a function of ), such that as → ∞:
− ,1 − + ,2 −1 − ,3 −1 − + ,4 − ,6 − − Θ,2∙ Φ + Thus with probability 1 as → ∞: ℎ − − Θ,2∙ Φ +
a.s. a.s.
− + ,5
0
0
a.s.
Now on its own this does not mean that for almost all , ℎ − − Θ,2∙ Φ −
0 as → ∞,
since the probability under the truth that = for all time must be 0. However by the assumed conti∗ nuity of −1 , ℎ must be continuous, thus as it is a surjection for all ∈ ℝdim and ℯ > 0, −1 ℎ ℯ
(where ℯ is the ℯ-ball around ) is a non-empty open set. Therefore for all ≥ , the
probability that ∈ ℯ for all ∈ , ∩ ℤ is strictly positive and hence for all ≥ : Pr lim ℎ − − Θ,2∙ Φ − = 0 ∈ ℯ for all ∈ , ∩ ℤ = 1
→∞
and so if we choose ℯ ≔ −1 then: lim ℎ − − Θ,2∙ Φ −
→∞
∈ −1 for all ∈ , ∩ ℤ = 0
We can think of this as a conditional expectation on a filtration indexed by ≥ , so Doob’s Martingale convergence theorem (Doob 1953) applies and hence: lim ℎ − − Θ,2∙ Φ −
→∞
= for all ≥ ∈ −1 for all ∈ , ∩ ℤ = 0
= lim lim ℎ − − Θ,2∙ Φ −
→0 →∞
Thus for all ∈ ℝdim , almost surely: lim ℎ − − Θ,2∙ Φ − = 0
→∞
May 1, 2008
66
Rational macroeconomic learning in linear expectational models Now for all 1 , 2 ∈ ℝdim , by the triangle inequality: ℎ 1 − ℎ 2 = ℎ 1 − − Θ,2∙ Φ 1 − − ℎ 2 − − Θ,2∙ Φ 2 − + − Θ,2∙ Φ 1 − 2 ≤ ℎ 1 − − Θ,2∙ Φ 1 − + ℎ 2 − − Θ,2∙ Φ 2 − + and similarly: − Θ,2∙ Φ 1 − 2 = − ℎ 1 − − Θ,2∙ Φ 1 − + ℎ 2 − − Θ,2∙ Φ 2 − + ℎ 1 − ℎ 2 ≤ ℎ 1 − − Θ,2∙ Φ 1 − + ℎ 2 − − Θ,2∙ Φ 2 − + ℎ 1 − ℎ 2 Consequently for all 1 , 2 ∈ ℝdim with 1 ≠ 2 , as → ∞: ℎ 1 − ℎ 2 1 − 2
a.s.
− Θ,2∙ Φ 1 − 2
− Θ,2∙ Φ 1 − 2 1 − 2
Now let be the SVD of − Θ,2∙ Φ , and let ≔ 1 − 2 , so 1 − 2 = . Additionally let min be the smallest singular value, i.e. the minimum element on the diagonal of , and hence, given the usual ordering, the element in the very bottom right. Then since unitary matrices preserve scale and is diagonal: − Θ,2∙ Φ 1 − 2 = 1 − 2 = = min 1 − 2
≥ min = min 1 − 2 Furthermore if ∝ 0 ⋯ 0
1 ′ (and has the usual ordering) then this bound is actually attained. ⋯ 0 ′ where ≠ 0, we have that as → ∞:
a.s.
Thus suppose that min = 0, then if 1 = 2 + 0
ℎ 1 − ℎ 2 1 − 2
0
Tom Holden
Partial information solution which would mean ℎ was not even injective in the limit, which means that feasibly we can say nothing analytically about whether or not ℎ is injective for finite . Therefore we will require that: min > 0 (3.35)
67
i.e. − Θ,2∙ Φ is invertible. We shall see in § 3.3 that, in the univariate case at least, this always holds when the FREE conditions are satisfied, and we conjecture that this generalizes to higher dimensions. Given (3.35) then we have that almost surely: ℎ 1 − ℎ 2 1 − 2
→∞
lim
>0
This is not however sufficient to show asymptotic invertibility, since for that we clearly need uniform convergence. However, by Egoroff’s Theorem (Dudley 2005: 243), we will find that, at least approximately, we can in fact get this. Before we begin this proof, we note that unfortunately our assumptions up to this point are not, to the best of our knowledge, sufficient to ensure that almost surely for almost all the Jacobian of ℎ ,
ℎ
converges to the Jacobian of its limit, i.e. − Θ,2∙ Φ , though they do imply that ℎ is con-
∗ tinuously differentiable since −1 is. Indeed, even uniform convergence of a sequence of functions
is not sufficient for convergence of their derivatives unless the derivatives can actually be shown to converge. Consequently for reasons of tractability we assume that almost surely for almost all : lim ℎ ℎ − →∞
→∞
lim
=0
(3.36)
We conjecture this could be proved by placing at most relatively minor additional assumptions on beliefs. We now begin our proof of asymptotic “approximate” invertibility, which we approach by first proving injectivity when 1 − 2 is sufficiently small, and then by proving the same when it is sufficiently large. Note that since ℎ − − Θ,2∙ Φ −
a.s.
0 as → ∞ and almost sure convergence implies conconverges in distribution as → ∞ to a normally
−1
∗ vergence in distribution, − ∗ ℐ−1 ∪ −1
distributed variable with mean 0 and variance − Θ,2∙ Φ
Σ − Θ,2∙ Φ
′−1
. Let be the prob-
May 1, 2008
68
Rational macroeconomic learning in linear expectational models ability measure induced by this particular normal distribution and let ℯ > 0 be fixed, then we can find a compact set 1 ⊆ ℝdim such that 1 > 1 − (we could take 1 to be a sufficiently large closed
7 ℯ
ball centred on the origin). From assumption (3.36) then, and Egoroff’s Theorem, there is a set 2 ⊆ 1 such that 2 > 1 − and such that as → ∞: ℎ 2 + ∗ −1 2
a.s. 2ℯ 7
2 ∈2
sup
− − Θ,2∙ Φ
0
By the sub-multiplicative property of the induced matrix norm this means that almost surely there exists 1 ≥ such that for all ≥ 1 and all 1 , 2 ∈ 2 :
ℎ 2 +∗ −1 2
1 − 2 − − Θ,2∙ Φ 1 − 2 1 − 2
ℎ 2 + ∗ −1 ≤ 2
− − Θ,2∙ Φ
<
min 3
Hence by the triangle inequality and the definition of min :
ℎ 2 +∗ −1 2
1 − 2
1 − 2
>
− Θ,2∙ Φ 1 − 2 1 − 2
−
min 2min ≥ 3 3
Now note that by the definition of the Jacobian: ℎ 1 + ∗ −1 − ℎ 2 + ∗ −1 1 − 2
ℎ 2 +∗ −1 2
1 →2
lim
−
1 − 2
=0
so again by the triangle inequality, almost surely for all ≥ 1 there exists ,2 > 0 such that 1 ∈ 2 with 1 − 2 < ,2 implies that almost surely: ℎ 1 + ∗ − ℎ 2 + ∗ −1 −1 1 − 2 2min min min − = 3 3 3
>
Tom Holden
Partial information solution Without loss of generality we assume ,2 is the greatest possible such value (where we allow ∞), and then, since asymptotically at least this last inequality must hold everywhere, it seems reasonable to conjecture (though we have not found a proof) that: lim inf ,2 > 1 ,2
→∞
69
(3.37)
3ℯ 7
Thus by Egoroff’s Theorem there is a set 3 ⊆ 2 such that 3 > 1 − 2 ≥ 1 such that for all ≥ 2 , and all 2 ∈ 3 , ,2 > 1 ,2 .
and such that there exists
Now note that since is a regular measure (Davidson 1994: 413) there exists a closed set 4 ⊆ 3 such that 3 ∖ 4 < and so 4 > 1 −
7 ℯ 4ℯ 7
. Since closed subspaces of compact spaces are compact
(Sutherland 1975: 84) and 4 ⊆ 1 , 4 is also compact. Let us write ,2 2 for the open ball of radius ,2 centred on 2 , then for all ≥ 1 , ,2 2 2 ∈ 3 is an open cover of 3 , and thus, by compactness, has a finite subcover, say , ,1 , , ,2 , … , , , ,
,1
,2
,
where we may assume that ,1 , ,2 , … , , ⊆ +1,1 , +1,2 , … , +1, +1 without loss of generality. Using this we let: ≔ inf sup > 0 ∃ ∈ 1, … , s.t. ∩ 3 ⊆ ,
∈3
,
,
>0
(That > 0 is easily seen diagrammatically.) Then for ≥ 2 , > 1 , so if we define ≔ 1 , almost surely for all ≥ 2 and all 1 , 2 ∈ 3 with 1 − 2 < : ℎ 1 + ∗ − ℎ 2 + ∗ −1 −1 1 − 2 min 3
>
We now turn to the large 1 − 2 case. Since for all ∈ ℝdim almost surely lim→∞ ℎ − − Θ,2∙ Φ − = 0, by Egoroff’s Theorem there is a set 5 ⊆ 4 such that 5 > 1 − and such that as → ∞: sup ℎ + ∗ −1 − − Θ,2∙ Φ + ∗ −1 −
a.s. 5ℯ 7
∈1
0
May 1, 2008
70
Rational macroeconomic learning in linear expectational models Then by the inequalities we originally used to derive pointwise convergence, almost surely there exists some 3 ≥ 2 such that for all ≥ 3 and all 1 , 2 ∈ 5 with 1 − 2 ≥ : ℎ 1 + ∗ − ℎ 2 + ∗ −1 −1 1 − 2 − Θ,2∙ Φ 1 − 2 1 − 2 min min > 2 2 min 2 1 − 2
>
−
≥ min −
Combining this with our previous result we have that, for ≥ 3 and all 1 , 2 ∈ 5 : ℎ 1 + ∗ − ℎ 2 + ∗ −1 −1 1 − 2
>0
Finally by the regularity of we can find a closed, compact set 6 ⊆ 5 such that 6 > 1 −
6ℯ 7
.
Now let ≔ + ∗ t ∈ 6 , then is compact and ℎ restricted to (written ℎ −1
)
is
almost surely an injection for ≥ 3 . Note that without loss of generality we may assume that 3 is the
∗ least value satisfying this. Note also that since as → ∞, − ∗ ℐ−1 ∪ −1
converges in dis-
tribution to a random variable with probability measure , there exists 4 ≥ 3 such that for all ≥ 4 ,
∗ Pr−1 − ∗ −1 ∗ Pr−1 − ∗ −1
∈ 6 − 6
<,
7
ℯ
so
from
the
above
for
all
≥ 4 ,
∈ 6 > 1 − ℯ . In conclusion then this means that for ≥ 4 , ℎ
is an
∗ injection where Pr−1 ∉ < .
Similarly to our assumption of the invertibility of for ≥ , we make the reasonable assumption that priors are sufficiently close to the truth that 4 = for some fixed ℯ > 0 which determines the accuracy of our approximation. I.e. we assume that almost surely there is a compact set such that for all ≥ : ℎ
∗ is injective and Pr−1 ∉ < ℯ
(3.38)
By the above, if ℯ is sufficiently large this is automatic. For smaller ℯ we have a trade-off between accuracy and restrictiveness on priors. For the approximation of ℎ by ℎ
to be fully rational we would need
to take the limit as ℯ → 0, but we cannot rule out the possibility that this would necessitate priors starting out already fully converged, though we conjecture this is not the case. We stress though that this does not mean that the updating of beliefs requires them to have already converged; in theory we could re-
Tom Holden
Partial information solution
∗ cover the distribution −1 without assuming ℎ is “approximately” invertible, it would just be far
71
less analytically and numerically tractable. Since hardware precision and the desire for finite running times inevitably place accuracy bounds on real world numerical algorithms, the current approach seems sensible in practice.
∗ Now in order to perform the change of variables necessary to recover −1 , the Jacobian of ℎ , ℎ
must exist and have a non-zero determinant almost everywhere. But from the above we have that
almost surely for all 1 , 2 ∈ :
ℎ 2 2
1 − 2
1 − 2
>
2min 3
so for 2 ∈ ° , (where ° is the interior of ° , i.e. the union of all its open subsets), the largest singular value of
ℎ 2 2
is bounded away from 0, and hence by continuity, in fact for all 2 ∈ ,
ℎ 2 2
is invert-
ible and hence has a non-zero determinant.
∗ We can now at last perform the change of variables. Recall that ℎ ℐ−1 , ~NIID 0, Σ , so
ℎ
∗ ℐ−1 , , ∈ is equal in distribution to ∈ ℎ , which has probability density func-
tion proportional to that of an NIID 0, Σ variable on ℎ . Thus from the change of variables formula (Port 1994: 462), for ∈ ℎ : ℎ
−1
∗ −1 ℎ
−1
, ∈ det
1′ ∝ exp − Σ −1 2
so by the inverse function theorem, for all ∈ : 1 ∗ −1 ∝ exp − ℎ ′ Σ −1 ℎ 2
∗ and for ∉ , we approximate −1 ∗ Pr−1 ∉ < ℯ.
det
ℎ
by 0 which is reasonable since as shown above,
We now almost have an algorithm for updating beliefs. Unfortunately though, careful inspection reveals
∗ that ℎ is defined in terms of −1 , so the just derived relationship should be thought of as describ-
May 1, 2008
72
Rational macroeconomic learning in linear expectational models
∗ ing the functional fixed point condition that characterises −1 . However, the above results imply ∗ that as → ∞, on , −1 almost surely tends uniformly to the probability density function of a ∗ normal random variable whose parameters are not defined in terms of −1 . More formally there
exists some (non-linear) operator on the space of continuous probability density functions over ,
∗ ∗ such that −1 ∙ = −1 ∙
and such that almost surely, as → ∞,
∞
∗ −1 ∙
−
∙
∞
→ 0 for some probability density function , where ∙
is the function space sup-norm. Since
constant operators are trivially contractions, this suggests (but does not imply) that for some and ≥ , is a contraction on the space of continuous probability density functions over . We conjecture this holds; then, since as is compact, the space of all continuous functions on with the sup-norm is complete (Sutherland 1975: 83,123,76), since the subspace of all functions with integral 1 is closed by Fatou’s lemma (Dudley 2005: 131) and since closed subspaces of a complete metric space are complete (Sutherland 1975: 124), then the space of all continuous probability density functions over is complete. Hence by the Banach Fixed Point Theorem (Sutherland 1975: 130-31), if our conjecture holds then for ≥ , has a unique fixed point which can be arrived at by iteration. As usual, by sufficiently restricting priors we may assume that = , so providing the assumptions and conjectures we have made in this sub-section actually hold, for sufficiently tight priors we have an effective way of computing posterior beliefs to an arbitrary degree of accuracy. Since this is an iteration around numerical integrals, it is however likely to be very slow to compute in practice.
3.3.
Application to the univariate case
We will now apply the results of this chapter to the univariate case, both in order to verify them by comparison with § 2.2 and in order to provide additional intuition. 0 We assume ≠ 0, so we may take Γ0 = , Γ1 = − 1
0 0 0 1 1 and Π = . , Ψ = , = − , = − −
From the properties of the QZ decomposition: Λ = and Ω = Γ1 , thus we may take Λ = . So 0 by unitarity, = , which leaves us with the Schur decomposition formula Ω = − in § 2.2.3, which means the diagonal elements of Ω are the eigenvalues of Γ1 . 1
as we had
Tom Holden
Partial information solution Note that Ψ Π is invertible providing ≠ 0, which is always true, thus the condition for the existence
1
73
of a full information REE is that ∙1 ∙1 = and the condition for the existence of a full information FREE is −1 that ∙1 ∙1 = and is of full rank with rows which are linearly independent of those of 11 ∙1 2∙ Ψ.
3.3.1. Fully stable cases When either 2 − 4 < 0 and
≤ 1, or 0 ≤ 2 − 4 and − + ≤ ≤ + , all the diagonal
elements of Ω are in the unit circle. Thus dim 2, = 0, so Ω = Ω11 , = = 1∙ = ∙1 and 2∙ Π and
2∙ Ψ both have zero rows, which implies , , , have zero rows too and we may take ∙2 = ∙2 = 1. This implies that the conditions for the existence of a REE and the first condition for the existence of a FREE are automatically satisfied. The second condition for the existence of a FREE just requires that ≠ 0, which is the condition we derived in § 2.2.2. Putting these values into the formulas derived in § 2.3.2, § 2.3.3 and § 2.3.5 then gives: 1 0 0 = , 1 Θ = Ω = Γ1 Θ = Π = Π
−1 Φ = , −1 Φ = −
dim = 0,
=
Θ = Ψ + Π = Ψ + Π , = = , = = ,
(These are entirely as expected from § 2.2.2.) So from (2.21), in the full information case we have: 1 −1 ∗ ∗ 1 −1 1 −1 ∗ − − −1 + ∗ − − + −1
∗ ∗ +1 =
which agrees with the FREE solution derived in § 2.2.2. Also from the formulas derived in § 3.1.3, § 3.1.4 and § 3.1.5 and § 3.2.5: Θ has 0 columns, Θ = Π∙2 = Π, 1 = 1 0 Γ1 − 0 1 = 0 0
, ℒ, ℛ, , ℬ, , all have 0 rows,
−1 Φ = − ,
∙2 = 1, = ∙1 = 1, Ω = 0, = 1 −1 ,
has 0 rows
Φ has 0 columns,
=
It is easy to see that this implies that condition (3.13), (3.15), (3.16) and (3.25) hold. From (3.21) we have:
May 1, 2008
74
Rational macroeconomic learning in linear expectational models = Θ −1 + + + Θ + Θ + Θ Θ ∙1
= Γ1 −1 + + + Ψ + Π + Π + Π Furthermore, we have that: 1 −1
= Θ ,2∙ − Θ,2∙ Φ Λ22
1
Θ ,2∙ + Θ,2∙ Φ ∙1 =
−1 so we can take = = 1 and = . Additionally from (3.22) this implies:
+1 =
1 −1 ∗ ∗ 1 −1 1 −1 1 −1 ∗ − − −1 + ∗ − − + + −1
which is identical to the full information FREE form apart from the additional term. 3.3.2. Saddle-path stable cases When 0 ≤ 2 − 4 and either < − + or > + then there is precisely one eigenvalue in the unit circle: let us call this eigenvalue 1 and the other 2 , so 1 ≤ 1 and 2 > 1. Thus dim 2, = 1, 11 so Ω11 = 1 , Ω22 = 2 . As in § 2.2.3 we write Ω12 = 12 and = 21
ing 1∙ Π = 11 + 21 = 1 1 1
12 , so = 11 22 12
21 , mean22 1
12 11 − 1 12 (by (2.4) and (2.6)) and 2∙ Π = 12 + 22 =
2 11
(similarly) and 2∙ Ψ = − =
1
11 (by (2.6)). These are scalars so we may take = = = = 1, 11 . This implies that the condition for the existence of a REE and the first
2 11 and = −
condition for the existence of a FREE are satisfied if and only if 11 ≠ 0. The second condition for the existence of a FREE then holds trivially. Thus assuming 11 ≠ 0, from the formulas derived in § 2.3.2, § 2.3.3 and § 2.3.5 and the identities (2.4), (2.5) and (2.6): −1 22 −2 22
22 2 + − 2 − 1 2
= 2 − 1 1
+ + 2 − 1
=
=
12 11 − 1 12 = ∙1 2 11 0 1
∙2 + ∙1
12 11 − 1 12 2 11 0
Θ = 1 ∙1 ∙1 +
z12 z12 ∙2 = 1 ∙1 11 + z11 12 z11
Tom Holden
Partial information solution 12 11 − 1 12 ∙2 2 11 0 − 0 ∙1 1= , − 2 11
75
Θ = ∙1 ∙1 −
Θ = 0
=
∙1
12 11 − 1 12 − ∙2 2 11
2 11 = + − 2 22 2 − 1 2
+ ∙2 22
=
1 1 12 11 − 1 z12 2 + − + ∙1 2 11 2 − 1 2 12 11 − 1 12 − ∙2 2 11 1 − 2 −1 ∙2
2 + − 2 − 1 2
=
∙1
0 = −
2 11 11 2 − 1
=
1 ∙1 − 12 + ∙2 11 2 − 1 11 Φ = 2 , Φ = 0
It is clear these agree with the solution derived in § 2.2.3. By these results, from (2.21), (2.5) and (2.6) in the full information case we have the following FREE form: 1 1 + 2 − 1 2 − 1 2 + − 2 − 1 2
∗ 2∗ ∗ ∗ ∗ +1 = 1 + 1 −1 − 1 ∗ + −1
2
+
+
1 + 1 2 − 1
This agrees with the FREE form derived in § 2.2.3 (though it is not quite as simple) since
2∗ ∗ 1 ∗ = 1 −1 + 1 −1 2 −1
+
2 −1 2
+
1 2 −1
.
Now from the identities derived in § 2.2.3 and the formulae derived in § 3.1.3, § 3.1.4 and § 3.2.5: 1 12 − 12 11 − ∙2 , 2 11 0 − 0 1 = 1
Θ = ∙1 1 = 1
0 1 ∙1 11 +
Θ has 0 columns 11 22 z12 21 − −1 = 1 −1
z12 z11 12
=
1 1 21 z12 − 11 22 + 1 = 0 11 1 z12 − 22 + 1 =
May 1, 2008
76
Rational macroeconomic learning in linear expectational models ℒ= = 1 2 1 2 11 1 12 − 22 − 11 1 12 − 12 11 − 2 12 11 1 12 − 22 − 11 1 12 − 22 = 1 0 − 0 = 0, 2 = 0, 1 , 11 0
0 columns
=0 0 0
0 columns
ℛ has 0 columns, = ∙2 = 1, = 2 1
ℬ=
=0
= ∙2 = 1, 0
0 columns
= ∙1 = 1,
Ω=0
−1 − Φ = −
= 2 −1 − , 11 = −1 2
Φ has 0 columns,
=
Again it is easy to see that conditions (3.13), (3.15), (3.16) and (3.25) hold. From (3.21) we have: = Θ −1 + + + Θ + Θ + Θ
= 1 ∙1 11 +
Θ ∙1
z12 z11 12
0 −1 2 + − 2 − 1 2
+
1 1 12 11 − 1 z12 2 + − ∙1 + 2 11 2 − 1 2
+ ∙2 22
+
1 ∙1 1 12 − 12 11 ∙1 − 12 + ∙2 11 + + ∙1 − ∙2 2 − 1 11 2 11 2 11
Furthermore, we have that by (2.5) and (2.6): 11
= Θ ,2∙ − Θ,2∙ Φ Λ22
Θ ,2∙ + Θ,2∙ Φ ∙1 = −
so we can take = = 1 and = −
11
. This also implies that by (3.22) and (2.5) and (2.6) again: − 2 − 1 11
+1 = 1 +
+ + 2 − 1 2 − 1
2
+
which were it not for the term would be identical to the simple FREE solution derived in § 2.2.2. (Indeed, we conjecture in light of this that using the derived partial information FREE solution with = 0 and our solution for we can always derive the MSV solution if it exists, even in the multivariate case.)
Tom Holden
Partial information solution 3.3.3. Convergence conditions We showed in § 3.3.1 and § 3.3.2 that providing the condition for the existence of a REE holds, (3.13), (3.15), (3.16) and (3.25) always hold. This leaves conditions (3.23), (3.26), (3.27), (3.31), (3.32) and (3.33) to investigate. We turn first to (3.23). Assuming partial information expectations are formed according to (3.29), we have that Cov , +1 ≈ Θ,2∙ + Θ,2∙ Φ (not exact since the other dated expectations will be slightly correlated with ). Now Θ,2∙ + Θ,2∙ Φ full stability =
1 −1 full stability , which, if
77
and are known non-zero constants, will be generically non-zero providing there is a reasonable degree of asymmetry around 0 to beliefs, which is certainly guaranteed to be true for large . Since Θ,2∙ + Θ,2∙ Φ saddle path stability = 0, this in turn means that providing beliefs put positive probability on full indeterminacy it is reasonable to expect Cov , +1 is of full rank for large . Similarly, from the solutions for in § 3.3.1 and § 3.3.2, with sufficiently tight/asymmetric priors it seems reasonable to expect (3.26) holds. Condition (3.27) is not strictly a condition at all, since we just choose to make it hold, so we are left with conditions (3.31), (3.32) and (3.33) about which we cannot really say anything since they relate to the actual evolution of variables over time. In future work we hope to investigate them through simulations. 3.3.4. Proposition 4 If the true model is univariate with no unit roots and the assumptions on beliefs in § 3.2.1 and § 3.2.2 hold, then: Under full stability a full information REE always exists and ≠ 0 is necessary for the existence of a full or partial information FREE. Under saddle-path stability 11 ≠ 0 is necessary for the existence of a full or partial information, REE or FREE. In either case that the respective necessary condition for the existence of a FREE, (3.26) and (3.27) hold almost surely is sufficient for the existence of a partial information FREE. For probability 1 convergence, conditions (3.31), (3.32) and (3.33) must hold almost surely.
May 1, 2008
78
Rational macroeconomic learning in linear expectational models Under full stability, that ≠ 0, (3.23), (3.27), (3.31) and (3.32) all hold with probability 1 is sufficient for almost sure convergence. Under saddle-path stability, that 11 ≠ 0, (3.23), (3.27), (3.31), (3.32) and (3.33) all hold with probability 1 is sufficient for almost sure convergence.
3.4.
Bounded rationality approximations
We finish by describing how bounded rationality schemes arise naturally from the partial information solution we have described. Many of our assumptions can be dropped or weakened under bounded rationality, and our algorithms simplified. By virtue of their origin, the schemes that result are less vulnerable to the charge of being arbitrary than those in the existing literature. The most obvious first step away from rationality would be to drop all assumptions that we have shown hold asymptotically anyway. This includes our assumption that is always invertible, which we could do without under bounded rationality by taking its pseudo-inverse, and our assumption that ℎ is always injective, which could be avoided by increasing the parameter ℯ sufficiently. A greater restriction on rationality would be to approximate our belief updating algorithm by Bayesian instrumental variables (Kleibergen and Zivot 1998), along similar lines to the classical SIV regression described in § 3.2.3. This would enable a conjugate prior form for beliefs to be found, thereby greatly increasing numerical tractability, while still enabling us to form non-point expectations of the required matrices. Finally, we could arrive at a “Bayesian econometrician” version of E&H’s approach by assuming
−1 that agents irrationally expect = ∙1 11 ∙1 Ω22
0 ∙1 −1 = 0 under the − 1 information set,
which would mean the final two terms dropped out of (3.29), our partial information FREE solution. These approximations could potentially help answer how a model’s dynamics change on the continuum from full rationality to E&H style bounded rationality, which in turn could enable us to provide an empirical answer to what degree of bounded rationality real world agents possess.
Tom Holden
Conclusion
79
4.
Conclusion
In this thesis, we have answered a major unsolved question in modern macroeconomics, namely how, rationally, should expectations be formed under partial information to ensure the economy is not asymptotically explosive. We have found necessary and sufficient conditions for the path a model follows when agents form expectations in this way to converge to that followed under the standard full information rational expectations solution, and we have gone a significant way towards describing a numerical algorithm for the inter-period updating of beliefs. Additionally we have described a new class of full information REEs, which, in a significant sense, are the only feasible ones. This is the first paper to address this topic though, so there is still much work to be done. Tighter necessary and sufficient conditions for convergence need to be found if this work is to be easily applicable, and in particular conditions not requiring to be tested by simulations. We conjecture that given sufficiently diffuse priors, the conditions for the existence of a FREE and the conditions for partial information convergence broadly coincide. We further conjecture that beliefs must put positive probability on full indeterminacy for convergence to take place. Future work must also provide a software implementation of our algorithms for the updating of beliefs, and the conjectures we made in § 3.2.9 must be proven to ensure these are valid. In order to answer our criticism of E&H we must also investigate rational learning in an overlapping generations framework, with a realistic population age distribution. The implications of this work are numerous. Firstly, the conditions for convergence and feasibility it outlines can be used both to rule out particular equilibria and to rule out entire models – potentially any theoretical model that did not produce indeterminacy for parameters with theoretically plausible values. Secondly, the solution can be applied to examine economic dynamics under partial information rational expectations, which could shed light on many major empirical puzzles and assist with the design of optimal fiscal and monetary rules. Thirdly, the described algorithms for belief updating and expectations formation could be used econometrically to produce estimates of our actual economy’s parameters that are not plagued by the omitted variable bias that comes from the lack of a decent proxy for expectations. Finally, this work could be useful for forecasting with truly rational, partial information expectations.
May 1, 2008
80
Rational macroeconomic learning in linear expectational models
5.
Appendix A: Matrix quasi-geometric series
For any matrix and ∈ ℕ:
−
=0
=
=0
−
=0
+1 = − +1
For
∞ =0
to converge34 we at least require lim→∞ = 0, which is true if any only if all the eigen-
values of are in the unit circle. This is also sufficient for convergence as it means − is invertible and the right hand side converges to . Also for any matrix and ∈ ℕ:
− −
=0
−1
= −
=0
−1
−
=0
+ 1 +
=0
= −
=0
− + 1 = − − +1 −
As before, we require lim→∞ −1 = 0. Taking the Jordan normal form of , we can write −1 =
−1 −1 with invertible and block diagonal with diagonal blocks of the form , + ,
with , nilpotent and , the th eigenvalue; so lim→∞ −1 = 0 if any only if for each block , lim→∞ , + ,
−1
= 0. Then by the binomial theorem and the nilpotency of , :
min −1,dim ,
, + ,
−1
=
=0
− 1 −1− , = ,
−1 dim , ,
= dim , −1 , (using order notation as → ∞, again assuming the matrix norm induced by the Euclidean vector norm). Thus, since dim , −1 → 0 as → ∞ if , < 1, we must have that when , < 1, , + , , of
−1
→ 0 as → ∞ too. Thus again we have that are = − in
−2
∞ =0
converges if and only if all the eigenvalues in which case we have
the
unit
circle,
∞ −1 =0
.
34
These and subsequent limits are taken under the matrix norm induced by the Euclidean vector norm.
Tom Holden
References
81
6.
References
Auray, Stéphane and Fève, Patrick (2007), 'On Sunspots, Habits, And Monetary Facts', Macroeconomic Dynamics, 12 (01), 72-96. Benhabib, Jess and Farmer, Roger E. A. (1994), 'Indeterminacy and Increasing Returns', Journal of Economic Theory, 63 (1), 19-41. Benhabib, Jess and Gali, Jordi (1995), 'On growth and indeterminacy: some theory and evidence', Carnegie-Rochester Conference Series on Public Policy, 43, 163-211. Benhabib, Jess and Nishimura, Kazuo (1998), 'Indeterminacy and Sunspots with Constant Returns', Journal of Economic Theory, 81 (1), 58-96. Benhabib, Jess and Farmer, Roger E. A. (2000), 'The Monetary Transmission Mechanism', Review of Economic Dynamics, 3 (3), 523-50. Benhabib, Jess, Schmitt-Grohe, Stephanie, and Uribe, Martin (1998), Monetary Policy and Multiple Equilibria (SSRN). Benhabib, Jess, Farmer, Roger E. A., and John, B. Taylor and Michael Woodford (1999), 'Chapter 6 Indeterminacy and sunspots in macroeconomics', Handbook of Macroeconomics (Volume 1, Part 1: Elsevier), 387-448. Binder, M. and Pesaran, H. (1996), 'Multivariate Linear Rational Expectations Models: Characterisation of the Nature of the Solutions and Their Fully Recursive Computation', (Faculty of Economics (formerly DAE), University of Cambridge). Blanchard, Olivier Jean and Kahn, Charles M. (1980), 'The Solution of Linear Difference Models under Rational Expectations', Econometrica, 48 (5), 1305-11. Blanchard, Olivier Jean and Summers, Lawrence H. (1987), 'Fiscal increasing returns, hysteresis, real wages and unemployment', European Economic Review, 31 (3), 543-60. Blume, Lawrence E. and Easley, David (1993), 'Rational Expectations and Rational Learning', (EconWPA). Blume, Lawrence E., Bray, Margaret M., and Easley, David (1982), 'Introduction to the Stability of Rational Expectations Equilibrium', Journal of Economic Theory, 26 (2), 313-17. Bray, Margaret (1982), 'Learning, estimation, and the stability of rational expectations', Journal of Economic Theory, 26 (2), 318-39. Cagan, Phillip (1954), 'The monetary dynamics of hyper-inflations'. Calvo, Guillermo A. (1983), 'Staggered Prices in a Utility-Maximizing Framework', Journal of Monetary Economics, 12 (3), 383-98.
May 1, 2008
82
Rational macroeconomic learning in linear expectational models Cogley, Timothy and Sargent, Thomas J. (2008), 'Anticipated Utility and Rational Expectations as Approximations of Bayesian Decision Making', International Economic Review, 49, 185-221. Davidson, James (1994), Stochastic limit theory : an introduction for econometricians (Advanced texts in econometrics; Oxford; New York: Oxford University Press). Doob, Joseph L. (1953), Stochastic processes (Wiley publications in statistics; New York: Wiley). --- (1984), Classical potential theory and its probabilistic counterpart (Grundlehren der mathematischen Wissenschaften, 262; New York: Springer-Verlag). Dreze, Jacques H. (1976), 'Bayesian Limited Information Analysis of the Simultaneous Equations Model', Econometrica, 44 (5), 1045-75. Dudley, R. M. (2005), Real analysis and probability (Cambridge: Cambridge University Press). Easley, David and Kiefer, Nicholas M. (1988), 'Controlling a Stochastic Process with Unknown Parameters', Econometrica, 56 (5), 1045-64. Ellison, Martin and Pearlman, Joe (2008), 'Saddlepath Learning'. Evans, George W. and Honkapohja, Seppo (2001), Learning and expectations in macroeconomics (Frontiers of Economic Research; Princeton and Oxford: Princeton University Press). Evans, I. G. (1965), 'Bayesian Estimation of Parameters of a Multivariate Normal Distribution', Journal of the Royal Statistical Society. Series B (Methodological), 27 (2), 279-83. Hastings, W. K. (1970), 'Monte Carlo Sampling Methods Using Markov Chains and Their Applications', Biometrika, 57 (1), 97-109. Hinrichsen, Diederich and Pritchard, A. J. (2005), Mathematical systems theory (Texts in applied mathematics, 48; Berlin; New York: Springer). Horn, Roger A. and Johnson, Charles R. (1985), Matrix analysis (Cambridge [Cambridgeshire]; New York: Cambridge University Press). Howitt, Peter and McAfee, R. Preston (1988), 'Stability of Equilibria with Externalities', The Quarterly Journal of Economics, 103 (2), 261-77. Kalai, Ehud and Lehrer, Ehud (1991), 'Rational Learning Leads to Nash Equilibrium', (C.V. Starr Center for Applied Economics, New York University). Kalman, R. E. (1960), 'A New Approach to Linear Filtering and Prediction Problems', Transactions of the ASME - Journal of Basic Engineering, 82, 35-45. Kiefer, Nicholas M. and Nyarko, Yaw (1989), 'Optimal Control of an Unknown Linear Process with Learning', International Economic Review, 30 (3), 571-86.
Tom Holden
References Kleibergen, Frank and Zivot, Eric (1998), 'Bayesian and Classical Approaches to Instrumental Variables Regression', (EconWPA). Kreps, David M. (1998), 'Anticipated Utility and Dynamic Choice', in Donald P. Jacobs, Ehud Kalai, and Morton Kamien (eds.), Frontiers of research in economic theory (Cambridge: Cambridge University Press), 242-74. Lubik, T. A. and Schorfheide, F. (2003), 'Computing sunspot equilibria in linear rational expectations models', Journal of Economic Dynamics and Control, 28, 273-85. Lucas, Robert E. (1972), 'Expectations and the neutrality of money', Journal of Economic Theory, 4 (2), 103-24. Marcet, Albert and Sargent, Thomas J. (1989), 'Convergence of least squares learning mechanisms in selfreferential linear stochastic models', Journal of Economic Theory, 48 (2), 337-68. Mas-Colell, Andreu, Whinston, Michael Dennis, and Green, Jerry R. (1995), Microeconomic theory (New York: Oxford University Press). Mavroeidis, Sophocles and Zwols, Yori (2007), 'LiRE - An Ox package for solving Linear Rational Expectations Models Manual Version 3.0'. , accessed 07/12/2007. McCallum, Bennett T. (1983), 'On Non-Uniqueness in Rational Expectations Models: An Attempt at Perspective', (National Bureau of Economic Research, Inc). --- (1999), 'Role of the Minimal State Variable Criterion in Rational Expectations Models', International Tax and Public Finance, 6 (4), 621-39. Muth, John F. (1961), 'Rational Expectations and the Theory of Price Movements', Econometrica, 29 (3), 315-35. Pearlman, Joseph, Currie, David, and Levine, Paul (1986), 'Rational expectations models with partial information', Economic Modelling, 3 (2), 90-105. Port, Sidney C. (1994), Theoretical probability for applications (Wiley series in probability and mathematical statistics; New York: Wiley). Prescott, Edward C. (1972), 'The Multi-Period Control Problem Under Uncertainty', Econometrica, 40 (6), 1043-58. Quarteroni, Alfio, Sacco, Riccardo, and Saleri, Fausto (2000), Numerical mathematics (Texts in applied mathematics, 37; New York: Springer). Radner, Roy (1979), 'Rational Expectations Equilibrium: Generic Existence and the Information Revealed by Prices', Econometrica, 47 (3), 655-78. Rotemberg, Julio J. and Woodford, Michael (1992), 'Oligopolistic Pricing and the Effects of Aggregate Demand on Economic Activity', The Journal of Political Economy, 100 (6), 1153-207.
83
May 1, 2008
84
Rational macroeconomic learning in linear expectational models Sargent, Thomas J., Fand, David, and Goldfeld, Stephen (1973), 'Rational Expectations, the Real Rate of Interest, and the Natural Rate of Unemployment', Brookings Papers on Economic Activity, 1973 (2), 429-80. Savage, Leonard J. (1954), The foundations of statistics (Wiley publications in statistics; New York: Wiley). Sims, Christopher A. (2002), 'Solving Linear Rational Expectations Models', Computational Economics, 20 (1), 1-20. Sutherland, W. A. (1975), Introduction to metric and topological spaces (Oxford [Eng.]: Clarendon Press). Taylor, John B. (1998), 'Monetary policy rules', (Chicago: University of Chicago Press). Townsend, Robert M. (1978), 'Market Anticipations, Rational Expectations, and Bayesian Analysis', International Economic Review, 19 (2), 481-94. --- (1983), 'Forecasting the Forecasts of Others', The Journal of Political Economy, 91 (4), 546-88. Walsh, Carl E. (2003), Monetary theory and policy (Second edition; Cambridge and London: MIT Press). Woodford, Michael (1987), 'Credit Policy and the Price Level in a Cash-in-Advance Economy', Proceedings of the Second International Symposium in Economic Theory and Econometrics (Barnett,-WilliamA.; Singleton,-Kenneth-J., eds. New approaches to monetary economics: International Symposia in Economic Theory and Econometrics series), New York and Melbourne Cambridge University Press, 1987; 52-66. --- (1994), 'Monetary policy and price level determinacy in a cash-in-advance economy', Economic Theory, 4 (3), 345-80. Wooldridge, Jeffrey M. (2002), Econometric analysis of cross section and panel data (Cambridge, Mass.: MIT Press). Young, H. Peyton (2004), Strategic learning and its limits (Oxford [England]; New York: Oxford University Press).
Tom Holden
Readers

Like
Add Comment