Lectures in Labor Econom ics
Daron Acemoglu
David Autor
Con tents
P art 1. In troduction to Hum an Capital In vestments 1
Chapter 1. The Basic Theory of H um an Capital 3
1. General Issues 3
2. Uses of Human Capital 4
3. Sources of Hum an Capital Dierences 6
4. Human Capital In vestmen ts and The Separation Theorem 8
5. Schooling Investments and Returns to Education 11
6. A Simple Tw o-P eriod Model of Schooling Investmen ts and Some Evidence 13
7. Eviden ce on Hum an C apital Inv estm ents and Credit Con straints 16
8. The Ben-Porath Model 20
9. Selection and Wages– T he One-Facto r Model 26
Chapter 2. Hu m an Capital and Signaling 35
1. The Basic Model of Labor Mark et Signaling 35
2. Generalizations 39
3. Evidence on Labor Market Signaling 44
Chapter 3. Externalities and P eer Eects 47
1. Theory 47
2. Eviden ce 51
3. School Qualit y 54
4. Peer Group Eects 55
P art 2. Incen tives, Agency and Eciency Wages 69
Chapter 4. Mor al Hazard: Basic Models 71
1. The Baseline Model of Incentiv e-Insu ra nce Trade o 72
2. Incen tiv es without Asymm etric Information 74
3. Incen tiv es-Insurance Trade-o 76
4. The Form of Perform an ce Contracts 80
5. The Use of Information: Sucient Statistics 82
Chap ter 5. Mo ral Hazard with Limited Liability, Mu ltitask ing, Career
Concerns, and Applications 85
1. Limited Lia bility 85
iii
Lectures in Labor Economics
2. Linear Con tracts 89
3. Eviden ce 94
4. Multitask ing 96
5. Relative P erfo rm a n ce Evaluation 99
6. Tournaments 100
7. Application: CEO P ay 106
8. The Basic Model of Career Concerns 108
9. Career Con cern s Over Multip le Periods 114
10. Career Co ncern s and Multita sking: Application to Teac h ing 115
11. Moral Hazard and Optimal Unemploym en t Insurance 128
Chapter 6. Holdups, Incomplete Con tracts and In v estments 137
1. In vestment s in the Absence of Binding Contracts 137
2. Incomplete Contracts and the Internal Organization of the Firm 141
Chapter 7. Eciency Wa ge Models 145
1. The Shapiro-Stiglitz Model 145
2. Other Solutions to Incentive Problems 151
3. Eviden ce on Eciency Wages 151
4. Eciency Wages, Monitoring and Corporate Structure 154
Part 3. Inv e stment in Post-Schooling Skills 163
Chap ter 8. The Theory of Training Inv estm ents 165
1. General Vs. Specic Training 165
2. The Becke r Model of Training 166
3. Market Failures Due to Con tractual Problems 169
4. Training in Imperfect Labor Mark ets 170
5. General Equilibrium with Imperfect Labor Markets 177
Chap ter 9. Firm -SpecicSkillsandLearning 189
1. The Evidence On Firm-Specic Ren ts and Interpretation 189
2. In vestment in Firm-Specic Skills 194
3. A Simple M odel of Labor Mar ket Learnin g and Mo bility 203
P art 4. Search and Unemployment 211
Chapter 10. Th e Partial Equilibrium Model 213
1. Basic Model 213
2. Unemplo ymen t with Sequential Search 218
3. Aside on Riskiness and Mean Preserving Spreads 21 9
4. Back to the Basic Partial Equilibriu m Sear ch Model 221
5. P aradoxes of Search 223
iv
Lectures in Labor Economics
Chapter 11. Ba sic Equilibrium Searc h Fram ework 229
1. Motivation 229
2. The Basic Search Model 229
3. Eciency of Search Equ ilibriu m 239
4. Endogenous Job Destruction 242
5. A Tw o -Sector Searc h M odel 247
Chapter 12. Com position of Jobs 253
1. Endogenous Composition of Jobs with Homogeneous Work ers 253
2. Endogenous Composition of Jobs with Heterogeneous Work ers 267
Chapter 13. Wage Posting and Directed Searc h 273
1. Ineciency of Search Equilibria with In vestments 273
2. The Basic Model of Directed Search 279
3. Risk Aversion in Searc h Equilibr ium 287
v
Part 1
Introd uc tion to Hu m an Cap ital
In vestmen ts
CHAPTER 1
The Basic Theory of Human Capital
1. General Issues
One of the most importan t ideas in labor economics is to think of the set of
marketable skills of wo rkers as a form of capital in whic h w orkers make a variety
of investments. This perspective is im portant in understanding both investmen t
incen tives, and the structure of w a ges and earnings.
Loosely speaking, h uman capital corresponds to any stock of kno wledge or c har-
acteristics the work er has (either innate or acquired) that contribu tes to his or her
“productivity”. This denition is broad, and this has both advan tages and disad-
vantages. The advantages are clear: it enables us to think of not only the years
of schooling, but also of a variety of other c haracteristics as part of h um an capital
in vestments. These inclu de sc hool quality, training, attitudes tow a rds work, etc. Us-
ing this type of reasoning, w e can m ak e some progress towards understanding some
of the dier en ces in earnings across wo rkers that are not accounted by sch ooling
dierences alone.
The disadvantages are also related. A t some lev el, we can push this notion of
h u ma n capital too far, and think of every dierence in remu nera tion that we observe
in the labor market as due to hum an capital. For example, if I am paid less than
another Ph.D ., that m ust be because I hav e lo wer “skills” in some other dimension
that’s not being measu red by my years of schooling–this is the famous (or infamo us)
unob serv ed heterogeneity issue. The presumption that all pa y dierences are related
to skills (even if these skills are unobserv e d to the econom ists in the standard data
sets) is not a bad place to start when w e want to impose a conceptual structure on
3
Lectures in Labor Economics
empirical w a ge distributions, but there are many notable exceptions, som e of which
will be discussed later. Here it is useful to mention three:
(1) Compensating dierentials: a wo rker may be paid less in money, because
he is receiving part of his compensation in terms of other (hard-to-observe)
c h aracteristics of the job, whic h m ay include lo wer eort requiremen ts, more
pleasant w ork ing conditions, better amenities etc.
(2) Labor mark et imperfections: t wo work ers with the same h um a n capital may
be paid dierent wa ges because jobs dier in terms of their productivity and
pa y, and one of them ended up matc hing with the high productivity job,
while the other has matched with the low productivit y one.
(3) Taste-based discrimination: employers may pay a lo wer wage to a w orker
because of the work er’s gender or race due to their prejudices.
In interpreting wage dierences, and therefore in thinking of h um an capital in-
v e stm ents and the incen tives for in vestmen t, it is importan t to strike the righ t bal-
ance between assigning earning dierences to unobserved heterogeneit y, compensat-
ing w a ge dierentials and labor market imperfections.
2. Uses of Human Capital
The standard approach in labor economics views human capital as a set of
skills/characteristics that increase a w orker’s productivity. Th is is a useful start-
ing place, and for most practical purposes quite sucient. Nevertheless, it ma y be
useful to distinguish between some complem entary/alternativ e ways of thinking of
human capital. Here is a possible classication:
(1) The Bec ker view: h u ma n capital is directly useful in the production process.
More explicitly, human capital increases a w orker’s productivity in all tasks,
though possibly dierentially in dierent tasks, organizations, and situa-
tions. In this view, although the role of human capital in the production
process ma y be quite complex, there is a sense in wh ich we can think of it as
represented (representa ble) b y a unidimensional object, suc h as the stock
4
Lectures in Labor Economics
of kno w le dg e or skills, h, and this stock is directly part of the production
function.
(2) The Gardener view: according to this view, we should not think of hum an
capital as unidimensional, since there are many man y dimensions or types
of skills. A simple ver sion of this approach wo uld emphasize mental vs.
physical abilities as dierent skills. Let us dub this the Garden er view af-
ter the work by the social psychologist Howard Gardener, who con tributed
to the dev elop m e nt of multiple-intelligences theory, in particular emphasiz-
ing how man y geniuses/fam ous personalities were v ery “unskilled” in some
other dimen sion s.
(3) The Schultz/Nelson-Phelps view: h uman capital is view ed mostly as the
capacity to adapt. According to this appro ach, human cap ital is especially
useful in dealing with “disequilibrium situations, or mor e generally, with
situation s in which there is a c ha ng ing enviro nmen t, and w orkers hav e to
adapt to this.
(4) The Bowles-Gintis view: “h um an capital” is the capacit y to wor k in or-
ganizatio n s, obey orders, in short, adapt to life in a hierarchic al/c apitalist
society. Accord ing to th is view, the main role o f schools is to instill in
individuals the “correct” ideology and approach towards life.
(5) The Spence view: observable measures of hu ma n capital are more a signal of
ability than cha racter istics independen tly useful in the production process.
Despite their dieren ces, the rst three views are quite similar, in that “h uman
capital” will be valued in th e mark et because it in creases rms’ prots. This is
straightforward in the Beck er and Sc hultz views, but also similar in the Garden er
view. In fact, in man y applications, labor econo m ists’ view of hu m an capital w o u ld
be a mixture of these th ree appro aches. Even th e Bowles-Gin tis view h as very similar
implicatio ns. Here, rms would pa y higher wages to educated w orkers because these
workerswillbemoreusefultotherm as they will obey orders better and will be
more reliable members of the rm ’s hierarc hy. The Spence view is dierent from
5
Lectures in Labor Economics
the others, howev er, in that observable measures of human capital m ay be rewarded
because they are signals about some other characteristics of work er s. We will discuss
dierent implications of these views below .
3. Sources of Human Capital Dierences
It is useful to think of the possible sources of h u ma n capital dierences before
discussing the incentiv es to inv est in h uman capital:
(1) Innate ability: work ers can ha ve dierent amounts of skills/human capital
becauseofinnatedierences. Researc h in biology/ social biology has docu-
mented that there is some componen t of IQ which is genetic in origin (there
is a heated debate about the exact importance of this component, and some
econom ists hav e also tak en part in this). The relevance of this observation
for labor eco nom ics is tw ofold: (i) there is likely to be heterog eneity in
h u m an capital ev en when individuals have access to the same investment
opportunities and the same economic constrain ts; (ii) in empirical appli-
cations, we have to nd a way of dealing with this source of dierences
in h u man capital, especially when it’s like ly to be correlated with other
variables of interest.
(2) Sc hooling: this has been the focus of muc h research, since it is the most
easily observable component of h um a n capital in vestments. It has to be
borne in mind, however, that the R
2
of earnings regressions that contr ol for
schooling is relatively small, suggesting that sc hooling dierences account
for a relatively small fraction of the dieren ces in earnings. Ther efore,
there is much more to hu m an capital than sc h ooling. Nev ertheless, the
analysis of sc h ooling is lik ely to be v er y informative if we presume that
thesameforcesthataect schooling in vestments are also likely to aect
non-schooling in vestments. So we can infer from the patterns of sc h ooling
inv estm ents what may be happening to non-sc h ooling inv estm ents, which
are more dicult to observ e.
6
Lectures in Labor Economics
(3) Sc hool qualit y and non-sc h ooling inv estmen ts: a pair of iden tica l tw ins who
grew up in the same en vironment un til the age of 6, and then completed
the same y ear s of schooling may nevertheless have dieren t amoun ts of
h u m an capital. This could be because they attended dieren t sc hools with
varying qualities, but it could also be the case ev en if they w ent to the same
sc h ool. In this latter case, for one reason or another, they may ha ve c hosen
to make dierent in vestmen ts in other components of their h um an capital
(one ma y ha ve w or ked harder, or studied especially for some subjects, or
because of a variet y of choices/ circumstances, one ma y ha ve become more
assertiv e, better at communicating, etc.). M any economists believ e that
these “unobserved” skills are v er y importan t in unders tan ding the structu re
of w ages (and the changes in the structure of w ages). The problem is that we
do not have good data on these components of hu m an capital. Nev ertheless,
we will see dierent w ays of inferring what’s happening to these dimensions
of hum an capital below .
(4) Training: this is the component of hum an capital that w orkers acquire after
sc h ooling, often associated with some set of skills useful for a particular
industry, or useful with a particular set of tec h n olog ies. A t some lev el,
training is very similar to sc hooling in that the w orker, at least to some
degree, con tr ols ho w mu ch to in vest. Bu t it is also m uch more complex,
since it is dicu lt for a w o r ker to make training in vestments by himself.
The rm also needs to inv est in the training of the w orkers, and often ends
up bearing a large fraction of the costs of these training in vestments. The
role of the rmisevengreateroncewetakeintoaccountthattraininghas
asignicant “matching” componen t in the sense that it is most useful for
theworkertoinvestinasetofspecic tech nologies that the rm will be
using in the future. So training is often a joint in vestment b y rm s and
workers, com plicating the analysis.
7
Lectures in Labor Economics
(5) Pre-labor market inuen ces: there is increasing recognition among econo-
miststhatpeergroupeects to whic h individuals are exposed before they
join the labor market may also aect their hu m a n capital signicantly. A t
some level, the analysis of these pre-labor mark e t inuences may be “so-
ciological”. But it also has an elemen t of investmen t. For example, an
altruistic parent deciding where to liv e is also deciding whether her o-
spring will be exposed to good or less good pre-labor market inuences.
Ther efor e, some of the same issues that arise in thinking about the theory
of schooling and training will apply in this context too.
4. Hum an Capital In vestments and The Separation Theorem
Let us start with the partial equilibrium schooling decisions and establish a
simple general result, sometimes referred to as a “separation theorem” for human
capital in vestmen ts. We set up the basic model in con tinuous time for simplicity.
Consid er the schooling decision of a single individu al facing exogenously giv en
prices for h u m a n capital. Througho ut, we assume that there are perfect capital
markets. The separatio n theorem referred to in the title of this section will show
that, with perfect capital markets, schooling decisions will maximize the net presen t
discoun ted value of the individual. More specica lly, consider an individual with an
instantaneous utilit y function u (c) that satises the standard neoclassical assump-
tions. In particular, it is strictly increasing and strictly concav e. Suppose that the
individual has a planning horizon of T (w h ere T = is allo wed), discounts the
future at the rate ρ>0 and faces a constan t o w rate of death equal to ν 0.
Standard arguments imply that the objective function of this individual at time
t =0is
(1.1) max
Z
T
0
exp ( (ρ + ν) t) u (c (t)) dt.
Supposethatthisindividualisbornwithsomehumancapitalh (0) 0.Suppose
also that his human capital ev o lves o ver time according to the dierential equation
(1.2)
˙
h (t)=G (t, h (t) ,s(t)) ,
8
Lectures in Labor Economics
where s (t) [0, 1] is the fraction of time that the individual spends for in v estmen ts
in schooling, and G : R
2
+
× [0, 1] R
+
determines how hum an capital evolv es as a
function of time, the individual’s stock of h uman capital and schooling decisions. In
addition, we can impose a further restriction on schooling decisions, for example,
(1.3) s (t) S (t) ,
where S (t) [0, 1] and ma y be useful to model constraints of the form s (t) {0, 1},
which wou ld correspond to the restriction that schooling must be full-time (or other
suc h restrictions on h um a n capital in vestments).
The individual is assumed to face an exogenous sequence of wage per unit of
h u m an capital giv en by [w (t)]
T
t=0
, so that his labor earnings at time t are
W (t)=w (t)[1 s (t)] [h (t)+ω (t)] ,
where 1 s (t) is the fraction of time spen t supplying labor to the market and ω (t)
is non-human capital labor that the individual may be supplying to the mark et at
time t. The sequence of non-hu man capital labor that the individual can supply to
the mark et, [ω (t)]
T
t=0
,isexogenous. Thisformulationassumesthattheonlymargin
of choice is between mark et w o rk and sc hooling (i.e., there is no leisure).
Finally, let us assume that the individual faces a constant (ow)interestrate
equal to r on his savings. Using the equation for labor earnings, the lifetime budget
constraint of the individ ual can be written as
(1.4)
Z
T
0
exp (rt) c (t) dt
Z
T
0
exp (rt) w (t)[1 s (t)] [h (t)+ω (t)] dt.
The Separa tion Theo rem , which is the subject of this section, can be stated as
follows:
Theorem 1.1. (Separation Theor em ) Suppose that the instantaneous utility
function u (·) is strictly increasing. Then the sequenc e
h
ˆc (t) , ˆs (t) ,
ˆ
h (t)
i
T
t=0
is a
solution to the maximization of (1.1) subje ct to (1.2), (1.3) and (1.4) if and only if
h
ˆs (t) ,
ˆ
h (t)
i
T
t=0
maximizes
(1.5)
Z
T
0
exp (rt) w (t)[1 s (t)] [h (t)+ω (t)] dt
9
Lectures in Labor Economics
subject to (1.2) and (1.3), and c (t)]
T
t=0
maximizes (1.1) subje ct to (1.4) given
h
ˆs (t) ,
ˆ
h (t)
i
T
t=0
. That is, human capital accumulation and supply decisions can be
separated from consump tio n decisions.
Proof. To prove the “only if part, suppose that
h
ˆs (t) ,
ˆ
h (t)
i
T
t=0
does not max-
imize (1.5), but there exists ˆc (t) suc h that
h
ˆc (t) , ˆs (t) ,
ˆ
h (t)
i
T
t=0
is a solution to
(1.1). Let the value of (1.5) generated by
h
ˆs (t) ,
ˆ
h (t)
i
T
t=0
be denoted Y .Since
h
ˆs (t) ,
ˆ
h (t)
i
T
t=0
doesnotmaximize(1.5),thereexists[s (t) ,h(t)]
T
t=0
reaching a value
of (1.5), Y
0
>Y. Consider the sequence [c (t) ,s(t) ,h(t)]
T
t=0
,wherec (t)=ˆc (t)+ε.
By the hypothesis that
h
ˆc (t) , ˆs (t) ,
ˆ
h (t)
i
T
t=0
is a solution to (1.1), the budget con-
straint (1.4) implies
Z
T
0
exp (rtc (t) dt Y .
Let ε>0 and consider c (t)=ˆc (t)+ε for all t.Wehavethat
Z
T
0
exp (rt) c (t) dt =
Z
T
0
exp (rtc (t) dt +
[1 exp (rT)]
r
ε.
Y +
[1 exp (rT)]
r
ε.
Since Y
0
>Y,forε sucien tly small,
R
T
0
exp (rt) c (t) dt Y
0
and th us [c (t) ,s(t) ,h(t)]
T
t=0
is feasible. Since u (·) is strictly increasing, [c (t) ,s(t) ,h(t)]
T
t=0
is strictly preferred
to
h
ˆc (t) , ˆs (t) ,
ˆ
h (t)
i
T
t=0
, leading to a contra diction and proving the “only if” part.
The proof of the “if” part is similar. Suppose that
h
ˆs (t) ,
ˆ
h (t)
i
T
t=0
maximizes
(1.5). Let the maximum value be denoted b y Y . Consider the maximization of (1.1)
subject to the constrain t that
R
T
0
exp (rt) c (t) dt Y .Letc (t)]
T
t=0
be a solution.
This implies that if [c
0
(t)]
T
t=0
is a sequence that is strictly preferred to c (t)]
T
t=0
,then
R
T
0
exp (rt) c
0
(t) dt > Y . Thisimpliesthat
h
ˆc (t) , ˆs (t) ,
ˆ
h (t)
i
T
t=0
must be a solution
to the original problem, because any other [s (t) ,h(t)]
T
t=0
leads to a value of (1.5)
Y
0
Y ,andif[c
0
(t)]
T
t=0
is strictly preferred to c (t)]
T
t=0
,then
R
T
0
exp (rt) c
0
(t) dt >
Y Y
0
for any Y
0
associated with any feasible [s (t) ,h(t)]
T
t=0
. ¤
10
Lectures in Labor Economics
The in tu ition for this theorem is straightforward: in the presence of perfect capi-
tal markets, the best human capital accum ulatio n decisions are those that maxim ize
the lifetime budget set of the individ u al. It can be shown that this theorem does
not hold when there are imperfect capital markets. Moreov er, this theorem also
fails to hold when leisure is an argumen t of the utility function of the individual.
Nevertheless, it is a very useful benchmarkas a starting poin t of our analysis.
5. Schooling Investmen ts and Returns to Education
We now tur n to the simplest model of schooling decision s in partial equilibrium ,
which will illustrate the ma in tradeos in hum an capital in vestmen ts. The model
presented here is a v ersion of Mincer’s (1974) seminal con trib ution. This model also
enables a simple mapping from the theory of human capital inv estmen ts to the large
empirical literature on returns to sch ooling.
Let us rst assume that T = , whic h will simplify the expressions. The ow
rate of death, ν, is positiv e, so that individu als ha ve nite expected liv es. Suppose
that (1.2) and (1.3) are suc h that the individual has to spend an interval S with
s (t)=1–i.e., in full-tim e schooling, and s (t)=0thereafter. A t the end of the
sc hooling interval, the individual will ha ve a schooling level of
h (S)=η (S) ,
where η (·) is an increasing, contin u ou sly dieren tiable and concave function. For
t [S, ), human capital accum ulates over time (as the individual w orks) according
to the dieren tial equation
(1.6)
˙
h (t)=g
h
h (t) ,
for some g
h
0. Suppose also that wages gro w exponentially,
(1.7) ˙w (t)=g
w
w (t) ,
with boundary condition w (0) > 0.
Suppose that
g
w
+ g
h
<r+ ν,
11
Lectures in Labor Economics
so that the net presen t discounted value of the individua l is nite. No w using
Theo rem 1.1, the optimal sch ooling decision m ust be a solution to the following
maximization problem
(1.8) max
S
Z
S
exp ( (r + ν) t) w (t) h (t) dt.
Now using (1.6) and (1.7), this is equivalent to:
(1.9) max
S
η (S) w (0) exp ( (r + ν g
w
) S)
r + ν g
h
g
w
.
Since η (S) is concav e, the objectiv e function in (1.9) is strictly concave. There-
fore, the unique solution to this problem is characterized b y the rst-order condition
(1.10)
η
0
(S
)
η (S
)
= r + ν g
w
.
Equation (1.10) sho w s that higher in terest rates and higher values of ν (cor-
responding to shorter planning horizons) reduce h u m an capital in vestments, while
higher values of g
w
increase the value of hum an capital and thus encourage further
inve stm e nts.
In tegratin g both sides of this equation with respect to S,weobtain
(1.11) ln η (S
)=constant +(r + ν g
w
) S
.
Now note that the w age earnings of the w orker of age τ S
in the labor market
at time t will be given by
W (S, t)=exp(g
w
t)exp(g
h
(t S)) η (S) .
Taking logs and using equation (1.11) implies that the earnings of the work er will
be given by
ln W (S
,t)= constant +(r + ν g
w
) S
+ g
w
t + g
h
(t S
) ,
where t S can be thought of as w orker experience (time after sch ooling). If we
make a cross-sectional compariso n across w orkers, the time trend term g
w
t ,will
also go in to the constan t, so that w e obtain the canonical Mincer equation where,
in the cross section, log w age earnings are proportional to schooling and experience.
12
Lectures in Labor Economics
Written dierently, we have the follow ing cross-sectional equation
(1.12) ln W
j
= constant + γ
s
S
j
+ γ
e
experience,
where j refers to individual j. Note ho wev er that we have not introduced an y source
of heterogeneity that can generate dierent leve ls of sc hooling across individuals.
Nevertheless, equation (1.12) is important, since it is the typ ical empirical model
for the relationship between wages and schooling estimated in labor economics.
The econom ic insight provided by this equation is quite important; it suggests
that the functional form of the Mincerian w age equation is not just a mere co-
incidence, but has economic conten t: the opportunity cost of one more y ear of
sc hooling is foregone earnings. T his implies that the benet has to be commen-
surate with these foregone earnings, thus should lead to a proportional increase in
earnings in the future. In particular, this proportional increase should be at the rate
(r + ν g
w
).
Em pirical w ork using equations of the form (1.12) leads to estimates for γ in
the range of 0.06 to 0.10. Equation (1.12) suggests that these returns to sch ooling
are not unreasonab le. For example, w e can think of the annu al in terest rate r as
approximately 0.10, ν as corresponding to 0.02 that gives an expected life of 50
y ears, and g
w
corresponding to the rate of wage growth holding the h um a n capital
level of the individual constant, which should be approximately about 2%. Th us
we should expect an estimate of γ around 0.10, whic h is consistent with the upper
range of the empirica l estimates.
6. A Simp le Two-Period M odel of Schooling Inv estments and Some
Evidence
Let us no w step bac k and illustrate these ideas using a t wo-period model and then
use this model to look at some further evidence. In period 1 an individual (parent)
works, consum es c,savess, decides whether to send their osprin g to sc hool, e =0
or 1, and then dies at the end of the period. Utility of house hold i is given as:
13
Lectures in Labor Economics
(1.13) ln c
i
+lnˆc
i
where ˆc is the consump tion of the ospring. There is heterogeneity among children,
sothecostofeducation,θ
i
varies with i. In the second period skilled individuals
(those with education) receive a wage w
s
and an unskilled work er receives w
u
.
First, consider the case in which there are no credit mark et problems, so parents
canborrowonbehalfoftheirchildren,andwhentheydoso,theypaythesame
in terest rate, r, as the rate they w ould obtain by sa ving. Then, the decision problem
of the parent with income y
i
is to maximize (1.13) with respect to e
i
, c
i
and ˆc
i
,
subject to the budget constrain t:
c
i
+
ˆc
i
1+r
w
u
1+r
+ e
i
w
s
w
u
1+r
+ y
i
e
i
θ
i
Note that e
i
does not appear in the objective function, so the education decision will
be mad e simply to m axim ize the budget set of the consu m er. This is the essen ce of
the Separation Theorem, Theorem 1.1 above. In particular, here paren ts will choose
to educate their osp ring only if
(1.14) θ
i
w
s
w
u
1+r
Oneimportantfeatureofthisdecisionruleisthatagreaterskillpremiumas
captured b y w
s
w
u
will encour age schooling, while the high er interest rate, r,will
discourage sc hooling (since schooling is a form of inv estm ent with upfron t costs and
delayed benets).
In practice, this solution may be dicult to achiev e for a variety of reasons.
First, there is the usual list of informational/contractual problems, creating credit
constraints or transaction costs that in troduce a wedge bet ween borro wing and lend-
ing rates (or ev en make borrow ing impossible for some groups). Second, in man y
cases, it is the paren ts who make part of the in vestment decisions for their children,
so the above solution involves parents borrow ing to nance both the education
expenses and also part of their o wn current consumption. These loans are then
supposed to be paid bac k b y their children . With the above setup, this arrangement
14
Lectures in Labor Economics
works since parents are fully altruistic. Howev er, if there are non-altruistic parents,
this will create obvious problem s.
Therefo re, in many situations credit problems might be importan t. No w imagine
the same setup, but also assume that parents cannot ha v e negativ e savings, which
is a simple and sev ere form of credit ma rket problem s. This m odies the constr aint
set as follows
c
i
y
i
e
i
θ
i
s
i
s
i
0
ˆc
i
w
u
+ e
i
(w
s
w
u
)+(1+r) s
Firstnotethatforaparentwithy
i
e
i
θ
i
>w
s
, the constrain t of nonnegative
sa ving s is not binding, so the same solution as before will apply. Therefor e, credit
constraints will only aect parents who needed to borrow to nance their children’s
education.
To characterize the solution to this problem , let us look at the utilities from
inv esting ing and not investing in education of a parent. Also to sim plify the discus-
sion let us focus on parents who w ould not choose positiv e sa vings, that is, those
paren ts with (1 + r) y
i
w
u
. The utilities from investing and not inve sting in
education are given, respectively, b y U(e =1| y
i
i
)=ln(y
i
θ
i
)+lnw
s
,and
U(e =0| y
i
i
)=lny
i
+lnw
u
. Comparison of these two expressions implies that
paren ts with
θ
i
y
i
w
s
w
u
w
s
will invest in education. It is then straightforward to verify that:
(1) This condition is more restrictiv e than (1.14) above, since (1 + r) y
i
w
u
<
w
s
.
(2) As income increases, there will be more in vestmen t in education, which
con tra sts with the non-credit-constrain ed case.
15
Lectures in Labor Economics
One interestin g implication of the setup with credit constraints is that the skill
premium, w
s
w
u
, still has a positive eect on human capital inv estm ents. Ho w -
ever, in more general models with credit constrain ts, the conclusions may be more
n u an ced. For example, if w
s
w
u
increases because the unskilled w age, w
u
, falls,
this ma y reduce the income level of man y of the households that are marginal for
the education decision, th us discourage in vestment in education.
7. Evidence on Hum an Capital In vestmen ts and Credit Constraints
This nding, that income only matters for education inv estm ents in the presence
of credit constrain ts, motivates investigations of whether there are signican t dier-
ences in the educational attainment of c hildren from dierent paren tal bac kgrounds
as a test of the im portance of credit constrain ts on education decisions. In addition,
the em p irical relationship between family income and educa tion is interesting in its
own right.
A t ypica l regression would be along the lines of
schooling=con trols + α · log paren tal income
which leads to positiv e estimates of α, consisten t with credit constraints. The prob-
lem is that there are at least t wo alternative explanations for wh y w e ma y be esti-
matin g a positive α:
(1) Children’s educa tion m ay also be a consumption good, so rich parents will
“consume” more of this good as well as other goods. If this is the case,
the positive relationsh i p between family inco me and educa tion is not ev-
idence in fa vor of credit constraints, since the “separation theorem” does
not apply when the decision is not a pure investment (en ter s directly in
the utilit y function ). Nevertheless, the implic ations for labor economics are
quite simila r: ric her par ents will inve st mo re in their c hildre n’s educa tion.
(2) The second issue is more problema tic. The distribution of costs and bene-
ts of education dier across families, and are lik ely to be correlated with
incom e. That is, the param eter θ
i
in terms of the model above will be
16
Lectures in Labor Economics
correlated with y
i
, so a regression of sc hooling on incom e will, at least in
part, capture the direct eect of dieren t costs and benets of education.
One line of attack to deal with this problem has been to include other char-
acteristics that could pro xy for the costs and benets of education, or attitudes
to ward education. The in teresting nding here is that when parents’ education is
also included in the regressio n, the role of incom e is substantially reduced.
Does this mean that credit mark et problems are not important for education?
Does it mean that parents’ incom e does not have a direct aect on education? Not
necessarily. In particular, there are t wo reasons for why suc h an interpretation may
not be w arranted.
(1) First, paren ts’ income may aect the quality rather than the quan tity of
educa tion . This may be particula rly importan t in the U.S. con text wher e
the choice of the neighborhood in which the family live s appears to have
amajoreect on the qualit y of schooling. This implies that in the United
States high income parents ma y be “buying” more human capital for their
c h ild ren , not by sending them to sc h ool for longer, but by providing them
with better sc h ooling.
(2) P a rental income is often measured with error, and has a signicant tran-
sitory component, so parental education ma y be a m uch better pro xy for
permanen t income than income observations in these data sets. There-
fore, even when incom e matters for education, all its eect my load on the
parental education variable.
Neither problem is easy to deal with, but there are possible a venues. First, we
could look at the incomes of ch ildre n rather than their sc h ooling as the outcom e
variable. To the extent that incom e reects skills (broadly dened), it will incorpo-
rate unobserved dimensions of human capital, including sc hool quality. This takes
us to the literature on in terge neratio nal mob ility. The typ ical regression here is
(1.15) log c hild incom e = controls + α · log paren tal income
17
Lectures in Labor Economics
Regressions of this sort wer e rst investigated b y Becker and Tomes. They found
relatively small coecien ts, typ ica lly in the neighborhood of 0.3 (while others, for
example Behrman and Taubman estimated coecients as low as 0.2). This me ans
that if your paren ts are twice as rich as my parents, you will typ ically ha ve about
30 to 40 percen t higher income than me. With this degree of in tergeneratio nal de-
pendence, dierences in initial conditions will soon disappear. In fact, your ch ildren
will be typic ally about 10 percent (α
2
percen t) richer than m y children. So this
nd ing im plie s that we are living in a relatively “egalitarian” society.
To see this more clearly, consider the following simple model:
ln y
t
= μ + α ln y
t1
+ ε
t
where y
t
istheincomeoft-th generation, and ε
t
is serially independent disturbanc e
term with variance σ
2
ε
. Then the long-term variance of log income is:
(1.16) σ
2
y
=
σ
2
ε
1 α
2
Using the estimate of 0.3 for α, equation (1.16) implies that the long-term variance of
log income will be approximately 10 percent higher than σ
2
ε
, so the long-run income
distribution will basically reect transitory shocks to dynasties’ incomes and skills,
and not inherited dierences.
Returnin g to the in terp reta tion of α in equation (1.15), also note that a degree
of persistence in the neigh borhood of 0.3 is not v ery dierentfromwhatwemight
expect to result sim ply from the inheritance of IQ between parents and c hildren , or
from the children ’s adoption of cultural values fa voring education from their parents.
As a result, these estimates suggest that there is a relatively small eect of paren ts
incom e on ch ildren ’s human capital.
This work has been criticized, ho wever, because there are certain simple biases,
stacking the cards against nding large estimates of the coecien t α.First,mea-
surement error will bias the coecient α towards zero. Second, in typical panel data
sets, we observ e children at an early stage of their life cycles, where dierences in
earningsmaybelessthanatlater stages, again biasing α do w nw ard. Third, income
18
Lectures in Labor Economics
mob ility may be v ery nonlinear, with a lot of mobilit y amo ng middle income fami-
lies, but very little at the tails. Work b y Solon and Zim m e rm a n has dealt with the
rst two problems. They nd that con t rolling for these issues increases the degree of
persistence substantially to about 0.45 or ev en 0.55. The next gur e sho w s Solon’s
baseline estimates.
Figure 1.1
A paper by Cooper, Durlauf and Johnson, in turn, nds that there is mu c h more
persistence at the top and the bottom of income distribution than at the middle.
That the dieren ce between 0.3 and 0.55 is in fact substan tial can be seen b y
looking at the implication s of using α =0.55 in (1.16). Now the long-run income
distribution will be substan tially more disperse than the transitory shocks. More
specica lly, we will hav e σ
2
y
1.45 · σ
2
ε
.
To deal with the second empirical issue, one needs a source of exogenous variation
in incomes to implem e nt an IV strategy. There are no perfect candid ates, but som e
imperfect ones exist. One possibility, pursued in Ace m oglu and Pischke (2001), is
to exploit changes in the income distribution that have tak en place o ver the past 30
19
Lectures in Labor Economics
y ears to get a source of exogenous variation in household income. The basic idea
is that the rank of a family in the income distribution is a good pro x y for parental
h u m an capital, and conditional on that rank, the income gap has widened o ver the
past 20 years. Moreo v er, this has happened dierentially across states. On e can
exploit this source of variation by estima ting regression of the form
(1.17) s
iqjt
= δ
q
+ δ
j
+ δ
t
+ β
q
ln y
iqjt
+ ε
iqjt
,
where q denotes income quartile, j denotes region, and t denotes time. s
iqjt
is
education of individual i in income quartile q region j time t.Withnoeect of
income on education, β
q
’s should be zero. With credit constraints, w e migh t expect
low e r quartiles to hav e positiv e β’s. Acem oglu and Pisc h ke report versions of this
equation using data aggregated to income quartile, regio n and time cells. The
estimates of β are ty p ica lly positive and signicant, as sho w n in the next two tables.
Ho wever, the evidence does not indicate that the β’s are higher for low e r income
quartiles, which suggests that there m ay be more to the relationship between income
and education than simple credit constrain ts. Potential determinants of the rela-
tionship between income and education have already been discussed extensiv ely in
the literature, but we still do not ha ve a satisfactory understanding of why paren tal
income may aectchildrenseducationaloutcomes(andtowhatextentitdoesso).
8. The Ben-Porath Model
The baseline Ben-P o rath model enrich es the models w e ha ve seen so far by al-
lowing h u m an capital investments and non-trivial labor supply decisions through ou t
the lifetime of the individual. It also acts as a bridge to models of inv estmen t in
h u m an capital on-the-job, whic h we will discuss below .
Let s (t) [0, 1] for all t 0. Together with the Mincer equation (1.12) above,
the Ben-Porath model is the basis of muc h of labor economics. Here it is sucien t
to consider a simple version of this model where the hum an capital accumulation
equation, (1.2), takes the form
(1.18)
˙
h (t)=φ (s (t) h (t)) δ
h
h (t) ,
20
Lectures in Labor Economics
Figure 1.2
where δ
h
> 0 captures “depreciation of hu ma n capital,” for example because new
mac hines and tec hniques are being in troduced, eroding the existing h uman capital
of the w o rker. The individual starts with an initial value of h um an capital h (0) >
0. The function φ : R
+
R
+
is strictly increasing, continuously dierentiable
and strictly concav e. Furtherm o re, we simplify the analysis by assuming that this
function satises the Inada-type conditions,
lim
x0
φ
0
(x)= and lim
xh(0)
φ
0
(x)=0.
21
Lectures in Labor Economics
The latter condition makes sure that we do not have to impose additional constrain ts
to ensure s (t) [0, 1]..
Let us also suppose that there is no non-human capital componen t of labor, so
that ω (t)=0for all t,thatT = , and that there is a ow rate of death ν>0.
Finally, we assume that the w age per unit of hu ma n capital is constant at w and
the in ter est rate is constant and equal to r. We also norm alize w =1without loss
of any generality.
Again using Theorem 1.1, h uman capital investments can be determined as a
solution to the followin g problem
max
Z
0
exp ( (r + ν)) (1 s (t)) h (t) dt
subject to (1.18).
This problem can then be solved b y setting up the current-value Hamilto nia n,
which in this case takes the form
H (h, s, μ)=(1 s (t)) h (t)+μ (t)(φ (s (t) h (t)) δ
h
h (t)) ,
where w e used H to denote the Hamiltonian to a void confusion with h u m a n capital.
The necessary conditio ns for an optimal solution to this problem are
H
s
(h, s, μ)=h (t)+μ (t) h (t) φ
0
(s (t) h (t)) = 0
H
h
(h, s, μ)=(1 s (t)) + μ (t)(s (t) φ
0
(s (t) h (t)) δ
h
)
=(r + ν) μ (t) ˙μ (t)
lim
t→∞
exp ( (r + ν) t) μ (t) h (t)=0.
To solv e for the optimal path of human capital in vestments, let us adopt the
following transform ation of variables:
x (t) s (t) h (t) .
Instead of s (t) (or μ (t))andh (t),wewillstudythedynamicsoftheoptimalpath
in x (t) and h (t).
The rst necessary condition then implies that
(1.19) 1=μ (t) φ
0
(x (t)) ,
22
Lectures in Labor Economics
while the second necessary condition can be expressed as
˙μ (t)
μ (t)
= r + ν + δ
h
s (t) φ
0
(x (t))
1 s (t)
μ (t)
.
Substitu ting for μ (t) from (1.19), and simplify in g, we obtain
(1.20)
˙μ (t)
μ (t)
= r + ν + δ
h
φ
0
(x (t)) .
The steady-state (stationary) solution of this optimal control problem in volves
˙μ (t)=0and
˙
h (t)=0, and thus implies that
(1.21) x
= φ
01
(r + ν + δ
h
) ,
where φ
01
(·) is the in verse function of φ
0
(·) (whic h exists and is strictly decreasing
since φ (·) is strictly concave). This equation sho w s that x
s
h
will be higher
when the in terest rate is low, when the life expectancy of the individual is high, and
when the rate of depreciation of h u m an capital is lo w.
To determine s
and h
separately, we set
˙
h (t)=0in the h u m a n capital accu-
mulation equation (1.18), whic h gives
h
=
φ (x
)
δ
h
=
φ
¡
φ
01
(r + ν + δ
h
)
¢
δ
h
.(1.22)
Since φ
01
(·) is strictly decreasing and φ (·) is strictly increasing, this equation im-
plies that the steady-state solution for the hu m an capital stock is uniquely deter-
mined and is decreasing in r, ν and δ
h
.
More interesting than the stationar y (steady-state) solution to the optimization
problem is the time path of h uman capital inv estm ents in this model. To derive
this, dierentiate (1.19) with respect to time to obtain
˙μ (t)
μ (t)
= ε
φ
0
(x)
˙x (t)
x (t)
,
where
ε
φ
0
(x)=
00
(x)
φ
0
(x)
> 0
23
Lectures in Labor Economics
is the elasticit y of the funct ion φ
0
(·) and is positive since φ
0
(·) is strictly decreasing
(thus φ
00
(·) < 0). Combining this equation with (1.20), we obtain
(1.23)
˙x (t)
x (t)
=
1
ε
φ
0
(x (t))
(r + ν + δ
h
φ
0
(x (t))) .
Figure 1.4 plots (1.18) and (1.23) in the h-x space. Th e upward-sloping curv e
corresponds to the locus for
˙
h (t)=0, while (1.23) can only be zero at x
,thusthe
locus for ˙x (t)=0corresponds to the horizontal line in the gure. The arrows of
motion are also plotted in this phase diag ram and make it clear that the steady-state
solution (h
,x
) is globally saddle-path stable, with the stable arm coinciding with
the horizon ta l line for ˙x (t)=0.Startingwithh (0) (0,h
), s (0) jumps to the level
necessary to ensure s (0) h (0) = x
.Fromthenon,h (t) increases and s (t) decreases
so as to keep s (t) h (t)=x
. Therefore, the pattern of hu man capital in vestments
implied by the Ben-P orath model is one of high in vestment at the beginning of an
individua l’s life followed by lower investments later on.
In our simplied version of the Ben-P orath model this all happens smoothly.
In the original Ben-Porath model, whic h inv olves the use of other inputs in the
production of human capital and nite horizons, the constrain t for s (t) 1 t y pically
binds early on in the life of the individual, and the interval during which s (t)=1
can be interpreted as full-time schooling. After full-time sc hooling, the individual
starts w o rking (i.e., s (t) < 1). But even on-the-job, the individual con tinues to
accumulate h uman capital (i.e., s (t) > 0), which can be in terp reted as spending
time in training programs or allocating some of his time on the job to learning rather
than production. Moreov er, because the horizon is nite, if the Inada conditions
were relaxed, the individual could prefer to stop investing in human capital at some
point. As a result, the time path of human capital generated b y the standard Ben-
P orath model ma y be h ump-shaped, with a possibly declining portion at the end.
Instead, the path of h uman capital (and the earning poten tial of the individual) in
thecurrentmodelisalwaysincreasingasshowninFigure1.5.
The importance of the Ben-Porath model is twofold. First, it emph asizes that
schooling is not the only way in whic h individua ls can in vest in h u m a n capital
24
Lectures in Labor Economics
and there is a con tinuity between schooling investments and other in vestmen ts in
human capital. Second, it suggests that in societies wher e sc h ooling inv estm e nts are
high w e ma y also expect higher levels of on-the-job investmen ts in h um an capital.
Thus there may be systematic mismeasurem ent of the amount or the quality h uman
capital across societies.
This model also provides us with a useful wa y of thinking of the lifecycle of the
individual, which starts with higher in vestmen ts in sc hooling, and then there is a
period of “full-time” w or k (where s (t) is high ), but this is still accomp an ied b y
in v estment in human capital and thus increasing earnings. The increase in earnings
tak es place at a slo wer rate as the individual ages. There is also som e evidence that
earnings may start falling at the very end of w orkers’ careers, though this does not
happen in the simplied version of the model presen ted here (how w ou ld you modify
it to mak e sure that earnings may fall in equilibrium ? ).
The a vailable evidence is consisten t with the broad patterns suggested by the
model. Nevertheless, this evidence comes from cross-sectional age-experience pro-
les, so it has to be interpreted with some caution (in particular, the decline at the
very end of an individua l’s life cycle that is found in some studies may be due to
“selection,” as the higher-abilit y workers retire earlier).
P erhaps more w orrisome for this interpretation is the fact that the increase in
earnings may reect not the accumulation of h um an capital due to in vestment, but
either:
(1) simple age eects; individuals become mo re productive as they get older.
Or
(2) simple experience eects: individuals becom e more productiv e as they get
more experienced–this is independent of whether they choose to in vest or
not.
It is dicult to distinguish between the Ben-Porath model and the second ex-
planation. But there is some evidence that could be useful to distinguish bet ween
age eects vs. experience eects (automatic or due to in vestment).
25
Lectures in Labor Economics
Josh Angrist’s paper on Vietnam v eterans basically sho w s that w o rkers who
served in the Vietnam War lost the experience premium associated with the yea rs
they serv ed in the w ar. This is sho w n in the next gure.
Presuming that serving in the war has no productivit y eects, this evidence
suggests that m uch of the age-earnings proles are due to experience not simply due
to age. Nev ertheless, this evidence is consistent both with direct experience eects
on w orker productivity, and also a Ben P orath type explanation where w orkers
are purposefully in vesting in their hu m a n capital while working, and experience is
proxying for these in vestments.
9. Selection and Wages–The One-Factor Model
Issues of selection bias arise often in the analysis of education, migration, labor
supply, and sectoral choice decisions. This section illustrates the basic issues of selec-
tion using a single-index model, wher e eac h individual possesses a one-dimen sional
skill. Richer models, such as the famous Roy model of selection, incorporate multi-
dimensional skills. While models with multi-dimensional skills mak e a range of
additional predictions, the major implica tions of selection for interpreting wage dif-
ferences across dierent groups can be derived using the single-index model.
Suppose that individuals are distinguished by an unobserv ed type, z,whichis
assumed to be distributed uniformly bet ween 0 and 1. Individuals decide whether
to obtain education, whic h costs c. The wage of an individual of type z when he
has no education is
w
0
(z)=z
and when he obtains education, it is
(1.24) w
1
(z)=α
0
+ α
1
z,
where α
0
> 0 and α
1
> 1. α
0
is the main eect of education on earnings, whic h
applies irrespectiv e of ability, whereas α
1
interacts with ability. The assump tion
that α
1
> 1 implies that education is comp lemen ta ry to ability, and will ensure that
high-ability individuals are “positively selected” into education.
26
Lectures in Labor Economics
Individuals make their sc hooling choices to maximize income. It is straightfor-
w ard to see that all individuals of type z z
will obtain education, wher e
z
c α
0
α
1
1
,
which, to mak e the analysis interesting, w e assume lies bet ween 0 and 1. Figure 1.7
gives the w age distribution in this economy.
Now let us look at mean wages b y education group. By standard arguments,
these are
¯w
0
=
c α
0
2(α
1
1)
¯w
1
= α
0
+ α
1
α
1
1+c α
0
2(α
1
1)
It is clear that ¯w
1
¯w
0
0
, so the wage gap bet ween educated and uneduca ted
groups is greater than the main eect of education in equation (1.24)–since α
1
1 >
0.Thisreects two components. First, the return to education is not α
0
, but it is
α
0
+ α
1
· z for individual z. Therefore, for a group of mean ability ¯z, the return to
education is
w
1
z) w
0
z)=α
0
+(α
1
1) ¯z,
which we can simply think of as the return to education evaluated at the mean
ability of the group.
But there is one more component in ¯w
1
¯w
0
, whic h results from the fact that
the average ability of the two groups is not the same, and the earning dierences
resulting from this ability gap are being coun ted as part of the returns to educa-
tion. In fact, since α
1
1 > 0, high-ab ility individuals are selected into education
increasing the wag e dierential. To see this, rewrite the observed wa ge dierential
as follo w s
¯w
1
¯w
0
= α
0
+(α
1
1)
c α
0
2(α
1
1)
¸
+
α
1
2
Here, the rst two terms give the return to education evaluated at the mean abilit y
of the uneducated group. This would be the answ er to the counter-factual question
of how much the earnings of the uneducated group w ould increase if they were to
obtain education. The third term is the additional eect that results from the fact
27
Lectures in Labor Economics
that the t wo groups do not have the same ability level. It is therefore the selection
eect. Alternatively, we could ha ve written
¯w
1
¯w
0
= α
0
+(α
1
1)
α
1
1+c α
0
2(α
1
1)
¸
+
1
2
,
where no w the rst two terms give the return to education evaluated at the mean
abilit y of the educated group, whic h is greater than the return to education evaluated
at the mean ability lev el of the uneducated group. So the selection eect is somewhat
smaller, but still positiv e.
This example illustrates ho w looking at observed averages, witho ut taking selec-
tion into account, may give mislead ing results, and also provides a simple example
of how to think of decisions in the presence of this ty pe of heterogeneity.
It is also interesting to note that if α
1
< 1, we w ould have negative selection in to
education, and observed returns to education w ould be less than the true returns.
Thecaseofα
1
< 1 appears less plausib le, but may arise if high abilit y individuals
do not need to obtain education to perform certain tasks.
28
Lectures in Labor Economics
Figure 1.3
29
Lectures in Labor Economics
h(t)
0
h(t)=0
h*
x*
x(t)
x(t)=0
h(0)
x’’(0)
x’(0)
Figure 1.4. Steady state and equilibrium dyn amics in the simp lied
BenPorathmodel.
30
Lectures in Labor Economics
h(t)
t
0
h*
h(0)
Figure 1.5. Time path of h u m a n capital investments in the simpli-
ed Ben Porath model.
31
Lectures in Labor Economics
Figure 1.6
32
Lectures in Labor Economics
Figure 1.7. Selection in the One-Factor Model.
33
CHAPTER 2
Human Cap ital an d Signaling
1. The Basic Model of Labor Market Signaling
The models w e hav e discussed so far are broadly in the tradition of Bec ker’s
approach to human capital. Human capital is viewed as an input in the production
process. Th e leading alternative is to view education purely as a signal. C o nsid er
the follo wing simple model to illustrate the issues.
There are two t y pes of workers, high ability and lo w abilit y. The fraction of
high abilit y workers in the population is λ. Workers kno w their ow n abilit y, but
employers do not observ e this directly. H igh abilit y work ers alwa ys produce y
H
,
while low abilit y workers produce y
L
. In addition, work er s can obtain education.
The cost of obtaining education is c
H
for high ab ility workers and c
L
for low ability
work ers. The crucial assumptio n is that c
L
>c
H
, that is, education is more costly
for lo w ability workers. This is often referred to as the “single-crossing” assumption,
since it mak es sure that in the space of educa tion and wa ges, the indierence curves
of high and low t ypes intersect only once. For future reference, let us denote the
decision to obtain education b y e =1.
For simplicity, we assume that education does not increase the productivity of
either type of w orker. Once wo rkers obtain their education, there is competition
among a large n umber of risk-neutral rm s, so w orkers will be paid their expected
productivity. More specically,thetimingofeventsisasfollows:
Each work er nds out their ability.
Each wor ker c hooses education, e =0or e =1.
A large number of rm s observ e the education decision of each worker (but
not their abilit y) and compete a la Bertrand to hire these w o rkers.
35
Lectures in Labor Economics
Clearly, this environm ent corresponds to a dynamic game of incomplete informa-
tion, since individuals kno w their ability, but rms do not. In natural equilibrium
concept in this case is the P erfect Ba yesian Equilibrium. Recall that a P e rfe ct
Bay e sian Equilibrium consists of a strategy prole σ (designating a strategy for
each player) and a brief prole μ (designating the beliefs of eac h pla yer at eac h
information set) such that σ is sequentially rational for each play e r giv en μ (so that
each pla yer pla ys the best response in eac h information set given their beliefs) and
μ is deriv ed from σ using Ba yes’s rule whenev er possible. W hile Perfect Bayesian
Equilibria are straigh tforward to ch aracterize and often reasonable, in incomplete
informat ion games where playe rs with private information move before those with-
out this information, there may also exist Perfect Ba yesian Equilibria with certain
undesirable chara cteristics. We ma y therefore wish to strengthen this notion of
equilibrium (see belo w ).
In general, there can be t wo types of equilibria in this game.
(1) Separating, where high and lo w ability workers ch oose dierent lev els of
sc h ooling, and as a result, in equilibrium, employers can infer worker abilit y
from education (whic h is a straightforward application of Bay esia n updat-
ing).
(2) P ooling, where high and lo w ability work ers c hoose the same level of edu-
cation.
In addition, there can be semi-separating equilibria, where some education levels
are chosen by more than one type.
1.1 . A sepa rating eq u ilibrium. Let us start b y characterizing a possible sep-
arating equilibrium , which illustrates ho w education can be valued, even though it
has no directly productive role.
Suppose that w e hav e
(2.1) y
H
c
H
>y
L
>y
H
c
L
36
Lectures in Labor Economics
This is clearly possible since c
H
<c
L
. Then the following is an equilibriu m: all high
ability workers obtain educatio n, and all low abilit y wo rkers c hoose no educatio n.
Wages (conditional on education) are:
w (e =1)=y
H
and w (e =0)=y
L
Notice that these wages are conditioned on education, and not directly on abilit y,
since ability is not observed b y employers. Let us now c h eck that all parties are
pla y ing best responses. First consider rms. Giv en the strategies of w orkers (to
obtain education for high ability and not to obtain education for low abilit y ), a
work er with education has productivity y
H
while a w or ker with no educatio n has
productivity y
L
.Sonorm can c hange its behavior and increase its prots.
W hat about workers? If a high ability work er deviates to no education, he will
obtain w (e =0)=y
L
, whereas he’s currently getting w (e =1)c
H
= y
H
c
H
>y
L
.
If a lo w ability worker deviates to obtaining education, the market w ill perceive him
as a high ability work er , and pa y him the higher w age w (e =1)=y
H
.Butfrom
(2.1), we have that y
H
c
L
<y
L
, so this deviation is not prota ble for a lo w abilit y
worker, proving that the separating allocation is indeed an equilib rium.
In this equilib riu m, education is valued simply because it is a signal about ability.
Edu cation can be a signal about ability because of the single-crossing property. This
can be easily veried b y considering the case in wh ich c
L
c
H
. Then w e could never
hav e condition (2.1) hold, so it would not be possible to convince high abilit y w or kers
to obtain education, while deterring low ability workers from doing so.
Notice also that if the game was one of perfect information, that is, the w orker
t ype were publicly observ ed, there could never be education investmen ts here. This
is an extreme result, due to the assumption that education has no productivity
benets. But it illustrates the forces at wo rk.
1.2. Pooling equilibri a in signa lin g games. How ever, the separating equi-
librium is not the only one. Consid er the following allocation: both lo w and high
37
Lectures in Labor Economics
abilit y workers do not obtain education, and the w a ge structure is
w (e =1)=(1 λ) y
L
+ λy
H
and w (e =0)=(1 λ) y
L
+ λy
H
It is straigh tfo rward to ch eck that no worker has an y incen tive to obtain edu-
cation (giv en that education is costly, and there are no rewards to obtaining it).
Since all workers choose no education, the expected productivity of a w orker with
no education is (1 λ) y
L
+λy
H
,sorms are playing best responses. (In Nash Equi-
librium and Perfect Bayesian Equilibrium, what they do in response to a deviation
b y a w ork er who obtains education is not important, since this does not happen
along the equilibrium path).
W hat is happening here is that the market does not view education as a good
signal, so a wo rker who “deviates” and obtains education is viewe d as an average-
abilit y work er, not as a high-ability w orker.
W ha t we hav e just described is a P erfe ct Bayesian Eq uilibrium. But is it reason-
able? The answer is no. This equilibrium is being supported by the belief that the
work er who gets edu cation is no better than a worker who does not. But education
is more costly for low abilit y w or kers, so they should be less likely to deviate to
obtaining education. There are man y renem ents in game theory whic h basically
try to restrict beliefs in informa tion sets that are not reached along the equilibrium
path, ensuring that “unreasonable” beliefs, such as those that think a deviation to
obtainin g educat ion is more likely from a low ab ility w o rker, are ruled out.
Perhaps the simplest is The Intuitive Criterion in troduced by Cho and Kreps.
The underlying idea is as follow s. If there exists a type who will nev e r benet
from taking a particular deviation, then the uninformed parties (here the rms)
should deduce that this deviation is v er y unlikely to come from this ty pe. This
falls within the category of “forw ar d induction” where rather than solving the game
simply bac kwards, we think about what ty pe of inferences will others derive from a
deviation.
38
Lectures in Labor Economics
To illustrate the main idea, let us simplify the discussion by sligh tly strength en ing
condition (2.1) to
(2.2) y
H
c
H
> (1 λ) y
L
+ λy
H
and y
L
>y
H
c
L
.
Now tak e the pooling equilibriu m above. Consider a deviation to e =1.Thereis
no circumstance under which the low type would benetfromthisdeviation,since
by assumption (2.2) we ha ve y
L
>y
H
c
L
, and the most a w orker could ev er get is
y
H
, and the low abilit y w orker is now getting (1 λ) y
L
+ λy
H
. Therefore, rm s can
deduce that the deviation to e =1must be coming from the high type, and oer
him a w age of y
H
. Then (2.2) also ensures that this deviation is protable for the
high t ypes, breaking the pooling equilibrium .
The reason wh y this renem ent is referred to as “The Intuitive Criterion” is
that it can be supported by a relatively in tuitive speech” b y the deviator along the
follow ing lines: y ou have to deduce that I must be the high type deviating to e =1,
since lo w ty pes w ould never ever consider such a deviation, whereas I w ould nd
it protable if I could convince you that I am indeed the high t ype).” You should
bear in mind that this speech is used simp ly as a loose and intu itive description of
the reasoning underlying this equilibrium renement. In practice there are no such
speec hes, because the possibility of making suc h speec hes has not been modeled as
part of the game. Nev erth eless, this heuristic device gives the basic idea.
The overall conclusion is that as long as the separating condition is satised,
we expect the equ ilibrium of this economy to involve a separating allocation, whe re
education is valued as a signal.
2. Generalizations
It is straightforward to generalize this equilib rium concept to a situation in whic h
education has a productiv e role as well as a signaling role. Then the story would be
one where education is valued for more than its productiv e eect, because it is also
associated with higher ability.
39
Lectures in Labor Economics
Figure 2.1
Letmegivethebasicideahere. Imaginethateducationiscontinuouse [0, ).
And the cost functions for the high and low t ypes are c
H
(e) and c
L
(e),whichare
both strictly increasing and con vex, with c
H
(0) = c
L
(0) = 0. The single crossing
property is that
c
0
H
(e) <c
0
L
(e) for all e [0, ),
that is, the marginal cost of investing in a given unit of education is alw ays higher
for the low type. Figure 3.1 shows these cost functions.
Moreover, suppose that the output of the t wo ty pes as a function of their edu-
cations are y
H
(e) and y
L
(e),with
y
H
(e) >y
L
(e) for all e.
Figure 2.2 shows the rst-best, whic h would arise in the absence of incomplete
informat ion.
40
Lectures in Labor Economics
Figure 2.2. The rst best allocation with complete information.
In particular, as the gure shows, the rstbestinvolveseort levels (e
l
,e
h
) suc h
that
(2.3) y
0
L
(e
l
)=c
0
L
(e
l
)
and
(2.4) y
0
H
(e
h
)=c
0
H
(e
h
) .
W ith incomplete informa tion , there are again many equilibria, some separating,
some pooling and some semi-separating. But applying a stronger form of the In-
tuitive Cr iterion reasoning, we will pic k the Riley equilibrium of this game, whic h
is a particular separating equilibrium . It is c ha racterize d as follows. We rst nd
the most preferred education level for the lo w type in the perfect inform a tion case,
which coincides with the rst best e
l
determin ed in (2.3). Then w e can write the
41
Lectures in Labor Economics
incen tive compatibilit y constrain t for the lo w t ype, suc h that when the market ex-
pects lo w types to obtain education e
l
,thelowtypedoesnottrytomimicthehigh
t ype; in other words, the lo w t ype agen t should not prefer to choose the education
level the market expects from the high type, e, and receive the wage associated with
this level of educ ation. This incentive com patib ility constraint is straightforw a rd to
write once w e not e tha t in the wage level that lo w ty pe w or kers will obtain is exactly
y
L
(e
l
) in this case, since w e are looking at the separating equilibr ium. Thus the
incentive compatibility constraint is simply
(2.5) y
L
(e
l
) c
L
(e
l
) w (e) c
L
(e) for all e,
where w (e) is the w age rate paid for a worker with education e.Sincee
l
is the rst-
best eort lev el for the lo w type w orker, if we had w (e)=y
L
(e),thisconstraint
wouldalwaysbesatised. Howev er, since the market can not tell low and high type
work ers apart, b y choosing a dierent level of education, a lo w type worker ma y be
able to “mimic” and high type w o rker and th us w e will t y p ically hav e w (e) y
L
(e)
when e e
l
, with a strict inequalit y for some values of education . Therefore, the
separatin g (Riley) equilib riu m must satisfy (2.5) for the equilibrium wage function
w (e).
To ma ke further progress, no te tha t in a separating equ ilib rium , there will exist
some level of education, sa y e
h
, that will be chosen by high type workers. Then,
Bertran d competition among rms, with the reasoning similar to that in the previous
section, implies that w (e
h
)=y
H
(e
h
). Therefore, if a low type worker deviates to
this level of eort,themarketwilltakehimtobeahightypeworkerandpayhim
the w a ge y
H
(e
h
). Now take this education level e
h
to be suc h that the incen tive
compatibility constraint, (2.5), holds as an equalit y, that is,
(2.6) y
L
(e
l
) c
L
(e
l
)=y
H
(e
h
) c
L
(e
h
) .
Then the Riley equilibriu m is suc h that low t y pes c hoose e
l
andobtainthewage
w (e
l
)=y
L
(e
l
), and high types choose e
h
and obtain the w age w (e
h
)=y
H
(e
h
).
That high types are happ y to do this follow s imm edia tely from the single-crossing
42
Lectures in Labor Economics
Figure 2.3. The Riley equilibrium.
property, since
y
H
(e
h
) c
H
(e
h
)=y
H
(e
h
) c
L
(e
h
) (c
H
(e
h
) c
L
(e
h
))
>y
H
(e
h
) c
L
(e
h
) (c
H
(e
l
) c
L
(e
l
))
= y
L
(e
l
) c
L
(e
l
) (c
H
(e
l
) c
L
(e
l
))
= y
L
(e
l
) c
H
(e
l
) ,
43
Lectures in Labor Economics
where the rst line is in troduced by adding and subtracting c
L
(e
h
). The second line
follows from single crossing, since c
H
(e
h
) c
L
(e
h
) <c
H
(e
l
) c
L
(e
l
) in vie w of the
fact that e
l
<e
h
. The third line exploits (2.6), and the nal line simply cancels the
two c
L
(e
l
) terms from the righ t hand side.
Figure 2.3 depicts this equilibrium diagrammatically (for clarit y it assumes that
y
H
(e) and y
L
(e) are linear in e).
Notice that in this equilibrium , high t y pe work ers in vest more than they w ou ld
hav e done in the perfect information case, in the sense that e
h
c har acterized here
is greater than the education lev el that high type individuals chosen with perfect
informat ion, given by e
h
in (2.4).
3. Evidence on Labor M arket Signaling
Is the signaling role of education important? There are a nu mber of dierent
wa ys of app roaching this question. Unfortunately, direct evidence is dicu lt to nd
since abilit y dierences across w orkers are not only unobserved by rm s, but also b y
econometricia ns. Never theless, n umber of dierent strategies can be used to gauge
the importance of signaling in the labor market. Here we will discuss a nu mber of
dierent attempts that investigate the importance of labor ma rket signaling. In the
next section, we will discuss em pirica l w ork that m ay give a sense of how important
signaling considerations are in the aggregate.
Before this discussion , note the parallel between the selection stories discussed
above and the signaling story. In both cases, the observ ed earnings dierences
bet ween high and lo w education worke rs will include a component due to the fact
that the abilities of the high and low education groups dier. There is one important
dierence, how ever, in that in the selection stories, the market observed ability, it
w as only us, the economists or the econometricians, who w ere unable to do so. In
the signaling story, the market is also unable to observed abilit y, and is inferring
it from education. For this reason , proper evidence in fa vor of the signaling story
should go bey o nd documenting the importance of some type of “selection”.
44
Lectures in Labor Economics
There are four dierent approa ches to determin ing whether signaling is impor-
tant. The rst line of w ork looks at whether degrees matter, in particular, whether
a high sc hool degree or the fourth y ear of college that gets an individual a univ er sit y
degree matter more than other years of sch ooling (e.g., Kane and Rouse). This
approach suers from t wo serious problem s. First, the nal y e ar of college (or high
sc hool) may in fact be more useful than the third-year, especially because it sho w s
that the individual is being able to learn all the required information that makes up
a college degree. Second, and more serious, there is no wa y of distinguishing selec-
tion and signaling as possible explanations for these patterns. It ma y be that those
who drop out of high school are observationally dieren t to emplo yers, and hence
receiv e dierent w ag es, bu t these dierences are not observed by us in the standard
data sets. This is a common problem that will come bac k again: the implication s
of unobserved heterogeneit y and signa ling are often similar.
Second, a creative paper by Lang and Kropp tests for signaling b y looking at
whether compulsory sc hooling law s aect schooling abov e the regulated age. The
reasoning is that if the 11th year of schooling is a signal, and the go v ernmen t legis-
lates that everybody has to have 11 years of schooling, no w high ability individuals
hav e to get 12 yea rs of schooling to distinguish themselves. They nd evidence for
this, whic h they in terpr et as supportiv e of the signaling model. The problem is that
there are other reasons for wh y compulsory schooling la w s ma y ha ve suc h eects.
For examp le, an individual who does not drop out of 11th grade may then decide to
comp lete high school. Alternatively, there can be peer grou p eects in that as fewer
people drop out of school, it may become less socially acceptab le the drop out even
at later grades.
The third approach is the best. It is pursued in a v ery creative paper by Tyler,
Murn an e and Willett. They observe that passing grades in the Graduate Equivalent
Degree (GED) dier b y state. So an individual with the same grade in the GED
exam will get a GED in one state, but not in ano ther. If the score in the exam is an
un biased measure of h um an capital, and there is no signaling, these t wo individuals
45
Lectures in Labor Economics
should get the same w a ges. In contrast, if the GED is a signal, and employers do
not know where the individual took the GED exam, these two individuals should
get dierent wages.
Using this meth odology, the authors estimate that there is a 10-19 percent return
to a GE D signal. The attached table sho ws the results.
An inte restin g result that Tyler, Murnane and Wille tt nd is that there are
no GED returns to minorities. This is also consistent with the signaling view,
sinceitturnsoutthatmanyminoritiesprepareforandtaketheGEDexamin
prison. Therefore, GE D w o u ld not only be a positive signal about ability, but also
potentially a signal that the individual w a s at some poin t incarcerated. This latter
feature makes a GED less of that positiv e signal for minorities.
46
CHAPTER 3
Externalities and P eer Eects
Many economists believe that h um an capital not only creates private returns,
increasing the earnings of the individual who acquires it, but it also creates external-
ities, i.e., it increases the productivity of other agen ts in the economy (e.g., Jacobs,
Lucas). If so, existing researc h on the private returns to educa tion is only part of the
picture–the social return, i.e., the private return plus the external return, ma y far
exceed the private return. Conv ersely, if signaling is importan t, the private return
o verestimates the social return to schooling. Estimating the external and the social
returns to schooling is a rst-order question.
1. Theory
To show how and why external returns to education may arise, we will briey
discuss t wo models. The rst is a theory of non-pecuniary external returns, meaning
that external returns arise from tec hnological linkages across agents or rms. The
second is pecuniary model of external returns, thus externalities will arise from mar-
k e t interactions and chan ges in m arket prices resulting from the a verage education
level of the work e rs.
1.1. Non-pecuniary h um an capital externalities. Suppose that the output
(or marginal product) of a w orker, i,is
y
i
= Ah
ν
i
,
where h
i
is the human capital (sc hooling) of the work er, and A is aggregate pro-
ductivit y. Assume that labor markets are competitive. So individual earnings are
W
i
= Ah
ν
i
.
47
Lectures in Labor Economics
The key idea of externalities is that the exchange of ideas amo ng workers raises
productivity. This can be modeled by allowing A to depend on aggregate h um an
capital. In particular, suppose that
(3.1) A = BH
δ
E [h
i
]
δ
,
where H is a measure of aggregate h uman capital, E is the expectation operator, B
is a constant
Individual earnings can then be written as W
i
= Ah
ν
i
= BH
δ
h
v
i
.Therefore,
taking logs, we have:
(3.2) ln W
i
=lnB + δ ln H + ν ln h
i
.
If external eects are stronger within a geographical area, as seem s lik ely in a world
where human interaction and the exchange of ideas are the main forces behind the
externalities, then equation (3.2) should be estimated using mea sures of H at the
local level. This is a theory of non-pecuniary externalities, since the external returns
arise from the technological nature of equation (3.1).
1.2. P ecuniary h um an capital externalities. The alternative is pecuniary
externalities, as rst conjectured by Alfred Marshall in his Principles of Ec onomics,
increasing the geographic concentration of specialized inputs may increase prod uc-
tivit y since the matching bet ween factor inputs and industries is impro ved. A
similar story is developed in Acem oglu (1997), where rms nd it protable to in-
v est in new technologies only when there is a sucient supply of trained workers to
replace emplo y ees who quit. We refer to this sort of eect as a pecuniary externality
since greater hu m an capital encourages more investmen t b y rms and raises other
workers’ wages via this c h a nnel.
Here, we will briey explain a sim pliedversionofthemodelinAcemoglu(1996).
Consid er an economy lasting two periods, with production only in the second
period, and a continuum of w orkers normalized to 1. Tak e human capital, h
i
,as
given. There is also a con tinuum of risk-neutral rm s. In period 1, rms make an
irreversible in vestment decision, k,atcostRk.Workersandrms come together in
48
Lectures in Labor Economics
the second period. The labor market is not competitiv e; instead, rms and workers
are matc h ed randomly, and each rm meets a w orker. The only decision w orkers
and rms make after matching is wheth er to produce together or not to produce at
all (since there are no further periods). If rm f and work er i produce together,
their output is
(3.3) k
α
f
h
ν
i
,
where α<1, ν 1 α. Since it is costly for the w orker-rm pair to separate and
nd new partners in this econom y, employm en t relationships generate quasi-ren ts.
Wages will therefore be determined by rent-sharing. Here, simply assume that the
work er receiv es a share β of this output as a result of bargaining, while the rm
receiv es the remaining 1 β share.
An equilibrium in this economy is a set of schooling cho ices for w o rkers and a set
of physical capital inv estm ents for rms. Firm f maximizes the following expected
prot function:
(3.4) (1 β)k
α
f
E[h
ν
i
] Rk
f
,
with respect to k
f
.Sincerms do not kno w whic h w or ker they will be ma tched with,
their expected protisanaverageofprots from dier ent skill levels. The function
(3.4) is strictly conca ve, so all rms ch oose the same lev el of capital investment,
k
f
= k,givenby
(3.5) k =
µ
(1 β)αH
R
1/(1α)
,
where
H E[h
ν
i
]
is the measure of aggregate h uman capital. Substituting (3.5) in to (3.3), and using
thefactthatwagesareequaltoafractionβ of output, the wage income of individual
i is given b y W
i
= β ((1 β)αH)
α/(1α)
R
α/(1α)
(h
i
)
ν
. Taking logs, this is:
(3.6) ln W
i
= c +
α
1 α
ln H + ν ln h
i
,
where c is a constant and α/ (1 α) and ν are positive coecients.
49
Lectures in Labor Economics
Hum an capital externalities arise here because rms choose their physical capital
in anticipation of the a v erage h uman capital of the w orkers they will emplo y in the
future. Since phy sical and hum an capital are complem ents in this setup, a more
educated labor force encourages greater investmen t in phy sical capital and to higher
wages. In the absence of the need for search and matching, rms w ould immediately
hire wo rkers with skills approp riate to their investments, and there wo uld be no
h u m an capital externalities.
Nonpecuniary and pecuniary theories of human capital externalities lead to sim-
ilar empirical relationships since equation (3.6) is identical to equation (3.2), with
c =lnB and δ = α/ (1 α). Again presum ing tha t these interactions exist in local
labor mark ets, w e can estimate a version of (3.2) using dierences in sc h ooling across
labor markets (cities, states, or ev en countries).
1.3. Signaling and negative externalities. The abo ve models focused on
positive externalities to education. How ever, in a w o rld where education pla y s a
signaling role, we might also expect signicant negative extern alities . To see this,
consider the most extrem e w orld in whic h education is only a signal–it does not
ha v e an y productive role.
Contrast t wo situations: in the rst, all individua ls ha ve 12 yea rs of schooling
and in the second all individuals hav e 16 years of schooling. Since education has
no productive role, and all individuals have the same level of schooling, in both
allocationstheywillearnexactlythesamewage(equaltoaverageproductivity).
Therefor e, here the increase in aggregate sc h ooling does not translate into aggregate
increases in wa ge s. But in the same world, if one individual obtains more education
than the rest, there will be a private return to him, because he w ould signal that
he is of higher ability. Therefor e, in a world w h ere signaling is important, we mig ht
also wa nt to estimate an equation of the form (3.2), but when signaling issues are
important, we would expect δ to be negative.
The basic idea here is that in this world, what determin es an individua l’s w ag es
is his “ranking” in the signaling distribution. When others in vest more in their
50
Lectures in Labor Economics
education, a giv en individual’s rank in the distribution declines, hence others are
creating a negative extern ality on this individu al via their h u m a n capital investment.
2. Ev idenc e
Ordin ary Least Squares (OL S) estim ation of equations like (3.2) using city or
state-level data yield v e ry signican t and positiv e estima tes of δ, indicating substan-
tial positive hu m an capital externalities. The leading examp le is the paper by Jim
Rauch.
There are at least two problems with this ty pe OLS estimates. First, it ma y
be precisely high-wage cities or states that either attract a large number of high
education w orkers or giv e strong support to education. R auch’s estimates w ere
using a cross-section of cities. Including cit y or state xed aects ameliorates this
problem , but does not solve it, since states’ attitudes to wards education and the
demand for labor ma y comov e. The ideal approach w ould be to nd a source of quasi-
exogenous variation in av erage schooling across labor markets (variation unlikely to
be correlated with other sources of variation in the demand for labor in the state).
Acemoglu and Angrist try to accomplish this using dierences in compulsory
sc hooling la w s. The advan tag e is that these la w s not only aect individual sc h ooling
but average schooling in a giv en area.
There is an additiona l econometric problem in estimating externalities, whic h
remainsevenifwehaveaninstrumentforaverageschoolingintheaggregate.This
is that if individual schooling is measu red with error (or for some other reason OLS
returns to individual schooling are not the causal eect), some of this discrepancy
bet ween the OLS returns and the causal return may load on average sc hooling,
even when a verage sc hooling is instrum ented. This suggests that we may need to
instrument for individua l sch ooling as w e ll (so as to get to the correct return to
individual sc h ooling).
Mo re explicitly, let Y
ijt
be the log w eekly w a ge, than the estimating equation is
(3.7) Y
ijt
= X
0
i
μ + δ
j
+ δ
t
+ γ
1
S
jt
+ γ
2i
s
i
+ u
jt
+ ε
i
,
51
Lectures in Labor Economics
To illustrate the main issues, ignore time dependence, and consider the population
regression of Y
i
on s
i
:
(3.8) Y
ij
= μ
0
+ ρ
0
s
i
+ ε
0i
; where E[ε
0i
s
i
] 0.
Next consider the IV population regression using a full set of state dummies. This
is equivalen t to
(3.9) Y
ij
= μ
1
+ ρ
1
S
j
+ ε
1i
; where E[ε
1i
S
j
] 0,
since the projection of individual schooling on a set of state dummies is simply
averageschoolingineachstate.
Now consider the estimation of the empirical analogue of equation (3.2):
(3.10) Y
ij
= μ
+ π
0
s
i
+ π
1
S
j
+ ξ
i
; where E[ξ
i
s
i
]=E[ξ
i
S
j
] 0.
Then, we have
π
0
= ρ
1
+ φ(ρ
0
ρ
1
)(3.11)
π
1
= φ(ρ
1
ρ
0
)
where φ =1/1 R
2
> 1, and R
2
is the rst-stage R-squared for the 2SLS estimates
in (3.9). Therefore, when ρ
1
0
, for example because there is measurement error
in individual sc hooling, we may nd positive external returns even when there are
none.
If w e could instrument for both individual and av erage sc hooling, we w ould solv e
this problem. But what type of instrument?
Consid er the relationship of interest:
(3.12) Y
ij
= μ + γ
1
S
j
+ γ
2i
s
i
+ u
j
+ ε
i
,
which could be estimated by OLS or instrumental variables, to obtain an estimate
of γ
1
as well as an average estimate of γ
2i
,sayγ
2
.
52
Lectures in Labor Economics
An alternative wa y of expressing this relationship is to adjust for the eect of
individual sc h ooling by directly rew riting (3.12):
Y
ij
γ
2
s
i
e
Y
ij
(3.13)
= μ + γ
1
S
j
+[u
j
+ ε
i
+(γ
2i
γ
2
)s
i
].
In this case, instrume ntal variables estimate of external returns is equivalent to
the Wald formula
γ
IV
1
=
E[
e
Y
ij
|z
i
=1] E[
e
Y
ij
|z
i
=0]
E[S
j
|z
i
=1] E[S
j
|z
i
=0]
= γ
1
+
E[γ
2i
s
i
|z
i
=1] E[γ
2i
s
i
|z
i
=0]
E[s
i
|z
i
=1] E[s
i
|z
i
=0]
γ
2
¸
·
E[s
i
|z
i
=1] E[s
i
|z
i
=0]
E[S
j
|z
i
=1] E[S
j
|z
i
=0]
¸
.
This shows that we should set
γ
2
=
E[γ
2i
s
i
|z
i
=1] E[γ
2i
s
i
|z
i
=0]
E[s
i
|z
i
=1] E[s
i
|z
i
=0]
(3.14)
=
E[(Y
ij
γ
1
S
j
)|z
i
=1] E[(Y
ij
γ
1
S
j
)|z
i
=0]
E[s
i
|z
i
=1] E[s
i
|z
i
=0]
This is typic ally not the OLS estim ator of the private return , and w e should
be using some instrument to simultaneously estimate the private return to school-
ing. The ideal instrument w ould be one aecting exactly the same people as the
comp ulsory schooling law s.
Qua rter of birth instrum ents might come close to this. Since quarter of birth
instruments are likely to aect the same people as compulsory schooling law s, ad-
justing with the quarter of birth estimate, or using quarter of birth dummies as
instrum ent for individual sc hooling, is the right strategy.
So the strategy is to estimate an equation similar to (3.2) or (3.10) using com p ul-
sory sc hooling la ws for a verage schooling and quarter of birth dummies for individual
sc hooling.
The estimation results from usin g this strategy in Acemoglu and Angrist (2000)
suggest that there are no signicant external returns. The estimates are typically
around 1 or 2 percent, and statistically not dierent from zero. They also suggest
53
Lectures in Labor Economics
that in the aggregate signaling considerations are unlikely to be v ery importan t (at
the v ery least, they do not dominate positive externalities).
3. Sc hool Qualit y
Dierences in sc hool qualit y could be a crucial factor in dier ences in hum an
capital. Two individ uals with the same years of sc hooling m ight ha ve v ery dierent
skills and very dierent earnings because one wen t to a muc h better school, with
better teac hers, instruction and resources. Dierences in school quality w o u ld add
to the unobserved componen t of human capital.
A natural conjecture is that school quality as measur ed b y teacher-pupil ratios,
spending per-pupil, length of school year, and educational qualications of teac hers
would be a majo r determinant of human capital. If school quality matters indeed a
lot, an eective wa y of increasing hu m an capital might be to increase the quality of
instruction in sch ools.
This view w as ho wever ch alleng ed b y a number of economists, most notably,
Hanushek. Han ush ek noted that the substan tial increase in spending per student
and teacher-p upil ratios, as well as the increase in the qualication s of teach ers,
was not associated with improv ed student outcomes, but on the contrary with a
deterioration in many measures of high school students’ performance. In addition,
Hanushek conducted a meta-analysis of the large n u mber of papers in the education
literature, and concluded that there was no o verwhelming case for a strong eect of
resources and class size on studen t outcomes.
Although this research has received substan tial attention, a nu mber of careful
papers show that exogenous variation in class size and other resources are in fact
associated with sizable improvem ents in studen t outcom es.
Most notable:
(1) Krueger analyzes the data from the Tennessee Star experimen t where stu-
dents w er e randomly allocated to classes of dieren t sizes.
54
Lectures in Labor Economics
(2) Angrist and La vy analyze the eect of class size on test scores using a unique
c h aracteristic of Israeli sc hools whic h caps class size at 40, th u s creating a
natural regression discon tinuity as a function of the total number of studen ts
in the sc hool.
(3) Card and Krueger look at the eects of pupil-teacher ratio, term length
and relative teac h er wage by comp aring the earnings of individuals w o rkin g
in the same state but educated in dierent states with dierent school
resources.
(4) Another paper by Card and Krueger looks at the eect of the “exogenously”
forced narro wing of the resource gap between blac k and white schools in
South Carolina on the gap between black and white pupils’ education and
subsequent earnings.
All of these papers nd sizable eects of sch ool quality on student outcomes.
Moreover, a recen t paper by Krueger sho ws that there w ere man y questionable
decisions in the meta-analysis b y Hanushek, shedding doubt on the usefulness of
this analysis. On the basis of these various pieces of evidence, it is safe to conclude
that school quality appears to matter for human capital.
4. P eer Group Eects
Issues of school quality are also in tim ately linked to those of externalities. An
important type of externality, dierent from the external returns to education dis-
cussed abo ve, arises in the context of education is peer group eects, or generally
social eects in the process of education. The fact that child ren gro w in g up in
dierent areas may c hoose dierent role models will lead to this ty pe of externali-
ties/peer group eects. More simply, to the exten t that sch ooling and learning are
group activities, there could be this ty pe of peer group eects.
There are a number of theoretical issues that need to be claried, as well as
important w ork that needs to be done in understanding where peer group eects
are coming from . Moreov er , empirica l in vestigation of peer group eec ts is at its
55
Lectures in Labor Economics
infancy, and there are very dicu lt issues involve d in estimation and interp retatio n.
Since there is little research in understanding the nature of peer group eects, here
we will simply take peer group eects as giv en , and briey discuss some of its
eciency implications, especially for community structure and school quality, and
then v er y briey mention some w or k on estimating peer group aects.
4.1. Implications of peer group ee c ts for mixing an d seg regation . An
importan t question is whether the presence of peer group eects has any particular
implications for the organization of schools, and in particular, whether children who
pro vide positiv e externalities on other c hildren should be put together in a separate
sc hool or classroom.
The basic issue here is equivalent to an assignment prob lem . The general princi-
pleinassignmentproblems,suchasBeckersfamousmodelofmarriage,isthatifin-
puts from the t wo parties are complementary, there should be assortative matching,
that is the highest quality individuals should be matched together. In the con te xt
of sc h ooling, this imp lie s tha t children with better c h ara cter istics, who are likely to
create more positive externalities and be better role models, should be segregated in
their ow n schools, and c hild ren with w o rse c har acteristics, who will tend to create
negative externalities will, should go to separate schools. This practically means
segregation along income lines, since often child ren with “better c ha rac teristics” are
those from better paren tal bac kgr ound s, while ch ildren with w orse c h ara cteristics
are often from lo wer socioeconomic backgrounds
So muc h is w ell-kn own and well under stood. The problem is that there is an
important confusion in the literature, which in volves deducin g complemen tarity from
the fact that in equilibrium we do observe segregation (e.g., ric h parents sending
their child ren to private sch ools with other child ren from rich parents, or living
in suburbs and sending their children to suburban schools, while poor parents liv e
in ghettos and children from disadvantaged bac kground s go to school with other
disadvantaged children in inner cities). T his reasoning is often used in discussions
of Tiebout competition, together with the argument that allowing paren ts with
56
Lectures in Labor Economics
dierent characteristics/tastes to sort in to dierent neigh borhood s will often be
ecien t.
The underlying idea can be giv en b y the follow ing simple model. Suppose that
sc hools consist of two kids, and denote the parent al bac kg roun d (e.g., hom e educa-
tion or paren tal expenditure on non-school inputs) of kids b y e, and the resulting
h u m a n capitals b y h.Supposethatwehave
h
1
= e
α
1
e
1α
2
(3.15)
h
2
= e
1α
1
e
α
2
where α>1/2. This implies that parental backgrounds are complemen tary, and
eac h kid’s hum an capital will depend mostly on his own parent’s background, but
also on that of the other kid in the sch ool. For example, it ma y be easier to learn
orbemotivatedwhenotherchildrenintheclassarealsomotivated. Thisexplains
wh y we hav e ∂h
1
/∂e
2
> 0 and ∂h
2
/∂e
1
> 0. But an equally importan t feature of
(3.15) is that
2
h
1
/∂e
2
∂e
1
> 0 and
2
h
2
/∂e
1
∂e
2
> 0,thatis,thebackgroundsofthe
t wo kids are complem entary. This implies that a classmate with a good back gr ound
is especially useful to another kid with a good backg ro un d. We can think of this
as the “bad apple” theory of classroom: one bad kid in the classroom brings dow n
ev erybody.
As a digression, notice an important feature of the way we wrote (3.15) linkin g
the outcome variables, h
1
and h
2
,topredetermined characteristics of c h ildr en e
1
and
e
2
, whic h creates a direct analogy with the hu m an capital externalities discussed
above. Ho wev er, this may simply be the reduced form of that somewhat dieren t
model, for example,
h
1
= H
1
(e
1
,h
2
)(3.16)
h
2
= H
2
(e
2
,h
1
)
whereby eac h individual’s h um an capital depends on his o w n background and the
h uman capital choice of the other individual. Although in reduced form (3.15) and
57
Lectures in Labor Economics
(3.16) are ve ry similar, they pro v ide dierent in terp reta tions of peer group eects,
and econom etrica lly they pose dierent c ha lleng es, which we will discuss belo w .
The complementarity has t wo implications:
(1) It is socially ecient, in the sense of maximizing the sum of huma n capitals,
to ha v e paren ts with good backgrounds to send their children to school with
other parents with good backgrounds . This follows simply from the den-
ition of comp lem entarity, positive cross-part ial derivative, whic h is clearly
veried by the production functions in (3.15).
(2) It will also be an equilibrium outcome that parents will do so. To see this,
supposethatwehaveasituationinwhichtherearetwosetsofparents
with background e
l
and e
h
>e
l
. Suppose that there is mixing. Now the
marg inal willingness to pa y of a parent with the high backg roun d to be in
the same school with the child of another high-background paren t, rather
than a low-bac kground student, is
e
h
e
α
h
e
1α
l
,
while the marginal willingness to pay of a low bac kground paren t to stay
in the sc hool with the high bac kground parents is
e
α
l
e
1α
h
e
l
.
The comp lementarity between e
h
and e
l
in (3.15) implies that e
h
e
α
h
e
1α
l
>
e
α
l
e
1α
h
e
l
.
Therefore, the high-background paren t can always outbid the low-bac kground
parent for the privilege of sending his children to school with other high-
background paren ts. Thus with prot maximiz in g sch ools, segregatio n will
ariseastheoutcome.
Next consider a production function with substitutability (negative cross-partial
derivativ e). For example,
h
1
= φe
1
+ e
2
λe
1/2
1
e
1/2
2
(3.17)
h
2
= e
1
+ φe
2
λe
1/2
1
e
1/2
2
58
Lectures in Labor Economics
where φ>1 and λ>0 but small, so that human capital is increasing in parental
background. With this production function, we again have ∂h
1
/∂e
2
> 0 and
∂h
2
/∂e
1
> 0, but now in contrast to (3.15), w e no w ha ve
2
h
1
∂e
2
∂e
1
and
2
h
2
∂e
1
∂e
2
< 0.
This can be thought as corresponding to the “good apple” theory of the classroom,
where the kids with the best c hara cteristics and attitudes bring the rest of the class
up.
In this case, because the cross-partial derivative is negative, the marginal will-
ingness to pa y of lo w -b ackground parents to have their kid together with high-
background parents is higher than that of high-background parents. With perfect
markets, we will observe mixing, and in equilibrium schools will consist of a mixture
of children from high- and low-b ackground parents.
Now combining the outcom es of these two models, many people jump to the
conclusion that since we do observ e segregation of sch ooling in practice, paren tal
backgrounds m ust be complem entary, so segregation is in fact ecien t. Again the
conclusion is that allow ing Tiebout competition and parental sorting will most lik ely
ac h ieve ecient outcomes.
Howe ver, this conclusio n is not correct, since ev en if the correct production func-
tion was (3.17), segregation w ould arise in the presence of credit market problems.
In particular , the way that mixing is supposed to occur with (3.17) is that lo w -
background parents mak e a pa ymen t to high-bac kground parents so that the latter
send their children to a mixed sc hool. To see wh y suc h pa ym ents are necessary,
recall that ev en w ith (3.17) w e hav e that the rst derivatives are positive, that is
∂h
1
∂e
2
> 0 and
∂h
2
∂e
1
> 0.
This means that ev ery thin g else being equal all children benetfrombeinginthe
same class with other child re n with good background s. With (3.17), howev er , chil-
dren from better back grounds benet less than c hild ren from less good backgrounds.
59
Lectures in Labor Economics
Thisimpliesthattherehastobepaymentsfromparentsoflessgoodbackgrounds
to high-bac kground parents.
Suc h pa yments are both dicult to implem ent in practice, and practically im-
possible taking in to accoun t the credit market problems facing parents from poor
socioeconom ic status.
This implies that, if the true production function is (3.17) but there are credit
market problems, w e will observ e segregation in equilibrium , and the segregation
will be inecien t. Therefore w e cannot simply appeal to Tiebout competition, or
deduce eciency from the equilibrium patterns of sorting.
Another implication of this analysis is that in the absence of credit market
problem s (and with complete markets), cross-partials determine the allocation of
students to sc hools. With credit market problems, rst there of it has become
important. This is a general result, with a rang e of implication s for emp irical w o rk.
4.2. The Benabou model. A similar point is developed by Benabou ev en
in the absence of credit market problem s, but relying on other missing markets.
His model has competitive labor markets, and local externalities (externalities in
sc hooling in the local area). All agents are assumed to be ex an te homogen eous,
and will ultima tely end up either low skill or high skill.
Utility of agent i is assum ed to be
U
i
= w
i
c
i
r
i
where w is the wage, c is the cost of education , whic h is necessary to become both
low skill or high skill, and r is rent.
The cost of education is assume d to depend on the fraction of the agents in the
neighborhood, denoted by x, who become high skill. In particular, w e have c
H
(x)
and c
L
(x) as the costs of becoming high skill and lo w skill. Both costs are decreasing
in x, meaning that when there are more individuals acquiring high skill, becoming
hig h skill is cheaper (positive peer group eects). In addition, w e hav e
c
H
(x) >c
L
(x)
60
Lectures in Labor Economics
Figure 3.1
so that becoming high skill is alwa ys more expensive, and as sho wn in Figure 3.1
c
0
H
(x) <c
0
L
(x) ,
so that the eect of increase in the fraction of high skill individuals in the neighbor-
hood is bigger on the cost of becoming high skill.
Since all agen ts are ex ante identical, in equilibrium we m u st have
U (L)=U (H)
that is, the utility of becom ing high skill and low skill must be the same.
Assum e that the labor market in the economy is global, and takes the constant
returns to scale form F (H, L). The important implicatio n here is that irrespectiv e of
where the w ork er obtains his education, he will receive the same wage as a function
of his skill lev el.
Also assume that there are two neigh borhoods of xed size, and individuals will
compete in the housing mark et to locate in one neigh borhood or the other.
As shown in Figures 3.2 and 3.3, there can be tw o t y pes of equilibria:
61
Lectures in Labor Economics
Figure 3.2. Integrated City E quilibr ium
(1) In tegra ted city equilibrium, where in both neighborhoods there is a fraction
ˆx of individu al obtaining high educa tion .
(2) Segregated city equilibrium , where one of the neigh borhoods is homoge-
neous. For example, w e could ha v e a situation where one neighborhood has
x =1and the other has ˜x<1, or one neigh borhood has x =0and the
other has ¯x>0.
The importan t observation here is that only segregated cit y equilibria are “sta-
ble”. To see this consider an integ rat ed city equilibrium, and imagine relocating a
fractio n ε of the high-skill individua ls (that is individuals getting high skills) from
neighborhood 1 to neighborhood 2. This will red uce the cost of education in neigh-
borhood 2, both for high and low skill individuals. But by assumption, it reduces it
more for high skill individu als, so all hig h skill individu als now will pa y higher rents
to be in that cit y, and they will outbid lo w -skill individuals, taking the economy
to ward the segregated cit y equilibriu m .
62
Lectures in Labor Economics
Figure 3.3. Segregated City Equilibrium
In contrast, the segregated city equilibrium is alw ays stable. So we again ha ve a
situation in whic h segregation arises as the equilibrium outcom e, and this is again
because of a reasonin g relying on the notion of “com plemen tar ity”. As in the previ-
ous section, high-skill individuals can outbid the lo w-skill individuals because they
benet more from the peer group eects of high skill individuals.
But crucially there are again missing mark ets in this economy. In particular,
rather than pa yin g high skill individuals for the positive externalities that they
create, as would be the case in complete markets, agents transact simply through
the housing market. In the housing market, there is only one ren t lev el, which both
high and lo w skill individuals pa y. In contrast, with complete markets, we can think
of the pricing schem e for housing to be such that high skill individuals pay a lo wer
rent (to be compensated for the positive externality that they are creating on the
other individuals).
Therefore, there are missing mark ets, and eciency is not guaranteed. Is the
allocation with segregation ecien t?
63
Lectures in Labor Economics
It turns out that it may or may not. To see this consider the problem of a utili-
tarian social planner m axim izing total output m inus costs of education for workers.
This implies that the social planner will maximize
F (H, L) H
1
c
H
(x
1
) H
2
c
H
(x
2
) L
1
c
L
(x
1
) L
2
c
L
(x
2
)
where
x
1
=
H
1
L
1
+ H
1
and x
2
=
H
2
L
2
+ H
2
This problem can be broken in to two parts: rst, the planner will c h oose the ag-
gregate amount of skilled individuals, and then she will ch oose ho w to actually
allocate them bet ween the two neighborhoods. The second part is simply one of
cost minimization, and the solution depends on whether
Φ (x)=xc
H
(x)+(1 x) c
L
(x)
is concave or con vex. This function is simply the cost of giving high skills to a
fractio n x of the population. When it is con vex, it means that it is best to choose
the same level of x in both neigh borhoods, and when it is concave, the social planner
minimizes costs by choosing two extrem e values of x in the t wo neighborhoods.
It turns out that this function can be con vex, i.e. Φ
00
(x) > 0.Morespecica lly,
we have:
Φ
00
(x)=2(c
0
H
(x) c
0
L
(x)) + x (c
00
H
(x) c
00
L
(x)) + c
00
L
(x)
We can have Φ
00
(x) > 0 when the second and third terms are large. In tuitively,
this can happen because although a high skill individual benets more from being
together with other high skill individuals, he is also creatin g a positive externality on
low skill individuals when he mixes with them . This externality is not internalized,
potentially leading to ineciency.
This model gives another example of why equilibr iu m segregation does not imp ly
ecien t segregation.
4.3. Empirical issues and evidence. Peer group eects are generally dicult
to iden tify. In addition, we can think of two alternative formulations where one is
practically impossible to identify satisfactorily. To discuss these issues, let us go bac k
64
Lectures in Labor Economics
to the previo us discussion, and recall that the two “structural” formulations, (3.15)
and (3.16), have very similar reduced forms, but the peer group eects w ork quite
dierently, and have dieren t in terpretations. In (3.15), it is the (predetermined)
c haracteristics of my peers that determine m y outcomes, whereas in (3.16), it is the
outcomes of my peers that matter. Above we sa w ho w to iden tify externalities in
human capital, whic h is in essence simila r to the structural form in (3.15). More
explicitly, the equation of in terest is
(3.18) y
ij
= θx
ij
+ α
¯
X
j
+ ε
ij
where
¯
X is average characteristic (e.g., a v erage schooling) and y
ij
is the outcome
of the ith individual in group j.Here,foridentication all w e need is exogenous
variation in
¯
X.
The alternativ e is
(3.19) y
ij
= θx
ij
+ α
¯
Y
j
+ ε
ij
where
¯
Y is the average of the outcome s. Som e reection will rev e al why the parame-
ter α is now practically impossible to identify. Since
¯
Y
j
does not vary by individual,
this regression amoun ts to one of
¯
Y
j
on itself at the group level. This is a serious
econometric problem. One imperfect wa y to solve this problem is to replace
¯
Y
j
on
the righ t hand side by
¯
Y
i
j
which is the average excluding individual i.Another
approa ch is to impose some timing structure. For exam ple:
y
ijt
= θx
ijt
+ α
¯
Y
j,t1
+ ε
ijt
There are still some serious problems irrespective of the approach tak en. First, the
timing structure is arbitrary, and second, there is no w ay of distinguishing peer
group eects from “comm on shocks”.
As an example consider the paper b y Sacerdote, which uses random assignment of
roommates in Dartmouth. He nd s that the GPAs of randomly assigned roommates
are correlated, and interprets this as evidence for peer group eects. The next table
summ arizessome of the key results.
65
Lectures in Labor Economics
Figure 3.4
Despite the ve ry nice nature of the experiment, the conclusion is problem atic,
because Sacerdote attempts to iden tify (3.19) rather than (3.18). For example, to the
exten t that there are comm on shoc ks to both roommates (e.g., they are in a noisier
dorm ), this m ay not reect peer group eects. Instead, the problem would not have
arisen if the right-hand side regressor was some predetermined c ha racteristic of the
66
Lectures in Labor Economics
roommate (i.e., then we w ou ld be estimating something similar to (3.18) rather than
(3.19)).
67
Part 2
Incen tiv es, Agency and Eciency Wages
A key issue in all organizations is how to give the right incen tives to employ ees.
This topic is centra l to con tr act theory and organizational economics, but it also
needs to be tak en in to account in labor economics, especially in order to better
understand the employmen t relationship.
Here we give a quic k ov erview of the main issues.
CHAPTER 4
Moral Hazard: Basic Models
Moral hazard refers to a situation where individual tak es a “hidden action”
that aects the pa yos to his employ e r (the principal). We generally think of this
as the level of “eort”, but other actions, such as the composition of eort, the
allocation of time, or even stealing, are poten tial examples of moral hazard-t ype
beha vior. Although eort is not observed, some of the outcomes that the principal
cares about, such as output or performance, are observ ed.
Because the action is hidden, the principal cannot simply dictate the level of
eort. She has to provide incentiv es through some other means. The simplest wa y
to approach the problem is to think of the principal as providing “high-po wered”
incen tives, and rew ar ding success. This will work to some degree, but will run in to
t wo sorts of problems;
(1) Limited liability
(2) Risk
Mo re explicitly, high-powered incen tives require the principal to punish the agent
as we ll as to reward him, but limited liability (i.e., the fact that the agen t cannot be
paid a negative wage in man y situations) implies that this is not possible. Therefore
high-powered incen tives come at the expense of high average level of paymen ts.
The risk problem is that rew arding the agent as a function of performance con-
icts with optimal risk sharing between the principal and the agen t. Generally, w e
think of the agen t as earning most of his living from this wage income, whereas
the principal employs a n u mber of similar agen ts, or is a corporation with diuse
o wnership. In that case, w e can think of the rm as risk neutral and the emplo y-
ment contra ct should not only pro vide incentives to the agent, but also insure him
71
Lectures in Labor Economics
against uctua tions in performance. More generally, ev en if the rm is risk averse,
the employment contract should in v olv e an element of risk sharing between the rm
and the w o rker. Risk sharing in emplo y m ent con t racts will often con t radict with
the pro vision of incen tives.
Because the incentive-insurance tradeo is a cen tra l problem, moral hazard prob-
lems often arise in the contex t of health insurance, in fact, the term moral hazard
origina tes from this literature. In particular, the idea is that if an individual is
provided with full insur an ce against all of the possible health expenses that he may
incur (which is good from risk-sharing poin t of view), he may be discouraged from
undertaking hazardous behavior, potentially increasing the risk of bad health out-
come s.
1. TheBaselineModelofIncentive-InsuranceTradeo
Let us start w ith the one agen t case, and build on the key paper b y Holm ström ,
“Mora l Hazard and Obser vability” Bell Journal 1979.
Imagine a single agen t is con tracting with a single principal.
The agent’s utility functio n is
H(w, a)=U(w) c(a)
where w isthewagehereceives,U is a concav e (risk-av er se) utilit y function and a
R
+
denotes his action, with c (·) an increasin g and conve x cost function . Basically,
theagentlikesmoreincomeanddislikeseort.
The agent has an “outside option,” representing the minimum amount that he
will accept for accepting the emp loyment contract (for example, this outside option
ma y be working for another rm or self-emplo ym ent). These are represented b y some
reservation utility
¯
H such that the agen t w ould not participate in the employment
relationship unless he receiv es at least this lev el of utility. This will lead to his
participation constraint.
72
Lectures in Labor Economics
Theactionthattheagenttakesaects his performance, which we simply think
of as output here. Let us denote output by x,andwrite
x (a, θ)
where θ R the state of nature. In other w ords,
x : R
+
× R R
This emphasizes that output depends on eort and some other inuences outside
the con tro l of the agent and the principal. There is therefore a stoc hastic element.
Since greater eort should correspond with good things, w e assume that
x
a
∂x
∂a
> 0
If output were a non-stochastic function of eort, contracting on output w ou ld
be equivalent to contracting on eort, and risk sharing issues w ou ld not arise. Here
θ is the sour ce o f risk .
The principal cares about output minus costs, so her utility function is
V (x w)
where V is also an increasing concave utility function. A special case of interest is
where V is linear, so that the principal is risk neutra l.
What is a contract here?
Let be the set of observable and con tractible ev ents, so when only x is observ-
able, = R.Whenanytwoofx, a,andθ are observable, then = R
+
× R (the
third one is redundan t given the information concerning the rst t wo).
Acontractisamapping
s : R
which species how muc h the agent will be paid.
Alternat ively when there is limited liability so that the agent cannot be paid a
negative wage,
s : R
+
Here let us start with the case without limit liability.
73
Lectures in Labor Economics
Digression: what is the dierence between observable and con tra ctible? What
happens if something is observable only b y the principal and the agen t, but b y
nobody else?
W ha t we have here is a dyna m ic gam e, so the timing is importan t. It is:
Timing:
(1) The principal oers a con tract s : R to the agen t.
(2) The agen t accepts or rejects the contract. If he rejects the con tract, he
receives his outside utility
H.
(3) If the agen t accepts the contract s : R, then he chooses eort a.
(4) Nature dra ws θ, determining x(a, θ).
(5) Agent receives the paymen t specied by contract s.
This is a game of incom p lete informatio n and as in signaling games, we will look
for a Perfect B ayesia n Equilibrium . Ho wev er , in this con tex t, the concept of Perfect
Bay e sia n Equilib rium will be strong enough.
2. I n centives witho ut Asym metric Infor mation
Let us start with the case of full information. Then the problem is straightfor-
ward.
The principal c hooses both the con tra ct s(x, a) (why is it a function of both
x and a?), and the agents chooses a. The P erfect Bay e sian Equilibrium can be
c haracterized b y bac kward induction. The rst in teresting action is at step 3, where
the agent c h ooses the eort lev el given the contra ct and then at step 2, where the
agent decides whether to accept con tra ct s. Given what ty pes of contra cts will be
accepted by the agen t and what the corresponding eortlevelwillbe,atstep1the
principal chooses the contract that maximizes her utility. With analogy to oligopoly
games, we can think of the principal, who moves rst, as a Stackleberg leader. As
usual with Stac kelberg leaders, wh en c hoosing the contract the principal an ticipates
the action that the agent will choose. Th u s, we should think of the principal as
choosing the eort level as well, and the optimization condition of the agen t will be
74
Lectures in Labor Economics
aconstraintfortheprincipal. Thisiswhatwerefertoastheincentiv e com patibility
constraint (IC).
Th us the problem is
max
s(x,a),a
E [V (x s(x, a)]
s.t. E [ H(s(x, a),a)]
H Participation Con straint (PC)
and a arg max
a
0
E [H(s(x, a
0
),a
0
)] Incentive Constraint (IC)
where expectations are taken over the distribution of θ.
This problem has exactly the same structure as the canonical mo ral hazard prob-
lem, but is much simpler, because the principal is choosing s (x, a).Inparticular,
she can c hoose s such that s (x, a)=−∞ for all a 6= a
,thuseectively imp lemen t-
ing a
. This is because there is no moral hazard problem here giv en that there is no
hidden action.
Therefor e, presuming that the lev el of eort a
is the optimum from the poin t
of view of the principal, the problem collapses to
max
s(x)
E [V (x s(x)]
subject to
E [U (s (x))]
H + c (a
)
where w e ha ve already imposed that the agen t will ch oose a
and the expectation
is conditional on eort lev el a
. We have also dropped the incen tive compatibility
constraint, and rewrote the participation constrain t to take into accoun t of the
equilibrium level of eort by the agen t.
This is simply a risk-sharing prob lem , and the solution is straightforward. It can
be found b y setting up a simple Lagrangean:
min
λ
max
s(x)
L =E [V (x s(x)] λ
£
H + c (a
) E [U (s (x))]
¤
No w this migh t appear as a complicated problem, because we are c hoosing a function
s (x),butthisspeciccaseisnotdicult because there is no constraint on the form
of the function, so the maximization can be carried out poin twise (think, for example,
that x only took discrete values).
75
Lectures in Labor Economics
We might then be tempted to write:
E [V
0
(x s(x)] = λE [U
0
(s (x))] .
However, this is not q uite righ t, and somewhat misleadin g. Recall that x = x (a, θ),
so once we x a = a
, and conditional on x, there is no more uncertainty. In other
w ords, the right way to think about the problem is that for a giv en lev el of a,the
variation in θ induces a distribution of x, whic h typically w e will refer to as F (x | a)
in what follows. For now, since a = a
and w e can choose s (x) separately for each
x, there is no more uncertainty conditional on x.
Hen ce, the right rst-order conditions are:
(4.1)
V
0
(x s(x))
U
0
(s (x))
= λ for all x,
i.e., perfect risk sharing. In all states, represented by x, the marginal value of one
more dollar to the principal divided b y the marginal value of one more dollar to the
agentmustbeconstant.
3. Incentiv es-Insurance Trade-o
Next, let us move to the real principle-agen t model where only includes the
output performance, x, so feasible con tra cts are of the form s (x), and are not
conditioned on a.Theeor t is chosen b y the agents to maxim ize his utilit y, and the
incentive com pa tibility constraint will pla y an im portan t role . The problem can be
written in a similar form to before as
max
s(x),a
E [V (x s(x)]
s.t. E [H(s(x),a)]
H Particip ation C o nstra int (PC )
and a arg max
a
0
E [H(s(x),a
0
)] Incentiv e Constra int (IC)
with the major dierence that s (x) instead of s (x, a) is used .
As already hin ted above, the analysis is more tractable when we suppress θ,and
instead directly work with the distribution function of outcomes as a function of the
eort level, a:
F (x | a)
76
Lectures in Labor Economics
A natural assumption is
F
a
(x | a) < 0,
which is related to and imp lied by x
a
> 0. Expressed dierently, an increase in a
leads to a rst-order stoc hastic-d ominan t shift in F . Recall that w e sa y a distribution
function F rst-order stochastically dominates another G,if
F (z) G (z)
for all z (alternatively, the denition of rst-order stochastic domina nce may be
strengthened b y requiring the inequality to be strict for some z).
Using this wa y of expressing the problem, the principal’s problem now becomes
max
s(x),a
Z
V (x s(x))dF (x | a)
s.t.
Z
[U(s(x) c(a))] dF (x | a)
H
a arg max
a
0
Z
[U(s(x)) c(a
0
)] dF (x | a
0
)
This problem is considerably more dicult, because the second, the IC, con-
straint is no longer an inequalit y constraint, but an abstract constraint requiring
the value of a function,
Z
[U(s(x)) c(a
0
)] dF (x | a
0
), to be highest when evaluated
at a
0
= a.
It is v ery dicult to make progress on this unless w e tak e some shortcuts. The
standar d shortcut is called the rst-order approach,"and involv es replacing the
second constraint with the rst-order conditions of the agent. No w this w ould be
no big step if the agent’s problem
max
a
0
Z
[U(s(x)) c(a
0
)] dF (x | a
0
)
was strictly concave , but we can make no suc h statemen t since this problem depends
on s (x), whic h is itself the c h oice variable that the principal chooses. For suciently
non-con v ex s functions, the whole program will be non-conca ve, thu s rst-order
conditions will not be sucient.
Therefo re, the rst-order approac h always comes with some risks (and one should
not apply it without recognizing these risks and the potential for making mistakes,
77
Lectures in Labor Economics
though there are many instances in whic h it is applied when it should not be).
Nevertheless, byp assing those, and in addition assum ing that F is twic e contin u ou sly
dierentiable, so that the densit y function f exists, and in turn is dier entiable with
respect to a,therst-order condition for the agen t is:
Z
U(s(x))f
a
(x | a)dx = c
0
(a).
Now using this, we can modify the princip al’s problem to
min
λ,μ
max
s(x),a
L =
Z
{V (x s(x)) + λ
£
U(s(x)) c(a) H
¤
+
μ
U(s(x))
f
a
(x | a)
f(x | a)
c
0
(a)
¸¾
f(x | a)dx
Again carrying out point-wise maximization with respect to s(x):
0=
L
∂s(x)
= V
0
(x s(x)) + λU
0
(s(x)) + μU
0
(s(x))
f
a
(x | a)
f(x | a)
for all x
Th is im plies
(4.2)
V
0
(x s(x))
U
0
(s(x))
= λ + μ
f
a
(x | a)
f(x | a)
.
Thenicethingnowisthatthisisidenticalto(4.1)ifμ =0,thatis,ifthe
incentiv e com pa tibility constraint is slack . As a corollary, if μ 6=0, and the incentiv e
compatibility constrain t is binding, there will be a trade-o between insurance and
incentives, and (4.2) will be dierent from (4.1). What sign should μ be? Since μ is
the m ultip lier associated with anequalit y constraint, we canno t sa y this on a priori
grounds. But it can be pro v ed under some regularity conditions (in particular, when
Mo noton e Likelih ood Ratio Principle introduced below holds) that μ>0.Letus
assume that this is indeed the case for now.
Note also that the solution must feature λ>0, i.e., the participation constraint
is binding. Wh y is this? Suppose not. Then, the principal could reduce s (x) for all
x by a little without violating incen tive compa tibility and increase her net incom e.
Equivalen tly, in this problem the agent is receiving exactly what he would in his
next best opportunity, and is obtaining no rents. (There are no ren ts, because there
78
Lectures in Labor Economics
is no constrain ts on the lev el of paymen ts; we will see below ho w this will c hange
with lim ited liability constraints).
We can also use (4.2) to derive further insigh ts about the trade-o between
insurance and incentiv es.
To do this, let us assume that V
0
is constant, so that the principal is risk neutral.
Letusaskwhatitwouldtaketomakesurethatwehavefullinsurance,i.e.,
V
0
(x s(x))/U
0
(s(x))=constant.SinceV
0
is constant, this is only possible if U
0
is
constant. Suppose that the agen t is risk-averse, so that U is strictly conca ve or U
0
is strictly decreasing. Therefore, full insurance (or full risk sharing) is only possible
if s (x) is constan t. But in turn, if s (x) is constant, the incen tive compatib ility
constraint will be typica lly violated (unless the optimal contract asks for a =0),
and the agen t will ch oose a =0.
Next, consider another extreme case, where the principal simply sells the rm to
the agen t for a xed amo unt, so s (x)=x s
0
. In this case, the agent’s rst-order
condition will give a high level of eort (we can think of this as the rst-best”
level of eort, though this is not literally true, since this lev el of eort poten tially
depends on s
0
):
Z
U (x s
0
) f
a
(x | a) dx = c
0
(a) .
This higheeer level of eort comes at expense of no insurance for the agen t.
Instead of these two extremes, the optimal contract will be “second-best”, trad-
ing o incen tives and insurance.
We can in terpr et the solution (4.2) further. But rst, note that as the optim iza-
tion problem already makes it clear, as long as the IC constraint of the agent has a
unique solution, once the agen t signs to con tract s(x), there is no uncertainty about
action ch oice a. Nevertheless, lac k of full insurance means that the agen t is being
punished for low realizations of x. Why is this?
At some in tuitive level, this is because had it not been so, ex an te the agen t
would have had no incentive to exert high eort. What supports high eorthereis
the thre at of punishment ex post.
79
Lectures in Labor Economics
This interpretation suggests that there is no need for the principal to draw
inferences about the eor t c h oice a from the realizations of x. How ever, it turns out
that the optimal wa y of incen tivizing the agent has many similarities to an optimal
signal extraction problem.
To develop this intuition, consid er the follow ing maximum lik elihood estimation
problem: w e know the distribution of x conditional on a, w e observe x,andwewant
to estimate a. This is a solution to the following maximiza tion prob lem
max
a
0
ln f(x | a
0
),
for given x, which has the r st-order condition
f
a
(x | a
0
)
f(x | a
0
)
=0
which can be solv ed for a(x). Let the level of eort that the principal wants to
implemen t be ¯a,thena (x)=¯a,thisrst-order condition is satised.
Now going bac k to (4.2), we can write this as:
V
0
(x s(x))
U
0
(s(x))
= λ + μ
f
a
(x | ¯a)
f(x | ¯a)
.
If a (x) > ¯a,thenf
a
(x | ¯a)/f(x | ¯a) > 0.Sinceμ>0,thisimpliesthatV
0
/U
0
must
be greater and therefore U
0
must be lower. This is in turn possible only when s (x)
is increasing in x. Therefore, when the realization of output is good news relative
to what w as expected, the agen t is rewa rd ed, when it is bad news, he is punished .
Th us in a way, the principal is acting as if she’s trying to infer what the agen t did,
even though of course the principal knows the agent’s action along the equilibrium
path.
4. The Form of Performance Con tracts
Can w e sa y an ything else on the form of s (x)? At a minimum, w e w ould like to
say that s (x) is increasing, so that greater output leads to greater ren um er atio n for
the agent, which seems to be a feature of real world contra cts for managers, w o rkers
etc.
80
Lectures in Labor Economics
Unfortunately, this is not true without putting more structure on techn ology.
Consid er the followin g examp le where the agent c hooses between t wo eort levels,
high and low :
a {a
H
,a
L
}
and the distribution function of output conditional on eortisasfollows:
F (x | a
H
)=
½
4 with probability
1
2
2 with probability
1
2
F (x | a
L
)=
½
3 with probability
1
2
1 with probability
1
2
The agent has an arbitrary strictly concave utility function.
It is quite clear that in this case full risk-sharing can be ac h ieved (what does this
mean in terms of the multipliers in our formulation above?). In particu lar, full risk
sharing is possible if the principal punishes the agent whenever 1 or 3 is observ ed.
In fact, the follow ing contract w o u ld do the trick:
s(2) = s(4) =
H + c (a
H
)
s(1) = s(3) = K
where K is a very large n umber. Thus the agent is punished severely for the out-
comes 1 or 3, since these occur only when he chooses lo w eort. When the outcom e
is 2 or 4, he gets a pa ymen t consistent with his participation con stra int.
Clearly this con tr act is not increasin g in x, in particular, s (3) <s(2).
You migh t wo nder whether there is something special here because of the discrete
distribution of x. This is not the case. For example, a contin uous distribution with
peaks at {2, 4} for a = a
H
and {1, 3} for a = a
L
w ould do the same job.
So how can w e ensure that s (x) is increasing in x?
Milgr om, Bell Journal, 1981, “Good News, Bad News” shows the following result:
Asucient condition for s(x) to be increasing is that higher values of x are
good news about a
i.e.,
f
a
(x | a)
f(x | a)
is increasing in x
81
Lectures in Labor Economics
f(x | a
1
)
f(x | a
2
)
is increasing in x for a
1
>a
2
This is referred to as the Monoton e Likelihood Ratio Property (MLR P).
Wecanalsonotethatthisimpliesthat
Z
xf(x | a
1
)dx >
Z
xf(x | a
2
)dx for a
1
>a
2
,
meaning this condition is sucient (but not necessary) for the expected value of x
to increase with the lev el of eort.
Given the c haracterization above in terms of inferring a (x) from x, we need a (x)
to be increasing, so that higher output levels correspond to better news about the
level of eortthattheagentmusthaveexerted.
This is clearly not the case with our example above, where 3 relative to 2 is bad
news about the agen t having exerted a high level of eort.
W hen we assume ML RP, the result that w e can also show that the m u ltiplier μ
must be positiv e. To see this, note that if ML R P holds and μ<0, then with the
same argumen t as abov e (4.2) implies that s (x) m ust be decreasing in x ev eryw here
(since f
a
(x | a)/f(x | a) is increasing, V
0
/U
0
would be decreasing in x). Howev er, if
s (x) is decreasing everyw her e, the agen t w ould necessarily choose the lo west eort
level and the incen tive compatibility constraint w ou ld then be slack, and thus μ
must be equal to zero,leading to a contradiction with the h ypothesis that μ<0.
This establishes that when MLRP holds, we m ust hav e μ>0.
5. The Use of Information: Sucient Statistics
Finally, another importan t result that follows from this framework is that of
a sucient statistic result. Im agin e that in addition to x, the principal observ es
another signal of the agent’s eort, y, in the sense that y is a rand om variable with
distribution G (y | a). The principal does not care about y per se, and still wan ts to
max E(V (x s)).
The k ey question is whether the principal should oer a con tr act s(x, y) which
depends (non-trivially) on the signal y as well as the output x?
82
Lectures in Labor Economics
The answ er is: yes,ify helps reduce noise or yields extra information on a,and
no if x is a sucien t statistic for (x, y) in the estimation of a. Recall that a statistic
T is a sucient statistic for some family of random variables F in estimating a
parameter θ Θ if and only if the marginal distribution of θ conditional on T and
F coincide, that is,
f (θ | T )=f (θ |F) for all θ Θ.
To dev elop this point more formally, let us look at the rst-order conditions for
choosing s(x, y). With direct analog y to before, the rst-order conditions imply:
V
0
(x s(x, y))
U
0
(s(x, y))
= λ + μ
f
a
(x, y | a)
f(x, y | a)
The problem we are in terested in can be posed as whether s(x, y)=S(x) for all x
and y,wheres(x, y) is the solution to the maximization problem .
Equivalently,
s(x, y)=S(x) for all x and y if and only if
f
a
(x, y | a)
f(x, y | a)
= k(x | a) for all x and y
for som e functio n k.
W hat does this condition mean? The condition
f
a
(x, y | a)
f(x, y | a)
= k(x | a) x, y
is equivalent to f(x, y | a)=g(x, y)h(x | a) (simply dierentiate both sides with
respect to a to v erify this claim). This condition, in turn, means that conditiona l
on x, y has no additional information on a, or using Bay es’ rule
f(a | x, y)=f(a | x)
that is, x is a sucien t statistic for (x, y) with respect to inferences about a.
The implication is the importan t suggested result: the optimal contract condi-
tional on x and y, s(x, y), will not use y if and only if x is a sucient statistic for
(x, y) with respect to a.
83
CHAPTER 5
M oral Ha z a r d with Lim it e d Liab ility, Mu lt itasking, Ca r e e r
Concerns, and Applications
1. L im ited Liability
Let us modify the baseline moral hazard model b y adding a lim it ed liability
constraint,sothats (x) 0.
The problem becomes:
max
s(x),a
Z
V (x s(x))dF (x | a)
subject to
Z
[U(s(x) c(a))] dF (x | a)
H
a arg max
a
0
Z
[U(s(x)) c(a
0
)] dF (x | a
0
)
s (x) 0 for all x
Again taking the rst-order approach, and assigning a multiplier η (x) to the last
set of constrain ts, the rst but her conditions become:
V
0
(x s(x)) =
λ + μ
f
a
(x | a)
f(x | a)
¸
U
0
(s(x)) + η (x) .
If s (x) was going to be positive for all x in an y case, the m ultiplier for the last set of
constraints, η (x), w ould be equal to zero, and the problem w ould have an identical
solution to before.
Ho w ever, if, previously, s (x) < 0 for some x, the structure of the solution has to
c hange. In particular, to obtain the in tuition, suppose that we shift up the en tire
function s (x) to ˜s (x) so that ˜s (x) 0. Since the participation constraint was
binding at s (x),itmustbeslackat˜s (x). C le arly this will not be optim al and in
fact because of income eects, this shifted-up sc hed ule may no longer lead to the
85
Lectures in Labor Economics
same optimal c ho ice of eort for the agent. In particular, as w e increase the level
of pa ymen ts at lo w realizations of x, the entire pa yment schedule has to change in
a more complex way. Nevertheless, this “shifting-up” in tuition mak es it clear that
the participation constrain t will no longer be binding, th us λ =0.
This informally is the basis of the intu ition that with out limited liabilit y con-
straints, there are no ren ts; but with limited liability there will be ren ts, makin g the
agent’s participation constraint slack.
Let us no w illustrate this with a simple example. Suppose that eort takes t wo
values a {a
L
,a
H
}. Assum e that output also takes only two values: x {0, 1},
moreov er,
F (x | a
H
)=
1 with prob ability 1
F (x | a
L
)=
½
1 with proba bility q
0 with prob ability 1 q
Norm alize
¯
H and c (a
L
) to zero, and assume c (a
H
)=c
H
< 1 q.
Finally, to make things ev en simpler, assume that both the agent and the prin-
cipal are risk neutral.
Let us rst look at the problem without the limited liability constraint. The
assumptio n that c (a
H
)=c
H
< 1 q implies that high eort is optimal, so in an
ideal world this would be the eo rt level.
Let us rst start by assuming that the principal w ou ld lik e to implem ent this.
In this case, the problem of the principa l can be written as
min
s(0),s(1)
s (1)
subject to
s (1) c
H
qs(1) + (1 q) s (0)
s (1) c
H
0
where s (0) and s (1) are the paymen ts to the agent conditional on the outcome
(W hy are these the only two control variables?)
86
Lectures in Labor Economics
The rst con str aint is the incentiv e compatib ility constraint; it requires that the
agent prefers to exert high eort and to receive the high pa y ment rather than taking
the gamble between high and low pa ym ent, while also saving the cost of eor t (this
statement is written presum ing that s (0) <s(1), which will be the case).
The second constraint is the participation constraint, requiring that the along-
the-equilibriu m -p ath paymen t to the agent exceed his outside option, 0.
This prob lem does not impose a limite d liability constraint yet.
The principal simply minimizes the cost of hiring the agent, since conditional on
implem enting the high eort, there is no other in teresting choice for her.
The solution is straightforw a rd, and involves the particip atio n constrain t holding
as equalit y, th u s
s (1) = c
H
Then the incen tive compa tibility constraint implies that
s (0) ≤−
q
1 q
c
H
,
so that the agen t receives a harsh enough punishment for generating the wrong lev el
of output. It can also be v eried that in this case the principal indeed prefers to
implement the high lev el of eort.
Clearly, s (0) needstobenegative,sothesolutionwillnotbepossiblewhenwe
impose limited liabilit y.
Let us now look at the problem with the limite d liabilit y constraint. Again pr e-
sumin g that the high level of eort will be implemen te d, the ma ximization problem
boils down to:
min
s(0),s(1)
s (1)
subject to
s (1) c
H
qs(1) + (1 q) s (0)
s (1) c
H
0
s (0) 0
87
Lectures in Labor Economics
where w e could hav e also imposed s (1) 0, but did not, because this constraint
will clearly be slack (why?).
It is straightforw ard to verify that solution to this problem will be
s (0) = 0
s (1) =
c
H
1 q
Th us now, when successful, the agent is paid more than the case without the limited
liabilit y constraint, and as a consequence, the participation constrain t is slack. A
dierent w ay of expressing this is that now the agent receiv es arentfr o m the
employment relationship. This rent can be easily calculated to be equal to
rent =
q
1 q
c
H
As a result, with limited liabilit y, we hav e the issue of rents in addition to the
issue of insurance.
We can also see that the presence of ren ts ma y actually distort the choice of
eort. To develop this point further, let us calculate the return to the principal
with high eort. It is clearly
Return
H
=1
1
1 q
c
H
In contrast, if he chooses the low eort, he can pa y the agent s (0) = s (1) = 0,thus
making:
Return
L
= q
whichcanbegreaterthanReturn
H
. In con tr ast, without rents for the agen t, the
return to the principal from implemen ting high eort would have been 1c
H
,which
is greater than q byassumption. Thisimpliesthateventhoughhigheort might be
“socially optimal” in the sense of increasing net output (net surplus), the principal
ma y c hoose low eort in order to reduce the rents that the agent receives (and thus
distort the structure of production and eort).
The limited liability constraint and the associated ren ts will pla y a v ery impor-
tant role below wh en we discuss eciency wa ges.
88
Lectures in Labor Economics
2. Linear Con tracts
One problem with the baseline model developed above is that, despite a n umber
of useful insigh ts, it is quite dicult to work with. M oreover, the exact shape of
the densit y functions can lead to ve ry dierent forms of con tracts, some with very
nonlinear features.
One approac h in the literature has been to look for “robust con tracts that are
both in tu itively simpler and easier to work with to derive some rst-order predic-
tions. But wh y should optimal contracts be “robust”? And, how do w e model
“robust” contracts?
A poten tially promising answer to this question is dev eloped in an important
paper by Holmstrom and Milgrom . They established the optimalit y of linear con-
tracts under certain conditions, which is in teresting both because linear con tracts
can be viewed at as more robust than highly nonlinear con tra cts, and also because
the intu ition of their result stems from robustness considerations.
Providing a detailed exposition of Holmstr om and Milgrom ’s model w ou ld tak e
us too far aeld from our main focus. Ne vertheless, it is useful to outline the
en vironmen t and the main intuition. Holmstrom and Milgrom consider a dynamic
principal-agent problem in con tinuous time. The interaction between the principal
and the agent tak e place ov er an in terval normalized to [0, 1]. The agent ch ooses an
eort lev el a
t
A at each instant after observing the relaxation of output up to that
instan t. More formally, the output process is given by the con t in uous time random
walk, that is, the following Brownia n motion process:
dx
t
= a
t
dt + σdW
t
where W is a standard Brownian motion (Wiener process). This implies that its
increments are independent and normally distributed, that is, W
t+τ
W
t
for any t
and τ is distributed norm a lly with variance equal to τ .LetX
t
=(x
τ
;0 τ<t)
be the ent ire history of the realization of the increments of output x up until time
t (or alternativ ely a “sample path” of the random variable x). The assumption
89
Lectures in Labor Economics
that the individual c hooses a
t
after observing past realizations implies that a
t
can
be represen ted by a mapping a
t
: X
t
A. Similarly, the principal also observ es
the realizations of the incremen ts (though obviously not the eort levels and the
realizations of W
t
), so a con tr act for the agen t is giv en b y a mapping s
t
: X
t
R,
specifyin g what the individua l will be paid at time t is a function of the en tire
realization of output lev els up to that point.
Holm strom and Milgrom assum e that the utilit y function of the agent be
u
µ
C
1
Z
1
0
a
t
d
t
where C
1
is the agent’s consumption at time t =1. T his utility function makes
t wo special assumptions: rst, the individual only derives utilit y from consum p tion
at the end (at time t =1) and second, the concave utility function applies to
consumption min us the total cost of eort between 0 and 1. In addition, Holmstrom
and Milgrom assume that u tak es the special constant absolute risk av ersion, CA R A ,
form
(5.1) u (z)= exp (rz)
with the degree of absolute risk aversion equal to r, and that the principal is risk
neutral, so that she only cares about her net reven ue at time t =1,givenbyx
1
C
1
(since consumption of the agent at time t =1is equal to total paym ents from the
principal to the agen t).
The key result of Holmstrom and Milgrom is that in this model, the optimal
contract is linear in nal (cum ulative) output x
1
. In particular, it does not depend
on the exact sample path leading to this cum ulative output. Moreov er, in response
to this con tract the optimal behavio r of the agen t is to choose a constan t lev el of
eort, whic h is also independent of the history of past realizations of the stoc hastic
shock (can you see why the utility function (5.1) is important here?).
The loose intuition is that with an y nonlinear con tract there will exist an event,
i.e., a sample path, after whic h the incen tives of the agent will be distorted, whereas
the linear con tract ac hieves a degree of “robustness”. A more formal in tuitio n is
90
Lectures in Labor Economics
that we can think of a discrete appro xim atio n to the Bro w nian motion, whic h will
be a binomial process specifying success or failure for the agen t at eac h instan t.
The agent should be rewarded for success and punished for failure, and this will
amount to the individual being ren um e rated according to total cumulativ e output.
Moreover, generally this rem uneration should depend on the wea lth lev el of the
agent, but with CAR A , the wealth lev el does not matter, so the reward is constant.
A linear reward sc h edu le is the limit of this process corresponding to the contin uou s
time limit of the binom ial process, which is the Brownian motion .
Now motivated by this result, many applied papers look at the follo w ing static
problem:
(1) The principal chooses a linear con tract, of the form s = α + βx (note that
this implies there is no lim ited liability; and we have also switc h ed from S
to s to simplify notation).
(2) The agen ts c hooses a A [0, ].
(3) x = a + ε where ε N (0
2
)
In addition, the principal is risk neutral, while the utilit y function of the agent
is
U (s, a)= exp (r (s c (a)))
with c (a)=ca
2
/2 corresponding to the cost of eor t for some c>0.
The argumen t is that a linear con tract is appro ximately optimal here.
It turns out that the results of this framework are v ery intuitiv e and consisten t
with the baseline model. Ho wever, it is important to emphasize that a linear contra ct
is not optima l in this case (it is only optimal in the Holmstrom -Milgrom model
with continuous time and the other assumptions; in fact, it is a well-known result
in agency theory that a static problem with a normally distributed outcomes has
suciently unlikely events that the rst-best level of eort, which here is a
fb
=1/c,
can be appro xim a ted by highly non linear con tracts, th u s the linear contracts studied
here are very dierent from the optimal con tracts that wo uld arise if the actual model
has been the static model with norm a lly distribute d shoc ks).
91
Lectures in Labor Economics
Let us derive the optimal con tract in this case.
The rst-order approac h w ork s in this case. T he maxim ization problem of the
agen t is
max
a
E { exp (r (s (a) c (a)))}
=max
a
½
exp
µ
rEs (a)+
r
2
2
Var (s (a)) rc (a)
¶¾
where the equality bet ween the tw o expressions follow s from the normality of s,(s
is a linear in x,andx is normally distributed), so this is equivalent to
max
a
n
Es (a)
r
2
Var (s (a))
c
2
a
2
o
Now substitutin g for the con tract, the problem is:
max
a
βa
c
2
a
2
r
2
β
2
σ
2
so the rst-order condition for the agent’s optimal eort c h oice is:
a =
β
c
The principal will then maxim ize
max
a,α,β
E ((1 β)(a + ε) α)
subject to
a =
β
c
α +
β
2
2
µ
1
c
2
¯
h
where the second inequality is the participation constrain t, with the denition
¯
h =
ln
¡
¯
H
¢
,where
¯
H is the reservation utility of the agent, and requires the expected
utilit y of the agen t under the con tract to be greater than
¯
H.
Thesolutiontothisproblemis
(5.2) β
=
1
1+rcσ
2
and
α
=
¯
h
1 rcσ
2
2c
2
(1 + rcσ
2
)
2
,
92
Lectures in Labor Economics
and because negative salaries are allowed, the participation constraint is binding.
In other words, the more risk-averse is the agen t, i.e., the greater is r,themore
costly is eort, i.e., the greater is c, and the more uncertain ty there is, i.e., the
greater is σ
2
, the low er powered are the agent’s incentives.
The equilibriu m level of eort is
a
=
1
c (1 + rcσ
2
)
Th is is alway s low e r th an the rst-best lev e l of eort which is a
fb
=1/c.
We can see that as r 0 and individual becomes more and more risk neutral,
the equilibrium approa ches this rst-best lev e l of eort. Similarly, the rst-best
applies as σ
2
0, which corresponds to the case where risk disappears (and th u s
the model has a problem becomes mute).
Let us now derives some of the other results of the baseline model. Suppose that
there is anoth er signal of the eor t
z = a + η,
where η is N
¡
0
2
η
¢
and is independent of ε. No w , let us restrict atten tion to linear
contracts of the form
s = α + β
x
x + β
z
z.
Note that this con tract can also be interp reted alternatively as s = α + μw where
w = w
1
x + w
2
z is a sucient statistic derived from the tw o random variables x and
z. This already highlights that the sucient statistic principle is still at w o rk here.
Now with this type of contra ct, the rst-order condition of the agent is
a =
β
x
+ β
z
c
and the optimal con tract can be obtained as:
β
x
=
σ
2
η
σ
2
+ σ
2
η
+ rc
¡
σ
2
σ
2
η
¢
and
β
z
=
σ
2
σ
2
+ σ
2
η
+ rc
¡
σ
2
σ
2
η
¢
93
Lectures in Labor Economics
These expressions sho w that generally x is not a sucien t statistic for (x, z),and
the principal will use information about z as w e ll to determine the compensation of
the agen t .
The exception is when σ
2
η
→∞so that there is almost no information in z
regarding the eort chosen by the agen t. In this case, β
z
0 and β
x
β
as given
by (5.2), so in this case x becomes a sucient statistic.
3. Ev idenc e
The evidence on the basic principal-agent model is mixed. A series of papers,
notably those by Ed Lazear using data from a large auto glass installer, presen t
con vincing evidence that in a variety of settings high incentiv es lead to more eort.
For example, Lazear’s evidence sho ws that when this particular compan y wen t from
xed salaries to piece rates productivit y rose b y 35% because of greater eort by
the employees (the increase in a verage w ages w as 12%), but part of this response
might be due to selection, as the composition of employ ees might hav e c ha n ged.
Similar evidence is reported in other papers. For example, Kahn and Sherer,
using the personnel les of a large compa ny, show that employ e es (white-collar
oce workers) whose pa y depends more on the subjective evaluations obtain better
evaluations and are more productiv e.
More starkly, and perhaps more interestingly, a n u mber of papers using Chinese
data, in particular w o rk by John McM illan , show that the responsibilit y system
in Chinese agriculture, allow ing local communes to retain a share of their prots
led to substan tial increases in productivity. Separate work by Ted Grov es nds
similar eects from the Chinese industry. This evidence is quite conclusive about
the eect of incentives on eort and productivity. Ho wever, the principal-agen t
approach to con tracting and to the incen tive structure not only requires that eort
and performance are responsive to incentives, but also that these incen tives are
designed optimally, and that the ty pes of theories developed so far capture the
94
Lectures in Labor Economics
salien t features of these optimal con tracts. The evidence in favor of this latter, m ore
stringent evaluation of the principal-agent theory is w eaker.
To start with, ev en though in some stoc k examples such as in Chinese agricultu re
or industry, higher-powered incen tives (meaning greater rewar ds for success) lead
to better outcomes, in other contexts more high-po wered incen tives seem to lead
to coun ter-p roductiv e incen tives. On e exam ple of this is the evidence in the paper
b y Ernst Fehr and Simon Gach ter show ing that incentive con tracts migh t destro y
voluntary cooperation.
More standard examples are situations in which high-powered incentives lead to
distortions that were not an ticipated by the principals. A well-known case is the
consequences of So viet incen tive schem es specifying “performance” by number of
nails or the weigh t of the materials used, leading to totally unu sable products.
Eviden ce closer to home also indicates similar issues. A n u mber of papers ha ve
documen t ed that agents with high-pow ered incentiv es try to “game” these incen-
tiv e s (poten tially creating costs for the principals). A telling example is work by
P aul Oyer, and w ork b y P ascal Court y and Gerard Marsc hke, whic h look at per-
formance contract that are nonlinear functions of outcomes, and show that there is
considerable gaming going on. For example, managers that get bon u ses for reac hin g
a particular target by a certain date put a lot of eort before this date, and m uch
less during other times. This w ould be costly if a more ev en distribution of eort
w ere optimal for the rm.
More generally, the greatest challenge to the principal-agent approach is that it
does not perform well in terms of its predictions regarding the types of con tracts
that should be oered (how these con tracts should v ery across en vironmen ts). First,
as discussed at length by Pren dergast, there is little association between riskiness
and noisiness of tasks and the t ypes of contracts wh en we look at a cross section of
job s.
Second, and perhaps more starkly, in man y professions performance contracts
are largely absen t. Th ere is a debate as to whether this is ecient, for example,
95
Lectures in Labor Economics
as in teac h ing and bureaucracy, but many believe that such contracts are absen t
precisely because their use w ould lead to distorted incen tives in other spheres–a s
in the models of multitasking we discuss next. A widespread view related to this is
that the basic moral hazard models are not useful in thinking about bureaucracies,
where there are man y coun tervailing eects related to m u ltitasking and “career
concerns,” creating incen tives for other types of behavior, and consequently the
powerofincentivesareoftenweakinsuchorganizations.
4. Multitaskin g
We now discuss incen tive models in which agen ts undertak e more than one task
or more than one agent in teract with the principal or perform similar tasks. These
models are useful both to extend the reac h of the agency theory, and also to gen-
erate some insights on wh y w e ma y not see very high-po wered incentives in most
occupations.
Multitasking is the broad name giv e n by Holm strom and Milgrom to situations
in which an agen t has to work in more than one tasks. Multitasking is generally
associated with problems of giving incentives to the agen t in one sphere without
excessiv e ly distorting his other incen tives. In other wor ds, multitasking is about
balancing the distortions created indier ence tasks undertaken by a single agen t.
Letusnowmodifytheabovelinearmodelsothattherearetwoeorts that the
individual chooses, a
1
and a
2
, with a cost function c (a
1
,a
2
) wh ich is increasing and
convex as usual.
These eorts lead to two outcom es:
x
1
= a
1
+ ε
1
and
x
2
= a
2
+ ε
2
,
where ε
1
and ε
2
could be correlated. The principal cares about both of these inputs
with poten tially dieren t weigh ts, so her return is
φ
1
x
1
+ φ
2
x
2
s
96
Lectures in Labor Economics
where s is the salary paid to the agen t.
What is dieren t from the prev ious setup is that only x
1
is ob se rved, while x
2
is
unobserv ed.
A simpl e example is a home contractor wher e x
1
is an inv erse measure of how
long it tak es to nish the contracted w ork, while x
2
is the quality of the job, wh ich
is not observed until m u ch later, and consequently, pa y m ents can not be conditioned
on this.
Another example w ould be the behavior of employees in the public sector, w here
quality of the service pro v ided to citizens is often dicult to con tract on.
So what is the solution to this problem?
Again let us take a linear contra ct of the form
s (x
1
)=α + βx
1
since x
1
is the only observ able output.
The rst-order condition of the agen t now giv es:
β =
∂c(a
1
,a
2
)
∂a
1
(5.3)
0=
∂c(a
1
,a
2
)
∂a
2
So if
∂c(a
1
,a
2
)
∂a
2
> 0
whenever a
2
> 0, then the agent will c hoose a
2
=0, and there is no way of inducing
him to choose a
2
> 0.
Ho wever, suppose that
∂x(a
1
,a
2
=0)
∂a
2
< 0,
so without incen tives the agent will exert som e positive eortinthesecondtask. In
this case, in fact pro vid ing incen tives in task 1 can undermine the incentives in task 2.
This will be the case when the tw o eorts are substitutes, i.e.,
2
c (a
1
,a
2
) /∂a
1
∂a
2
>
0,sothatexertingmoreeort in one task increase the cost of eort in the other task.
97
Lectures in Labor Economics
Now, stronger incen tives for task 1 increase eort a
1
, reducing eort a
2
because of
the substit utab ility bet ween the two eorts.
To see this more formally, imagine that the equations in (5.3) have an interior so-
lution (why is an in terior solution important?), and dierentiate these tw o rst-order
conditions with respect to β. Using the fact that these two rst-order conditions
correspond to a maximum (i.e., the second order conditions are satised ), we can
use the Implicit Function Theorem on (5.3), immediately see that
∂a
1
∂β
> 0.
(Itisusefulforyoutoderivethisyourself). Thishasthenaturalinterpretationthat
high-powered incentives lead to stronger incentives as the evidence discussed abo ve
suggests.
However, we also have that if
2
c (a
1
,a
2
) /∂a
1
∂a
2
> 0,then
∂a
2
∂β
< 0,
thus high-po wered incen tives in one task adv ersely aect the other task.
Nowitisintuitivethatifthesecondtaskissuciently important for the prin-
cipal, then she will “sh y away ” from high-po wered incen tives; if you are afraid that
the con tractor will sacrice quality for speed, y ou are unlik ely to oer a con tract
that puts a high rew ard on speed.
Mo re formally, with a similar analysis to before, it can be show n that in this
case
β
∗∗
=
φ
1
φ
2
(
2
c (a
1
,a
2
) /∂a
1
∂a
2
) / (
2
c (a
1
,a
2
) /∂a
2
2
)
1+
2
1
(
2
c (a
1
,a
2
) /∂a
2
1
(
2
c (a
1
,a
2
) /∂a
1
∂a
2
)
2
/∂
2
c (a
1
,a
2
) /∂a
2
2
)
Therefore, the optimal linear contract from the point of view of the principal
has sensitivit y β
∗∗
to performance, and β
∗∗
is declining in φ
2
(the importance of the
second task) and in
2
c (a
1
,a
2
) /∂a
1
∂a
2
(degree of substitutability between the
eorts of the tw o tasks).
This equation is the basis of many of the claims based on the multitask model.
In particular, think of the loose claim that “in the presence of m ultitasking high-
powered incentiv es may be some optimal”. Basically, this equation sho ws why.
98
Lectures in Labor Economics
This m odel has enormous poten tia l to explain wh y many organizations are un-
willing to go to high-powered incen tives.
These ideas are developed in Holmstrom and Milgro m’s 1991 paper “Multitask
Princip al-A g ent Analyses” in the Journal of L aw Economics and Organization.This
is a fan tastic paper, and y ou shou ld read it.
They also show how the m u ltitask idea explains wh y yo u wa nt to put restric-
tions on the outside activities of w orkers or managers, and how it giv es you a new
perspective on thinking of how dierent tasks should be organized into various jobs.
5. Relative P erformance Evaluation
This framew ork also naturally leads to relativ e performance evaluation when
there are man y agents workin g on similar tasks.
Let us go bac k to the one task linear model where
U (s, a)= exp (r (s c (a)))
with c (a)=ca
2
/2, and the principal cares about
x = a + ε
The only dier ence now is that there is another work er (perhaps workin g for some
other principal), whose performance is given b y
˜x a ε,
where ~ denotes the other worker. The random shocks ε and ˜ε are both normally
distributed . They can also be correlated, wh ich will play an important role in
relative performance evaluation.
Assum e that ˜x is publically observed. In equilibrium , everybody will guess the
level of eort that this other w o rker exerts giv e n his contract, so ˜x, along the equi-
librium path, will reve al ˜ε. (T his is a very important comment; it may be obv ious
to you, or it ma y not be; in either case think about it, and this will pla y a v ery
important role in what follo w s).
Now if ε and ˜ε are uncorrelated, the equilibrium derived above applies.
99
Lectures in Labor Economics
But suppose that these two agen ts are in the same line of business, th us are
aected by comm on shocks. Then w e may assume, for example, that
Var(ε)=Varε)=σ
2
Corr (ε, ˜ε)=ρ,
which determines whether both agen ts are simultaneously lucky or not.
In this case, using the same argument as before, it can be shown that the optimal
(linear) con tr act for our agent will take the form
s = α + βx
˜
β˜x
with
β =
1
1+rcσ
2
(1 ρ)
and
˜
β =
ρ
1+rcσ
2
(1 ρ)
Let us now consid er the case where ρ>0 so that performan ce bet ween the two
agents is positively correlated. In this case, the agent’s pa ym ent is more sensitiv e
to his own performance (β is no w larger), but he will be punished for the successful
performance of the other agent (and the exten t of of this depends on the degree
of correlation between the tw o performances, ρ). This is clearly a form of relativ e
performance evaluation, where the agent is judged not according to some absolute
standard but with respect to a relative standard set b y others in the same eld.
Can you see what w ould happen if ρ<0?
6. Tournamen ts
Som ething akin to relative performance evaluation, some form of a “yardstic k
competition”, where employ ees are compared to eac h other, often occurs inside
rm s. For example, the emplo yee who is most successful gets prom oted . Also
related is the very common “up-or-out contracts” where after a w hile employees are
either promoted or red (e.g., ten ure in academic systems). The parallel between
these con tracts and relative performance evaluation comes from the fact that it is
100
Lectures in Labor Economics
t y pically impossible to promote all lo w -level w orkers, so there is an implicit element
of yard stick competition in up-or-out contracts.
This situation is sometimes referred to as “tournamen ts”.
The analysis is develo ped in the famous paper by Ed Lazear and Sherwin Rosen,
JPE 1981.
They analyze the problem of a rm employing two wo rkers in a similar task, one
producing x
1
, the other producing x
2
.
We know from the above analysis that the optimal con tract that the principal
can oer to these guys should mak e their renum era tion a function of both x
1
and
x
2
.
Instead, Lazear and Rosen look at a non-optima l but int uitive contract where
the agen ts’ renu m e rations are a function of their “rank”, exactly as in sports tour-
namen t, where the highest prize goes to the winner, etc.
More concretely, let us assume that both the principal and the agen ts are risk-
neutral, and the output of each agen t is given b y
x
i
= a
i
+ θ
i
where a
i
is eort and θ
i
is a stochastic term.
Both agen ts ha ve the same cost function for eort given by c (a),which,as
before, is increasing and convex as usual. Let us denote the reservation utilit y of
both agen ts b y
¯
H as before.
Clearly the rst best will solve
max
a
i
x
i
c(a
i
),
so will satisfy
c
0
¡
a
fb
¢
=1.
(Recall that there is no eo rt in teraction, only interactions throug h stoc ha stic ele-
ments).
Let us simplify the problem and look at the extrem e case where θ
1
and θ
2
are
independent, so what w e are dealing with is not standard “relativ e performance
101
Lectures in Labor Economics
evaluation”. Thu s let us assume that they are both drawn independently from a
con tin uous distribution F (θ), with density f (θ). This is in very useful benchm ark,
especially from our analysis of the baseline Holm strom model abo ve w e know quite
a few things about the optimal contract in this case (what do w e kno w ?).
The principa l is restricted to the follo w ing contract
w
i
(x
1
,x
2
)=
w if x
i
>x
j
w if x
i
<x
j
1
2
(w + w) if x
i
= x
j
In other words, the principal only c hooses t wo levels of paym en ts, w for the more
successful agent and w
for the less successful agent.
There is a dieren ce here from what we hav e studied so far, since no w conditional
on the contract oered by the principal, the two agents will be pla yin g a game, since
their eort c h oices will aect the other agent’s pa yo.
More specically, the timing of mo ves is now giv en by
(1) The principal c h ooses
w, w.
(2) Agen ts sim ultaneously choose a
1
, a
2
Formally, this again corresponds to a dynamic game where the principal is like a
Stackleberg leader. Since w e have a dynamic game, w e should look for the subgame
perfect Nash equilibrium , that means backward induction. That is, we need to
analyze the Nash equilibrium in the subgames between the agents, and then the
optimal contract choice of the principal.
In other words, w e rst tak e each subgame chara cterized by a dierent ch o ice
of the contract w(x
1
,x
2
),andnd the Nash equilibrium a
1
(w, w),a
2
(w, w) of the
twoagents(whydodierent contracts corresponds to dierent subgames?). Then
the principal will maximize expected pro ts by choosing
w, w giv en agents’ reaction
functions, a
1
(w, w),a
2
(w, w).
Thekeyobjectwillbetheprobabilitythatoneworkerperformsbetterthanthe
other as a function of their eorts. Dene
P
i
(a
i
,a
j
) Prob {x
i
>x
j
| a
i
,a
j
}
102
Lectures in Labor Economics
Clearly
x
i
= a
i
+ θ
i
>a
j
+ θ
j
= x
j
if and only if
θ
i
>a
j
a
i
+ θ
j
Using this, w e can derive:
P
i
(a
i
,a
j
)=Prob{θ
i
>a
j
a
i
+ θ
j
| a
i
,a
j
}
=
Z
Prob {θ
i
>a
j
a
i
+ θ
j
| θ
j
,a
i
,a
j
} f(θ
j
)
j
=
Z
[1 F (a
j
a
i
+ θ
j
)] f(θ
j
)
j
(using indenite in tegr als to denote integration ov er the whole support).
Nash equilibrium in the subgam e given the wage function w(x
1
,x
2
),orsimply
(
w, w),isdened as a pair of eort choices (a
1
,a
2
) such that
a
i
max
a
i
P
i
(a
i
,a
j
)w +
£
1 P
i
(a
i
,a
j
)
¤
w c(a
i
)
The rst-order condition for the Nash equilibrium for each agen t is therefore
given by
(
w w)
∂P
i
(a
i
,a
j
)
∂a
i
c
0
(a
i
)=0
This equation is v ery intuitive: eac h agent will exert eortuptothepointwhere
the marginal gain, whic h is equal to the prize for success times the increase in the
probability of success, is equal to marg inal cost of exerting eort.
The solution to this rst-order condition is the best response of agent i to
(
w, w,a
j
).
Sinceagentsarerisk-neutral,ispossibletoimplementtherst best here. Clearly,
the rst best involves both agents choosing the same lev el of eort, a
fb
as sho w n
above. Let us then look for a sym m e tric equilib riu m imp lemen tin g the rst best:
a
i
= a
j
= a
fb
103
Lectures in Labor Economics
Now using the rst-order condition of the agen t, this implies that a
fb
has to be
a solution to:
(
w w)
∂P
i
(a
i
,a
fb
)
∂a
i
c
0
(a
fb
)=0
Since the rst best eort lev el is dened b y c
0
(a
fb
)=1,thisisequivalentto
(
w w)
∂P
i
(a
i
,a
fb
)
∂a
i
=1.
Note that
∂P
i
(a
i
,a
j
)
∂a
i
=
Z
f(a
j
a
i
+ θ
j
)f(θ
j
)
j
Now using symm etry, i.e., a
1
= a
2
, this equation becomes
∂P
i
(a
i
,a
fb
)
∂a
i
¯
¯
¯
¯
a
i
=a
fb
=
Z
f(θ
j
)
2
j
or
(
w w)
Z
f(θ
j
)
2
j
=1
This c h aracterizes the constrain t that the principal will face in c hoosing the
optimal con tract.
Expressed dierently, the principal must set
w w =
Z
f(θ
j
)
2
j
¸
1
w = w + ,
where
Z
f(θ
j
)
2
j
¸
1
.
This expression implies that in order to induce the agents to play this symm e tric
equilib riu m , which will in turn lead to the rst best, the principal has to induce the
wage gap of at least between the more and the less successful agent. The con-
straint facing the principal to ensure that both agents exert eort is the equivalen t
of the incentive com patib ility constrain t of the standard moral hazard models trans-
lated into this tournamen t con tex t (Can you interpret this incentiv e comp atibility
constrain t in greater detail?).
104
Lectures in Labor Economics
The principal also needs to satisfy the participation con straint; this will tie down
thevaluesof
w and w.
P
i
(a
fb
,a
fb
)w +[1 P
i
(a
fb
,a
fb
)] w c(a
fb
) H
Symm etry ensures that P
i
(a
fb
,a
fb
)=1/2,thatis,
1
2
(
w + w) H + c(a
fb
)
The principal’s problem therefore boils down to:
min
w,w
w + w
s.t w = w + (IC)
w + w 2(H + c(a
fb
)) (PC )
This has the solutions
w = H + c(a
fb
)+
2
w
= H + c(a
fb
)
2
Therefo re, in this case by using a simple tournam ent, the principal can induce
an equilibrium in which both agents choose rst-best eor t.
No w y ou may wonder how this compares with what we ha ve done so far?
Clearly the tournamen t is not an “optimal contract” in general (we hav e re-
stricted the functional form of the con tra ct severely). But it is implem enting the
rst best. In fact, given our assumption that θ
1
and θ
2
are independent, the full
analysis above based on Holmstrom ’s paper shows that the optimal con tract for
agen t 1 should be independent of x
2
and vice v er sa. So what is happening?
The answer is that the environm ent here is simple enough that both the optimal
contract a la Holmstrom , and the non-optimal con tract a la Lazear-Rosen reach
the rst best. This would generally not be the case. Ho wever, tournaments still
maybeattractivebecausetheyaresimpleandonemighthopetheymightbe“more
robust” to variations in the tec hnology or information structure (even though, again,
the exact meaning of “robust” is not entirely clear here).
105
Lectures in Labor Economics
Having said that, while “up-or-out” con tracts are quite common, tournament
are not as common within organizations. This may be because designing tourna-
ment among employ ees migh t create an adv ersarial environm ent, leading workers
to sabotage their co workers (Lazear) or some t ype of collusion among the agents
(Mookherjee).
There is little careful empirical work on these issues, so a lot of poten tia l here.
7. A p p licat ion : CEO Pay
A major application of the ideas of agency theory is to the behavior and ren u-
meration of managers.
Consistent with theory, it seems that agency considera tions are important, but
there are also many other issues to consider.
One easy w a y of thinking about the problem is to equate the principal with the
shareholders of the rm (ignoring free rider problem among the shareholders), the
agen t with the CEO (ignoring other manager s that also tak e important decisions). In
practice, the relationship bet ween the CE O and managers, and between shar eholders
and debtholders could be quite important in thinking of the righ t model.
Ignoring these more complex issues, w e can map the CEO example into our
model as:
“output” x = c han ge in shareholder w ealth
“wages” s(x)=salary,bonus,benets, pensions, stoc k option s,...
The basic question in the empirical literature is whether executiv e compensation
is sensitiv e to the rm ’s stock m arket performance (i.e. whether s
0
(x) > 0)andthe
answer is a clear ye s. CEO s that are more productiv e for their shareholders are paid
m uch more (some m uc h m uch m uc h more), and are less likely to be red.
Nevertheless, an inuential paper by Jensen and Murphy in 1990 argued that
s
0
(x) is too low. They blamed regulations and social norms for it. They argued that
if s
0
(x) could be increased, performance w o uld improve substantially.
106
Lectures in Labor Economics
Clearly s
0
(x) increased o ver the 1990s, and there is a debate as to whether it is
too high or too low, but it’s clear that s
0
(x) is positive. Again, ther e is little car e ful
empirica l w o rk here.
The recent scandals also illustrate that high values of s
0
(x) create problems
similar to emp loy ees gaming the rew ard s discussed above. This is related to the side
eects of high-po wered incentives discussed above.
CE O pay data also give us an opportunit y to investigate whether Relative P er-
forma nce Evaluation is used in practice. We may naturally expect that there should
be a lot of relative performance evaluation, since there are many common factors
determ inin g shareholder value for all the rms in a particular industry. Here the
evidence is more mixed.
An early paper by Gibbons and Murph y, in ILRR 1990 nds some evidence
for relativ e performance evaluation. How ever, it seems that the CEOs are being
compared to the market rather than their o w n industry. Mo re recen t research, for
example, Bertrand and Mullainathan nd that there is not enough relative per-
formance ev aluation in general. In particular, they nd that CEO s are rewarded
for common shocks aecting their industry and they interpr et this nding as evi-
dence against standard moral hazard models. While this result is in teresting , the
in terpretation may not be fully warran ted. In particular, a more careful theoretical
frame work w o uld be useful in interpreting such results. For example, almost all em -
pirical work interprets the data is being generated from and model in which output
is giv en by
x = a + η + ε
where η is a common shoc k s aecting not only one rm but all others in the industry.
But this additive structure is clearly special. A more general model w ould be
x = g (a, η, ε) ,
which allows for eorttobemorevaluableinperiodsinwhichthecommonshockas
high. In this case optimal con tracts ma y compensate CEO s when there are positiv e,
shocks not because they are suboptimal, but because this is necessary for optimal
107
Lectures in Labor Economics
incentiv e provision. Ev en though this ma y appear not as simple as the additiv e
structure,thefactthatitistheoreticallypossibleimpliesthatissomewhatmore
careful combination of theoretical and empirical w ork may be fruitful. This is an
area for future w ork.
8. The Basic Model of Career Concerns
In ad dition to multitasking another issue important in ma ny settings, especially
in the public sector or for politicians, but equally for man agers, is that they are
not simply remunerated for the curren t performance with w ages, but their future
prospects for prom otion and employment depend on their current performance. This
is referred to as “career concerns” follo wing the seminal paper by Holmstrom “Man-
agerial Incen tive Schem es-A Dynam ic Perspective” in Essays in Ec onomics and
Mana gem ent in the Honor of Lars Wahlbeck 1982.
The issues here are v ery important theoretically, and also have practical impor-
tance. Eugene Fama in a paper in 1980 suggested that competition in mar ket for
managers might be sucient to giv e th em sucient incentiv es witho ut agency con-
tracts. Perhaps more importan t, and anticipating the incomplete con tra cts, whic h
we will discuss soon, it ma y be the case that the performance of the agen t is “ob-
servable” so that the market know s about it and then decide whether to hire the
agent or not accordingly, but is not easy to contract upon. This will naturally lead
to career concern type models.
The original Holmstrom model is innite horizon, and we will see an innite
horizon model next, but let us start with a 2-period model. This class of models are
sometimes referred to as “signal jamming” models (e.g., by Fudenberg and Tirole)
for reasons that will become clear soon.
Outp ut produced is equal to
x
t
= η
|{z}
+ a
t
|{z}
+ ε
t
|{z}
t =1, 2
ability eort noise
108
Lectures in Labor Economics
which is only dierent from wha t we ha ve seen so far because of the presence of the
ability term η.
We go to the extreme case where there are no performance con tracts.
Moreover, assume that
ε
t
N(0, 1/h
ε
)
where h is referred to as “precision”.
Also, the prior on η has a normal distribution with mean m
0
, i.e.,
η N (m
0
, 1/h
0
)
and η, ε
1
2
are independent.
As before, a
t
[0, ). Even without a
t
, a dynamic model of this sort has a lot
of interesting features (for example, this is analyzed in the dynamic wage contract
model of Harris and Holmstrom).
Dieren tly from the basic moral hazard model this is an equilibrium model, in
the sense that there are other rm s out there who can hire this agen t. This is the
source of the career concerns. Loosely speaking, a higher perception of the mark et
about the abilit y of the agen t , η, will translate into higher wages.
The name signal jamming no w makes sense; it originates from the fact that
under certain circumstances the agen t might hav e an interest in wor king harder in
ordertoimprovetheperceptionofthemarketabouthisability.
Given these issues, let us be more specic about the information structure. This
is as follows:
the rm,theworker,andthemarketallsharepriorbeliefaboutη (th us there
is no asymmetric inform a tion and adverse selection; is this importan t?).
they all observe x
t
each period.
only w orker sees a
t
(moral hazard/hidden action).
In equilib riu m rm and mark et correctly conjecture a
t
.Thisisimportantfrom
a tec hn ical point of view, because along-the-eq uilibriu m path despite the fact that
there is hidden action , information will stay symmetric.
109
Lectures in Labor Economics
The model of the labor market is simple. It is competitive , whic h does not
in troduce any dicult techn icalities since all rms ha ve symm etric informatio n, and
the other important assump tion is that con tracts cannot be con tin gent on output
and wages are paid at the beginning of each period.
In particular, competition in the labor mark et implies that the w age of the w orker
at a time t is equal to the math em atic al expectation of the output he will produce
giv en the history of its outputs
w
t
(x
t1
)=E(x
t
| x
t1
)
where x
t1
= {x
1
,...,x
t1
} is the history of his output realizations.
Ofcourse,wecanwritethisas
w
t
(x
t1
)=E(x
t
| x
t1
)
= E(η | x
t1
)+a
t
(x
t1
)
where a
t
(x
t1
) is the eortthattheagentwillexertgivenhistoryx
t1
,whichis
perfectly anticipated by the market along the equilibrium path.
Prefe ren ces are as before. In part icula r, the instantaneous utility function of the
agen t is
u(w
t
,a
t
)=w
t
c(a
t
)
But we live in a dynamic world, so the agen t m ax im izes:
U(w, a)=
T
P
t=1
β
t1
[w
t
c(a
t
)]
where β is the agent’s discount factor and T is the length of the horizon, which
equals to 2 here (later we will discuss the case where T = ).
We do not need to tak e a position on where β comes from (the mark et or just
discounting). It suces that β 1.
We also hav e the standard assumptions on the cost function for eort
c
0
> 0,c
00
> 0
c
0
(0) = 0
to guarantee a unique interior solution. This rst best level of eort a
fb
solves
c
0
(a
fb
)=1.
Recall that all players, including the agen t himself, have prior on η N (m
0
, 1/h
0
)
110
Lectures in Labor Economics
So the w orld can be summ arized as:
period 1:
wage w
1
eort a
1
c ho sen by the agent (unob served)
output is realized x
1
= η + a
1
+ ε
1
period 2:
wage w
2
(x
1
)
eort a
2
c h osen
output is realized x
2
= η + a
2
+ ε
2
The appropria te equilibrium concept is again Perfect Bayesia n Equilibrium , but
for our purposes what matters is that there will be bac kward induction again, and
all beliefs will be pinned down b y application of Bay es’ rule. So let us start from
the second period.
Backw a rd induction imm ediate ly makes it clear that a
2
=0irrespectiv e of what
happens in the rst period, i.e., the agent will exert no eort in the last period
because the w age does not depend on second period output, and the world ends
after that, certication s don’t matter.
Given this, we can write:
w
2
(x
1
)=E(η | x
1
)+a
2
(x
1
)
= E(η | x
1
)
Then the problem of the market is the estimation of η giv en information x
1
=
η + a
1
+ ε
1
. The only diculty is that x
1
depends on rst period eort.
In a Perfect Ba yesian Equilibrium, the market will anticipate the level of eort
a
1
, and giv en the beliefs, agen ts will in fact play exactly this lev el. Let the conjectur e
of the market be ¯a
1
.
Dene
z
1
x
1
¯a
1
= η + ε
1
as the deviation of observed output from this conjecture.
Once w e ha v e z
1
, life is straightforw ard because ev eryth ing is normal. In pa rtic-
ular, standard normal updating formula implies that
η | z
1
N
µ
h
0
m
0
+ h
ε
z
1
h
0
+ h
ε
,h
0
+ h
ε
111
Lectures in Labor Economics
The in ter pre tation of this equation is straightforward, especially with the analogy
to linear regression. In tuitively, w e start with prior m
0
, and update η according to
the informa tion contain ed in z
1
. How much weigh t w e give to this new information
depends on its precision relativ e to the precision of the prior. Th e greater its h
ε
relative to h
0
, the more the new information ma tters. Finally, the variance of this
posterior will be less than the variance of both the prior and the new information,
since these two bits of informatio n are being comb in ed (hence its precision is greater).
Therefor e, w e hav e
E(η | z
1
)=
h
0
m
0
+ h
ε
z
1
h
0
+ h
ε
or going back to the original notation:
E(η | x
1
)=
h
0
m
0
+ h
ε
(x
1
¯a
1
)
h
0
+ h
ε
Consequently
w
2
(x
1
)=
h
0
m
0
+ h
ε
(x
1
¯a
1
)
h
0
+ h
ε
So to complete the chara cterization of equilibrium we hav e to nd the lev el of
a
1
that the agent will ch oose as a function of ¯a
1
, and make sure that this is indeed
equal to ¯a
1
, that is, this will ensu r e that this is a xed point, as required by our
concept of P e rfe ct Bayesia n Eq uilib rium .
Let us rst write the optimization problem of the agen t. This is
max
a
1
[w
1
c(a
1
)] + β[E{w
2
(x
1
) | ¯a
1
}]
wherewehaveusedthefactthata
2
=0. Substitutin g from above and dropp ing w
1
whichisjustaconstant,thisisequivalentto:
max
a
1
β E
½
h
0
m
0
+ h
ε
(x
1
¯a
1
)
h
0
+ h
ε
¯
¯
¯
¯
¯a
1
¾
c(a
1
)
Recall that both η and ε
1
are uncertain, even to the agent.
Therefore
max
a
1
βE
½
h
0
m
0
+ h
ε
(η + ε
1
+ a
1
¯a
1
)
h
0
+ h
ε
¯
¯
¯
¯
a
1
¾
c(a
1
)
112
Lectures in Labor Economics
and nally m aking use of the fact that a
1
is not stoc h astic (the agen t is choosing it,
so he knows what it is!), the problem is
max
a
1
β
h
ε
h
0
+ h
ε
a
1
c(a
1
)+β E
½
h
0
m
0
+ h
ε
(η + ε
1
¯a
1
)
h
0
+ h
ε
¾
Now carrying out the maximiz ation problem, we obtain the rst-order condition:
c
0
(a
1
)=β
h
ε
h
0
+ h
ε
< 1=c
0
(a
fb
)
so that the agen t exerts less than rst best eort in period one. This is because there
are two “leakages” (increases in output that the agent does not capture): rst, the
pa y o from higher eort only occurs next period, therefore its value is discoun ted
to β. Secondly, the agent only gets credit for a fraction h
ε
/(h
0
+ h
ε
) of her eort,
the part that is attributed to abilit y.
Holm strom shows that as long as β<1, equilibrium eort will also be less than
the rst-best in a stationary innite horizon model, but as w e will see next, with
nite horizon or non-stationary en vironments, “o ver-eort” is a possibility.
The characterization of the equilibrium is completed by imposing ¯a
1
= a
1
,which
enablesustocomputew
1
. Recall that
w
1
= E(y
1
| prior)
= E(η)+¯a
1
= m
0
+ a
1
The model has straightforward comparative statics. In particular, we have:
∂a
1
∂β
> 0
∂a
1
∂h
ε
> 0
∂a
1
∂h
0
< 0
These are all intuitiv e. Greater β means that the agent discounts the future less, so
exerts more eort because the rst source of leakage is reduced.
113
Lectures in Labor Economics
More interestingly, a greater h
ε
implies that there is less variabilit y in the random
component of performance. This, from the normal updating formula, implies that
any giv en increase in performance is more likely to be attributed to ability, so the
agent is more tempted to jam the signal b y exerting more eort. Na turally, in
equilib riu m , nobody is fooled, but equ ilibr iu m is only consistent with a higher level
of equilibrium eort.
The in tuition for the negative eect of h
0
is similar. W h en there is mo re vari-
ability in abilit y, career concerns are stronger.
This model gives a number of insights about what t ype of professions might ha ve
good incentives coming from career concerns. For example, if w e think that ability
matte rs a lot and sho w s a lot of variability in politics, the model wo u ld suggest that
career concerns should be important for politicians.
9. C are er Concer ns Over Multip le P e riods
Let us briey emphas ize one implica tion of hav ing m u ltiple periods in this set-
ting. There will be more learning earlier on than later.
To illustrate this, let us look at the same model with three periods. This model
can be summa rized by the following matrix
w
1
a
1
w
2
(x
1
) a
2
w
3
(x
1
,x
2
) a
3
W ith similar analysis to before, the rst-order conditions for the agent are
c
0
(a
1
)=β
h
ε
h
0
+ h
ε
+ β
2
h
ε
h
0
+2h
ε
c
0
(a
2
)=β
h
ε
h
0
+2h
ε
This immediately implies that
a
1
>a
2
>a
3
=0.
114
Lectures in Labor Economics
More generally, in the T period model, the relevan t rst-ord er condition is
c
0
(a
t
)=
T 1
X
τ=t
β
τt+1
h
ε
h
0
+ τh
ε
.
Holm strom sho w s that in this case, w ith T suciently large, there exists a period
τ
suc h that
a
t<
τ
a
fb
a
t>
τ
.
In other w ords, w orkers w ork too hard when y ou ng and not hard enough when
old– think of the w orking hours of assistant professors v ersus tenured faculty). Im-
portantly and interestingly, these eort levels depend on the horizon (time periods),
but not on past realizations.
Rem a rkably, similar results hold when ability is not constant, but ev olves ov er
time (as long as it follows a normal process). For example, w e could hav e
η
t
= η
t1
+ δ
t
with
η
0
N (m
0
, 1/h
0
)
δ
t
N (0, 1/h
δ
) t
In this case, it can be shown that the updating process is stable, so that the process
and therefore the eort lev el conv erge, and in particular as t →∞,wehave
a
t
a
but as long as β<1,
a<a
fb
.
10. Career Conc er n s and Multitas kin g: Applica tio n to Teaching
Acem og lu, Krem er and Mian in vestigate a dynam ic model of incentive s with
career concerns and multitasking, motivated b y the example of teachers, and use this
model to discuss which tasks should be organized in marke ts, rms or governments.
Here is a quic k o verview, which will be useful in getting us to work more with
innite-horizon career concerns models.
Consid er an innite horizon economy with n innitely liv ed teachers, and n
0
>n
parentsineveryperiod,eachwithonechildtobeeducated. K =1, 2,... children
115
Lectures in Labor Economics
can be taught join tly b y K teac h ers. Eac h teach er, i,isendowedwithateaching
ability a
i
t
at the beginning of period t.Thelevelofa
i
t
is unknown, but both teach er
i and parents share the sam e belief about the distribution of a
i
t
. The common belief
about teacher i’s abilit y at time t is given b y a normal distribution:
a
i
t
v N (m
i
t
,v
t
),
(where, note that, follo w ing the article w e will now use variances, rather than pre-
cision, so v
t
is the variance, where as precision w ould have been 1/v
t
).
Ability evo lves o ver time according to the stocha stic process giv en by:
(5.4) a
i
t+1
= a
i
t
+ ε
i
t
,
where ε
i
t
is i.i.d. with
ε v N (0
2
ε
).
A teac her can exert t wo t ypes of eort, “good” and “bad”, denoted by g
i
t
and b
i
t
respectively.
The human capital, h
j
t
of child j is given b y:
(5.5) h
j
t
= a
j
t
+ f(g
t
)
j
where a
j
t
=
1
K
j
P
iK
j
a
i
t
and f(g
t
)
j
=
1
K
j
P
iK
j
f(g
i
t
) with K
j
is the set of teac hers
teaching c h ild j,andK
j
as the nu mber of teac her s in the set K
j
. In addition, f(g)
is increasing and strictly concav e in g, with f(0) = 0,andh
j
t
=0if the child is not
taught by a teacher.
Letusstartwiththecasewhereeachchildistaughtbyasingleteacher,inwhich
case (5.5) specializes to
(5.6) h
i
t
= a
i
t
+ f(g
i
t
),
where, in this case, w e can index the c hild taught by teacher i by i.
Paren ts only care about the level of human capital provided to their ch ildren.
The expected utility of a parent at time t is given by:
U
P
t
= E
t
[h
t
] w
t
,
116
Lectures in Labor Economics
where E
t
[·] denotes expectations with respect to publicly a vailable inform ation at
the beginning of time t and w is the w age paid to the teac her .
The expected utilit y of a teach er i at time t is given by:
U
i
t
= E
t
"
X
τ=0
δ
τ
¡
w
i
t+τ
g
i
t+τ
b
i
t+τ
¢
#
,
where w
i
t+τ
denotes the w age of the teac her at time t + τ, and δ<1 is the discoun t
rate.
The level of h
i
t
provided by a teac her is not observable to parents. Instead,
parentshavetorelyonanimperfectsignalofh, giv en b y the test scores, s.The
test score of c hild j in the general case is given b y:
(5.7) s
j
t
= h
j
t
+ γf(b
t
)
j
+ θ
j
t
+ η
t
,
where γ 0, θ
i
t
is an i.i.d. student-level shock distributed as N (0
2
θ
), for example,
the ability of the studen ts to learn, and η
t
is a common shock that every teacher
receiv es in period t. For example, if all studen ts are given the same test, η
t
can be
thought of as the overall dicult y of the test, or any other cohort-specicdierence
in abilit y or the curriculum . η
t
is distributed i.i.d. and N(0
2
η
). In addition , f(b
t
)
j
and θ
j
t
are dened analog ously as a verages o ver the set of teac hers in K
j
.
In the special case where each child is taugh t by a single teach er, w e have:
(5.8) s
i
t
= h
i
t
+ γf(b
i
t
)+θ
i
t
+ η
t
.
Naturally the variance σ
2
θ
measures the quality of signal s
i
t
, but the v ariance of
the comm on shock , σ
2
η
,alsoaects the informa tiveness of the signal.
The timing of events is similar to the baseline career concern model. In the
beginning of every period t, parents form priors, m
i
t
, on the abilities of teachers
based on the histories of test scores of the teachers. They then oer a w age w
i
t
based on the expected abilit y of the teac her working with their c hild. The teach er
then decides on the levels of good and bad eort, and h and s arerealizedatthe
end of period t. A bility a
i
t
is then updated accordin g to the stochastic process (5.4).
The process then repeats itself in period t +1.
117
Lectures in Labor Economics
Aga in we are in teres ted in Perfect Bayesia n Equ ilib ria , where all teachers choose
©
g
i
t+τ
,b
i
t+τ
ª
τ=0,1,..
optim ally giv en their rewards, and the beliefs about teac her ability
are given by Bayesian updating.
Let us also simplify the analysis by focusing on the stationary equilib rium w he re
the variance of eac h teacher’s ability is constant, i.e. v
t
= v
t+1
= v,andthereare
man y teachers, so n →∞.
We have to start b y deriving the equations for the evolution of beliefs.
P aren ts’ belief about teacher i at the beginning of period t can be summa rized
as, a
i
t
v N (m
i
t
,v
t
).
Let S
t
=[s
1
t
...... s
n
t
]
T
denote the v ector of n test scores that the agents observe
during period t when each child is taught by a single teacher.
As in the analysis abov e, paren ts back out the part of S
t
which only reects
the ability lev els of the teac hers, plus the noise. Let Z
t
=[z
1
t
...... z
n
t
]
T
denote this
bac ked out signal, where
z
i
t
= s
i
t
f(g
i
t
) γf(b
i
t
)
= a
i
t
+ θ
i
t
+ η
t
Let a
i
t+1
be the updated prior on teac her i’s ability conditional on observing Z
t
.
Then the normality of the error terms and the additive structure in equation (5.8)
imply that
a
i
t+1
v N (m
i
t+1
,v
t+1
)
where m
i
t+1
and v
t+1
denote the mean and the variance of the posterior distribution.
Using the normal updating form ula, setting v
t+1
= v
t
= v,itcanbederivedthat:
(5.9) m
i
t+1
= m
i
t
+ β(z
i
t
m
i
t
) β(z
i
t
m
i
t
),
where
(5.10) β =
β =
1+
r
1+4
³
σ
2
θ
σ
2
ε
´
1+2
³
σ
2
θ
σ
2
ε
´
+
r
1+4
³
σ
2
θ
σ
2
ε
´
,
118
Lectures in Labor Economics
z
i
t
is the ith element of the vector Z
t
, and refers to the signal from teac h er i,whilez
i
t
is the a verage test score excluding teach er i.Sincen →∞,wehave(z
i
t
m
i
t
) η
t
,
so the common shoc k is rev ealed and ltered out.
This expression indicates that w e can think of the parame ter β as in “career
concerns” parameter , in the sense that it indicates ho w muc h a given increase in
test scores of c hildren feeds in to an improved perception of the ability of the teac h er.
Note also that there is a natural form of relative performance evaluation here
because of the common shock η
t
–b y comparing tw o dierent teac h ers (sc h ools), the
comm on shock η
t
can be perfectly ltered out.
Letusnextlookattherst and second-best by considering the social w elfa re
function:
(5.11) U
W
t
=
X
τ=0
δ
τ
(A + f(g
t+τ
) g
t+τ
b
t+τ
)
where
A is the a verage ability of teac hers in the population, which is constant when
n →∞,andg
t+τ
and b
t+τ
are the good and bad eort levels chosen b y all teachers.
Naturally we ha ve:
First Best: Maximizing (5.11) gives us the rst-best. In the rst-best, there is
no bad eort, b
t
=0, and the level of good eort, g
FB
, is given b y f
0
(g
FB
)=1.
Second-Best: Since teacher eort and the level of human capital are not di-
rectly observable, a more useful benchmark is given b y solving for the optima l me ch-
anism given these informationa l constraints.
Let
i
t
=[m
i
0
s
i
0
s
i
1
s
i
2
...... s
i
t1
] be the information set conta ining the ve ctor of
test scores for teac her i at the beginning of period t when all children are taugh t by
a single teacher.
119
Lectures in Labor Economics
Let w
i
t
(
i
t
) bethewagepaidtoteacheri in period t. Then the constrained
maxim iza tion problem to determ ine the second-best allocation can be written as:
max
{
w
i
t+τ
(
i
t+τ
)
}
τ=0,1,..
U
W
t
subject to
{g
t+τ
,b
t+τ
}
τ=0,1,..
arg max
{
g
0
t+τ
,b
0
t+τ
}
τ=0,1,..
E
t
"
X
τ=0
δ
τ
(w
t+τ
(
i
t+τ
) g
0
t+τ
b
0
t+τ
)
#
.
W hile the exact solution of this problem is sligh tly involv ed , th e rst-order con-
dition immediately implies that:
γf
0
(b
t+τ
)=f
0
(g
t+τ
)
Therefore, teac hers can be encouraged to exert good eortonlyatthecostof
bad eor t. As a result, the opportunity cost of inducing high eort is greater in the
second-best problem than in the rst-best.
Next consider a w age schedule of the form
w
t
= αm
t
+ κ,
which links teach er compensation to their cont em poraneous perceived ability.
Given such a sc hedu le, the privately optima l levels of good and bad eort are
obtained as:
f
0
(g
t+τ
)=γf
0
(b
t+τ
)=
1 δ(1 β)
αδβ
for all τ 0.
Consequently, a greater α, i.e., higher-powered incentiv e s, transla te into greater
good and bad eort, and for in tuitiv e reasons, the magnitude of this eect depends
both on the career concerns coecien t β and the discount factor δ.(Canyoudevelop
the intu ition?)
Putting this together with the second-best above, w e imm ed iately see that
(5.12) α
SB
=
1 δ(1 β)
δβf
0
(g
SB
)
,
would ac h ieve the second-best for giv en level of β.
In teresting ly, somehow if α w as constant, but the planner could manipulate β,
that is the degree to which teachers have “career concerns”, the second-best could
120
Lectures in Labor Economics
be achieved by setting:
(5.13) β
SB
=
1 δ
δ(αf
0
(g
SB
) 1)
.
Now it is an immediate corollary of what we ha ve seen so far that if all teac hers
work in “sing leton teams,” that is, if they work b y themselves, the ma rket w age for
teacher i will be:
(5.14) w
i
t
= m
i
t
+ E
t
[f(g
i
t
)].
The mar ket equilibrium is therefor e similar to the second-best equilibrium , ex-
cept that now α is xedtobe1. Thisleadstoaresultthatparallelsthepossibility
of excess incen tives in the m u ltitask models.
In particular, the market equilibrium lev el of good eor t will be g
M
,givenby:
f
0
¡
g
M
¢
=
1 δ(1 β)
δβ
.
An interesting implica tion is that g
M
<g
SB
if γ<γ, and g
M
>g
SB
if γ>γ.
The result that g
M
<g
SB
if γ<γis similar to the result in Holmstrom discussed
above that with discou nting, career concern s ar e typically insucien t to induce the
optimal level of eort. So in this case, even markets do not pro vide strong enough
incentives.
Thecasewhereγ>γ
, on the other hand, leads to the opposite conclusion.
Now the natural career concerns provided b y the market equilibrium create too
high-powered incentiv es relative to the second-best.
The extent to whic h the market provides excessiv ely high-po wered incentiv es
depends on the career concerns coecient, β,andviathis,onσ
2
θ
and σ
2
ε
.When
σ
2
θ
is small relativ e to σ
2
ε
, β is high, and teac he rs in the market care a lot about
their pupils scores, giving them v ery high-po wered incentiv es. In this case, since
markets are encouraging too muc h bad eort, rms or go vernments may be useful
b y modifying the organization of production to dull incen tiv es.
121
Lectures in Labor Economics
If indeed markets provide too high-powered incentiv es, one w ay of o vercoming
this may be to form teams of teachers to w eak en the signaling abilit y of individual
teachers.
Let us model the rm as a partnership of K teac h ers working together, engaged
in join t teaching as capture d in equation (5.5) above.
Crucia lly, parents only observe the aggregate or a verage test score of all the
teachers (or pupils) in the rm.
The notation for the analysis in this case is somewhat involv ed, so we will not
provided details, but simply highligh t the main result. This is that when γ>γ
so
that mark e ts prov ide excessiv e incentiv es, there exists a unique equilibrium wher e
rmshavesizeequaltoK
= β/β
SB
> 1 and where teachers exert the second-best
level of good eort, g
SB
.
The paper also sho w s wh y this benecial rm-equilibrium may be impossible to
sustain because of inside informa tion about the performance of employ e es within
the rm (think about wh y such inside information will be problema tic?), and how
go v ernment-type organizations with dollar incentives ma y be useful (what are the
things that the government can do and the private sector can not?).
10.1. Team Production. Fina lly, let us brie y discuss the Holmstrom 1982
paper where output is produced by a team, where ev ery work er’s con tribution raises
the total output of the rm. This seems lik e a good approximation to many pro-
duction processes in practice.
Theinformationstructureissuchthatonly total output is observed, that is the
principal cannot tell the contribution of dierent work e rs to total production.
Given this assumption, the environ ment can be simplied by rst remo v ing un-
certain ty, because there is still a non-trivial problem for the principal, since she
cannot invert the output-eort relationship to obtain the actions of all agen ts (since
all of their eorts matter for output).
More formally, consider an organization consisting of n agents i {1,...,n}
They all choose eort a
i
[0, )
122
Lectures in Labor Economics
Let the vector of eorts be denoted by
a =(a
1
,...,a
n
)
The k ey assumption is that of team production, so output is equal to
x = x(a
1
,...,a
n
) R
and does not depend on the stochastic variable θ. Wemakethenaturalassumption
that higher eort leads to higher output, that is,
∂x
∂a
i
> 0 i
All of the w or kers hav e risk neutral preferences:
U(w
i
,a
i
)=w
i
c
i
(a
i
)
with the usual assumption that, c
0
i
> 0,c
00
i
> 0 i
What is a contract here?
Since only x is observable, a con tract has to be a factor that species paymen ts
to eac h agent as a function of the realization of output.
Let us refer to this as a sharing rule, denoted by
s(x)=(s
1
(x), ..., s
i
(x)
|
{z}
,...,s
n
(x))
pa ym ent to agent i when team output is x
It is natural to impose limited liability in this case, so
s
i
(x) 0 x, i
Moreov er, we may want to impose that the rm can never pa y o ut more than what
it generates
n
X
i=1
s
i
(x) x
(though this ma y be relaxed if the rm is represen ted by a risk-taking en trepreneu r,
who makes a loss in some periods and compensated b y gains during other times).
Thetimingofeventsisasusual:
(1) Principal and agents sign s(x)
123
Lectures in Labor Economics
(2) n agen ts simultaneously choose eor ts.
(3) Ev er ybody observ es x
(4) The paym ents specied by s(x) are distributed and the principal keeps
x
n
X
i=1
s
i
(x)
Before w e analyze this game, let us imagine that there is no principal and the
team manages itself, in the spirit of a labor-managed rm .
Ho w should the labor-managed rm ideally set s(x)?
The key constrain t is that of budget b alance, i.e., the labor manag ed rm has
to distribute all of the output between its emplo yees (there is no principal to make
additional payments or tak e a share of prots; money-burning type rules would be
ex post non-credible). Thus, we have
n
X
i=1
s
i
(x)=x x
Let us ask whether the labor-man aged rm can achieve eciency, that is eort
levels such that
(5.15)
∂x
∂a
i
= c
0
i
(a
i
) i
Thus the question is whether there exists a sharing rule s(x) that achiev es full
eciency.
To answ er this question, w e ha ve to look at the rst-order conditions of the
agents, which tak e the natural form:
s
0
i
(x)
∂x
∂a
i
= c
0
i
(a
i
)
Now for this condition to be consistent with (5.15), it must be that s
0
i
(x)=1 i.
But budget balance requires
P
s
0
i
(x)=1so w e cannot ha v e full eciency.
One solution is to have a budget break er” so that the budget balance constrain t
is relaxed.
124
Lectures in Labor Economics
Consid er the con tr act
s
i
(x)=
½
b
i
if x x(a
)
0 if x<x(a
)
where a
=(a
1
,...,a
n
) is the vector of ecient eort lev els and
P
b
i
= x(a
).What
is happening to the output when it is less than x (a
)?
It can be v eried easily that giv e n this con tract all agents will c h oose the ecient
level of eor t. If they do not, then they will all be punished severely.
In fact, with this contract, along the equilibrium path there is budget balance,
but the principal can design a contract whereby o the equilibrium path, output is
taken a way from the workers as punishment.
Looked at in this light, the problem of the labor-managed rm (relativ e to the
capitalist rm) is its inability to punish its employ ee s b y throwing awa y outpu t.
10.2. Team s with Obser ved Ind ividu al Outpu ts. Let us end this discus-
sion by going bac k to tournamen ts, relative performance evaluation and sucient
statistics. Let us consider the team production problem and in vestigate the role
that the performance of other agents play in the optimal con tract.
Let us assume that the principal is risk neutral while agents ha ve utilit y
u
i
(w
i
) c
i
(a
i
)
with the standard assumptions,
u
0
i
> 0,u
00
i
< 0
c
0
i
> 0,c
00
i
> 0
In the baseline model, the output x
i
of eac h agen t is observed but not his eort
level a
i
. The output of the agen t is again a function of his o w n eort and some state
of nature θ
i
x
i
(a
i
i
)= output of i
Notice that ther e is no “team production” her e, since x
i
only depends on the action
of individual i, a
i
.
As usual w e assume that
125
Lectures in Labor Economics
∂x
i
∂a
i
> 0,
and as a norma lization,
∂x
i
∂θ
i
> 0
Finally, let us denote the v e ctor of stoc hastic elements b y
θ =(θ
1
,...,θ
n
) F (θ)
P ossible sharing rules in the setup are v ectors of the form
{s
i
(x
1
,...,x
i
,...,x
n
)}
n
i=1
.
The general sharing rules are dicult to c hara cterize. However, the sucien t
statistic result from before enables us to answer an inter estin g question: when is a
sharing rule that only depends on the individual agent’s output {s
i
(x
i
)}
n
i=1
optimal?
To answer this, let us set up the principal’s problem as
max
a=(a
1
,...,a
n
)
{s
1
(x
1
,...,x
n
)}
n
i=1
Z
θ
"
n
X
i=1
{x
i
(a
i
i
) s
i
(x
1
(a
1
1
),...,x
n
(a
n
n
))}
#
dF (θ)
subjec t to the participation constraint of eac h agent
Z
θ
u
i
(s
i
(x
1
(a
1
1
),...,x
n
(a
n
n
)))dF (θ) c
i
(a
i
) H
i
for all i
and the incen t ive compatibility constrain ts (combined with Nash equilibrium in the
tournam ent-like environm ents that w e hav e already seen), that is,
a
i
max
a
0
Z
θ
u
i
(s
i
(x
1
(a
1
1
),...,x
i
(a
0
i
i
),...,x
n
(a
n
n
))dF (θ) c
i
(a
0
i
) for all i
Recall from our discussion abo ve that for n =1we know x is a sucient statistic
for (x, y), if and only if f(a | x, y)=f(a | x), which implies that the optima l contra ct
s(x, y)=S(x)
Genera lizing that analysis in a natural w ay, for n>1, we are inte rested in the
question: is x
i
asucient statistic for (x
1
,...,x
n
) with respect to a
i
?
126
Lectures in Labor Economics
Using the previous denitio n, let y now be a vector, dened as
y = x
i
=(x
1
...,x
i1
,x
i+1
,...,x
n
).
The sucient statistic result sa ys that
s
i
(x
1
,...,x
n
)=s
i
(x
i
) if and only if f(a
i
| x
i
,x
i
)=f(a
i
| x
i
).
This leads to the natura l result that the optima l sharing rules {s
i
(x
i
,...,x
n
)}
n
i=1
are functions of x
i
alone i the θ
i
’s are independent. i.e.,
F (θ)=F
1
(θ
1
)F
2
(θ
2
) ...F
n
(θ
n
).
What the proposition says is that forcing agents to compete with each other is
useless if there exists no common uncertainty. This con trasts with the tournament
results of Lazear and Rosen, and highligh ts that tournam ent-type contracts are
generally not optimal.
Even when the dierent θs are not independen t, the sharing rules might take
simple forms.
For example, suppose that θ
i
= η+ε
i
so that individual uncertainty is the sum of
an aggregate component and an individual componen t (independent across agents),
and that both of these components are norma lly distributed
Then
x
i
(a
i
i
)=a
i
+ θ
i
= a
i
+ η + ε
i
In this case, it can be sho w n that optimal contracts are of the form
s
i
(x
i
, x)
where
x =
1
n
n
X
j=1
x
j
is a verage output. This is because (x, x
i
) is a sucien t statistic for (x
1
,...,x
n
) for
the estimation of a
i
.
127
Lectures in Labor Economics
11. Moral Hazard and Optimal Unem ployment Insurance
Let us now consider an application of ideas related to moral hazard to the design
of optimum unemplo ymen t insurance. The standard approach in the literature, rst
developed b y Shav ell and Weiss’s classic paper in 1979, considers the problem of
the design of optimal unemployment insurance as a dynamic moral hazard problem,
where unemplo y ed individuals have to exert eort to nd jobs and the unemplo y ment
insurance system pro vides consumption insurance. Greater consumption insurance
is desirab le all else equal, but it tends to discour age search eort and th us increases
unemploym en t duration.
Here I will presen t a slight generaliza tion of Sh av ell and Weiss’s appr oach based
on a more recen t paper by Hopenha yn and Nicolini (JPE, 1997). The in teraction
bet ween a general equilibrium model of searc h and unemp loyment insurance is dis-
cussed in later ch ap ters.
The model incorporates moral hazard regarding search eort(butthereareno
application decisions). Since the rm side is left implicit, it is essentially a partial
equilibrium m odel. The preferences of the agen t are
E
X
t=0
β
t
[u(c
t
) a
t
]
where c
t
R is consumption and a
t
A is searc h eort, whic h lead to a probability
of ndingajobp
t
= p(a
t
). All jobs are homogeneous and pa y w (the feature that
rules out the application margin). We naturally assume that
p
00
< 0,p
0
> 0.
We also assum e that the individual has zero income when unem ploy ed and does
not ha ve access to an y savings or borrowing opportunities. This last assumption is
crucial and simplies the analysis by allowing the unemployment insurance authority
to directly con trol the consumption lev el of the individual. Otherwise, there will be
an additional constraint whic h determines the optimal consumptio n path of the
individual.
128
Lectures in Labor Economics
Let s
j
be state at time j
s
j
=0 unemploy ed
s
j
=1 employed
The importan t object will be the history of the agent up to time t, whic h is denoted
by h
t
= {s
j
}
j<t
.LetH
t
be this set of all such histories.
A general insurance contract can be represented as a mapping
τ : H
t
−→ A × R
where the rst element of the mapping is a
t
, the "recommend ed search eort" and
the second element z
t
is the transfer to the wo rker, whic h will directly determin e
his consumption , since he has no access to an outside sour ce of consum ption and no
sa ving s opportunities.
Let V
0
(τ) be the expected discoun ted utility at t =0associated with con tra ct
τ, and to prepare for setting up the dual of this problem, let C
0
(τ) be the expected
cost (net transfers) to the agent.
Now the optimal con tract c hoice can be set up as
max V
0
(τ)
s.t.
IC (incentiv e comp atibility constrain ts) if any
C
0
(τ) C
Thelastconstraintforexamplemayrequirethetotalcosttobeequaltozero,i.e.,
all benets to be nanced b y some type of payroll taxes or other taxation. E.g.,
budget balance as in the previous model.
Instead of this problem, we can look at the dual problem
min C(V )=C
0
(τ)
s.t. IC
129
Lectures in Labor Economics
V
0
(τ) V
Let us start with the full information case where the social planner (the unem-
ploy m ent insurance authorit y) can directly monitor the search eort of the unem-
ployed individual, so the individual has no choice but to choose the recommended
search eort. This implies that there are no IC constrain ts.
Then , it is straightforward that full insurance is optimal, i.e., c
t
= c t,andthe
level of searc h eort will solve:
a
=argmax
a
p(a)
X
t=0
β
t
[1 p(a)]
t
u(c)
1 β
a
¸
The more in teresting case is the one with imperfect information, where a is the
private information of the individual, so he will only follow the recommended search
eort if this is incen tive compatible for him. In other words, as in all types of
implem entation or optimal policy problem s, there is an "argmax" constraint on the
maximization problem.
Suppose V
0
(τ)=V. Let us introduce some useful notation
V
e
= V
1
(τ) if s
1
=1
V
u
= V
1
(τ) if s
1
=0.
This imp lies that we can write the value of the individual as
V = u(c) a + β {p(a)V
e
+(1 p(a))V
u
}
Now the incentiv e com patib ility constraints boil down to
(5.16) (IC) a arg max
a
0
u(c) a
0
+ β {p(a
0
)V
e
+(1 p(a
0
))V
u
} .
Natura lly, (5.16) denes a very high dimensiona l object. It basically requires a
to be better than or as good as an y other feasible choice in A. Th ese kinds of
constrain ts are v ery dicu lt to w ork with, so the literature usually tak es the rst-
order approac h, which is to represent (5.16) with the corresponding rst-order
condition of the agen t, i.e.,
130
Lectures in Labor Economics
(5.17) βp
0
(a)(V
e
V
u
)=1
This may seem innocuous, but in m any situations it leads to the wrong solu-
tion. One has to be v ery careful in using the rst-order approac h. In this case, the
situation is not so bad, because the individual only has a single c hoice, and given V
e
and V
u
, his maximization problem is strictly conca ve, so the rst-order condition
(5.17) is necessary and sucient for the individual’s maxim ization problem. Never-
theless, this constraint itself, i.e., (5.17), is non-linear and non-con vex, so some of
the dicu lties of designing optimal contracts carry over to this case.
The problem is further simplied by noting that after the individual nds a
job, there is no further incentiv e problem , so after that point there will be full
consumption smoothing, i.e.,
(5.18) V
e
=
u(c
e
)
1 β
for some c
e
.
This is equivalent to a per-period transfer c
e
w to the agent. In other w ords, there
maybenegativeorpositivetransferstotheagentafterhends a job. The level of
these transfers will be a function of its history, i.e., when (after ho w many periods
of unemplo ymen t) he has found a job.
Now let
W (V
e
)=
c
e
w
1 β
be the discounted present value of the transfer from the principal to the agen t.
In verting (5.18), w e ha ve
W (V
e
)=
w + u
1
[(1 β)V
e
]
1 β
Dierentiating this equatio n , we obtain an intu itive formula
W
0
(V
e
)=
1
u
0
(c
e
)
,
which states that the cost of pro vidin g greater utilit y is the reciprocal of the marginal
utility of consu m p tion for the individual. Whe n u
0
(c
e
) is high, providing m ore utilit y
131
Lectures in Labor Economics
to the individual is relatively cheap. From the concavit y of the individual’s utilit y
function, u, W is also seen to be a con vex function (it is clearly increasing).
Now let C (V ) bethecostofprovidingutilityV to an unemploy ed individual.
It can be written in a recursive form as
C(V )= min
a,c
u
,V
e
,V
u
c
u
+ β {p(a)W(V
e
)+[1 p(a)]C(V
u
)}
subject to
u(c
u
) a + β {p(a)V
e
+[1 p(a)]V
u
} = V(5.19)
βp
0
(a)(V
e
V
u
)=1(5.20)
where c
u
is utilit y given to unem p loyed individual, (5.19) is the promise kee ping con-
straint, which mak es sure that the agent indeed receiv es utilit y V . (5.20) is the IC
constraint using the rst-order appro ach. Note that this formulation mak es it clear
that the social planner or the unemploymen t insurance authority is directly con trol-
ling consumption. Oth erw ise, there w ould be another constrain t corresponding to
the Euler equation of the individual for example.
Also, notice that this is a standard recursiv e equation, so time has been dropped
and everything has been written recursively. This creates quite a bit of economy
in terms of notation. M or eover, the existence of a function C(V ) can be again
guaranteed using the contra ction m appin g theorem (Theorem ??).
An in ter estin g question is whether C(V ) is conv ex . Recall that in the standard
dynamic programm ing problems, conca vity of the pay o function and the conv exit y
of the constraints set w ere sucient to establish conca vity of the value function.
Here we are dealing with a minimizatio n prob lem , so the equivalent result would
be con vexit y of the cost function. Ho wev er, the constrain t set is no longer conv ex,
so the con vexity of C(V ) is not guaran teed. Th is does not create a problem for
the solution, but it imp lies that there may be a better policy than the one outlined
abo v e whic h w ould in volve using lotteries.
Can you see why lotteries would improv e the allocation in this case? Can you
see ho w the problem should be form ulated with lotteries?
132
Lectures in Labor Economics
Here, to simplify the analysis, let us ignore lotteries.
To make more prog ress, let us assign multiplier λ to (5.19) and η to (5.20). Then
the rst-order condition s (with respect to a, c
u
, V
e
and V
u
)are
βp
0
(a)[W(V
e
) C(V
u
)] λ [βp
0
(a)(V
e
V
u
) 1] ηβp
00
(a)(V
e
V
u
)=0
1 λu
0
(c
u
)=0
βp(a)W
0
(V
e
) λβp(a) ηβp
0
(a)=0
β [1 p(a)] C
0
(V
u
) λβ[1 p(a)] + ηβp
0
(a)=0
The second rst-order condition imm ediately implies
λ =1/u
0
(c
u
)
Now substituting this in to the other conditions (and using constrain t (5.20)), w e
ha v e
(5.21) p
0
(a)[W(V
e
) C(V
u
)] = ηp
00
(a)(V
e
V
u
)
(5.22) C
0
(V
u
)=
1
u
0
(c
u
)
η
p
0
(a)
1 p(a)
(5.23) W
0
(V
e
)=
1
u
0
(c
e
)
=
1
u
0
(c
u
)
+ η
p
0
(a)
p(a)
In addition, w e hav e the follo wing envelope condition by dierentiating the cost
function with respect to V :
(5.24) C
0
(V )=
1
u
0
(c
u
)
=[1 p(a)]C
0
(V
u
)+p(a)W
0
(V
e
)
We now have a key result of optimal unemployment insurance:
Theorem 5.1. The unemployment benet and thus unemploye d consumption,
c
u
, is decreasing over time. In addition, if C (V ) is convex, then V
u
<V.
Proof. (sketc h ) From (5.22) and (5.23), we hav e that
W
0
(V
e
) C
0
(V
u
)=ηp
0
(a)
1
1 p(a)
+
1
p(a)
¸
.
133
Lectures in Labor Economics
Since η>0 (see the paper, or think intuitively), this immediately implies
W
0
(V
e
) >C
0
(V
u
)
Now use the Env elope condition (5.24), whic h immediately implies
(5.25) W
0
(V
e
) >C
0
(V ) >C
0
(V
u
)
Let ˆc
u
be next period’s consumption. Then we ha v e
C
0
(V
u
)=
1
u
0
c
u
)
,
which com b ined wit h (5.25) and (5.24) and the concavit y of the utility function u
immediately implies
ˆc
u
<c
u
as claimed. Mor eov er, (5.25) also implies that V
u
<V as long as C is conv ex,
comp leting the proof of the theorem. ¤
What is the in tuition? Dynamic incentives: th e planner can give more ecien t
incentives by reducing consum p tion in the future.
A related question is what happens to the transfer/tax to emplo yed work ers. Is
this a function of history?
Theorem 5.2. The wage tax/subsidy is a function of history, h
t
, i.e., it is not
constant.
Proof. (sketc h) Let us revisit the envelope condition and rewrite it as
C
0
(V
t
)=[1 p(a
t
)]C
0
(V
t+1
)+p(a
t
)W
0
(V
e
t
).
C
0
(V
t
)=
T 1
X
i=0
(
i1
Y
j=0
(1 p(a
t+j
))
)
p(a
t+i
)W
0
(V
e
t+i
)
+
(
T 1
Y
j=0
(1 p(a
t+j
))
)
C
0
(V
u
t+T
)
No w to obtain a contradiction, suppose that V
e
t
= V
e
for all t. From Theorem 5.1,
V
u
t
must eventually be decreasing (since consumption benets are). Let the second
134
Lectures in Labor Economics
term with
T 1
Y
j=0
be denoted by b
2
.SinceC
0
(V
u
t+T
) is bounded, so as T →∞,we
ha v e b
2
0. Therefore,
C
0
(V
t
)=
X
i=0
(
i1
Y
j=0
(1 p(a
t1+j
))
)
p(a
t1+i
)W
0
(V
e
t+i
)
Since, by h ypothesis, W
0
(V
e
t+i
) is constant, we have
C
0
(V
t
)=W
0
(V
e
)
X
i=0
(
i1
Y
j=0
(1 p(a
t+j
))
)
p(a
t+i
)
= W
0
(V
e
),
which con t rad icts (5.25), so V
e
cannot be constan t and therefore c
e
cannot be con-
stant. ¤
Under further assum ptio ns, it can be established that generally c
e
is a decreas-
ing sequence, which implies that Optimal unemployment insurance sc hemes should
make use of emplo y m ent taxes conditional on history as well as allow for decreasin g
benets.
Can you see the intuition for why w age taxes/subsidies are non-constant? Can
y o u relate this result to decreasing benets?
135
CHAPTER 6
Holdups, Incomplete Con tracts and Investments
Before we discuss theories of in vestmen ts in general and specic training, it is
useful to review certain basic notions and m odels of holdups and in v estmen ts in the
absence of perfect markets and complete contracts. This cha pter discusses the model
b y Grout where w a ge negotiations in the absence of a binding con tract bet ween
workers and rms leads to underinv estm ent b y rms, and the famous incom plete
contract approa ch to the organization of the rm due to William son and Grossman
and Hart.
1. In vestmen ts in the Absence of Binding Con tracts
Consid er the follow in g simple setup. A rm and a worker are matc h ed together,
and because of labor market frictions, they cannot switc h partners, so wa ges are
determined by bargaining. As long as it employs the worker, the total output of the
rm is
f (k)
where k is the amount of ph y sical capital rm has, and f is an increasing, con tinuous
and strictly concave production function.
The tim in g of events in this simple model is as follo w s :
The rm decides ho w muc h to in vest, at the cost rk.
The work er and the rm bargain ov er the w age, w. We assume that bargain-
ing can be represented b y the Nash solution with asymmetric bargaining
po wers. In this bargaining problem, if there is disagreement, the worker
receives an outside w age, ¯w,andtherm produces nothing, so its pay o is
rk.
137
Lectures in Labor Economics
The equilibrium has to be found by bac k ward induction, starting in the second
period. This involves rst characterizing the Nash solution to bargaining. In this
Nash bargaining problem, let the bargaining po wer of the w orker be β (0, 1).
Reca ll that the (asymm etric ) Nash solution to bargainin g between two play er s, 1
and 2, is giv en by maximizing
(6.1) (pa y o
1
outside opt ion
1
)
β
(pa y o
2
outside option
2
)
1β
.
Digression: Before proceeding further, let us review where equation (6.1) comes
from. Nash’s bargaining theorem considers the bargaining problem of choosing a
point x from a set X R
N
for some N 1 by two parties with utilit y functions
u
1
(x) and u
2
(x), such that if they cannot agree, they will obtain respectiv e dis-
agreement payos d
1
and d
2
(these are sometimes referred to as “outside option s,”
though if they are literally modeled as outside options in a dynam ic game of alter-
nating oers bargaining, we w o uld not necessarily in that with the Nash solution).
The remarkable Nash bargaining theorem is as follow s. Suppose we impose the
following four axioms on the problem and solution: (1) u
1
(x) and u
2
(x) are Von
Neumann-M orge nste rn utility functions, in particular, unique up to positiv e linear
transform ations; (2) Pa reto optimality, the agreement point will be along the fron-
tier; (3) Independence of the Relevant Alternatives; suppose X
0
X and the choice
when bargaining ov er the set X is x
0
X
0
,thenx
0
isalsothesolutionwhenbar-
gaining over X
0
; (4) Symmetry; iden tities of the players do not matter, only their
utility functions . Then, there exists a unique bargaining solution that satises these
four axioms. This unique solution is given by
x
NS
=argmax
xX
(u
1
(x) d
1
)(u
2
(x) d
2
)
If w e relax the symmetry axiom, so that the iden tities of the pla yers can matter
(e.g., w or ker versus rm ha ve dierent “bargaining powers"), then w e obtain:
(6.2) x
NS
=argmax
xX
(u
1
(x) d
1
)
β
(u
2
(x) d
2
)
1β
where β [0, 1] is the bargainin g power of pla yer 1.
138
Lectures in Labor Economics
Next note that if both utilities are linear and dened o ver their share of some
pie, and the set X R
2
is given by x
1
+ x
2
1, then the solution to (6.2) is given
by
(1 β)(x
1
d
1
)=β (x
2
d
2
) ,
with x
1
=1 x
2
, whic h implies the linear sharing rule:
x
2
=(1 β)(1 d
1
d
2
)+d
2
.
Intuitiv ely, pla yer 2 receives a fraction 1 β of the net surplus 1 d
1
d
2
plus his
outside option, d
2
.
In our context, the Nash bargaining solution amounts to choosing the wa ge, w,
so as to maximize:
(f (k) w)
1β
(w ¯w)
β
.
An important observation is that the cost of investmen t, rk,doesnotfeatureinthis
expression, since these in vestmen t costs are sunk. In other wo rds, the prots of the
rm are f (k) w rk , while its outside option is rk.Sothedierence between
pa y o and outside option for the rm is simply f (k) w. The above argument
imm ed iately implies that the Nash solution, as a function of the capital investment
k, can be expressed as
w (k)=βf (k)+(1 βw.
This expression emphasize s the dependence of the equilibrium w ag e on the capital
stoc k of the rm. In con trast, in a competitive labor market, the w a ge that a
worker of a giv en skill is paid is alw ays independent of the ph ysical capital level of
his employer. Here this dependence arises because of w ag e bargain ing, i.e., absence
of a competitiv e market.
Therefor e, at the poin t of investment, the prots of the rm are
π (k)=f (k) w (k) rk
=(1 β)(f (k) ¯w) rk
139
Lectures in Labor Economics
The rst-order condition of the prot maxim iza tion problem giv es the equilibrium
inve stm e nt/phy sica l capital lev el, k
e
,as
(1 β) f
0
(k
e
)=r
In comparison, the ecient level of investment that would have emerged in a
competitive labor market, is giv en by
f
0
(k
)=r
The conca vity of f immediately implies that k
e
<k
, thus there will be underin-
vestment.
The reason for underin vestment is straightfo rward to see. Because of bargaining,
the rm is not the full residual claimant of the additional returns it generates by its
inve stm e nt. A fraction β all the returns are received b y the work er, since the wage
that the rm has to pay is increa sing in its capital stoc k.
That there are no binding contracts is important for this result. Imagine an
alternative scenario where the rm and the w orker rst negotiate a w age con tract
w (k) whic h species the w age that the worker will be paid for every level of phy s-
ical capital. Assu m e this wage con tract is binding, and to simplify discussion, let
us limit attention to dierentiable w a ge functions. Then, the equilibrium can be
c haracterized as follo ws: rst, given the wage function, nd the rm ’s investment.
This is clearly given b y
(6.3) f
0
(k) w
0
(k)=r
Then, the w age function, w (k), and the lev el of investmen t, k, will be ch osen so as
to maximize:
(f (k) w (k) rk)
1β
(w (k) ¯w)
β
wherenoticethatnowrk is no w subtracted from the rm’s payo, since the ne-
gotiation is before in vestmen t costs are sunk. It is straightforward to see that the
solution to this problem m ust hav e w
0
(k)=0,sotheecient lev el of in vestmen t
will be im plem ented (to see this consider c ha nges in the functions w (k) suc h that
the derivative at the value k changes without w (k) changing. By considering suc h
140
Lectures in Labor Economics
c hanges, we can manipu late the lev el of k from (6.3). So the equilibrium has to
satisfy the rst-order condition with respect to k
(1 β)
(f
0
(k) w
0
(k) r)
f (k) w (k) rk
+ β
w
0
(k)
w (k) ¯w
=0
Using (6.3), imm ed iately gives w
0
(k)=0).
This analysis establishes that underinv estm ent arises in this in vestment problem
here because of the absence of binding contr acts, which in turn lead to a holdup
problem;oncetherm inv ests a larger amount in ph y sica l capital, it is potentially
“held up” by the w ork er who can demand higher wages to work for the rm, with
the threat point that, if he does not accept to work for the rm , the inve stme nt of
the rm will be w asted .
Is the assumption of “no binding con tracts,” whic h underlies this hold a prob-
lem reasonab le? There are two reasonsfor why binding con tracts are generally not
possible and instead con tra cts hav e to be “incomplete”:
(1) Suc h con tracts require the lev el in vestment, k, to be easily observable b y
outside parties, so that the terms of a contract that makes pa ym ents con-
ditional on k are easily enforceable (notice the important emphasis here;
there is no asymmetric information between the parties, but outside courts
cannot observ e what the rm and the w orker observe; can there be no con-
tracts that transmits this information to outside parties in order to make
con tracts conditional on this information?).
(2) We need to rule out renegotiation.
A combination of these reasons imply that such binding contracts are not easy
to write. This problem becomes even more serious when in vestments are not in
ph ysical capital, but h uman capital, which will be our focus below.
2. Incomp lete Contracts and the In ternal Organization of the Firm
The type of incompleteness of contr acts discussed in the previous section pla y s
an important role in thinking about the in ternal organization of the rm . This is
the essence of the approac h started by Coase, Williamson , and Grossman and Hart.
141
Lectures in Labor Economics
An importan t application of this approach is a theory of vertical in tegr atio n.
This theory provides potential answers to the questio n: when should two divisions
be part of a single rm, in a vertically-integrated structure, and when should they
function as separate rms at arm’s length? Alt houg h issues related to v ertical
integration are not cen tra l to labor economics, they highlight the implicatio n s of
incomplete contracts in other settings as w ell.
The answer that the incomplete con tracts literature gives to the v ertica l in te-
gration question links the organization of the rm to the distribution of bargain -
ing power. Wh oev er has the right to use physical assets has the residual righ ts
of con trol, so in the event of disagreement in bargaining, he or she can use these
assets. This impro ves the outside option of the part y who ow ns more assets. In
a v ertically-integrated structure, the o w ner/manager of the (do wnstream ) rm has
residual righ t s of contr ol, and the man ag er of the (upstrea m ) division is an employ ee.
He can be re d at will. In an arm’s-length relationship, in case of disagreem ent, the
o w n er of the separate (upstream ) rm has the residual righ ts of control of his ow n
production and assets. This improv es his incentiv es, but harms the incentiv es of the
o wner of the other rm .
To make these ideas more specic , consider the follow ing simple setup. A do w n -
stream rm will buy an input of quality h from an upstream supplier. He will then
combine this with anoth er inpu t of qualit y s, to produce output of value
(6.4) 2(1 δ + δe) s
α
h
γ
,
where e {0, 1} is an ex post eort that the upstream manager exerts with some
small cost ε. The cost of in vesting in quality for the downstream ma nag er is equal
to s, and for the upstream rm, it is h.Bothqualities,aswellase, are unobserved
by other parties, th u s contracts conditional on h, s and e cannot be written.
We will consider t wo dierent organizational forms. The rst organizational form
is vertical integration, which in volves the v estin g of property righ ts over assets to the
do wnstream manager. In particular, once the input is produced, the downstream
manager o wns the input and can appropriate the input from the upstream manager
142
Lectures in Labor Economics
(natura lly, there could also be the conv e rse case in which there is vertical integration
with the upstream manager having property righ ts over all the assets, but for our
purposes here, it is sucient to focus on vertical integ rat io n with property righ ts
v e sted in the do w nstrea m manager). Now consider the case where the upstream
manager has ch osen h and the down stream rm has chosen s. If the t wo managers
agree, then the upstream manager will exert the required eort, e =1,andtotal
surplus is giv en by (6.4) with e =1:
2s
α
h
γ
If they disagree, the upstream rm obtains nothing, while the upstream from obtains
2(1 δ) s
α
h
γ
Therefor e, symm etric Nash barga ining gives the gross pa yos of the tw o par ties as
π
d
=[2(1 δ)+δ] s
α
h
γ
and π
u
= δs
α
h
γ
Now going bac k to the in vestment stage, the two rms (managers) will choose h and
s with the following rst-order conditions:
α [2 (1 δ)+δ] s
α1
h
γ
=1and γδs
α
h
γ1
=1,
which imp lies that
(6.5)
h
s
=
γδ
α [2 (1 δ)+δ]
,
and th us
h =
¡
(α [2 (1 δ)+δ])
α
(γδ)
1α
¢
1/(1αγ)
and
s =
¡
(α [2 (1 δ)+δ])
1γ
(γδ)
γ
¢
1/(1αγ)
Th us total surplus with vertical in tegration is:
(6.6)
V
VI
=2((α [2 (1 δ)+δ])
α
(γδ)
γ
)
1/(1αγ)
¡
(α [2 (1 δ)+δ])
1γ
(γδ)
γ
¢
1/(1αγ)
¡
(α [2 (1 δ)+δ])
α
(γδ)
1α
¢
1/(1αγ)
143
Lectures in Labor Economics
Next consider the same problem with arm’s-length relationship. Now, if there
is disagreemen t, the upstream would not supply the input, so the output of the
do wnstream rm would be 0. Similarly, the output of the upstream rm is also 0,
since he’s making no sales. Th u s, gross pa yosare
π
d
= s
α
h
γ
and π
u
= s
α
h
γ
The ex an te m axim ization problem then give s:
(6.7)
h
s
=
γ
α
and
s =
¡
α
1γ
γ
γ
¢
1/(1αγ)
and
h =
¡
α
α
γ
1α
¢
1/(1αγ)
and total surplus is
V
NI
=2(α
α
γ
γ
)
1/(1αγ)
(6.8)
¡
α
1γ
γ
γ
¢
1/(1αγ)
¡
α
α
γ
1α
¢
1/(1αγ)
Com p arison of (6.5) and (6.7) sho w s that the upstream rm is investing more relative
to the dow n stream rm with arm’s-len gth relationship. This is because it has better
outside options with this arrangement.
Com p arison of (6.6) and (6.8) in turn sho w s that as γ increases (while k eepin g
γ + α constan t ), V
NI
increases relativ e to V
VI
. Thus for relatively high γ, implying
that the quality of the input from the upsteam rm is relativ e ly important, V
NI
will exceed V
VI
. It is also straightforw ard to see that in this world with no credit
market problem s when V
NI
>V
VI
, the equilibrium organization will be arm’s-
length and when V
NI
<V
VI
, it will be v er tic al in teg ratio n. This is because in the
absence of credit mar ket problems , ex an te transfers will ensure that the equilibrium
organizational form is c hosen jointly to maximize total surplus.
Therefore, we now have a theory of vertical in tegration based on the relativ e
importance of incentives of the do w nstream and the upstream rms.
144
CHAPTER 7
Eciency Wage Models
Eciency wage models are basically models with imperfect information where
the participa tion constraint of employees are slack because of limited liability con-
straints, or sometim es because of other information al problems.
W hile basic agency models are widely used in contrac t theory, organizational
economics and corporate nance, eciency wage models are used mostly in macro
and labor economics. But they are all part of the same family.
We start here with the Shapiro-Stiglitz m odel, whic h is the most famous ma cro/ labor
eciency w a ge model, and provides a useful way of thinking about unemployment,
which w e will discuss in the con text of search m odels as well later in the course.
1. The Shapiro-Stiglitz Model
The Shapiro-Stiglitz model is one of the w orkhorses of macro/labor. In this
model, unemp loyment arises because wa ges need to be above the market clearing
level in order to giv e incen tives to w or kers. In fact, it is the combination of unem-
plo ym ent and high w ages that mak e w ork more attractive for w orkers, hence the
title on the paper “unemplo ymen t as a worker-disciplin e device”.
Since the model is som ew hat familiar, it is sucient to sketch the ma in ingredi-
entshere:themodelisincontinuoustimeandallagentsareinnitely lived.
Worker s hav e to choose between t wo lev els of eort, and are only productive if
they exert eort.
−→ eort −→ 0 ~cost = 0, not productive
−→ 1 ~cost = e, productive
145
Lectures in Labor Economics
Without any informational problems rms wou ld write contracts to pa y wor kers
only if they exert eort. The problem arises because rms cannot observ e whether
aworkerhasexertedeort or not, and cannot deduce it from output, since output
is a function of all workers’ eorts. This intr oduces the moral hazard problem
The model is in con tinuous time, so instead of probabilities we will be talking
about ow rates.
If a w orker “shirks”, there is eort = 0, then there is probability (ow rate) q of
getting detected and red . [...For exam ple, the work er’s actions aect the probability
distribution of some observable signal on the basis of which the rm compensates
him. When the worker exerts eort, this signal take s the value 1. When he shirks,
this signal is equal to 1 with probability 1 q and 0 with probabilit y q...]
All agents are risk neutral, and there are N workers
b = exogenous separation rate
a = job nding rate, which will be determ ine d in equilibrium
r =interest rate/discount factor
These t y pe of dynamic models are t ypically solv ed by using dynamic program -
ming/Bellm an equations. Although the theory of dynamic programmin g can be
sometimes dicult, in the con tex t of this model, its applicat io n is easy.
We will make the analysis even easier by focusing on steady states. In steady
state, we can simply think of the presen t discoun ted value (PD V) of w orkers as a
function of their “strategy” of shirking or working hard.
Denote the PDV of employ ed-shirk er by V
S
E
(recall we are in con tinuous time)
(7.1) rV
S
E
= w +(b + q)(V
U
V
S
E
)
wherewehaveimposed
˙
V
S
E
=0, since here w e will only cha racterize steady states.
The intuition for this equation is straigh tforward. The w orker alw ays receives his
wage (his compensation for this instant of work) w, but at the ow rate b,he
separates from the rm exogenously, and at the ow rate q, he gets caugh t for
shirking, and in both cases he becomes unemp loyed, receiving V
U
and losing V
S
E
.
146
Lectures in Labor Economics
[The full continuous-time dyn a m ic pro gra m ming equation would be
rV
S
E
˙
V
S
E
= w +(b + q)(V
U
V
S
E
)
but in steady state
˙
V
S
E
=0...]
Deno te the PDV of employed-non shirker b y V
N
E
(7.2) rV
N
E
= w e + b(V
U
V
N
E
),
which is dierent from (7.1) because the w orker incurs the cost e, but loses his job
at the slower rate b.
PD V of unemplo y ed w ork ers V
U
is
rV
U
= z + a(V
E
V
U
),
where
V
E
=max
©
V
S
E
,V
N
E
ª
and z is the utility of leisure + unemp loyment benet.
Non-shirking condition is an incen tive-compatibility constrain t that requires the
work er to prefer to exert eort. Combining these equations, we obtain it as
V
N
E
V
S
E
w rV
U
+[r + b + a]
e
q
[non-shirkin g condition ].
This equation is intu it ive. The greater is the unem p loyment benetandthe
greater is the cost of eort, the greater should the w age be. More importantly, the
more lik ely the w o rker is to be caugh t when he shirks, the low er is the w ag e. Also,
the w ag es higher when r, b and a arehigher. Whyisthis?
Not shirking is an investment (wh y is this?), so the greater is r, the less attractive
it is. This also explains the eect of b;thegreateristhisparameter,themorelikely
oneistoleavethejob,sothisisjustlikediscounting.
Finally, the eect of a can be understood b y thinking of unemployment as pun-
ishment (after all, if it weren’t, w hy wou ld the w o rker care about being red?). The
low e r is a, the harder it is to mo ve out of unemployment, the harsher is unem ploy-
ment as a punishm ent, th u s wages will not need to be as high.
147
Lectures in Labor Economics
Steady state requires that =
ow in to unemploymen t = ow out of unemployment
Again this is a t ype of equation we will see a lot when we study search models
below.
In equilibrium, no one shirks because the non-shirking condition holds (similar
to the agen ts doing the right thing in the agency models).
Therefo re,
bL = aU
where L is employmen t, and U unemploymen t.
This equation imm ed iately determines the ow rate out of unemployment as
a =
bL
U
=
bL
N L
.
No w substituting for this we get the full non-shirking condition as
Non-Shirking Condition : w z+e +
r +
bN
N L
¸
e
q
Notice that a higher lev el of
N
NL
, which corresponds to lo wer unemploymen t, ne-
cessitate a higher w age to satisfy the non-shirking condition. This is the sense in
which unem ployment is a wo rker-discipline device. H igher unem ployment makes
losing the job more costly, hence encourages work ers not to shirk.
Next, let us consider the determination of labor demand in this econom y. Let
us suppose that there are M rms, each w ith access to a production function
AF (L),
where L denotes their labor. We mak e the standard assumptions on F ,inparticular,
it is increasing an d strictly concav e, i.e. F
00
< 0.
These rms maximize static prots (no ring/hiring c osts).
This imp lies that the equilibrium will satisfy:
AF
0
(L)=w,
148
Lectures in Labor Economics
Aggreg ate Labor Demand will therefore be given by
AF
0
µ
L
M
= w.
Figure 7.1 shows the determination of the equilibrium diagramm atica lly. It plots
the non-shirking condition, in the labor-wa ge space as an upward sloping curve
asymptoting to innit y at L = N (full employmen t) as well as the downw ar d sloping
“labor demand curve” giv en b y AF
0
(L) (for no w ignore the av erage product line).
Figure 7.1
Set M =1as a norma liz atio n, then equilibrium will be given b y the follow in g
(famou s) equation:
z + e +
r +
bN
N L
¸
e
q
= AF
0
(L)
This equation basically equates labor demand to quasi-labor supply.
This is quasi-labor supply rather than real labor supply, because it is not deter-
mined by the w ork leisure trade-o of w ork ers, but b y the non-shirking condition–if
wages did not satisfy this condition, w or kers will shirk and rms w ou ld lose money.
Given this equation, comparativ e statics are straightforw ard and intuitiv e:
149
Lectures in Labor Economics
A = L :lowerprod.= high unemploymen t
z = L : high reservation wages = high unemplo ymen t
q = L : bad monitoring = high unemploymen t
r = L : high int erest rates =high unemplo ymen t
b = L : high turnov er = high unemplo ymen t
Since there is unemployment, rents and information problems here, it is also
natural to ask the w elfare question: is the leve l of unemploymen t too high? It
depends on what notion of w elfare we are using and whether rm s are owned by
non work ers.
W hat are the externalities?
(1) By hiring one more work er, the rm is reducing unemploymen t, and forcing
other rm s to pay higher wagesunemploy ment is too low.
(2) By hiring one more work er, the rm is increasing the w or ker’s utility at
the margin, since eac h worker is receiving a ren t (w age > opportunity
cost)unemplo ymen t is too high.
The diagram show s that the second eect alwa ys dominates (no w consider the
a verage product line). The unemploym ent is too high. A subsidy on wages nanced
by a tax on prots will increase output.
So can there be a Pareto-impro ving tax-subsidy scheme?
Contrary to what Shapiro and Stiglitz claim, the answer is not ne cessarily. If
rm s are ow ned b y capitalists, the abo ve policy will increase output, but will not
constitute a Pareto improvem ent (think about whether P ar eto impr ov em ent is the
righ t criterion to look at in this case).
If rm s are ow ned b y work ers, the above policy will constitute a Pareto improv e-
men t. But in this case wo rk ers have enough income. Why do they not already en ter
in to “bonding” contracts or at least write better con tra cts as in our moral hazard
models?
150
Lectures in Labor Economics
2. Other Solutions to Incen tive Problems
In fact, the discussion at the end of the previous section imm ediately east of
the question : A re rms beha vin g optimally? The answer to this question is also:
not necessarily. Firms can write better con tracts (from the viewpoin t of maxim izing
prots) even when work ers are severely credit constrained. In particular, backloading
compensation for w or kers is alwa y s feasible and will be more eectiv e in prev enting
shirking.
This is one of the main criticisms of the shirking model: the presence of the
mon itoring problem does not necessarily imply “ren ts” for w orkers, and it is the
ren ts for the workers that lead to distortions and unemploymen t.
Moreover, if work ers ha ve wealth, they can en ter in to bonding contracts where
they post a bond that they lose if they are caught shirking.
Problems: rm-side moral hazard–rms ma y claim workers hav e shirked and
re them either to reduce labor costs when the w orker’s w age has increased enou gh
(above the opportunity cost), or to collect the bond pa ym ents.
In fact, in practice we observe a lot of up ward sloping pa y m ent sc hed u les for
work ers, and pensions and other benets that they receive after retirement. Ed
Lazear has argued that these are precisely respo nses to the incentive problems that
workers face.
In an y case, this discussion highligh ts that there are two empirical questions:
(1) Are monitoring problems importan t?
(2) Do more severe mo nitoring prob lems lead to greater rents for w orkers?
3. Evidence on Eciency Wages
There are t wo t y pes of evidence oered in the literature in support of ecien cy
wages.
The rst type of evidence sho ws the presence of substantial inter-industry w age
dierences (e.g., Krueger and Summ ers). Suc h wage dierentials are consisten t with
151
Lectures in Labor Economics
eciency wage theories since the monitoring problem (q in terms of the model above)
is naturally more serious in some industries than others.
Nevertheless, this evidence does not establish that eciency wa ge considerations
are importan t, since there are at least two other explanations for the inter-industry
wage dieren tia ls:
(1) These dier entials may reect compensating wages (since some jobs may
be less pleasant than others) or premia for unobserv ed characteristics of
work ers, which dier systematically across industries because w or kers select
into industries based on their abilities.
It seems to be the case that a substantial part of the w a ge dierentials
are in fact driv en b y these considerations. Nevertheless, it also seems to
be the case that part of the inter-industry w age dierences do in fact cor-
respond to “rents”. Work ers who mo ve from a lo w w age to a high wage
industry receive a w a ge increase in line with the wage dierential be tw een
these t wo sectors (Krueger and Summ ers; Gibbons and Katz), suggesting
that the dierentials do not simply reect unobserved abilit y (Nev erth eless,
this evidence is not w atertight; what if w or kers move precisely wh en there
is “good news” about their abilities?).
Com pensating wage dierentials also do not seem to be the whole story:
workers are less lik ely to quit such jobs (Krueger and Summers), but let’s
see the discussion of Holzer, Katz, and Krueger belo w .
(2) In ter-ind ustry w age dierentials ma y correspond to dierential work er rents
in dierent industries, but not because of ecien cy wages, but because of
dierences in unionization or other industry characteristics that give greater
bargaining power to work ers in some industries than others (e.g., capital
intensit y ).
Therefor e, the inter-in dustry w a ge dierentials are consistent with ecien cy
wages, but do not prov e that ecien cy wage consider ations are importan t.
152
Lectures in Labor Economics
An interesting paper by Holzer, Katz, and Krueger in vestigates whether higher
wages attract more applican ts (wh ic h wou ld be an indication that these jobs might
be more attractiv e). They nd that when w ages are exogenously higher because
of minim u m wages, there are in fact more applica nts. How e ver, interin dustry wage
dierentials don’t seem to induce more applications!
The second line of attack looks for direct evidence for eciency w age consider-
ations. A number of studies nd support for eciency w ages. These include:
(1) Krueger compares w ages and tenure premia in franchised and company-
o w n ed fast food restaurant s. Kru eger mak es the natural assumption that
there is less monitoring of w o rkers in a franchised restaurant. He nds
higher wages and steeper wage-ten u re proles in the franc hised restauran ts,
whichheinterpretsasevidenceforecien cy wages.
(2) Cappelli and Chau vin pro vid e more con vinc ing evidence. They look at
the number of disciplinary dism issals, w hich they interpret as a measure of
shirking, in the dierent plants located in dierent areas, but all by the same
automobile manufacturer (and cov ered by the same union). The rm pays
the same nominal w a ge ever yw here (because of union legislation). This
nom inal w age translates in to greater wa ge premia in some areas because
outside w ages dier. They nd that when w a ge premia are greater, there
are fewer disciplinary dismissals. This appears to provide strong support
to the basic implication of the shirking model.
(3) Campbell and Kamlani survey 184 rm s and nd that rm s are often un-
willing to cut w a ges because this will reduce work e r eort and increase
shirking.
These various pieces of evidence together suggest eciency w age consid eration s
are important. Nev ertheless they do not indicate whether these eciency wages are
the main reason why w ages are higher than market-cleaning lev els and unemplo y-
ment is high either in the U.S. or in Europe. Such an in vestigation requires more
aggregate evidence.
153
Lectures in Labor Economics
4. Eciency Wages, Monitoring and Corporate Structure
Next consider a simple model where we use the ideas of eciency w ag es for think
about the corporate structure.
For simplicity, take corporate structure to be the exten t of monitoring (e.g.,
n u mber of supervisors to production w orkers).
Consider a one-period economy consisting of a con tin uum of measure N of w ork-
ers and a continuum of measure 1 of rm o w ners who are dierent from the work ers.
Eac h rm i has the production function AF (L
i
).
Dierently from the Shap iro-Stiglitz model, let the probabilit y of catch ing a
shirking worker be endogenous. In particular, let q
i
= q(m
i
) where m
i
is the degree
of monitoring per work er b y rm i.
The cost of monitoring for rm i which hires L
i
workers is sm
i
L
i
.For example,
m
i
could be the nu mber of managers per production w orker and s as the salary of
managers.
Since there is a limited liability constraint, w o rkers cannot be paid a negative
wage, and the w o rst thing that can happen to a worker is to receive zero income.
Since all agen ts are risk-neutral, without loss of generalit y, restrict attention to the
case where w orkers are paid zero when caugh t shirking.
Therefor e, the incen tive compatibilit y constraint of a worker employ ed in rm i
canbewrittenas:
w
i
e (1 q
i
)w
i
.
If the w orker exerts eort, he gets utility w
i
e, which giv es the left hand side of the
expres sion . If he ch ooses to shirk, he gets caught with probab ility q
i
and receiv es
zero. If he is not caugh t, he gets w
i
without suering the cost of eort. This gives
the righ t hand side of the expression.
Notice an important dierence here from the Shapiro-S tiglitz model. No w if the
work er is caught shirking, he does not receiv e the w age payment.
154
Lectures in Labor Economics
Firm i’s maximization problem can be written as:
(7.3) max
w
i
,L
i
,q
i
Π = AF (L
i
) w
i
L
i
sm
i
L
i
subject to:
(7.4) w
i
e
q(m
i
)
(7.5) w
i
e u
The rst constraint is the incen tive compa tibility condition rearranged. The second
is the participation constraint where u
is the ex ante reservation utility (outside
option) of the w orker; in other wo rds, what he could receive from another rm in
this market.
The maxim ization problem (7.3) has a recursiv e structure: m and w can be
determined rst without reference to L by minimizing the cost of a work er w + sm
subject to (7.4) and (7.5); then, once this cost is determined , the prot max imizing
level of employment can be found. Eac h subproblem is strictly conv ex, so the
solution is uniquely determined, and all rmswillmakethesamechoices: m
i
= m,
w
i
= w and L
i
= L. In other words, the equilibrium will be symm e tric.
Anoth er useful observation is that the incen tive compatibility constrain t (7.4)
will alway s bind [Why is this? Th ink as follo w s: if the incentiv e comp atib ility
constrain t, (7.4), did not bind, the rm could lo wer q, and increase prots withou t
aecting an y th in g else. Th is diers from the simplest moral hazard problem with
xed q in wh ich the incentive com pa tibility constraint (7.4) could be slack...]
By contrast, the participation constrain t (7.5) may or may not bind–hence there
ma y or ma y not be be ren ts for w ork ers; con trast this with the Shapiro-Stiglitz
model. The comparative statics of the solution ha ve a very dierent c har acter
depending on wheth er it does. The two situations are sk e tched in the gur es.
(1) When (7.5) does not bind, the solution is c haracterized b y the tangency of
the (7.4) with the per-worker cost w + sm.
155
Lectures in Labor Economics
Call this solution (w
,m
),where:
(7.6)
eq
0
(m
)
(q(m
))
2
= s and w
=
e
q(m
)
.
In this case, because the participation constraint (7.5) does not bind, w and
m are given by (7.6) and small c han ges in u
leave these v a riables unc hanged.
(2) In contr ast, if (7.5) binds, w is determined directly from this constrain t as
equal to u
+e, and an increase in u causes the rm to raise this w age. Since
(7.4) holds in this case, the rm will also reduce the amount of information
gathering, m.
What determines whether (7.5) binds?
Let ˆw and ˆm be the per-w orker cost minimizing w ag e and monitoring leve ls
(which would not be equal to w
and m
when (7.5) binds). Then, labor demand of
a representative rm solv es:
(7.7) AF
0
(
ˆ
L)= ˆw + s ˆm.
Next, using labor demand, we can determine u
, workers’ ex ante reservation
utility from market equilibrium. It depends on how many jobs there are. If aggregate
demand
ˆ
L is greater than or equal to N, then a w orker who turns down a job is sure
to get another. In contrast, if aggregate demand
ˆ
L is less than N, then a wo rker
who turns do wn a job ma y end up without another. In particular, in this case,
u
=
ˆ
L
N
w e)+(1
ˆ
L
N
)z, where z is an unemploymen t benetthataworkerwho
cannot nd a job receiv es.
When
ˆ
L = N,therearealwaysrm s who wan t to hire an unemploy ed wo rker
atthebeginningoftheperiod,andthusu
w e. If there is excess supply of
workers, i.e.
ˆ
L<N,thenrm s can set the wage as low as they w ant, and so they
will c h oose the prot m aximizing wage level w
as giv en by (7.6). In con trast, with
full employm ent, rms have to pay a wage equal to u
+ e which will generically
exceed the (unconstrained) prot maxim izing wage rate w
.Therefore,wecan
think of labor deman d as a function of u
, the reservation utility of workers: rm s
are “utility-tak ers” rather than price-takers. The gures sho w the two cases; the
156
Lectures in Labor Economics
w
m
0
IC
isocost
PC
w*
m*
Figure 7.2. Participation Constraint is Slack .
outcome depends on the state of labor demand. More importantly, the comparative
statics are very dierent in the tw o cases.
Now comparativ e statics are straightforward.
First, consider a small increase A and suppose that (7.5) is slack. The tangency
bet ween (7.4) and the per w orker cost is unaected. Therefore, neither w nor m
c hange. Instead, the demand for labor shifts to the righ t and rmshiremoreworkers.
As long as (7.5) is slack, rm s will continu e to choose their (market) unconstrained
optimum, (w
,m
), which is independent of the marginal product of labor. As a
result, changes in labor demand do not aect the organizational form of the rm.
157
Lectures in Labor Economics
w
m
0
IC
isocost
PC
w
^
m
^
Figure 7.3. Participation Constraint is Bin d i ng.
If instead (7.5) holds as an equalit y, com pa r ative static results will be dierent.
In this case, (7.4), (7.7), and L = N join tly determine ˆq and ˆw.AnincreaseinA
induces rms to demand more labor, increasing ˆw. Since (7.4) holds, this reduces
ˆq as can be seen b y shifting the PC curv e up. Ther efo re, when (7.5) holds, an
improvem ent in the state of labor demand reduces monitoring. The intu ition is
closely related to the fact that workers are subject to limited liabilit y. When w o rkers
cannot be paid negativ e amounts, the lev e l of their w ages is directly related to the
power of the incen tives. The higher are their wages, the more they have to lose by
being re d and th u s the less willing they are to shirk.
158
Lectures in Labor Economics
u
0
L
w*- e
Labor
Demand
Labor
Supply
N
L
^
Figure 7.4. Participation Constraint is Slack .
Next, suppose that gov ernmen t in troduces a wage oor w
above the equilibrium
wage (or alternativ ely, unions demand a higher w age than w ould have prevailed in
the non-union ized economy). Since the incentiv e com pa tibility constraint (7.4) will
never be slac k , a higher w ag e will simply mov e rms along the IC curve in the gure
and reduce m. Ho wever, this will also increase total cost of hiring a w orker, reducing
employment.
Can this model be useful in thinking about wh y the exten t of monitoring appears
to be behaving dierently in continental Europe and the U.S.?
159
Lectures in Labor Economics
u
0
L
w - e
Labor
Demand
Labor
Supply
^
N
w*- e
Figure 7.5. Participation Constraint is Bin d i ng.
Whether this model can explain these patterns or not is not clear. But cross-
country dierences in broad features of organizations are ver y stark, and investiga-
tion of these issues seems to be a very interesting area for future research.
Fina lly, let us look at welfare in this m odel.
Consid er the aggregate surplus Y generated by the economy:
(7.8) Y = AF (L) smL eL,
where AF (L) is total output, and eL and smL are the (social) input costs.
In this economy, the equilibrium is constrained P ar eto ecient: subject to the
informatio na l constrain ts, a social planner could not increase the utility of w or kers
160
Lectures in Labor Economics
0
5
10
15
20
25
M
anagement
R
at
i
o
7374757677787980 8182838485 868788899091 92939495
Year
U.S. Japan Canada
Spain Italy Norway
Figure 7.6. Trends in the ratio of managerial employees to non-
man ag erial, non-agricultural wo rkers in six countries. Source ILO
Labor Statistics.
without hurtin g the ow ners. But total surplus Y is nev e r maxim ized in laissez-faire
equilibr ium. This is because of the follow ing reason: if w e can reduce q without
c hanging L,thenY increases. A tax on prots used to subsidize w relaxes the
incen tive constraint (7.4) and allow s a reduction in monitoring. Indeed, the second-
best allocation which maxim izes Y subject to (7.4) would set w ag es as high as
possible subject to zero prots for rms. Suppose that the second-best optim al lev el
of employment is
˜
L, then we have:
(7.9) ˜w + sq
1
³
e
˜w
´
=
AF (
˜
L)
˜
L
In this allocation, all rm s would be making zero-prots; since in the decentr alized
allocation, due to decreasing returns, they are always making positive prots, the
t wo will never coincide.
Adierent in tu ition for wh y the decentralized equilibrium fails to maximize net
outpu t is as follows: part of the expenditure on mo n itor ing, smL, can be in terpr eted
as “rent-seeking ” b y rm s. Firms are expending resources to reduce w ages they
161
Lectures in Labor Economics
are trying to minim ize the private cost of a w o rker w + sm –whichistoarst-
order appro ximation, a pure transfer from w orkers to rms. A social planne r who
cares only about the size of the national product w ants to minimize e + sm, and
therefore would spend less on monitoring. Reducin g monitoring starting from the
decentralized equilibrium would therefore increase net outpu t.
162
Part 3
Investment in Post-S chooling Sk ills
CHAPTER 8
The The o r y of Trainin g Investments
1. General Vs. SpecicTraining
In the Ben-P orath model, an individual contin ues to invest in his human capital
after he starts employment. We norma lly think of suc h investm ents as “training”,
provided either b y the rm itself on-the-job, or acquired by the work er (and the
rm ) through vocational training program s. This approach views training just as
sc hooling, which is perhaps too blackbo x for most purposes.
More specically, two com p lica tions that arise in thinkin g about training are:
(1) Most of the skills that the w orker acquires via training will not be as widely
applicable as schooling. As an example, consider a worker who learns ho w to
use a prin ting machine. This will only be useful in the printing industry, and
perhaps in some other specialized rm s; in this case, the w or ker will be able
to use his skills only if he sta ys within the same industry. Next , consider the
example of a w orker who learns ho w to use a variety of mac hines, and the
current employe r is the only rm that uses this exact variety; in this case,
if the worker chang es employ er, some of his skills will become redundant.
Or more extremely, consider a w ork er who learns how to get along with
his colleagues or with the custom ers of his employ er . These skills are even
more “specic”, and will become practically useless if he cha nges employ er.
(2) A large part of the costs of training consist of forgone production and
other costs borne directly b y the emplo yer. So at the v ery least, training
investments have to be thought as joint investments by the rm and the
work er, and in many instances, they ma y correspond to the rm’s decisions
more than to that of the work er .
165
Lectures in Labor Economics
The rst consideration motivates a particular distinction bet ween two types of
h u m an capital in the con text of training:
(1) Firm-specic training : this provides a w orker with rm-specicskills,that
is, skills that will increase his or her productivity only with the current
emplo yer.
(2) General training: this type of training will con tribute to the work er’s general
h uman capital, increasing his productivity with a range of employers.
Naturally, in practice actual training programs could (and often do) pro vide a
combination of rm-specic and general skills.
The second consideration above motivates models in which rm s have an im-
portant say in whether or not the w orker undertakes training in vestments. The
extreme but not sho w case is the one where training costs are borne by the rm (for
examp le, because the process of training reduces production), and in this case, the
rm directly deciding whether and ho w muc h training the work e r will obtain may
be a good approximation to realit y and a good starting point for our analysis.
2. The Bec k er Model of Training
Let us start with in vestments in general skills. Con sid er the following stylized
model:
At time t =0, there is an initial production of y
0
,andalsotherm decides
the level of training τ, incurring the cost c (τ). Let us assume that c (0) = 0,
c
0
(0) = 0, c
0
(·) 0 and c
00
(·) > 0. The second assump tion here ensures
that it is alwa ys socially benecial to have some amount of positiv e training.
At time t =1/2,therm makes a w age oer w to the work e r, and other
rm s also compete for the worker’s labor. The w ork er decides whether to
quit and w ork for another rm. Let us assum e that there are ma ny identical
rms who can use the gen eral skills of the wo rker, and the w orker does not
incur any cost in the process of chan g ing job s. This assump tion mak es the
166
Lectures in Labor Economics
labor market essentially competitive. (Recall: there is no informa tional
asymm etry here).
At time t =1, there is the second and nal period of production, where
output is equal to y
1
+ α (τ),withα (0) = 0, α
0
(·) > 0 and α
00
(·) < 0.For
simplicity, let us ignore discounting.
First, note that a social planner wishing to maxim ize net output w ou ld c h oose
a positiv e level of training inv estmen t, τ
> 0,givenby
c
0
(τ
)=α
0
(τ
) .
The fact that τ
is strictly positive im mediately follow s from the fact that c
0
(0) = 0
and α
0
(0) > 0.
Before Beck er analy zed this problem , the general conclusion, for example conjec-
tured by Pigou, was that there would be underinv estmen t in training. The reasonin g
wen t along the follow ing lines. Suppose the rm in vests some amount τ>0.For
this to be protable for the rm , at time t =1, it needs to pa y the w ork er at most
awageof
w
1
<y
1
+ α (τ) c (τ)
to recoup its costs. But suppose that the rm w as oering suc h a wa ge . Could this
be an equilibrium? No, because there are other rm s who have access to exactly
thesametechnology,theywouldbewillingtobidawageofw
1
+ ε for this w orker’s
labor services. Since there are no costs of c han ging emplo yer, for ε small enough
suc h that
w
1
+ ε<y
1
+ α (τ) ,
a rm oering w
1
+ ε wouldbothattracttheworkerbyoering this higher wage
and also m ake positive prots. This reasoning implies that in an y competitiv e labor
market, w e m ust hav e
w
1
= y
1
+ α (τ) .
But then, the rm cannot recoup any of its costs and wo uld lik e to choose τ =0.
Despite the fact that a social plann er wou ld choose a positive level of training
167
Lectures in Labor Economics
inv estmen t, τ
> 0, the pre-Becker view was that this economy w ould fail to invest
in training.
The mistake in this reasoning w as that it did not take in to account the w orker’s
incentives to invest in his ow n training. In eect, the rm does not get an y of the
returns from training because the w o rker is receiving all of them. In other words, the
worker is the full residual claimant of the increase in his o wn productivity, and in the
competitiv e equilibrium of this economy without any credit market or con tra ctu al
frictions, he w ould ha ve the right incentiv es to invest in his training.
Let us analyze this equilibrium no w. As is the case in all games of this sort, w e
are interested in the subgame perfect equilibria. So w e have to solve the game b y
bac kw ard induction. First note that at t =1, the work e r will be paid w
1
= y
1
+α (τ ).
Next recall that τ
is the ecient lev el of training given by c
0
(τ
)=α
0
(τ
).Then
in the unique subgame perfect equilib riu m , in the rst period the rm will oer the
follow ing package: training of τ
and a wa ge of
w
0
= y
0
c (τ
) .
Then, in the second period the wo rker will receiv e the wage of
w
1
= y
1
+ α (τ
)
either from the curren t rm or from another rm.
To see why no other allocation could be an equilibrium, suppose that the rm
oered (τ,w
0
),suchthatτ 6= τ
.Fortherm to break even we need that w
0
y
0
c (τ), but b y the denition of τ
,wehave
y
0
c (τ
)+y
1
+ α (τ
) >y
0
c (τ)+y
1
+ α (τ) w
0
+ y
1
+ α (τ)
So the deviation of oer ing (τ
,y
0
c (τ
) ε) for ε suciently small would attract
the worker and make positive prots. T hus, the unique equilibr ium is the one in
which the rm oers training τ
.
Therefo re, in this economy the ecient lev el of training will be ach ieved with
rm s bearing none of the cost of training, and workers nancing training by taking
awagecutintherst period of employment (i.e, a wage w
0
<y
0
).
168
Lectures in Labor Economics
There are a range of examples for which this model appears to pro vide a good
description. Th ese include some of the historical apprenticeship programs where
y o ung individuals work ed for very lo w wages and then “graduated” to become master
craftsmen; pilots who work for the Na vy or the Air Force for lo w wages, and then
obtain m u ch higher wages w or king for private sector airlines; securities brok ers, often
highly qualied individuals with MBA degrees, working at a pay level close to the
minimum w ag e until they receiv e their professional certication; or even academics
taking an assistant professor job at Harvard despite the higher salaries in other
departments.
3. Market Failures Due to Con tractual Problems
Theaboveresultwasachievedbecauserm s could commit to a wage-training
contract. In other w ord s, the rm could make a credible commitm ent to pro v idin g
training in the amount of τ
. S uch commitm ents are in general dicult, since
outsiders cannot observe the exact nature of the “training activities” taking place
inside the rm. For example, the rm could hire w orkers at a low w age pretending
to oer them training, and then employ them as cheap labor. T his implies that
contracts bet ween rms and work ers concerning training investments are naturally
incom plete .
To capture these issues let us mak e the timing of ev ents regarding the provision
of training somewhat more explicit.
At time t = 1/2,therm mak es a training-w age con tract oer (τ
0
,w
0
).
Workers accept oers from rms.
At time t =0, there is an initial production of y
0
,therm pays w
0
,and
also unilaterally decides the lev el of training τ, which could be dierent
from the promised level of training τ
0
.
At time t =1/2,wageoers are made, and the worker decides whether to
quit and work for another rm .
169
Lectures in Labor Economics
At time t =1, there is the second and nal period of production, where
output is equal to y
1
+ α (τ).
Now the subga m e perfect equilibrium can be characterized as follo w s: at time
t =1,aworkeroftrainingτ will receive w
1
= y
1
+ α (τ). Realizing this, at time
t =0,therm w ould oer training τ =0, irrespective of its contract promise.
Anticipating this w ag e oer, the worker will only accept a contract oer of the form
(τ
0
,w
0
), such that w
0
y
0
,andτ does not matter, since the w ork er knows that
the rm is not committed to this promise. As a result, we are back to the outcom e
conjectured b y Pigou, with no training investment by the rm.
A similar conclusion wo uld also be reac he d if the rm could write a binding
contract about training, but the wo rker were subject to credit constraints and
c (τ
) >y
0
, so the work er cann ot take enough of a wage cut to nance his training.
In the extreme case where y
0
=0, we are again back to the Pigou outcome, where
there is no training in vestment, despite the fact that it is socially optimal to invest
in skills (which one of these problem s, contractual incomp leteness or credit market
constrain ts, appears more importan t in the context of training?).
4. Training in Imperfect Labor Markets
4.1. Motivation. The general conclusion of both the Bec ker model with perfect
(credit and labor) markets and the model with incomplete con tra cts (or sev er e credit
constraints) is that there will be no rm -sponsored in vestmen t in general training.
This conclusion follows from the com m on assump tion of these tw o models, that the
labor market is competitive, so the rmwillneverbeabletorecoupitstraining
expenditures in general skills later dur ing the employment relationship.
Is this a reasonable prediction? The answ er appears to be no. There are many
instances in which rms bear a signicant fraction (sometimes all) of the costs of
general training in vestments.
The rst piece of evidence comes from the Germ a n apprenticeship system. Ap-
prenticeship training in Germany is largely general. Firms training apprentices have
170
Lectures in Labor Economics
to follo w a prescribed curriculum, and apprentices tak e a rigorous outside exam in
their trade at the end of the apprenticeship. The industry or crafts c hambers certify
whether rms fullltherequirementstotrainapprenticesadequately,whileworks
councils in the rm s monitor the training and resolv e grievances. A t least in certain
tec h nical and business occupation s, the training curricula limit the rms’ choices
o ver the training conten t fairly severely. Estimates of the net cost of appren ticesh ip
programs to employ ers in Germa ny indicate that rms bear a signicant nancial
burden associated with these training investments. The net costs of appren ticesh ip
training may be as high as DM 6,000 per worker (in the 1990s, equivalent of about
$6,000 today).
Anoth er interesting example comes from the recen t grow th sector of the US,
the temporary help industry. The temporary help rm s provide w orkers to various
employers on short-term contracts, and receiv e a fraction of the w orkers’ wages as
commission. Although blue-collar and professional tem porary w orkers are becomin g
increasingly common, the majority of temporary workers are in clerical and secretar-
ial jobs. These occupation s require some basic compute r, typing and other clerical
skills, which temporary help rm s often provide before the wo rker is assigned to an
employer. Work ers are under no con tractu al obligation to the temporary help rm
after this training program. M ost large temporary help rm s oer such training
to all willing individuals. As training prepares the w orkers for a range of dieren t
assignmen ts, it is almost completely general. Although work e rs taking part in the
training programs do not get paid, all the monetary costs of training are borne
by the temporary help rms, giving us a clear example of rm -sponsored general
training. This w as rst noted b y Krueger and is discussed in more detail by David
Autor.
Other evidence is not as clear-cut, but suggests that rm-sponsored investments
in general skills are widespread. A number of studies have investigated whether
work ers who tak e part in general training programs pay for the costs by taking
low er wages. The majority of these studies do not nd lower wages for w ork ers in
171
Lectures in Labor Economics
training programs, and even when w ages are lower, the amoun ts typically appear
too small to compensate rms for the costs. Although this pattern can be explained
within the paradigm of Bec k er’s theory b y arguing that work ers selected for training
were more skilled in unobserved dim ensions, it is broadly supportiv e of widespread
rm -sponsored-trainin g.
There are also man y examp les of rmsthatsendtheiremployeestocollege,MBA
or literacy programs, and problem solving courses, and pa y for the expenses while the
wages of w orkers who tak e up these benets are not reduced. In addition, man y large
comp anies, such as consulting rms, oer training programs to college graduates
invo lv ing general skills. T h ese employ ers ty p ically pa y substantial salaries and
bear the full mon etary costs of training, even during periods of full-time classroom
training.
How do w e ma ke sense of these rm-sponsored investmen ts in general training?
We will no w illustra te how in frictional labor markets, rm s may also be willing to
make inv estm ents in the general skills of their emplo yees.
4.2. A Basic Framework. Consider the follow ing t wo-period model. In period
1, the w ork er and/or the employ er choose how m uc h to invest in the w orker’s general
human capital, τ. There is no production in the rst period. In period 2,theworker
either stays with the rm and produces output y = f(τ),wheref(τ) is a strictly
increasing and concav e function. The worker is also paid a wage rate, w(τ) as a
function of his skill lev el (training) τ, or he quits and obtains an outside wage.
Thecostofacquiringτ units of skill is again c(τ), which is again assumed to be
contin uo u s, dierentiable, strictly increasing and con vex, and to satisfy c
0
(0) = 0.
There is no discounting, and all agents are risk-neutral.
Assum e that all training is technologic ally general in the sense that f(τ) is the
same in all rms.
If a work er leaves his original rm, then he will earn v(τ) in the outsid e labor
market. Suppose
v(τ) <f(τ).
172
Lectures in Labor Economics
Thatis,despitethatfactthatτ is general h um an capital, when the work er separates
from the rm,hewillgetalowerwagethanhismarginalproductinthecurrent
rm . The fact that v(τ) <f(τ) implies that there is a surplus that the rm and
the w orker can share when they are together. Also note that v(τ) <f(τ) is only
possible in labor markets with frictions– o therw ise, the worker w ou ld be paid his
full marginal product, and v(τ)=f(τ).
Let us suppose that this surplus will be divided by asymmetric Nash bargainin g
with w orker bargaining power giv en by β (0, 1). Recallfromabovethatasym-
metric Nash bargaining and risk neutral preferences imply that the wage rate as a
function of training is
(8.1) w(τ)=v(τ)+β [f(τ ) v(τ)] .
An important point to note is that the equilibriu m w age rate w(τ) is independen t
of c(τ): the level of training is c h ose n rst, and then the w orker and the rm bargain
o ver the wage rate. A t this point the training costs are already sunk, so they do not
feature in the bargaining calculations (b ygones are bygones).
Assum e that τ is determined by the investments of the rm and the worker, who
independently choose their contributions, c
w
and c
f
,andτ is given by
c(τ)=c
w
+ c
f
.
Assum e that $1 investmen t by the w o rker costs $p where p 1.Whenp =1,the
work er has access to perfect credit mark ets and when p →∞, the work er is severely
constrained and cannot invest at all.
More explicitly, the timing of events are:
The work er and the rm sim u ltan eously decide their con tributio ns to train-
ing expenses, c
w
and c
f
. The worker receiv es an amoun t of training τ suc h
that c(τ)=c
w
+ c
f
.
173
Lectures in Labor Economics
The rm and the w orker bargain o v er the w age for the second period, w (τ),
where the threat point of the w orker is the outside wage, v (τ),andthe
threat poin t of the rm is not to produce.
Production tak es place.
Given this setup, the contributions to training expenses c
w
and c
f
will be deter-
mined noncooperatively. More specically, the rm chooses c
f
to maximize prots:
π(τ)=f(τ) w(τ) c
f
=(1 β)[f(τ) v(τ)] c
f
.
subject to c(τ)=c
w
+ c
f
. The w orker chooses c
w
to maxim ize utility:
u(τ)=w(τ) pc
w
= βf(τ)+(1 β)v(τ) pc
f
subject to the same constraint.
The rst-order conditions are:
(8.2) (1 β)[f
0
(τ) v
0
(τ)] c
0
(τ)=0 if c
f
> 0
(8.3) v
0
(τ)+β [f
0
(τ) v
0
(τ)] pc
0
(τ)=0 if c
w
> 0
Inspection of these equations implies that generically, one of them will hold as a
strict inequalit y, therefore, one of the parties will bear the full cost of training.
The result of no rm-sponsored in vestmen t in general training by the rm obtains
when f(τ)=v(τ), which is the case of perfectly competitiv e labor mark ets. (8.2)
then implies that c
f
=0, so when work ers receiv e their full m ar gina l product in the
outside labor market, the rm will never pay for training. Moreov er, as p →∞,so
that the w or ker is sever ely credit constrained, there will be no investmen t in training.
In all cases, the rm is not constrain ed, so one dollar of spending on training costs
one dollar for the rm.
In contrast, suppose there are labor market imperfections, so that the outside
wage is less than the productivit y of the work er , that is v (τ) <f(τ).Isthis
gap between marginal product and market w ag e enough to ensure rm-sponsored
174
Lectures in Labor Economics
inv estmen ts in training? T he answer is no. To see this, rst consider the case
with no wage compression, that is the case in which a marginal increase in skills
is valued approp riately in the outside mar ket. Mathem atica lly this corresponds to
v
0
(τ)=f
0
(τ) for all τ. Substitutin g for this in the rst-order condition of the rm,
(8.2), w e immediately nd that if c
f
> 0,thenc
0
(τ)=0. So in other words, there
will be no rm contribution to training expenditures.
Next consider the case in which there is w age compression, i.e., v
0
(τ) <f
0
(τ).
Now it is clear that the rm may be willin g to inv est in the general trainin g of the
work er. The simplest w ay to see this is again to consider the case of severe credit
constrain ts on the wo rker, that is, p →∞, so that the worker cannot invest in
training. Then, v
0
(0) <f
0
(0) is sucient to induce the rm to inve st in training.
This sho w s the importance of wage compr ession for rm-sponsored training.
The int uition is simple: w age compr ession in the outside market translates into
wage compression inside the rm , i.e., it implies w
0
(τ) <f
0
(τ). As a result, the rm
mak es greater prots from a more skilled (trained) w o rker, and has an incen tive to
increase the skills of the w orker.
To clarify this poin t further, the gure dra w s the productivit y, f(τ),andwage,
w(τ), of the w orker. The gap between these two curves is the sector-period prot
of the rm. When f
0
(τ)=w
0
(τ),thisprot is independent of the skill lev el of the
work er, and the rm has no in tere st in increasing the w orker’s skill. A competitive
labor market, f(τ )=v(τ), implies this case. In con trast, if f
0
(τ) >w
0
(τ),which
follows is a direct implication of f
0
(τ) >v
0
(τ) giv en Nash bargaining, the rm makes
more prots from more skilled w or kers, and is willing to invest in the general skills
of its employe es.
Let τ
w
be the lev el of training that satises (8.3) as equality, and τ
f
be the
solution to (8.2). Then, it is clear that if τ
w
f
, the w or ker will bear all the cost
of training. And if τ
f
w
, then the rm will bear all the cost of training (despite
the fact that the wo rker may have access to perfect capital markets, i.e. p =1).
175
Lectures in Labor Economics
f(τ)
w(τ) = f(τ) Δ
w(τ) = f(τ) Δ(τ)
τ
f(τ)
No firm-sponsored
training
Firm-sponsored
training
Figure 8.1
To deriv e the implications of changes in the skill premium on training, let v(τ)=
af(τ)b. Adecreaseina is equivalen t to a decrease in the price of skill in the outside
mark et, and w ould also tilt the w age function inside the rm, w(τ),decreasingthe
relative wages of more skilled w orkers because of bargaining between the rm and in
the worker, with the outside wa ge v (τ) asthethreatpointoftheworker. Starting
from a =1and p<, a poin t at whic h the worker makes all in vestmen ts, a decrease
in a leads to less in vestment in training from (8.3 ). This is simply an applicatio n of
the Beck er reasoning; without any w ag e comp ressio n, the w orker is the one receiving
all the benets and bearing all the costs, and a decline in the returns to training
will reduce his in vestments.
176
Lectures in Labor Economics
As a declines further, w e will ev entually reac h the poin t where τ
w
= τ
f
.Now
the rm starts paying for training, and a further decrease in a increases inv estm ent
in general training (from (8.2)). Therefore,thereisaU-shapedrelationbetweenthe
skill premium and training–starting from a compressed w age structure, a further
decrease in the skill premium may increase training. Holding f(τ) constan t a tilting
upofthewageschedule,w(τ), reduces the prots from more skilled work e rs, and
the rm has less in terest in investing in skills.
Changes in labor mark et institutions, suc h as minimum wages and unionization,
will therefore aect the amount of training in this econom y. To see the impact
of a minim um wage, consider the next gu re, and start with a situation where
v(τ)=f(τ) and p →∞so that the worker cannot invest in training, and there
will be no training. N ow impose a minimum wa ge as dra w n in the gure. This
distorts the wage structure and encourages the rm to invest in skills up to τ
,as
long as c(τ
) is not too high. This is because the rm makes higher prots from
workers with skills τ
than workers with skills τ =0.
This is an in te resting compa rative static result, since the standa rd Beck er model
with competitive labor markets implies that minimum wages should always reduce
training . The reason for this is straightforward. Workers take w age cuts to nance
their general skills training, and minimum wages will prevent these w a ge cuts, thus
reducing training. We will discuss this issue further below .
5. General Equilibrium with Imperfect Labor Mark ets
The above analysis sho wed ho w in imperfect labor market rms will nd it prof-
itable to in vest in the general skills of their em ployees as long as the equ ilibriu m wage
structure is compressed . The equilibrium w ag e structure will be compressed, in turn,
when the outside w a ge structure, v (τ), is compressed– t hat is, when v
0
(τ) <f
0
(τ).
The analysis w as partial equilibrium in that this outside w age structure w as taken as
giv en. There are man y reasons wh y in frictional labor markets w e may expect this
177
Lectures in Labor Economics
outside wage structure to be comp ressed . These include adv er se selection, bargain-
ing, and eciency wages, as well as complementarit y between general and specic
skills. Here we will discuss how adv erse selection leads to w ag e com pr ession.
5.1. The Basic Model of Adv erse Selection and Training. This is a sim-
plied v ersion of the model in Acem o glu and Pischke (1998). Suppose that fraction
p of w orkers are high ability, and ha ve productivity α (τ ) in the second period if
they receive training τ in the rstperiod. Theremaining1 p are low abilit y and
produce nothing (in terms of the above model, w e are setting y =0).
No one knows the w o rker’s ability in the rst period, but in the second period, the
current employer learns this abilit y. Firms never observe the ability of the work e rs
they ha ve not employed , so outsiders will hav e to form beliefs about the w o rker’s
ability.
The exact timing of eve nts is as follo w s:
Firmsmakewageoers to workers. A t this point, worker ability is unknown.
Firms make training decisions, τ.
Worker abilit y is rev ealed to the current emp loy er and to the w orker.
Emp loy ers make second period wa ge oers to w orkers.
Workers decide whether to quit.
Outside rms compete for w o rkers in the “secondhand” labor market. At
this poin t, these rm s observe neither w orker abilit y nor whether the w orker
has quit or w a s laid o.
Production tak es place.
Since outside rm s do not know work er ability when they make their bids, this is
a (dynamic) ga m e of inco m p lete information. So we will look for a Per fect Ba yesian
Equ ilibr iu m of this game, which is dened in the standard manner. We will c har-
acterize equilibria using bac k ward induction conditional on beliefs at a give n infor-
mation set.
First, note that all w o rkers will lea ve their curren t employer if outside wages
are higher. In addition, a fraction λ of w orkers, irrespectiv e of ability, realize that
178
Lectures in Labor Economics
they form a bad matc h with the current employer, and leave whatever the w age is.
The important assump tion here is that rms in the outside market observe neither
worker ability nor whether a w orker has quit or has been laid o. Ho wever, worker
training is publicly observed (what w ould happen to the model is training w as not
observed b y outside emp loy ers?).
These assumptions ensure that in the second period eac h work er obtains his
expected productivit y condition a l on his training. Tha t is, his wage will be ind e-
pendent of his own productivity, but will depend on the ave rag e productivit y of the
work ers who are in the secondh and labor market.
By Bay es’s rule, the expected productivity of a w or ker of training τ,is
(8.4) v (τ)=
λpα (τ)
λp +(1 p)
To see why this expressio n app lies, note that all low abilit y worke rs will lea ve their
initial employer, who will at most pay a wa ge of 0 (since this is the productivity
of a low ability wo rker), and as we will see, outside wages are positiv e , low ability
work ers will quit (therefore, the oer of a w age of 0 is equivalen t to a layo;can
there exist in equilibrium in which wo rkers receiv e zero w age and sta y at their job?).
Those w orkers make up a fraction 1 p of the total w orkforce. In addition, of the
high ability workers who make up a fraction p of the total w orkforce, a fraction λ
of them will also lea ve. Therefor e, the total size of the secondhand labor market
is λp +(1 p), which is the denomin ator of (8.4). Of those, the lo w ability ones
produce nothing, whereas the λp high abilit y w orkers produce α (τ), whic h explains
this expression.
Anticipating this outside w ag e, the initial employe r has to pa y each high abil-
it y worker v (τ) to keep him. This observation, com bined with (8.4), immediately
implies that there is w a ge com pression in this w o rld , in the sense that
v
0
(τ)=
λpα
0
(τ)
λp +(1 p)
0
(τ) ,
so the adv erse selection problem in troduces wa ge compression, and via this c hann el,
will lead to rm-sponsored training.
179
Lectures in Labor Economics
To analyze this issue more carefully, consider the previous stage of the game.
Now rm prots as a function of the training choice can be writ ten as
π (τ)=(1 λ) p [α (τ) v (τ)] c (τ) .
The rst-order condition for the rm is
π
0
(τ)=(1 λ) p [α
0
(τ) v
0
(τ)] c
0
(τ)=0(8.5)
=
(1 λ) p (1 p) α
0
(τ)
λp +(1 p)
c
0
(τ)=0
There are a number of notew o rthy features:
(1) c
0
(0) = 0 is sucient to ensure that there is rm -sponsored training (that
is, the solution to (8.5) is interior).
(2) There is underinv estmen t in training relativ e to the rst-best whic h would
have in volv e d
0
(τ)=c
0
(τ) (noticethattherst-best already tak es into
account that only a fraction p of the workers will benet from training).
This is because of two reasons: rst, a fraction λ of the high ability workers
quit, and the rm does not get any prots from them. Second, ev en for the
work ers who stay, the rm is forced to pa y them a higher w age, because they
ha v e an outside option that improv es with their training, i.e., v
0
(τ) > 0.
This reduces prots from training, since the rm has to pay higher w ages
to ke ep the trained w o rkers.
(3) The rm has monopsony power o v er the workers, enabling it to reco ver the
costs of training. In particular, high ability w orkers who produce α (τ ) are
paid v (τ) (τ).
(4) Monop sony power is not enough b y itself. Wage compression is also essen-
tial for this result. To see this, suppose that w e impose there is no wage
compression , i.e., v
0
(τ)=α
0
(τ), then inspection of the rst line of (8.5)
imm ed iately implies that there will be zero training, τ =0.
(5) But wage compression is also not automa tic; it is a consequence of some of
the assump tion s in the model. Let us modify the model so that high abilit y
work ers produce η + α (τ) in the second period, while low ability w orkers
180
Lectures in Labor Economics
produce α (τ).Thismodication implies that training and abilit y are no
longer complem ents. Both t ypes of workers get exactly the sam e margina l
increase in productivity (this contrasts with the previous specication where
only high ability workers beneted from training, hence training and abilit y
were highly comp lem entary). Then , it is straightforw ard to check that we
will have
v (τ)=
λpη
λp +(1 p)
+ α (τ) ,
and hence v
0
(τ)=α
0
(τ). Thus no w age compressio n, and rm-sponsored
training. Intuitively, the complem e ntarity bet ween ability and training in-
duces wage compression, because the training of high ability work ers who
are con templating to leave their rm is judged b y the mark et as the training
of a relativ e ly low abilit y w o rker (since low ability work ers are overrepre-
sen ted in the secondhand labor mark et). Therefore, the marginal increase
in a (high ability) worker’s productivit y due to training is valued less in the
outsid e market, which views this work er, on average, as low ability. Hence
the rm does not ha ve to pay as mu ch for the marginal increase in the
productivity of a high ability w orker, and mak es greater prots from more
trained high-a bility w orkers.
(6) What happens if
π (τ)=(1 λ) p [α (τ) v (τ)] c (τ) > 0,
that is, if rm s are making positive prots (at the equilibrium level of
training)? If there is free en try at time t =0,thisimpliesthatrms will
compete for wo rkers, since hiring a w orker now guarantees positive prots
in later periods. As a resu lt, rms will have to pay a positive wage at tim e
t =0, precisely equal to
W = π (τ)
asaresultofthiscompetition.Thisisbecauseonceaworkeracceptsajob
with a rm , the rm acquires monopsony po wer o ver this w orker’s labor
181
Lectures in Labor Economics
services at time t =1to make positive prots. C o m petition then implies
that these prots have to be transferred to the work er at time t =0.The
interesting result is that not only do rm s pay for training, but they ma y
also pa y work e rs extra in order to attract them.
5.2. Evidence. How can this model be tested? One w ay is to look for evi-
dence of this t ype of adverse selection amon g highly trained w o rkers. The fact that
employers kno w more about their curren t employ ees may be a particularly good as-
sump tion for y o un g wo rkers, so a good area of application would be for appren t ices
in Germa ny.
According to the model, workers who quit or are laid o should get lower w ages
than those who sta y in their jobs, which is a prediction that follo w s simply from
adverse selection (and Gibbons and Katz tested in the U.S. labor market for all
w orkers by comparing laid-o w ork ers to those who lost their jobs as a result of plan t
closings). The more inter esting implication here is that if the worker is separated
from his rm for an exogenous reason that is clearly observable to the mark et, he
should not be punish ed by the secondhand labor market. In fact, he’s “freed” from
the mon opso ny power of the rm, and he may get ev en higher w ages than sta yers
(who are on a verage of higher ability, though subject to the mon opsony power of
their employer).
To see this, note that a worker who is exogenously separated from his rm will
get to w age of (τ) whereas sta yers, who are still subject of the monopsony power
of their employe r, obtain the w age of v (τ) as given by (8.4), which could be less
than (τ). In the Germ a n context, work ers who leave their apprenticeship rm
to serv e in the military provide a potential group of such exogenous separators.
In terestingly, the evidence suggests that although these military quitters are on
a verage lo wer ability than those who stay in the apprenticeship rm, the military
quitters receiv e higher w ages.
5.3. Mobility, trainin g and w ag e s. The in teraction between training and
adverse selection in the labo r market also pro vides a dierent perspectiv e in thinking
182
Lectures in Labor Economics
about mobility pattern s. To see this, c han ge the above model so that λ =0,but
workers now quit if
w (τ) v (τ)
where θ is a worker-specic draw from a uniform distribution over [0, 1]. θ,which
can be in terpreted as the disutility of wo rk in the current job, is the w orker’s private
informat ion. This implies that the fraction of high ability work ers who quit their
initial employer will be
1 w (τ)+v (τ) ,
so the outside wa ge is no w
(8.6) v (τ)=
p [1 w (τ)+v (τ)] α (τ)
p [1 w (τ)+v (τ)] + (1 p)
Note that if v (τ) is high, man y work ers lea ve their employ er because outside wages in
the secondhand market are high. But also the righ t hand side of (8.6) is increasing
in the fraction of quitters, [1 w (τ)+v (τ)],sov (τ) will increase further. This
reects the fact that with a higher quit rate, the secondhand market is not as
adversely selected (it has a better composition).
This im plies that the re can be m ultiple eq uilibr ia in this economy. One equilib-
rium with a high quit rate, high wages for w o rkers ch an gin g jobs, i.e. high v (τ),but
low training. Another equilibrium with low mob ility, low w ag es for job changers, and
high training. T his seems to give a stylistic description of the dierences between
the U.S. and German labor markets. In Germ any, the turno ver rate is muc h lower
than in the U.S., and also there is mu ch more training. Also, in Germ any wo rkers
who c hang e jobs are muc h more sev erely penalized (on a verage, in Germany such
work ers experience a substantial wage loss, while they experience a w age gain in the
U.S.).
W hich equilibrium is better? There is no unambiguous answer to this question.
W hile the low-turn over equilibrium ac hie ves higher training, it does wo rse in term s
of matc h ing w orkers to jobs, in that work ers often get stuck in jobs that they do
183
Lectures in Labor Economics
not lik e. In terms of the abo ve model, we can see this by looking at the average
disutility of work that wo rkers receive (i.e., the average θ’s).
5.4. Adv erse selection and training in the temporary help industry.
An alternativ e place to look for evidence is the temporary help industry in the U.S.
Autor (2001) develops an extended v ersion of this model, which also inco r pora tes
self-selection by w orkers, for the temporary help industry. Autor modies the above
model in four respects to apply it to the U.S. temporary help industry. These are:
(1) The model now lasts for three periods, and in the last period, all w o rkers
receive their full margin al products. This is meant to proxy the fact that
at some poin t temporary-help w or kers may be hired into permanent jobs
where their remuneration ma y better reect their productivity.
(2) Workers have dieren t beliefs about the probability that they are high abil-
it y. Some work ers receive a signal whic h makes them believe that they are
high ability with probability p, while others believe that they are high abil-
it y with probability p
0
<p. This assumption will allo w self-selection among
w orkers between training and no-training rms.
(3) Worke r abilit y is only learned via training. Firms that do not oer training
will not hav e superior information relative to the mar ket. In addition, in
con tra st to the baseline version of the abo ve model, it is also assum e d that
rm s can oer dieren t training levels and commit to them, so rms can
use training levels as a method of attracting workers.
(4) The degree of competitiveness in the market is modeled b y assuming that
rms need to mak e a certain level of prots π, and a higher π corresponds
to a less competitive market.
Autor looks for a “separating”/self-selection equilibrium in which p
0
work ers se-
lect in t o no-training rms, wher eas p w orkers go to training rm s. In this con text,
self-selection equilibrium is one in which w orkers with dier ent abilities (dierent
beliefs) choose to accept jobs in dier ent r m s, because ability is rew a rded dieren-
tially in dierent rms. This makes sense since training and ability are complem e nts
184
Lectures in Labor Economics
as before. Since rms that do not train their employees do not learn about em ploy-
ability, there is no adverse selection for w o rkers who quit from no-training rms.
Therefore, the second-period w age of w orkers who quit from no-training rm s will
be simply
v (0) = p
0
α (0)
In cont rast, the secondhand labor market wag e of work ers from training rms will
be given by v (τ) from (8.4 ) abo ve.
In the third period, all workers will receiv e their expected full marginal product.
For w ork ers who were employ ed by the non-training rms (and th us would did not
receive training), this is p
0
α (0), whereas for w orkers with training, it is (τ).
In the second-period, all work ers receive their outside option in the secondhand
market, so v (0) for workers in no-training rms, and v (τ) for w ork ers in training
rm s.
The condition for a self-selection equilibrium is
p (α (τ) α (0)) >v(0) v (τ) >p
0
(α (τ) α (0)) ,
that is, expected gain of third-period wages for high-belief work ers should outweigh
the loss (if an y) in terms of second period wa ges (since there are no costs in the rst-
period by the assumption that there are no wages in the rst-period). Otherwise,
there could not be a separating equilib rium .
Th is immediate ly implies tha t if v (0)v (τ) < 0, that is, if workers with training
receiv e higher wages in the second period, then there cannot be a self-selection
equilibrium – a ll work e rs, irrespectiv e of their beliefs, wo uld like to tak e a job with
training rms. Therefore, the adv erse selection problem needs to be strong enough
to ensure that v (0) v (τ) > 0. This is the rst implication that Autor investigates
empirically using data about the w ages of temporary help workers in rms that oer
free training comp ared to the w ag es of w o rkers in rms that do not oer training.
He ndsthatthisisgenerallythecase.
The second implica tion concerns the impact of greater competition on training.
To see this m ore form a lly, simply return to the basic model, and look at the prots
185
Lectures in Labor Economics
of a t ypical training rm. These are
π (τ)=
(1 λ) p (1 p) α (τ)
λp +(1 p)
c (τ) .
Therefor e, if in equilibrium w e must ha ve π (τ)=π for some exogenous lev el of
prots π,andπ increases exogenously, the training level oered b y training rms
must increase. To see this, note that in equilibrium we cou ld never have π
0
(τ) > 0,
since then the rm can increase both its prots and attract more workers b y simply
increasing training. Therefore, the equilibrium m u st feature π
0
(τ) 0, and thus a
decline in π, that is, increasing competitiven ess, will lead to higher training.
Au tor investigates this empirically using dierences in temporary help rms con-
cen tration across MSAs, and nds that in areas where there is greater concentration,
training is lower.
5.5. Labor mark e t institutions and training. The theory developed here
also implies that changes in labor market institutions, such as minim um wages and
unioniza tion, will therefore aect the amoun t of training in this economy. To see
the impact of a binding minimum w age on training, let us return to the baseline
framework and consider the next gure, and start with a situation where v(τ)=
f(τ) and p →∞so that the work er cannot in vest in training, and there will
be no training. Now impose a minimum w a ge as drawn in the gure. This distorts
the w age structure and encourages the rm to in vest in skills up to τ
,aslongas
c(τ
) is not too high. This is because the rm makes higher prots from work ers
with skills τ
than work ers wit h skills τ =0.
This is an in te resting compa rative static result, since the standa rd Beck er model
with competitive labor markets implies that minimum wages should always reduce
training . The reason for this is straightforward. Workers take w age cuts to nance
their general skills training, and minimum wages will prevent these w a ge cuts, thus
reducing training.
Therefo re, an empirical investigation of the relationship between minimum wage
c hanges and worker training is a w ay of nding out whether the Becker channel
186
Lectures in Labor Economics
Figure 2
f(τ)
v(τ) = f(τ) − Δ
τ
f(τ)
Minimum wage
τ
Figure 8.2
or the w age-co m pressio n c h ann el is more importan t. Empirical evidence suggests
that higher minimum w ages are typically associated with more training for lo w -sk ill
workers (though this relationship is not alwa y s statistically signicant).
187
Lectures in Labor Economics
Figure 8.3
188
CHAPTER 9
Firm-Specic Skills and Learning
The analysis so far has focused on general skills, acquired in sc hool or b y in vest-
ments in general training. M ost labor economists also believe that there are also
important rm-specic skills, acquired either thanks to rm-specicexperience,or
by investment in rm -specic skills, or via “matching”. If such rm-specic skills
are important w e should observ e w orker productivit y and w ages to increase with
tenure–that is, a w or ker who has sta yed longer in a given job should earn more
than a compar able w o rker (with the same sc h ooling and experience) who has less
tenure.
1. TheEvidenceOnFirm-Specic Ren ts and In terpretation
1.1. Some Evidence. The empirical in vestigation of the importance of rm-
specic skills and rents is a dicult and c h allen ging area. There are tw o important
conceptual issues that arise in thinking about the relationsh ip between w ag es and
tenure, as well as a host of econometric issues. The conceptual issues are as follows:
(1) We can imagin e a world in whic h rm-specic skills are important, but
there ma y be no relationship between tenure and w ages. This is because,
as we will see in more detail below , productivity increases due to rm-
specic skills do not necessarily translate into wag e increases. The usual
reasoning for wh y high w orker productivity translates into higher wages is
that otherwise, competitors would bid for the work er and steal him. This
argumen t does not apply when skills are rm-specic since suc h skills do not
con tribute to the worker’s productivity in other rm s. More generally, the
189
Lectures in Labor Economics
relationship between productivit y and wages is more complex when rm-
specic skills are a signicant com ponent of productivity. For example, w e
might have t wo dieren t jobs, one with faster accumulation of rm-specic
skills, but w age s may grow faster in the other job because the outside option
oftheworkerisimprovingfaster.
(2) An empirical relationship betw een tenure and wages does not establish that
there are imported from-speciceects. To start with, wages may increase
with ten ure because of backloaded compensation packages, whic h , as w e
saw above, are useful for dealing with moral haz ard problem s. Suc h a rela-
tionship migh t also result from the fact that there are some jobs with high
“rents,” and w orkers who get these jobs nev er quit, creating a positive re-
lationship between tenure and wages. Alternatively, a positiv e relationship
between tenure and wages may reect the fact that high ability w o rkers
sta y in their jobs longer (selection).
The existing evidence may therefore either ov erstate or understate the impor-
tance of ten ure and rm-specic skills, and there are no straigh tforward w ays of deal-
ing with these problem s. In addition, there are important econometric prob lem s, for
examp le, the fact that in most data sets most tenure spells are uncompleted (most
w orkers are in the middle of their job ten ure), complicating the analysis. A n um ber
of researchers have used the usual strategies, as we ll as some creativ e strategies, to
deal with the selection and omitted variable biases, poin ted out in the second prob-
lem. But is still requ ires us to ignore the rst problem (i.e., be cautious in inferring
the ten ure-productivit y relationship from the observed ten ure-wage relationsh ip).
In an y case, the empirical relationship bet ween tenure and w ages is of interest
in its own righ t, even if w e cannot immediately deduce from this the relationship
bet ween tenu re and rm -specicproductivity.
With all of these complications, the evidence nev er theless suggests that there is
a positiv e relationship between tenur e and wa ges, consisten t with the im portance of
rm -specic skills. Here w e will discuss two dieren t t ypes of evidence.
190
Lectures in Labor Economics
The rst type of evidence is from regression analyses of the relationship between
wages and tenure exploiting within job wage growth. Here the idea is that by looking
at how w ages gro w within a job (as long as the w orker does not change jobs), and
comparing this to the experience premium, w e will get an estimate of the ten ure
premium. In other w ords, w e can think of wages as giv en b y the follo wing model
(9.1) ln w
it
= β
1
X
it
+ β
2
T
it
+ ε
it
where X
it
this total labor market experience of individua l i,andT
it
is his tenure in
the curren t job. Then, we ha ve that his wage grow th on this job is:
ln w
it
= β
1
+ β
2
+ ε
it
If we knew the experience premium, β
1
, w e could then immediately compute the
tenure prem ium β
2
.Theproblemisthatwedonotknowtheexperiencepremium.
Topel suggests that w e can get an upper bound for the experience premium by
looking at the relationship betw ee n entry-lev e l w ag es and labor market experience
(that is, wages in jobs with tenure equal to zero). This is an upper bound to the
extent that w orkers do not randomly c hange jobs, but only accept new jobs if these
oer a relativ ely high w ag e. Therefore, whenever T
it
=0, the disturbance term ε
it
in
(9.1) is likely to be positiv ely selected. According to this reasoning, w e can obtain
a lower bound estimate of β
2
,
ˆ
β
2
,usingatwo-stepprocedurerst estimate the
rate of within-job wage grow th,
ˆ
β
1
, and then subtract from this the estimate of the
experience premium obtained from en try -lev el jobs (can y ou see reasons why this
will lead to an upwardly biased estimate of the importance of ten ure rather than a
lower bound on ten ure aects as Topel claims?).
Using this procedure Topel estimates relativ ely high rates of return to tenur e.
For example, his main estimates imply that ten y ear s of tenur e increase w a ges b y
about 25 percen t, o ver and abo ve the experience premium.
It is possible, howev er, that this procedure might generate ten ur e premium es-
timates that are upw a rd biased. For example, this w ould be the case if the return
191
Lectures in Labor Economics
to tenure or experience is higher among high-ability w orkers, and those are under-
represented among the job-c ha nger s. Alternatively, returns to experience may be
non-constant, and they may be higher in jobs to which w o rk ers are a better match.
If this is the case, returns experience for new jobs will understate the a verage returns
toexperienceforjobsinwhichworkerschoosetostay.
On the oth er hand, the advan ta ge of this evidence is that it is unlikely to reect
simply the presence of some jobs that oer high-ren ts to work er s, unless these jobs
that pro vide high ren ts also have (for some reason) higher w age growth (one pos-
sibility might be that, union jobs pay higher w ages, and ha v e higher w age growth,
and of course, work ers do not lea ve union jobs, but this seems unlikely).
The second type of evidence comes from the w age c hanges of w ork ers resulting
from job displacement. A number of papers, most notably Jacobson, LaLonde and
Sullivan, nd that displaced work e rs experience substan tial drop in earnings. This
is shown in the next gure.
Part of this is due to non-employmen t follo w ing displacement, but even after
three years a typical displaced worker is earning about $1500 less (1987 dollars).
Econometrically, this evidence is simpler to interpret than the tenure-premiu m esti-
mates. Econom ically, the in terp reta tion is som ewh at more dicult than the tenure
estimates, since it may simply reect the loss of high-ren t (e.g. union) jobs.
In any case, these tw o pieces of evidence together are consisten t with the view
that there are important rm-specic skills/expertises that are accumulated on the
job .
1.2. What Are Firm-SpecicSkills? If we are going to interpret the above
evidence as reecting the importance of rm-specic skills, then we ha ve to be more
specic about what constitutes rm-specic skills. Her e are four dierent views:
(1) Firm-specic skills can be thought to result mostly from rm -specictrain-
ing investments made by workers and rms. Here it is importan t to distin-
guish bet ween rms’ and w o rkers’ in vestments, since they will hav e dierent
incentiv es.
192
Lectures in Labor Economics
Figure 9.1
(2) Firm-specic skills simply reect wha t the w or ker learns on-the-job without
making an y inv estmen ts. In other words, they are simply uninten tional
b yproducts of working on the job. The reason why it is useful to distinguish
this particular view from the rm-specicinvestmentsviewisthataccording
to this view, we do not need to worry about the incentives to acquire rm-
specic skills. Howe ver, most likely, even for simple skills that wo rkers can
acquire on-the-job, they need to exert some eort, so this view may ha ve
relatively little applicability.
(3) Firm-specic skills may reect “matching” as in Jo vanovic’s approach. Here,
thereisnorm-specic skill, but some workers are better ma tches to some
rms. Ex an te, neither the rm nor the w orker know s this, and the infor-
mation is revealed only slow ly. Only workers who are revealed to be good
193
Lectures in Labor Economics
matc hes to a particular job will stay on that job, and as a result, they
will be more productive in this job than a random ly c h osen worker. We
can think of this process of learning about the quality of the match as the
“accumulation of rm -specic skills”.
(4) There ma y be no technologically rm-specic skills. Instead, you ma y think
of all skills as technologically general, in the sense that if the w orker is more
productive in a giv en rm, another rm that adopts exactly the same tec h-
nologies and organizational structure, and hires the same set of co-w orkers
will also be able to benet from this high productivity. These technologi-
cally general skills are transformed into de facto rm-specic skills because
of market imperfections. For example, if w orker mobility is costly, or if it
is dicult or unprotable for rmstocopysomeotherrms’ technology
c h oices, these skills will be de facto specictotherm that has rst made
the tec h nolog y/orga nization al choices. But if this is the case, w e are back
to the model of general training in vestmen ts under imperfect mark ets w e
studied above. The reason why it is important to distinguish this view of
de facto rm -specic skills from the rst view abov e is that now changes in
technology/ m ar ket organization will aect which skills are specicandhow
much of a giv e n bund le of tec hn olog ically -determined skills are “specic”.
2. In vestmen t in Firm-SpecicSkills
2.1. The basic problem. The problem with general training in vestmen ts was
that part of the costs had to be borne b y the rm, but, at least in competitiv e
labor markets, the w orker wa s the residual claimant. The worker, in turn, w as the
residual claimant because the skills wer e general, and other rms could compete for
this worker’s labor services. In con tr ast, with specic skills, the current emp loyer is
theonly(oratleastthemain)“consumer,sothereisnocompetitionfromother
rm s to push up the w orker’s w ag es. A s a result, rm-specic skills will make the
rmtheexpostmonopsonist. Thiscreatestheconverseproblem. Nowtheworker
194
Lectures in Labor Economics
also bears some (perhaps most) of the costs of in vestment, but ma y not ha ve the
righ t incentives to in vest, since the rmwillgetmostofthebenets.
To capture these problems, consider the following very simple model:
At time t =0, the work er decides ho w much to in vest in rm-specic skills,
denoted by s, at the cost γ (s). γ (s) is strictly increasing and conv ex, with
γ
0
(0) = 0.
At time t =1,therm makes a w age oer to the work e r.
The wo rker decides whether to accept this w ag e oer and w ork for this rm,
or tak e another job.
Production takes place and wages are paid.
Let the productivity of the w ork er be y
1
+ f (s) where y
1
is also what he w ould
produce with another rm. Since s is specic skills, it does not aect the worker’s
productivity in other rm s.
First, note that the rst-best level of rm-specic skills is given by
γ
0
(s
)=f
0
(s
) .
Here s
is strictly positiv e since γ
0
(0) = 0.
Let us next solv e this game by backward induction again, starting in the last
period. The w o rker will accept an y wage oer w
1
y
1
, since this is wha t he can get
in an outside rm. Knowing this, the rm simply oers w
1
= y
1
. In the previous
period, realizing that his wage is independen t of his specic skills, the w o rker mak es
no investmen t in specic skills, even thoug h the rst best lev el of rm-specic skills
s
is strictly positiv e.
W hat is the problem here? By inv esting in his rm-specicskills,theworkeris
increasing the rm’s prots. Therefore, the rm would lik e to encourage the w orker
to invest. How ever, given the timing of the game, w ages are determ ined b y a take-
it-leav e-it oer b y the rm after the in vestment. Therefore, it will always be in the
in terest of the rm to oer a lo w wage to the wor ker after the investmen t, in other
words, the rm will hold the worker up. The w ork er an ticipates this holdup problem
and does not in vest in his rm-specicskills.
195
Lectures in Labor Economics
Why is there not a contractual solution to this underinvestment problem? For
examp le, the rm could write a contract ex an te promising a certain paym ent to
the wor ker. Lea v ing aside the problems of enforcing such contracts (the rm could
alwa ys try to re the w orker, or threaten to re him), there is and more fundamen-
tal problem. If the employment contract does not make the wage of the worker
condition al on his rm -specic skills, it will not encourage in vestment. So the only
contracts that could help with the underin v estmen t problem are those that mak e
the wo rker’s w ages contingent on his rm -specic skills. H owe ver, suc h skills are
very dicult to observ e or v erify by outside parties. This motivates the assump-
tion in this literature, as w ell as in the incomplete contracts literature, that such
contingen t contracts cannot be written (they cannot be enforced, and hence are use-
less). Therefore, contractua l solutions to the underin vestment problem are dicu lt
to devise
As a result, there is a severe underin vestment problem here, driven by exactly
the con verse of the underinv estmen t problem in general training. The worker will
not undertak e the required investm ents, because he’s afraid of being held up b y the
rm .
2.2. Work er power and in vestment. How can we improve the workers in-
vestment incentives?
At a very general leve l, the answ e r is simple. The w orker’s earnings hav e to
be conditioned on his specic skills. There are a number of ways of achieving
this. Perhaps the simplest is to give the w or ker som e “po wer” in the employment
relationship. This power ma y come simply because the worker can bargain with
his employ er eectively (either individua lly or via unions–though the latter w ou ld
probably be not useful in this con text, since union bargaining does not typically will
link a work er ’s wage to his productivit y ). The wo rker may be able to bargain with
the rm , in turn, for a variety of reasons. Here are some:
(1) Because of regulations, such as emplo y m ent protection legislation, or pre-
cisely because of his specic skills, the rm needs the worker, hence we are
196
Lectures in Labor Economics
in the bilateral monopoly situation, and the rents will be shared (rather
than the rm m akin g a tak e-it-leave-it oer).
(2) The rm may purposefully giv e access to some important assets of the rm
to the work er, so that the worker m ay feel secure that he will not be held up.
This is basically the insigh t that follo w s from the incomp lete con tractin g
appro ach to property rights, which we discussed previously. Recall that
in the Grossman-Hart-Moore approach to the in ternal organization of the
rm, the allocation of property rights determine who can use assets and the
use of the rm’s assets is a w ay of manipu lating ex post barg ain ing and via
this c han nel ex an te inv estmen t incentiv es.
(3) The rm may c hange its organizational form in order to make a credible
commitment not to hold up the w orker.
(4) The rm may dev elop a reputation for not holding up work ers who ha v e
inv este d in rm -specic hum an capital.
Here let us consider a simple exam p le of in vestmen t incen tives with bargaining
power, and show why rmsmaypreferredtogivemorebargainingpowertotheir
employees in order to ensure high lev els of rm-specic in vestments. In the next
section, w e discuss alternative “organizational” solutions to this problem.
Modify the abo ve game simply b y assuming that in the nal period, rather than
the rm makin g a tak e-it-leave-it oer, the w orker and the rm bargain ov er the
rm -specic surplus, so the worker’s w age is
w
1
(s)=y
1
+ βf (s)
Now at time t =0, the w o rker maximizes
y
1
+ βf (s) γ (s) ,
which gives his in vestment as
(9.2) βf
0
s)=γ
0
s)
197
Lectures in Labor Economics
Here ˆs is strictly positiv e, so giving the wo rker bargaining power has improv ed
in vestment incentives. Ho wev er, ˆs is strictly less than the rst-best investme nt level
s
.
To investigate the relationship bet ween rm-specic skills, rm prots and the
allocation of po wer within rms, now consider an extended game, where at time
t = 1,therm c h ooses whether to giv e the w orker access to a key asset. If it does,
ex post the w orker has bargaining power β, and if it does not, the work er has no
bargaining power and wages are determined b y a take-it-leave-it oer of the rm .
Essentially, the rm is c h oosing bet ween the game in this section and the previou s
one. Letuslookattheprots of the rm from c hoosing the two actions. Wh en
it give s no access, the work er chooses zero investment, and since w
1
= y
1
,therm
prots are π
0
=0. In contrast, with the change in organizational form giving access
to the wo rker, the worker undertak es inv estm ent ˆs,andprots are
π
β
=(1 β) f s) .
Therefor e, the rm w ould prefer to giv e the worker some bargaining power in order
to encourage inv estmen t in specic skills.
Notice the contra st in the role of work er bargaining po wer between the standard
framew ork and the one here. In the standard framew ork, w ork er bargaining power
alwa ys reduces prots and causes ineciency. Here, it ma y do the opposite. This
suggests tha t in some situations reducing w o rker bargain ing power ma y actually be
coun terproductive for eciency.
Note another interesting implication of the framew ork here. If the rm could
c h oose the bargaining power of the w orker without any constra ints, it wo uld set
¯
β
suc h that
∂π
β
∂β
=0=f
¡
ˆs
¡
¯
β
¢¢
+
¡
1
¯
β
¢
f
0
¡
ˆs
¡
¯
β
¢¢
dˆs
¡
¯
β
¢
,
where ˆs (β) and dˆs/dβ are given b y the rst-order condition of the worker, (9.2).
One observation is imm ed iate. The rm w o u ld certainly choose
¯
β<1,sincewith
¯
β =1,wecouldneverhave∂π
β
/∂β =0(or m ore straightforw ard ly, prots would be
zero). In contrast, a social planner who did not care about the distribution of income
198
Lectures in Labor Economics
bet ween prots and wages would necessarily c hoose β =1. The reason why the rm
w ould not choose the structure of organization that achiev es the best investment
outcom es is that it cares about its own prots, not total income or surplus.
Iftherewereanexantemarketinwhichtheworkerandtherm could “transact’,
the w o rker could make side pa ymen ts to the rm to encourage it to choose β =1,
then the ecient outcome w ould be ach ieved. This is basically the solution that
follows from the ana lysis of the incomplete contracts literature discussed abov e, but
this literature focuses on v e rtical integration , and attempts to answer the question
of who among man y en trepreneurs/managers should own the rm or its assets. In
the con text of w orker-rm relationships, such a solution is not possible, giv en credit
constrain ts facing w orkers. P erhaps more importantly, suc h an arrangemen t w ould
eectively amount to the w ork er buying the rm, whic h is not possible for two
important reasons:
the entrepreneur/owner of the rm most lik ely has some essen tia l kno w ledg e
for the production process and transferring all prots to w orkers or to a
single w ork er is impractical and would destroy the value-generating capacity
of the rm ;
in practice there are many w orkers, so it is impossible to improv es their
in vestmen t incentiv es b y making each worker the residual claiman t of the
rm ’s prots.
2.3. Promotions. An alternativ e arrangem ent to encourage work ers to in vest
in rm -specic skills is to design a promotion sc hem e. Consid er the following setup.
Suppose that there are tw o investmen t lev els, s =0,ands =1which costs c.
Suppose also that at time t =1,therearetwotasksintherm, dicult and
easy, D and E. A ssum e outputs in these two tasks as a function of the skill lev el are
y
D
(0) <y
E
(0) <y
E
(1) <y
D
(1)
There fore, skills are more useful in the dicult task, and withou t skills the dicult
task is not v ery productive.
199
Lectures in Labor Economics
Moreover, suppose that
y
D
(1) y
E
(1) >c
mean ing that the productivity gain of assigning a skilled wo rker to the dicult task
is greater than the cost of the wo rker obtaining skills.
In this situation, the rm can induce rm-specic in vestments in skills if it can
comm it to a wage structure attac h ed to promotions. In particular, suppose that
the rmcommitstoawageofw
D
for the dicult task and w
E
for the easy task.
Notice that the wag es do not depend on whether the w orker has undertaken the
inve stm ent, so we are assuming some degree of comm itm ent on the side of the rm,
but not modifying the crucial incompleten ess of contra cts assum ption .
Now imagine the rm c hooses the wage structures suc h that
(9.3) y
D
(1) y
E
(1) >w
D
w
E
>c,
and then ex post decides whether the w orker will be promoted.
Again by bac kward induction, w e have to look at the decisions in the nal period
of the game. When it comes to the promotion decision, and the worker is unskilled,
the rm will naturally choose to allocate him to the easy task (his productivity is
higher in the easy task and his wage is lower). If the worker is skilled, and the rm
allocates him to the easy task, his prots are y
E
(1) w
E
. If it allocates him to
the dicult task, his prots are y
D
(1) w
D
. The w ag e structure in (9.3) ensures
that prots from allocating him to the dicult task are higher. Therefore, with this
wage structure the rm has made a credible commitm ent to pa y the w o rker a higher
wage if he becom es skilled, because it will nd it prota ble to promote the w or ker.
Next, going to the in vestmen t stage, the w orker realizes that when he does not
in vest he will receive w
E
, and when he in vests, he will get the higher wage w
D
.
Since, again b y (9.3), w
D
w
E
>c, the work er will nd it prota ble to undertake
the investmen t.
2.4. In vestmen ts and layos–The Hashimoto model. Consider the fol-
lowing model which is useful in a variet y of circumsta nces. Th e w or ker can in vest
200
Lectures in Labor Economics
in s =1at time t =0again at the cost c. The in vestment increases the worker’s
productivity by an amoun t m + η where η is a mea n -zero random variable obser ved
only by the rm at t =1. The total productivit y of the w orker is x + m + η (if he
does not in vest, his productivity is simply x). The rm unilaterally decides whether
to re the worker, so the w orker will be red if
η<η
w x m,
where w ishiswage. Thiswageisassumedtobexed, and cannot be renegoti-
ated as a function of η, since the w ork er does not observe η.(Therecanbeother
more complicated w ays of revealing information about η, using stoc hastic con tra cts,
whereby w ork ers and rms mak e direct reports about the values of η and θ,and
dierent values of these variables map into a w age level and a probabilit y that the
relationship will con tinue; using the Rev elation Principle we can restrict attention
to truthful reports subject incentiv e compatibility constraints and solve for the most
ecien t contracts of this form; nev erth eless, to keep the discussion simple, w e ignore
these stocha stic contracts here).
If the w orker is red or quits, he receives an outside w ag e v.Ifhestays,he
receives the wage paid by the rm , w, and also disutility, θ,onlyobservedbyhim.
The worker unilatera lly decides whe ther to quit or not, so he will quit if
θ>θ
w v
Denoting the distribution function of θ by Q and that of η by F , and assu m ing that
the draws from these distributions are independen t, the expected protoftherm
is
Q (θ
)[1 F (η
)] [x + m w + E (η | η η
)]
The expected utility of the work er is
v + Q (θ
)[1 F (η
)] [w v E (θ | θ θ
)]
201
Lectures in Labor Economics
In contra st, if the work er does not in vest in skills, he will obtain
v if w>x
v + Q (θ
)[w v E (θ | θ θ
)] if w x
So we can see that a high w a ge prom ise by the rm may ha ve either a benecial or
an adverse eect on in vestmen t incentiv es. If w = x + ε>v, the w orker realizes
that he can only k eep his job b y in vesting. But on the other hand, a high w age
makes it more lik e ly that η<η
, so it ma y increase the probability that giv e n the
realization of the productivit y shoc k, prots will be negativ e, and the work er will
be red. This will reduce the w orker’s investm ent incentiv es. In addition, a low er
w age would mak e it more likely that the work er will quit, and through this channel
increase ineciency and discourage investment.
According to Hashimoto, the w age structure has to be determined to balance
these eects, and moreov er, the ex post w age structure ch osen to minimize inecient
separatio ns may dictate a particular division of the costs of rm -specic in vestments.
An in teresting tw ist on this comes from Carm ichael, who sugg ests that com m it-
ment to a promotion ladder might improv e incen tives to inv est without encouraging
furt her layosbytherm. Suppose the rm commits to promote N
h
workers at
time t =1(ho w such a comm itm ent is made is an in teresting an d dicult question).
Promotion comes with an additional w age of B. So the expected w age of the w orker,
if he k eeps his job, is now
w +
N
h
N
B,
where N is employment at time t =1, and this expression assumes that a random
selection of the w orkers will be prom oted . A greater N
h
or B, holding the la y o
rate of the rm constant, increases the incen tive of the work er to stay around, and
encourages in vestment.
Next think about the layo rate of the rm. The total w age bill of the rm at
time t =1is then
W = Nw + N
h
B.
202
Lectures in Labor Economics
The signicance of this expression is that if the rm res a w orker, this will only
sa ve the rm w, since it is still committed to prom ote N
h
work ers. Therefore, this
commitmen t to (an absolute n umber of) promotions, reduces the r m ’s incentive to
re, while simu ltaneou sly increasing the reward to sta yin g in the rm for the worker.
This is an interesting idea, but w e can push the reasoning further, perhaps
suggesting that it is not as compelling as it rst appears. If the rm can com mit
to prom ote N
h
work ers, why can it not commit to employing N
0
workers, and by
manipulating this number eectively make a commitment not to re workers? So
if this type of commitmen t to employme nt level is allow ed , promotion s are not
necessary, and if such a com m itm ent is not allowed, it is not plausible that the rm
can commit to promoting N
h
work ers.
3. A Simple Model of Labor Mark et Learning and Mobility
An important idea related to rm-specic skills is that these skills are (at least
in part) a man ifestation of the qualit y of the matc h between a w o rker and his job.
Naturally, if workers could cost lessly lea rn about the quality of the matches between
themselves and all potential jobs, they would im m ediately choose the job for which
they are m ost suited to. In practice, ho wev er, jobs are “experience goods,” meaning
that work ers can only nd out whether they are a good match to a job (and to a
rm ) by working in that rm and job. Moreov er, this type of learning does not take
place immediately.
W hat m akes these ideas particularly useful for labor economics is that a simp le
model incorporating this t ype of match-specic learning provides a range of useful
results and also opens up even a larger set of questio ns for analysis. Interestin gly,
how ever, after the early models on these topics, there has been relativ e ly little
researc h.
The rst model to formalize these ideas is due to Jo vano vic. Jo vanovic considered
a model in which match-specic productivity is the dra w from a normal distribution,
and the output of the worker conditional on his match-specic productivity is also
203
Lectures in Labor Economics
normally distributed. Though, as w e ha ve seen, normal distributions are often very
conv enient, in this particular con text the normal distribution has a disadvan tage,
which is that as the w orker learns about his match-specic productivit y, w e need to
k eep track of both his belief about the lev el of the quality and also the precision of
his beliefs. This makes the model somewha t dicult to work with.
Instead, let us consider a simp ler v ersion of the same model.
Each w o rker is innitely liv ed in discrete time and maximizes the expected dis-
counted value of income, with a discount factor β<1. There is no ex ante hetero-
geneit y among the workers. But wo rker-job matches are random.
In particular, the work er may be a good matc h for a job (or a rm) or a bad
match. Let the (population) probability that the w orker is a good matc h be μ
0
(0, 1). A worker in an y giv en job can generate one of tw o levels of outpu t, high , y
h
,
and low y
l
<y
h
. In particular, suppose that we have
good match
y
h
with probab ility p
y
l
with prob a bility 1 p
and
bad match
y
h
with prob a bility q
y
l
with probab ility 1 q
where, naturally,
p>q.
Let us assume that all learning is symm etric (as in the career concerns model).
This is natural in the presen t context, since there is only learning about the match
quality of the work er and the rm will also observe the productivity realizations of
the work er since the beginning of their employme nt relationship . This implies that
the rm and the w orker will share the same posterior probability that the worker is
a good match to the job. For w orker i job j and time t, we can denote this posterior
probability (belief) as μ
ijt
. W hen there is no risk of confusion, w e will denote this
simply by μ.
Jovano vic assumes that workers alwa ys receiv e their full marginal product in
each job. This is a problematic assumption, since matc h -specic quality is also
204
Lectures in Labor Economics
rm specic, thus there is no reason for the w orker to receiv e this en tire rm-
specic surplus. As in the models with rm-specicinvestments,themorenatural
assumptionwouldbetohavesometypeofwagebargaining. Letusassumethe
simplest bargaining structure in which a rm will pay the work er a fraction φ (0, 1]
of his expected productivity at that point. In particular, the w age of a worker whose
posterior of a good match is μ will be
w (μ)=φ [μy
h
+(1 μ) y
l
] .
Note that this is dieren t from the Nash bargaining solution, which would hav e to
take in to account the outside option and also the future benets to the worker from
being in this job (which result from learnin g). But having suc h a simple expression
facilitates the analysis and the exposition here. [Alternatively, w e could ha ve assum e
that bargaining takes place after the realization of output, in whic h case the w age
would be equal to φy
h
with probability μ and to φy
l
with proba bility 1 μ;since
both the work er and arm our risk neutral, there is no dierence bet ween these two
cases].
To make progress, let us consider a wo rker with belief μ.Ifthisworkerproduces
output y
h
, then Bayes’s rule implies that his posterior (belief) next period should
be
μ
0
h
(μ)
μp
μp +(1 μ) q
,
where the fact that this is greater tha n μ immediately follows from the assumption
that p>q. Sim ilarly, follo wing an outpu t realizatio n of y
l
,thebeliefoftheworker
will be
μ
0
l
(μ)
μ (1 p)
μ (1 p)+(1 μ)(1 p)
.
Finally, let us also assume that ev ery time a wo rker c ha nges jobs, he has to incur
a training or mob ility cost equal to γ 0.
Under these assump tions, we can write the net present discounted value of a
worker with belief μ recursively using simple dynamic program m ing argum ents. In
205
Lectures in Labor Economics
particular, this is
V (μ)=w (μ)+β[(μp +(1 μ) q) V (μ
0
h
(μ))
+(μ (1 p)+(1 μ)(1 q)) max {V (μ
0
l
(μ)) ; V (μ
0
) γ} .
Intuitiv ely, the w orker receiv es the wa ge w (μ) as a function of the (symm etr ically
held) belief about the quality of his matc h at the moment.
The contin uation value, which is discounted with the discount factor β<1,has
the following explanation: with probability μ, the match is indeed good and then
the worker will produce an output equal to y
h
with pr obab ility p. With probability
1μ, the match is not good and the worker will produce high output with probabilit y
q. In either case, the posterior about match qualit y will be μ
0
h
(μ), and using the
recurs ive reasoning, his value will be V (μ
0
h
(μ)). Since he w a s happy to be in this
job with belief μ, μ
0
h
(μ) as stated above, and clearly (can you prove this?)
V (μ) is increasin g in μ,hewillnotwanttoquitafteragoodrealizationandthus
his value is written as V (μ
0
h
(μ)).
With probability (μ (1 p)+(1 μ)(1 q)), on the other hand, he will produce
low output, y
l
, and in this case the posterior will be μ
0
l
(μ).Sinceμ
0
l
(μ) ,atthis
poin t the wor ker may prefer to quit and take another job. Since a new job is a new
draw from the match-quality distribution, the proba bility that he will be good at
this job is μ
0
. Subtracting the cost of mobility, γ,thevalueoftakinganewjobis
therefore V (μ
0
)γ. The w orker chooses the maximum of this and this con tinuation
value in the same rm, V (μ
0
l
(μ)).
An immediate result from dynamic programming is that if the instantaneous
reward function, here w (μ), is strictly increasing in the state variable, which here
is the belief μ, then the value function V (μ) will also be strictly increasing. This
implies that there will exist some cuto lev e l of belief μ
such that workers will stay
in their job as long as
μ μ
,
and they will quit if μ<μ
.
206
Lectures in Labor Economics
Let ¯μ =inf{μ:μ
0
l
(μ)
}. Then a w orker with beliefs μ>¯μ will not quit
irrespective of the realization of output. Work ers with μ<μ
should ha ve quit
already. Therefore, the only remaining range of beliefs is μ [μ
, ¯μ].Aworkerwith
beliefsinthisrangewillquitthejobifhegenerateslowoutput.
Now a couple of observations are immediate.
(1) Pro vided that μ
0
(0, 1), μ will never converge to 0 or 1 in nite time.
Therefo re, a work er who generates high output will ha ve higher wages in
the following period, and a worker who generates low output will ha v e lo wer
w ages in the following period. Th us, in this model w ork er wages will move
with past performance.
(2) It can be easily proved that if γ =0,thenμ
= μ
0
.Thisimpliesthatwhen
γ is equal to 0 or is v ery small, a wo rker who starts a job and generates
low output will quit imme dia tely. Ther efo re, as long as γ is no t very high,
there will be a high likelihood of separ ation in new jobs.
(3) Next consider a w orker who has been in a job for a long time. Such workers
will on average hav e high values of μ, since they have neve r experienced (on
this job) a belief less than μ
. This implies that the average value of their
beliefs must be high. T he refo r e, wor k e r s with long tenure are unlikely to
quit or separate from their job. [Here average refers to the average among
the set of w orkers who ha ve been in a job for a giv en length of time; for
example,theaveragevalueofμ for all wo rkers who ha ve been a job for T
periods].
(4) With the sam e argum ent,workers who have been in a job for a long time will
ha v e high av erage μ and thus high wages. This implies that in equilibriu m
there will be a tenure premium .
(5) Moreover, because Bay esian updating immed iately implies that the gaps
between μ
0
h
(μ) and μ and between μ
0
l
(μ) and μ are lowest when μ is close
to 1 (and symmetr ically when it is close to 0, but wo rkers are nev er in
jobswheretheirbeliefsarecloseto0),workerswithlongtenurewillnot
207
Lectures in Labor Economics
experience large wage changes. In con trast, workers at the b eginning of
their tenure will have highe r wag e variability.
(6) What will happen to w ages when workers quit? If γ =0, w ages will neces-
sarily fall when w orkers quit (since before they quit μ>μ
0
,whereasinthe
new job μ = μ
0
). If, on the other hand, γ is non-innitesimal, workers will
experience a wage gain when they c hange jobs, since in this case μ
0
because they are stay ing in their current job un t il this job is suciently
unlikely to be a good match. This last prediction is also consisten t with
the data, where on average w o rkers who change jobs experience in increase
in w ages. [But is this a reasonable explanation for wage increases when
w orkers c hange jobs?].
What is missing from this model is dierential learning opportunities in dieren t
jobs. If we assume that output and underlying job quality are normally distributed,
we already obtain some of amoun t of dieren tial learning, since the value of learning
is higher in new jobs because the precision of the posterior is smaller. Another
possibilit y will be to have heterogen eous jobs, w here some jobs have greater return s
to match qualit y, or perhaps some jobs enable faster learning (e.g., more informative
signals). E ven more inte restin g w o uld be to allow some amount of learning about
general skills. For example, an academic will not be learning and rev e aling only
about his match -specic qualit y but also about his industry-specic quality (e.g.,
his researc h poten tial). W hen this is the case, some jobs ma y pla y the role of
“steppin g stones” because they rev ea l informatio n about the skills and productivity
oftheworkerinarangeonotherjobs.
Finally, if instead of the reduced-form w age equation, we incorporate competition
among rm s in to this model, some of the predictions chan ge again. For examp le,
wecanconsideraworldinwhichanite nu mber of rms with access to the same
tec h nolog y compete a la Bertran d for the wo rker. Clearly the worker will start work-
ing for the rm where the prior of a good matc h is greatest. Bertrand competition
implies that this rm will pa y the worker his value at the next best job. Once the
208
Lectures in Labor Economics
work er receiv es bad news and decides to quit, then he will switc h to the job that
was previou sly his next best option. But this implies that his wage, which will now
be determ in ed b y the third best option (which may in fact be his initial employ er)
is necessarily smaller, thus job cha nges will alw ays be associated with w age declines.
This discussion sho w s that wage determina tion assumptions in these models are not
innocuous, and more realistic wage determina tion schemes ma y lead to results that
are not en tirely consistent with the data (w age declines rather than wag e increases
upon job c hanges). This once again highlights the need for introducing some amount
of general skills and job heterogeneit y, so that w ork ers quit not only because they
hav e received bad news in their curren t job but also because they have learned about
their ability and can ther efo re go and work for “higher-q ua lity” jobs.
209
Part 4
Searc h and U nemplo ym en t
Let us start with the classical McC all model of search. This model is not only
elegant, but has also become a w ork ho rse for many questions in macr o, labor and
industrial organization . An important feature of the model is that it is m uch more
tractable than the original Stigler form u lation of search, as one of sampling m ultip le
oers, but we will return to this them e belo w .
CHAPTER 10
Th e Partial Equ ilib r ium M odel
1. Basic Model
Imagin e a partial equilibrium setup with a risk neutral individual in discrete
time. A t tim e t =0, this individual has preferences given b y
X
t=0
β
t
c
t
where c
t
is his consumption. He starts life as unemploy ed. W hen unemploy ed,
he has access to consumptio n equal to b (from home production, value of leisure or
unemploymen t benet). At eac h time period, he samples a job. All jobs are identical
except for their wages, and wages are giv en by an exogenous stationary distribution
of F (w) with nite (bounded) support W, i.e., F is dened only for w W. Without
loss of any generalit y, we can tak e the lo wer support of W to be 0,sincenegative
wages can be ruled out. In other word s, at ev ery date, the individual samples a
wage w
t
W , and has to decide whether to take this or con tin ue searching. Draws
from W over time are independent and identically distributed.
This t y pe of sequen tial search model can also be referred to as a model of undi-
rected search, in the sense that the individual has no ability to seek or direct his
search towards dierent parts of the wa ge distribution (or to wards dierent t ypes
of jobs). This will con tras t with models of dire cted sear ch whic h we will see later.
Let us assume for now that there is no recall, so that the only thing the individual
candoistotakethejoboered within that date (with recall, the individual would
be able to accumulate oers, so at time t, he can ch oose any of the oers he has
receiv ed up at that poin t). If he accepts a job, he will be emplo yed at that job
213
Lectures in Labor Economics
forever, so the net presen t value of accepting a job of w age w
t
is
w
t
1 β
.
This is a simple decision problem. Let us specify the class of decision rules of the
agen t. In particular, let
a
t
: W [0, 1]
denote the action of the agen t at time t, whic h species his acceptance probabilit y
for each wage in W at time t.Leta
0
t
{0, 1} be the realization of the action by
the individual (th u s allo w ing for mixed strategies). Let also A
t
denote the set of
realized actions b y the individual, and dene A
t
=
t
Q
s=0
A
s
. Then a strategy for the
individu a l in this game is
p
t
: A
t1
× W [0, 1]
Let P be the set of such functions (with the propert y that p
t
(·) is dened only if
p
s
(·)=0for all s t)andP
the set of inn ite sequences of suc h function s. The
most general w ay of expressing the problem of the individual w o uld be as follow s.
Let E be the expectations operator. Then the individual’s problem is
max
{p
t
}
t=0
P
E
X
t=0
β
t
c
t
subject to c
t
= b if t<sand c
t
= w
s
if t s where s =inf{n N : a
0
n
=1}.
Natura lly, written in this w ay, the problem looks complicated. Nevertheless, the
dynamic programming formulation of this problem will be quite tractable.
To develop this approach, let us analyze this problem b y writing it recursiv ely
using dynam ic programm ing tech niqu es. First, let us den e the value of the agen t
when he has sampled a job of w W.Thisisclearlygivenby
(10.1) v (w)=max
½
w
1 β
v+ b
¾
,
where
(10.2) v =
Z
W
v (ω) dF (ω)
214
Lectures in Labor Economics
isthecontinuationvalueofnotacceptingajob. Herewehavemadenoassumptions
about the structure of the set W,whichcouldbeaninterval,ormighthaveamass
poin t, and the densit y of the distribution F ma y not exist. Th erefo re, the in teg ral
in (10.2) should be interprete d as a Lebesgue in teg ral.
Equa tion (10.1) follo w s from the observation that the individual will either accept
the job, receiving a constan t consump tion stream of w (valued at w/ (1 β))orwill
turn dow n this job, in which case he will enjoy the consu m ptio n level b, and receive
the continu ation value v. Maximization im p lies that the individu al takes wh ichev er
of these two options gives higher net present value.
Equation (10.2), on the other hand, follow s from the fact that from tomorrow on,
the individual faces the same distribution of job oers, so v is simply the expected
value of v (w) o ver the stationary distribution of wages.
We are interested in nding both the value function v (w) and the optimal policy
of the individua l.
Combining these t wo equatio ns, we can wr ite
(10.3) v (w)=max
½
w
1 β
,b+ β
Z
W
v (ω) dF (ω)
¾
.
We can now deduce the existence of optimal policies using standard theorems from
dynamic programming. But in fact, (10.3) is simple enough that, one can derive
these results without appealing to these theorems. In particular, this equation
makes it clear that v (w) must be piecewise linear with rst a at portion and then
an increasing portion.
The next task is to determine the optimal policy. But the fact that v (w) is
non-decreasing and is piecewise linear with rst a at portion, imme dia tely tells us
that the optimal policy will take a reservation wage form, whic h is a key result of
the sequential searc h model. More exp licitly, there w ill exist some reservation wage
R such that all w ag es above R will be accepted and those w<Rwill be turned
down. Moreov er, this reservation w age has to be such that
(10.4)
R
1 β
= b + β
Z
W
v (ω) dF (ω) ,
215
Lectures in Labor Economics
so that the individual is just indier ent between taking w = R and waiting for one
more period. Next we also ha ve that since w<Rare turned dow n, for all w<R
v (w)=b + β
Z
W
v (ω) dF (ω)
=
R
1 β
,
and for all w R,
v (w)=
w
1 β
Therefor e,
Z
W
v (ω) dF (ω)=
RF (R)
1 β
+
Z
wR
w
1 β
dF (w) .
Combining this with (10.4), we ha ve
R
1 β
= b + β
RF (R)
1 β
+
Z
wR
w
1 β
dF (w)
¸
Manipulating this equation, w e can write
R =
1
1 βF (R)
b(1 β)+β
Z
+
R
wdF (w)
¸
,
which is one wa y of expressing the reservation wag e. More useful is to rewrite this
equation as
Z
w<R
R
1 β
dF (w)+
Z
wR
R
1 β
dF (w)=b+β
Z
w<R
R
1 β
dF (w)+
Z
wR
w
1 β
dF (w)
¸
No w subtracting βR
R
wR
dF (w) / (1 β)+βR
R
w<R
dF (w) / (1 β) from both
sides, w e obtain
Z
w<R
R
1 β
dF (w)+
Z
wR
R
1 β
dF (w)
β
Z
wR
R
1 β
dF (w) β
Z
w<R
R
1 β
dF (w)
= b + β
Z
wR
w R
1 β
dF (w)
¸
Collecting terms, we obtain
(10.5) R b =
β
1 β
Z
wR
(w R) dF (w)
¸
,
216
Lectures in Labor Economics
which is a particularly useful and econom ically intu itive way of c h ar acte rizing the
reservation wage. The left-hand side is best understood as the cost of foregoing the
wage of R, while the righ t hand side is the expected benet of one more searc h.
Clearly, at the reservation wage, these two are equal.
Oneimplicationofthereservationwagepolicyisthattheassumptionofno
recall, m ade above, was of no consequence. In a stationar y en viro nm ent, the w orker
will ha ve a constant reservation wage, and therefore has no desire to go back and
take a job that he had previously rejected.
Let us dene the right hand side of equation (10.5) as
g (R)
β
1 β
Z
wR
(w R) dF (w)
¸
,
which represents the expected benet of one more search as a function of the reser-
vation wage. Clearly,
g
0
(R)=
β
1 β
(R R) f (R)
β
1 β
Z
wR
dF (w)
¸
=
β
1 β
[1 F (R)] < 0
This imp lies that equation (10.5) has a unique solution. Mo reover, b y the implicit
function theorem,
dR
db
=
1
1 g
0
(R)
> 0,
so that as expected, higher benets when unemployed increase the reservation w age,
makingworkersmorepicky.
Moreover, for future reference, also note that when the density of F (R),denoted
by f (R), exists, the second derivativ e of g also exists and is
g
00
(R)=
β
1 β
f (R) 0,
so that the right hand side of equation (10.5) is also conv ex.
The next question is to in vestigate how chan ges in the distribution of wages
F aect the reservation w age. Before doing this, howeve r, w e will use this partial
equilibrium McCall model to derive a v ery simple theory of unemployment.
217
Lectures in Labor Economics
2. Unemplo yment with Sequen tial Search
Let us no w use the McCall model to construct a simple model of unemployment.
In particular, let us suppose that there is now a contin uum 1 of iden tical individuals
sampling jobs from the same stationary distribution F . Moreo ver, once a job is
created, it lasts un til the wo rker dies, which happens with probability s.Thereis
amassofs w orkers born ev ery period, so that population is constant, and these
work ers start out as unemplo yed. Th e death probability means that the eective
discoun t factor of w o rkers is equal to β (1 s). C onsequen tly, the value of having
accepted a wa ge of w is:
v
a
(w)=
w
1 β (1 s)
.
Moreo ver, with the same reasoning as before, the value of having a job oer at
wage w at hand is
v (w)=max{v
a
(w) ,b+ β (1 s) v}
with
v =
Z
W
v (w) dF.
Therefore, the same steps lead to the reservation wage equation:
R b =
β (1 s)
1 β (1 s)
Z
wR
(w R) dF (w)
¸
.
Now what is interesting is to look at the la w of motion of unemployment. Let
us start time t with U
t
unem ployed work ers. There will be s new w orkers born in to
the une m p loyment pool. Out of the U
t
unem ploy ed w o rkers, those who surviv e and
do not nd a job will remain unem ploy ed. Therefore
U
t+1
= s +(1 s) F (R)U
t
,
where F (R) is the probability of not nding a job (i.e., a w a ge oer belo w the
reservation w age), so (1 s) F (R) is the joint probabilit y of not nding a job and
surviving , i.e., of remainin g unemp loyed . This is a simp le rst-order linear dierence
equation (only depending on the reservation wa ge R, which is itself in dependent of
218
Lectures in Labor Economics
the lev el of unemployment, U
t
) and determ ines the la w of m otion of unemployment.
Moreov er, since (1 s) F (R) < 1, it is asympto tically stable, and will converge to
a unique steady-state lev el of unemp lo ym ent.
To get more insigh t, subtract U
t
from both sides, and rearrange to obtain
U
t+1
U
t
= s (1 U
t
) (1 s)(1 F (R)) U
t
.
This is the sim p lest example of the ow appr oach to the labor mark et, where unem-
ployment dynamics are determined b y ow s in and out of unemp loyment. In fact is
equation has the canonical form for change in unemploymen t in the ow approach.
The left hand-side is the chang e unemploymen t (whic h can be either indiscreet or
contin uous time), while the right hand-side consists of the job destruction rate (in
this case s) m ultiplied by (1 U
t
) min us the rate at which w orkers leav e unem ploy-
ment (in this case (1 s)(1 F (R))) m u ltiplied with U
t
.
The unique steady-state unemplo ymen t rate where U
t+1
= U
t
is given by
U =
s
s +(1 s)(1 F (R))
.
This is again the canonical formula of the o w approa ch. T he steady-state unem-
plo ym ent rate is equal to the job destruction rate (here the rate at which work e rs
die, s) divided b y the job destruction rate plus the job creation rate (here in fact the
rate at which work ers lea ve unemploymen t, which is dierent from the job creation
rate). Clearly, an increases in s will raise steady-state unemploymen t. Moreover, an
increase in R, that is, a higher reservation wage, will also depress job creation and
increase unemplo ymen t.
3. A sid e on R iskin es s and Mean Preserv ing Spread s
To in vestigate the eect of c h ang es in the distribution of wages on the reservation
wage, let us introduce the concept of mean preserving spre ad s. Loosely speaking, a
mean preserving spread is a change in distribution that increases risk. Let a family
of distributions over som e set X R with generic element x be denoted by F (x, r),
where r is a shift variable, which c hanges the distribution function. An example
219
Lectures in Labor Economics
will be F (x, r) to stand for mean zero normal variables, with r parameterizing the
variance of the distribution. In fact, the normal distribution is special in the sense
that, the m ean and the variance completely describe the distributio n, so the notion
of risk can be captured by the variance. This is generally not true. The notion of
“riskier” is a more stringen t notion than having a greater variance. In fact, we will
see that “riskier than” is a partial order (while, clearly, comparing variances is a
comp lete order).
Here is a natural denition of one distribution being riskier than another, rst
introduced by Blac kwell, and then by Rothschild and Stiglitz.
Definition 10.1. F (x, r) is less risky than F (x, r
0
),writtenasF (x, r) º
R
F (x, r
0
), if for all c oncave and incre asing u : R R,wehave
Z
X
u (x) dF (x, r)
Z
X
u (x) dF (x, r
0
) .
At some level, it may be a mo re int uitive denition of “riskiness” to require that
F (x, r) and F (x, r
0
) tohavethesamemean,i.e.,
R
X
xdF (x, r)=
R
X
xdF (x, r
0
),
wh ile still F (x, r) º
R
F (x, r
0
). Howev er, whether w e do this or not is not important
for our focus.
A related denition is that of second-order stochastic dominance.
Definition 10.2. F (x, r) sec o nd order stochastic ally dominates F (x, r
0
),writ-
ten as F (x, r) º
SD
F (x, r
0
),if
Z
c
−∞
F (x, r) dx
Z
c
−∞
F (x, r
0
) dx,forallc X.
In other w ords, this denition requires the distribution function of F (x, r) to
start lower and alwa ys keep a lower integral than that of F (x, r
0
). On e easy case
wherethiswillbesatised is when both distribution functions have the same mean
and they intersect only once: “single crossing") with F (x, r) cutting F (x, r
0
) fro m
below.
The denitions above use w ea k inequalities. Alternativ ely, they can be strength-
ened to strict inequalities. In particular, the rst denition w ould require a strict
220
Lectures in Labor Economics
inequalit y for functions that are strictly concav e over som e range, while the second
denition will require strict inequalit y for some c.
Theorem 10.1. (Blackwell, Rothschild and Stiglitz) F (x, r) º
R
F (x, r
0
)
if and only if F (x, r) º
SD
F (x, r
0
).
Therefo re, there is an intim ate link between second-order stoc ha stic dominance
and the notion of riskiness. This also show s that variance is not a good measu re of
riskiness, since second order stoc ha stic domina nce is a partial order.
Now mean preserving spreads are essentially equivalen t to second-order sto-
c ha stic dominance with the additional restriction that both distributions ha ve the
same mean. As the term suggests, a mean preserving spread is equivalen t to taking
a given distribution and shifting some of the weigh t from around the mean to the
tails. Alternativ e representations also include one distribution being obtained from
the other b y adding “white noise to the other.
Second-o rder stoc hastic dominance pla ys a very importan t role in the theory of
learning, and also more generally in the theor y of decision-m a king under uncertaint y.
Here it will be useful for comparativ e statics.
4. B ack to the Bas ic Partial Equilibriu m Sear ch M odel
Let us return to the McC all searc h model. To investigate the eect of changes
in the riskiness (or dispersion) of the w a ge distribution on reservation w ages, and
th u s on search and unemplo ymen t beha vior, let us express the reservation wage
somew ha t dierently. Start with equation (10.5) above, whic h is reproduced here
for conv enien ce,
R b =
β
1 β
Z
wR
(w R) dF (w)
¸
.
221
Lectures in Labor Economics
Rewritethisas
R b =
β
1 β
Z
wR
(w R) dF (w)
¸
+
β
1 β
Z
wR
(w R) dF (w)
¸
β
1 β
Z
wR
(w R) dF (w)
¸
,
=
β
1 β
(Ew R)
β
1 β
Z
wR
(w R) dF (w)
¸
,
where Ew is the mean of the w age distribution, i.e.,
Ew =
Z
W
wdF (w) .
Now rearranging this last equation, we have
R b = β (Ew b) β
Z
wR
(w R) dF (w) .
Apply ing integration b y parts to the in teg ral on the righ t hand side, in particular,
noting that
Z
wR
wdF (w)=
Z
R
0
wdF (w)
= wF (w)|
R
0
Z
R
0
F (w) dw
= RF (R)
Z
R
0
F (w) dw,
this equation can be rewritten as
(10.6) R b = β (Ew b)+β
Z
R
0
F (w) dw.
Now consider a shift from F to
˜
F corresponding to a mean preserving spread.
This implies that Ew is unchanged, but by denition of a mean preserving spread
(second-order stoc ha stic dominance), the last in teg ral increases. T h erefore, the
mean preserving spread induces a shift in the reservation wa ge from R to
˜
R>R.
This reects the greater option value of wa iting when faced with a more dispersed
wage distribution; lower wages are already turned do w n, while higher w ages are now
more likely.
222
Lectures in Labor Economics
Adierent w ay of viewing this result is that the analysis above established that
the v alue function v (w) is con vex. While Theorem 10.1 shows that concav e utilit y
function s like less risky distributions, con vex functions lik e more risky distributions.
5. Parado xes of Searc h
The search framework is attractive especially when we w ant to think of a w or ld
withou t a Walrasian auctioneer, or alternativ ely a world with “frictions”. How do
prices get determined? How do poten tial buy er s and sellers get together? Can we
think of Walrasian equilibrium as an approximation to such a world under some
conditions?
Search theory holds the promise of poten tially answ erin g these questions, and
pro viding us with a framework for analysis.
5.1. The Rothsch ild Critique. The McCall model is an attractiv e starting
point. It captures the int uition that individua ls may be searching for the right t ypes
of job (e.g., jobs oering higher wages), trading o the prospects of future benets
(high wages) for the costs of foregoing curren t w ages.
But everything hinges on the distribution of w ages, F (w).Wheredoesthis
come from? Presuma bly somebody is oering ev ery w a ge in the support of this
distribution.
The basis of the Rothschild critique is that it is dicu lt to rationalize the dis-
tribution function F (w) as resulting from prot-maximizing choices of rms.
Imagin e that the economy consists of a mass 1 of iden tical w orkers similar to our
searching agent. On the other side, there are N rmsthatcanproductivelyemploy
work ers. Imagine that rm j has access to a tec hnology such that it can emplo y l
j
work ers to produce
y
j
= x
j
l
j
units of output (with its price normalized to one as the numeraire, so that w is
the real wage). Suppose that each rm can only attract w orkers by posting a
single vacancy. Moreover , to simplify life, suppose that rm s post a vacancy at
223
Lectures in Labor Economics
thebeginningofthegameatt =0, and then do not change the wage from then on.
This will both simplify the strategies, and imply that the wage distribution will be
stationary, since all the same w ages will remain active throughout time. [Can you
see why this simplies the discussion? Imagine, for contra st, the case in which each
rm only hires one worker; then think of the wage distribution at time t, F
t
(w),
starting with some arbitrary F
0
(w). Will it remain constant?]
Suppose that the distribution of x in the population of rms is given by G (x)
with support X R
+
. Also assume that there is some cost γ>0 of posting a
vacancy at the beginning, and nally, that N>>1 (i.e., N =
R
−∞
dG (x) >> 1)
and each w orker samples one rm from the distribu tion of posting rms.
As before, we will assume that once a work er accepts a job, this is permanent,
andhewillbeemployedatthisjobforever. Moreoverletussetb =0,sothatthere
is no unemp loyment benets. Finally, to keep the enviro nm ent entirely stationary,
assume that once a w o rker accepts a job, a new w orker is born, and starts search.
W ill these rms oer a non-degen erate wage distribution F (w)?
Theanswerisno.
First, note that an endogenous wa ge distribution equilibrium would correspond
to a function
p : X {0, 1} ,
denoting whether the rm is posting a vacancy or not, and if it is, i.e., p =1,
h : X R
+
,
specifyingthewageitisoering.
It is intu itive that h (x) should be non-decreasing (higher wages are more at-
tractive to high prod uctivity rms). Let us suppose that this is so, and denote
its set-valued in verse mapping by h
1
. Th en , the along-the-equilibr ium path w a ge
distribution is
F (w)=
R
h
1
(w)
−∞
p (x) dG (x)
R
−∞
p (x) dG (x)
.
Why?
224
Lectures in Labor Economics
In addition, the strategies of wo rkers can be represented by a function
a : R
+
[0, 1]
denotin g the probability that the wor ker will accept an y wa ge in the “poten tial
support” of the wage distribution, with 1 standing for acceptance. This is general
enough to nest non-symmetric or mixed strategies.
The natura l equilibrium concept is subgam e perfect Nash equilibriu m , whereby
the strategies of rms (p, h) and those of w orkers, a, are best responses to eac h other
in all subgam es.
The same arguments as above imply that all work er s will use a reservation w age,
so
a (w)=1if w R
=0otherwise
Since all w orkers are identical and the equation above determining the reservation
wage, (10.5), has a unique solution, all w orkers will all be using the same reservation
rule, accepting all wa ges w R and turning down those w<R. Workers’ strategies
are therefore again char acterized by a reservation wage R.
Now tak e a rm with productivit y x oeringawagew
0
>R. Its net present
value of prots from this period’s matches is
π (p =1,w
0
>R,x)=γ +
1
n
(x w
0
)
1 β
where
n =
Z
−∞
p (x) dG (x)
is the measu re of activ e rms, 1/n is the probabilit y of a match within each period
(since the population of activ e rms and searching worke rs are constant), and xw
0
is the prot from the w orker discounted at the discount factor β.
Notice t wo (implicit) assumption s here: (1) wage posting: each job comes with
a commitment to a certain wag e; (2) undirected searc h: the w o rker makes a random
225
Lectures in Labor Economics
dra w from the distribution F , and the only w a y he can seek higher w ages is by
turning down lower w ages that he samples.
This rm can deviate and cut its wag e to som e value in the interval [R, w
0
).All
workers will still accept this job since its wage is abo ve the reservation wage, and
the rm will increase its prots to
π (p =1,w [R, w
0
),x)=γ +
1
n
x w
1 β
(p =1,w
0
,x)
So there should not be an y wages strictly above R.
Next consider a rm oering a wage ˜w<R. Thiswagewillberejectedbyall
work ers, and the rm would lose the cost of posting a vacancy, i.e.,
π (p =1,w <R,x)=γ,
and this rm can deviate to p =0and make zero prots. Therefore, in equilibrium
when workers use the reservation wage rule of accepting only w ages greater than R,
all rms will oerthesamewageR, and there is no distribution and no searc h.
This establishes
Theorem 10.2. When all workers are homogeneous and engage in undirected
search, all equilibrium distributions will have a mass point at their reservation wage
R.
In fact, the paradox is even deeper.
5.2. The Diamond Paradox. The follow ing result is one form of the Diamond
parado x:
Theorem 10.3. (Diamond Paradox) For al l β<1, the unique e quilibrium
in the above econom y is R =0.
Given the Theorem 10.2, this result is easy to understan d. Theorem 10.2 implies
that all rms will oer the same wage, R.
Suppose R>0,andβ<1. What is the optimal acceptance function, a,fora
worker?
226
Lectures in Labor Economics
If the answer is
a (w)=1if w R
=0otherwise
then we can support all rms oering w = R as an equilibrium (notice that the
acceptance function needs to be dened for w ages “o-the-equ ilibrium path" ). Why
is this importa nt?
Ho w ever, w e can prov e:
Lemma 10.1. Ther e exists ε>0 such that when “almost all” rms are oerin g
w = R, it is optim al for each worker to use the follow in g acceptance strategy:
a (w)=1if w R ε
=0otherwise
Note: think about what “almost all” means here and wh y it is necessary.
Proof. If the wor ker accepts the w a ge of R ε today his pay o is
u
accept
=
R ε
1 β
If he rejects and waits un t il next period, then since “almost all” rms are oering
R, he will receiv e the wage of R,so
u
reject
=
βR
1 β
where the additional β comes in because of the w aiting period. For all β<1,there
exists ε>0 such that
u
accept
>u
reject
,
pro ving the claim. ¤
What is the intuition for this lemma?
But this implies that, starting from an allocation where all rm s oer R,any
rm can deviate an d oer a wage of R ε and increase its prots. This proves that
no wage R>0 can be the equilibrium, proving the proposition.
227
Lectures in Labor Economics
Notice that subgame perfection is importan t here. We know that these are non-
subgame perfect Nash equilibria, and this highligh ts the importance of using the
righ t equilibrium concept in the context of dynamic economies.
So now w e are in a con undrum. Not only does there fail to be a w age distribution,
but irrespectiv e of the distribution of productivities or the degree of discounting, all
rm s oer the lowest possible wa ge, i.e., they are full monopsonists.
Howdoweresolvethisparadox?
(1) By assumption: assume that F (w) is not the distribution of w ages, but
the distribution of “fruits” exogenously oered by “trees”. This is clearly
unsatisfactory, both from the modeling point of view, and from the point
of view of asking policy questions from the model (e.g., how does unem-
ployment insurance aect the equilibrium ? The answer will depend also on
how the equilibrium wage distribution c ha nges).
(2) Introduce other dimensions of heterogeneity: to be done later.
(3) Modify the wage determinatio n assum p tions: to be done in a little bit.
228
CHAPTER 11
Ba s ic Equ ilib r ium Sea r ch Fra mewor k
1. Motivation
Importance of labor market ows, job creation, job destruction.
Nee d for a framework that can be used for equilibrium analysis, but allows for
unemploymentEquilibriu m search models.
More reduced form than a partial equilibrium model in order to avoid the “para-
doxes” men tioned above.
2. The Basic Search Model
Now w e discuss the basic searc h-match ing model, or sometimes called the ow
approach to the labor market.
Here the basic idea is that that are frictions in the labor market, mak ing it
costly (time-consu m ing) for work ers to nd rm s and vice versa . This will lead
to what is commonly referred to as “frictional unemployment”. Ho wever, as soon
as there are these t ypes of frictions, there are also quasi-rents in the relationship
bet ween rms and w orkers, and there will be room for ren t-sharing. In the basic
search model, the main reason for high unemployment ma y not be the time costs
of nding partner s, but bargaining between rmsandworkerswhichleadstonon-
market-clearing equilibrium prices.
Here is a simple version of the basic search model.
The rst important object is the matching function, which gives the num ber
of matches between rmsandworkersasafunctionofthenumberofunemployed
work ers and n u mber of vacancies.
Matc hing Function: Matc hes = x(U, V )
229
Lectures in Labor Economics
This function captu res the frictions inherent in the process of assigning workers
to jobs in a v ery reduced form w ay. This reduced -for m structure is its advantage
and disadvantage. It is dicult to ha v e microfoundations for this function, but it is
verytractable,fairlyeasytomaptodata(atleasttodataonjobows and worker
ows), and captures the in tuitive notion that job nding rates for w or kers should
depend on how man y unemployed w orkers are chasing ho w man y vacancies.
Of course the form of the matching function will also depend on what the time
horizon is.
Following our treatment of the Shapiro-St ig lit z model, we will work with con t in -
uous time, so w e should think of x(U, V ) as the ow rate of matches.
We t ypica lly assum e that this matc hin g function exhibits constant returns to
scale (CRS), that is,
Matches = xL = x(uL, vL)
= x = x (u, v)
Here we ha v e adopted the usual notation:
U =unemployment;
u =unemplo ymen t rate
V =vacancies;
v = vacancy rate (per worker in labor force)
L = labor force
Existing aggregate evidence suggests that the assump tio n of x exhibiting CRS
is reasonable (Blanchard and Diamond , 1989)
Using the constan t returns assumption, we can express ev erything as a function
of the tigh tness of the labor market.
Therefo re;
q(θ)
x
v
= x
³
u
v
, 1
´
,
where θ v/u is the tightness of the labor market
230
Lectures in Labor Economics
Since w e are in continuous time, these things immediately map to ow rates.
Namely
q(θ):Poisson arrival rate of match for a vacancy
q(θ)θ :Poissonarrivalrateofmatchforanunemployedworker
W hat does Poisson mean?
Tak e a short period of time t, then the P o isson process is dened such that
during this time interval, the probability that there will be one arrival, for examp le
one arrival of a job for a w orker, is
tq(θ)θ
The probability that there will be more than one arrivals is vanishingly small (for-
mally, of order o (t)).
Therefo re,
1 q(θ): probability that a work er looking for a job will not nd one during
t
This probab ility depends on θ, th us leading to a potential externalit y– th e search
beha vior of others aects my own job nding rate.
The search model is also sometimes called the o w approach to unemployment
becauseitsallaboutjobo ws. That is about job creation and job destruction.
This is another dividing line between labor and macro. Many macroeconomists
look at data on job creation and job destruction follo w ing Da v is and Haltiwanger.
Most labor econom ists do not look at these data. Presuma bly there is some infor-
mation in them.
Job creation is equal to
Jobcreation=uθq(θ)L
Wh at about job destruction?
Let us start with the simplest model of job destruction, which is basically to
treatitas“exogenous.
231
Lectures in Labor Economics
Think of it as follow s, rms are hit b y adverse shoc ks, and then they decide
wheth er to destro y or to con tinue.
−→ Adverse Shock−→ destro y
−→ con tin ue
Exogenou s job destruction: A dverse shock = −∞ with ”probabilit y s
As in the Shap iro-Stiglitz model, w e will focus on steady states.
Steady State:
ow in to unemploymen t = ow out of unemployment
Therefor e, with exogenous job destruction:
s(1 u)=θq(θ)u
This gives the steady-state unem ployment rate as
u =
s
s + θq(θ)
This relationship is sometimes referred to as the Bev e ridge Curve, or the U-V
curve. It dra w s a dow nwa rd sloping locus of unemployment-vacancy combinations
intheU-Vspacethatareconsistentwitho w into unemployment being equal with
ow out of unemp loymen t. Some authors in t erpr et shifts of this relationship is
reecting structural c ha nges in the labor market, but we will see that there are
many factors that might actually shift at a generalized ve rsion of suc h relationship.
It is a crucial equation even if you don’t like the searc h model. It relates the un-
emplo yment rate to the rate at which people leave their jobs and and unemployment
and the rate at whic h people lea ve the unemplo ym ent pool.
In a more realistic model, of course, w e hav e to tak e into accoun t the rate at
which people go and come bac k from out-of-labor force status.
Let’s next turn to the production side.
Let the output of each rm be giv e n b y neoclassical production function com-
bining labor and capital:
232
Lectures in Labor Economics
Y = AF (K, N)
where the prod uction function F is assume d to exhibit constant returns, K is the
capital stoc k of the economy, and N is employment (dierent from labor force be-
cause of unemployment).
Dening k K/N as the capital labor ratio, we hav e that outpu t per worker is:
Y
N
= Af(k) AF (
K
N
, 1)
because of constant returns.
Tw o interpretations −→ eac h rm is a ”job” hires one w orker
eac h rmcanhireasmanyworkerasitlikes
For our purposes either interpretation is ne
Hiring: Vacancy costs γ
0
: xed cost of hiring
r: cost of capital
δ:depreciation
The key assumption here is that capital is perfectly rever sible.
As in the Shapiro Stiglitz model, we will solv e ev ery th ing b y using dynam ic
program min g, or in other w ords b y writing the asset value equations. As in there,
let us dene those in terms of the present discounted values.
Namely, let
J
V
: PDV of a v acancy
J
F
:PDV of a ”job”
J
U
:PDV of a searc h ing w or ker
J
E
:PDVofanemployedworker
More generally, we ha ve that w orker utilit y is: EU
0
=
R
0
e
rt
U (c
t
), but for
what we care here, risk-neutrality is sucien t.
233
Lectures in Labor Economics
Utility U(c)=c, in other words, linear utility, so agen ts are risk-neutra l.
Perfect capital market gives the asset value for a vacancy (in steady state) as
rJ
V
= γ
0
+ q(θ)(J
F
J
V
)
In tu itiv ely, there is a cost of vacancy equal to γ
0
at ev ery instan t, and the vacancy
turns into a lled job at the ow rate q (θ).
Notice that in writing this expression, we ha ve assumed that rm s are risk neu-
tral. Why is this important?
−→ w orkers risk neutral, or
−→ complete markets
The question is how to model job creation (whic h is the equivalen t of how to
model labor dem an d in a competitive labor market).
Presu m ab ly, rmsdecidetocreatejobswhenthereareprot opportunities.
The simplest and perhaps the most extreme form of endogenous job creation is
to assume that there will be a rm that creates a vacancy as soon as the value of a
vacancy is positive (after all, unless there are scarce factors necessary for creating
vacancies any body should be able to create one).
This is sometimes referred to as the free-entry assumption, because it amounts
to imposing that whenever there are poten tial prots they will be eroded by entry.
Free Entry =
J
V
0
The most important implication of this assum ption is that job creation can
happen really “fast”, except because of the frictions created by matc hin g searching
work ers to searc hing vacancies.
Alternative would be: γ
0
= Γ
0
(V ) or Γ
1
(θ), so as there are more and more jobs
created, the cost of opening an additional job increases.
234
Lectures in Labor Economics
Free entry implies that
J
F
=
γ
0
q(θ)
Next, w e can write another asset value equation for the value of a eld job:
r(J
F
+ k)=Af(k) δk w s(J
F
J
V
)
Intuitively, the rm has two assets: the fact that it is matched with a worker,
and its capital, k. So its asset value is J
F
+ k (more generally, without the perfect
rev ersability, we w o uld have the more general J
F
(k)). Its return is equal to produc-
tion, Af(k), and its costs are depreciation of capital and w ages, δk and w. Finally,
at the rate s, the relationship comes to an end and the rm loses J
F
.
Perfect Rev ersa bility implies that w does not depend on the rm’s choice of
capital
= equilibrium capital utilizatio n f
0
(k)=r + δ–Modied Golden Rule
[...Digressio n: Suppose k is not perfectly rev ersible then suppose that the w orker
captures a fraction β all the output in bargaining. Then the wage depends on the
capital stoc k of the rm, as in the holdup models discussed before.
w (k)=βAf(k)
Af
0
(k)=
r + δ
1 β
; capital accumulation is distorted
...]
Now, ignoring this digression
Af(k) (r δ)k w
(r + s)
q(θ)
γ
0
=0
Now returning to the wo rker side, the risk neutralit y of w o rkers giv es
rJ
U
= z + θq(θ)(J
E
J
U
)
235
Lectures in Labor Economics
where z is unemploymen t benets. The in t uition for this equat io n is similar. We
also ha ve
rJ
E
= w + s(J
U
J
E
)
Solving these equations w e obtain
rJ
U
=
(r + s)z + θq(θ)w
r + s + θq(θ)
rJ
E
=
sz +[r + θq(θ)] w
r + s + θq(θ)
Ho w are w ages determined? Nash Bargaining.
Why do we need bargaining? Answer: bilateral mon opoly or muc h m ore specif-
ically: match specic surplus.
Think of a competitiv e labor market, at the marg in the rm is indierent be-
t ween employing the mar ginal w orker or not, and the work er is indierent between
supplying the marg ina l hour or not (or w ork ing for this rm or another rm ). We
can make both parties in dieren t at the same time– no ma tch-specic surplus.
In a frictional labor mark et, if we choose the w age such that J
E
=0,wewill
typically have J
F
> 0 and vice v ersa. There is some surplus to be shared.
Nash solution to bargaining is again the natural benchmark. Let us assume that
the w orker has bargaining po wer β.
Applying this form ula, for pair i,wehave
rJ
F
i
= Af(k) (r + δ)k w
i
sJ
F
i
rJ
E
i
= w
i
s(J
E
i
J
U
0
).
The Nash solution will solve
max(J
E
i
J
U
)
β
(J
F
i
J
V
)
1β
β = bargaining power of the worker
Since we ha ve linear utility, th u s “tran sfe rab le utility”, this imp lies
236
Lectures in Labor Economics
= J
E
i
J
U
= β(J
F
i
+ J
E
i
J
V
J
U
)
= w =(1 β)z + β [Af(k) (r + δ)k + θγ
0
]
Here [ Af (k) (r + δ)k + θγ
0
] is the quasi-ren t created b y a match that the rm
and w orkers share. Why is the term θγ
0
there?
Now we are in this position to c har acterize the steady-state equilibrium.
Steady Sta te Equilib r ium is given by four equa tion s
(1) The Bev eridge curv e:
u =
s
s + θq(θ)
(2) Job creation leads zero prots:
Af(k) (r + δ)k w
(r + s)
q(θ)
γ
0
=0
(3) Wage determination :
w =(1 β)z + β [Af(k) (r + δ)k + θγ
0
]
(4) Modied golden rule:
Af
0
(k)=r + δ
These four equation s dene a block recursive system
(4) + r −→ k
k + r +(2)+(3)−→ θ, w
θ +(1)−→ u
Alternatively, com b ining three of these equations w e obtain the zero-protlocus,
the VS curve, and com bine it with the Bev eridge curv e. More specically,
(2), (3), (4) = the VS curve
237
Lectures in Labor Economics
(1 β)[Af(k) (r + δ)k z]
r + δ + βθq(θ)
q(θ)
γ
0
=0
There fore, the equilibrium looks ver y similar to the intersect ion of “quasi-labor
demand” and “quasi-labor supply”.
Qua si-labor supply is giv en by the Beveridge curv e, while labor demand is given
by the zero protconditions.
Given this equilibrium, comparative statics (for steady states) are straightfor-
ward.
Figure 11.1
For example:
s U V θ w
r U V θ w
γ
0
U V θ w
238
Lectures in Labor Economics
β U V θ w
z U V θ w
A U V θ w
Thus, a greater exogenous separation rate, higher discount rates, higher costs
of creating vacancies, higher bargaining po wer of w o rkers, higher unemployment
benets lead to higher unemployment. Greater productivit y of jobs, leads to low er
unemployment.
In teresting ly, some of those, notably the greater separation rate also increases
the n u mber of vacancies.
Can we think of any of these factors is explaining the rise in unemployment
in Europe during the 1980s, or the lesser rise in unemployment in 1980s in in the
United States?
3. Eciency of Searc h Equilibrium
Is the search equilibrium ecient? Clearly, it is inecient relative to a rst-best
alternative, e.g., a social planner that can a void the matching frictions.
Howev er, this is not an in ter esting benc hm a rk. Much more in teresting is whether
a social planner aected b y exactly the same externalities as the market economy
can do better than the decentralized equilibrium.
An alternative way of asking this question is to think about externalities. In this
economy there are two externalities
θ = workers nd jobs more easily
thic k-m arket externalit y
= rm s ndworkersmoreslowly
congestion externalit y
Therefor e, the question of eciency boils dow n to whether these t wo externalities
cancel each other or whether one of them dom inates.
To analyze this question more systematically, consider a social planner subject
to the same constraints, intending to maximize “total surplus”, in other words,
pursuin g a utilitarian objective.
239
Lectures in Labor Economics
Firstignorediscounting,i.e.,r 0, then the planner’s problem can be written
as
max
u,θ
SS =(1 u)y + uz γ
0
.
s.t.
u =
s
s + θq(θ)
.
where w e assumed that z corresponds to the utilit y of leisure rather than unemplo y -
ment benets (how w ou ld this be dieren t if z w ere unemplo ym ent benets?)
The form of the objectiv e function is intuitiv e. For ev ery employed w ork er, a
fractio n 1u of the work er s, the society receives an output of y; for every unemplo yed
work er, a fraction u of the population, it receives z, and in addition for ev ery vacancy
itpaysthecostofγ
0
(and there are vacancies).
The constrain t on this problem is that imposed b y the matching frictions, i.e. the
Beveridgecurve,capturingthefactthatlower unemploymen t can only be achieved
by creating more vacancies, i.e., higher θ.
Hold ing r =0, turns this from a dynam ic into a static optimization problem,
and it can be analyzed b y forming the Lagrangian, whic h is
L =(1 u)y + uz γ
0
+ λ
u
s
s + θq(θ)
¸
The rst-order conditions with respect to u and θ are straightforw ard:
(y z)+θγ
0
= λ
0
= λs
θq
0
(θ)+q (θ)
(s + θq(θ))
2
Since the constra int will clearly binding (why is this? Otherwise reduce θ,andsocial
surplus increases), w e can substitute for u from the Bev eridge curve, and obtain:
λ =
γ
0
(s + θq (θ))
θq
0
(θ)+q (θ)
Now substitute this into the rst condition to obtain
[θq
0
(θ)+q (θ)] (y z)+[θq
0
(θ)+q (θ)] θγ
0
γ
0
(s + θq (θ)) = 0
240
Lectures in Labor Economics
Now simp lifying and dividin g throug h by q (θ),weobtain
[1 η(θ)] [y z]
s + η(θ)θq(θ)
q(θ)
γ
0
=0.
where
η (θ)=
θq
0
(θ)
q (θ)
=
∂M(U,V )
∂U
U
M (U, V )
is the elasticity of the matching function respect to unem ployment.
Reca ll that in equilibriu m , we hav e (with r =0)
(1 β)(y z)
s + βθq(θ)
q(θ)
γ
0
=0.
Comparingthesetwoconditionswend that eciency obtains if and only if
β = η(θ).
In other w ords, eciency requires the bargaining power of the worker to be equal
to the elasticit y of the matc hin g function with respect to unemp loyment.
We can also note that this result is m ade possible by the fact that the matc h in g
function is constant returns to scale, and eciency would never obtain if it exhibited
increasing or increasing returns to scale. (Why is this? How would go about pro vin g
this?)
The condition β = η(θ) is the famous Hosios condition . It requires the bargaining
po wer of a factor to be equal to the elasticity of the matching function with respect
to the corresponding factor.
What is the intuition?
It is not easy to give an intuition for this result, but here is an attempt: as a
planner you would like to increase the number of v acancies to the poin t where the
marginal benet in terms of add itional matches is equal to the cost. In equilibrium ,
vacancies enter until the marginal benets in terms of their bargained returns is
equal to the cost. So if β is too high, they are getting too small a fraction of the
return, and they will not en ter enough. If β is too low , then they are getting too
muc h of the surplu s, so there will be excess entry. The right value of β turns out to
be the one that is equal to the elasticity of the matc hin g function with respect to
241
Lectures in Labor Economics
unemployment (thus 1 β is equal to the elasticit y of the matching function with
respect to vacancies, by constant returns to scale).
Exactly the sam e result holds when we ha ve discounting, i.e., r>0
In this case, the objective function is
SS
=
Z
0
e
rt
[Ny zN γ
0
θ(L N)] dt
and will be maximized subject to
˙
N = q(θ)θ(L N) sN
The rst-order condition is
y z
r + s + η(θ)q(θ)θ
q(θ)[1 η(θ)]
γ
0
=0
Com p ared to the equilibriu m w here
(1 β)[y z]+
r + s + βq(θ)θ
q(θ)
γ
0
=0
Again, η(θ)=β would decentralized the constrained ecient allocation.
At this point, you may be puzzled. Isn’t there unem ployment in equilib riu m ?
So the equilibrium being ecient means that the social plann er likes unemployment
too. This raises the question: What is the use of unemp loymen t?
The answer to this question is quite rev ealing . Unemployment in fact has a social
role in this model. Its role is to facilitate trade at low transa ctio n costs; the great er
is unem p loyment, the less costly this is to ll vacancies (which are in turn costly
to open). This highlights why the bargaining parameter should be related to the
elasticit y of the matching function. The greater is this elasticity, it means that the
more importan t it is to have more unemplo y ed workers around to facilitate matc hing,
and that means a high shadow value of unemplo yed wo rkers, which corresponds to
ahighβ in equilibrium.
4. Endogenous Job Destruction
So far we treated the rate at which jobs get destroyed as a constan t, s, giving
us a simple equation
242
Lectures in Labor Economics
˙u = s(1 u) θq (θ) u
But presumably thinking of job destruction as exogenous is not satisfactory.
Firms decide when to expand and contract, so it’s a natural next step to endogenize
s.
To do this, suppose that eac h rm consists of a single job (so w e are no w taking
a position on for size). Also assume that the productivit y of eac h rm consists of
twocomponents,acommonproductivityandarm-specic productivity.
In particular
productivity for rm i = p
|{z}
com m on pro d uctivity
+ σ × ε
i
|{z}
rm-spe cic
where
ε
i
F (·)
over support ε
and ¯ε,andσ is a parameter capturing the importance of rm -specic
shocks.
Moreover,supposethateachnewjobstartsatε ε, but does not necessarily
sta y there. In particu lar, there is a new draw from F (·) arriving at the ow the rate
λ.
To simplify the discussion, let us ignore wag e determ ination and set
w = b
This then gives the following value function (written in steady state) for a an active
jobwithproductivityshockε (though this job may decide not to be active):
rJ
F
(ε)=p + σε b + λ
Z
¯ε
ε
max{J
F
(x) ,J
V
}dF (x) J
F
(ε)
¸
where J
V
is the value of a vacant job, which is what the rm becomes if it decides
to destroy. The max operator tak es care of the fact that the rm has a choice after
the realization of the new shock, x, whether to destro y or to continue.
243
Lectures in Labor Economics
Since with free entry J
V
=0,wehave
(11.1) rJ
F
(ε)=p + σε b + λ
£
E(J
F
) J
F
(ε)
¤
where now w e write J
F
(ε) todenotethefactthatthevalueofemployingaworker
for a rm depends on rm-specic productivity.
(11.2) E(J
F
)=
Z
¯ε
ε
max
©
J
F
(x) , 0
ª
dF (x)
is the expected value of a job after a draw from the distribution F (ε).
Given the Markov structure, the value conditional on a draw does not depend
on history.
What is the intuition for this equation?
Dierentiation of (11.1) imm ed iately gives
(11.3)
dJ
F
(ε)
=
σ
r + λ
> 0
Greater productivit y gives greater values the rm .
W hen will job destruction tak e place?
Since (11.3) establishes that J
F
is monotonic in ε, job destruction will be c har-
acterized by a cut-o rule, i.e.,
ε
d
: ε<ε
d
−→ destroy
Clearly, this cuto thresho ld will be dened by
rJ
F
(ε
d
)=0
ButwealsohaverJ
F
(ε
d
)=p + σε
d
b + λ
£
E(J
F
) J
F
(ε
d
)
¤
, which yields an
equat io n for the value of a job after a new draw:
E(J
F
)=
p + σε
d
b
λ
> 0
This is an in ter esting result ; it implies that since the expected value of con t inuation is
positive (remember equation (11.2)), the ow prots of the marginal job, p+σε
d
b,
must be negative. Why is this? Th e answer is option value. Co ntinuing as a
productiv e unit means that the rm has the option of getting a better dra w in the
future, whic h is potentially protable. For this reason it w aits until current prots
244
Lectures in Labor Economics
are suciently negativ e to destroy the job; in other w ords there is a natural form of
labor hoarding in this economy.
Furthermore, we ha ve a tractable equation for J
F
(ε):
J
F
(ε)=
σ
r + λ
(ε ε
d
)
Let us now m ake more progress to wards ch aracter izing E(J
F
)
By denition, w e have
E(J
F
)=
Z
¯ε
ε
d
J
F
(x)dF(x)
(where we have used the fact that when ε<ε
d
, the job will be destroy ed ).
Now doing integration b y parts, we have
E(J
F
)=
Z
¯ε
ε
d
J
F
(x)dF (x)=J
F
(x)F(x)
¯
¯
¯ε
ε
d
Z
¯ε
ε
d
F (x)
dJ
F
(x)
dx
dx
= J
F
ε)
σ
λ + r
Z
¯ε
ε
d
F (x)dx
=
σ
λ + r
Z
¯ε
ε
d
[1 F (x)] dx
where the last line use the fact that J
F
(ε)=
σ
λ+r
(ε ε
d
),soincorporatesJ
F
ε) into
the integ ral
Next,wehavethat
p + σε
d
b
|
{z }
prot ow from m a rgin al job
=
λσ
r + λ
Z
¯ε
ε
d
[1 F (x)] dx
< 0 due to option value
which again highlights the hoarding result. More importantly, w e ha ve
d
=
p b
σ
σ(
r + λF (ε
d
)
r + λ
)
¸
1
> 0.
which implies that when there is more dispersion of rm-specic shocks, there will
be more job destruction
245
Lectures in Labor Economics
The job creation part of this economy is similar to before. In particular, since
rmsenterattheproductivity¯ε,wehave
q (θ) J
F
ε)=γ
0
=
γ
0
(r + λ)
σε ε
d
)
= q(θ)
Recall that as in the basic searc h model, job creation is “sluggish”, in the sense
thatitisdictatedbythematchingfunction;itcannotjumpitcanonlyincreaseby
inv esting m or e resource s in matching.
On the other hand, job destruction is a jump variable so it has the potential to
adjust m uch more rapidly (this feature w as emphasized a lot when search models
with endogenous job-destruction rstcamearound,becauseatthetimethegeneral
belief w as that job destruction rates were more variable than job creation rates; no w
it’s not clear whether this is true; it seems to be true in manufacturing, but not in
the whole econom y).
The Beveridge curve is also dierent now. Flow into unemp loym ent is also
endogeno us, so in steady-state w e need to ha ve
λF (ε
d
)(1 u)=q(θ)θu
In other words:
u =
λF (ε
d
)
λF (ε
d
)+q(θ)θ
,
which is ver y similar to our Beveridge curve above, except that λF (ε
d
) replaces s.
The most important implication of this is that shocks (for examp le to produc-
tivit y) no w also shift the Bev erid ge curv e shifts. For example, an increase in p
will cause an inward shift of the Bev eridg e curv e; so at a giv en leve l of creation,
unemploym en t will be lower.
How do y ou think endogenous job destruction aects eciency?
246
Lectures in Labor Economics
5. A Two-Sector Searc h Model
Now consider a two-sector version of the searc h model, where there are skilled
and unskilled w orkers. In particular, suppose that the labor force con sists of L
1
and
L
2
workers, i.e.
L
1
: unsk illed work er
L
2
: skilled w or ker
Firms decid e wh ether to open a skilled vacancy or an unskille d vacancy.
M
1
= x(U
1
,V
1
)
M
2
= x(U
2
,V
2
)
¾
the same matching function in both sectors.
Opening vacancies is costly in both markets with
γ
1
: cost of vacancy for unskilled work er
γ
2
: cost of vacancy for skilled w or ker.
As before, shocks arriv e at some rate, here assum ed to be exogeno us and poten-
tially dierent bet ween the two t y pes of jobs
s
1
,s
2
: separation rates
Finally, we allo w for population growth of both skilled unskilled work er s to be able
to discuss cha nges in the composition of the labor force. In particular, let the rate
of population gro wth of L
1
and L
2
be n
1
and n
2
respectively.
n
1
,n
2
: population growth rates
This structure immed iately implies that there will be two separate Beveridge
curves for unskilled and skilled workers, given by
u
1
=
s
1
+ n
1
s
1
+ n
1
+ θ
1
q(θ
1
)
u
2
=
s
2
+ n
2
s
2
+ n
2
+ θ
2
q(θ
2
)
.
(can y ou explain these equations? Deriv e them ?)
So dierent unemployment rates are due to three observable features, separation
rates, population gro w th and job creation rates.
Theproductionsideislargelythesameasbefore
output Af(K, N)
247
Lectures in Labor Economics
where N is the eective units of labor, consisting of skilled and unskilled w o rkers.
We assumed that each unskilled work er has one unit of eective labor, while
each skilled wo r ker has η>1 units of eective labor.
Finally, the in terest rate is still r and the capital depreciation rate is δ.
Asset Value Equations are as before.
For lled jobs,
rJ
F
1
= Af(k) (r + δ)k w
1
s
1
J
F
1
rJ
F
2
= Af(k)η (r + δ) w
2
s
2
J
F
2
W hile for vacancies, w e hav e
rJ
V
1
= γ
1
+ q(θ
1
)(J
F
1
J
V
1
)
rJ
V
2
= γ
2
+ q(θ
2
)(J
F
2
J
V
2
)
Zero prot for opening jobs in both sectors implies
J
V
1
= J
V
2
=0
Using this, we ha v e the value of lledjobsinthetwosectors
J
F
1
=
γ
1
q(θ
1
)
and J
F
2
=
γ
2
q(θ
2
)
Theworkersideisalsoidentical,especiallysinceworkersdonthaveachoiceaecting
their status. In particular,
rJ
U
1
= z + θ
1
q(θ
1
)(J
E
1
J
U
1
)
rJ
U
2
= z + θ
2
q(θ
2
)(J
E
2
J
U
2
)
where we hav e assumed the unemploymen t benetisequalforbothgroups(thisis
not importan t, what’s important is that unemploymen t benets are not proportional
to equilibrium wages).
Finally, the value of being employed for the two types of w orkers are
rJ
E
i
= w
i
s(J
E
i
J
U
i
)
248
Lectures in Labor Economics
The structure of the equilibrium is similar to before, in particular the modied
golden rule and the two w age equations are:
Af
0
(k)=r + δ M.G.R.
w
1
=(1 β)z + β [Af(k) (r + δ)k + θ
1
γ
1
]
w
2
=(1 β)z + δ [Af(k)η (r + δ) + θ
2
γ
2
]
The m ost importan t result here is that wa ge dierences bet ween skilled unskilled
work ers are compressed.
To illustrate this, let us take a simple case and suppose rst that
γ
1
= γ
2
,n
1
= n
2
,s
1
= s
2
,z=0.
Thus there are no dierences in costs of creating vacancies, separation rates, un-
employment benets, and population growth rates between skilled and unskilled
work ers.
Then w e hav e
u
2
>u
1
Why? Lets see
J
F
1
=
γ
q(θ
1
)
and J
F
2
=
γ
q(θ
2
)
J
F
2
>J
F
1
= θ
1
2
= u
1
>u
2
.
High skill jobs yield higher rents, so ev eryth ing else equal rms will be keener to
create these t ypes of jobs, and the only thing that will equate their marginal prots
is a slower rate of nding skilled w o rkers, i.e., a lower rate of unem p loyment for
skilled than unskilled workers
There are also other reasons for higher unemployment for unskilled w orkers.
Also, s
1
>s
2
but lately n
1
<n
2
so the recent fall in n
1
and increase in n
2
should
ha v e helped unskilled unemployment.
But z has more impact on unskilled w ag es.
η = “skill-biased” tech no logic al c h an ge.
249
Lectures in Labor Economics
= u
1
= cst, w
1
= cst
u
2
,w
2
A set of interesting eects happen when r are endogenous. What are they?
Suppose w e hav e η , this implies that demand for capital goes up, and this will
increase the inter est rate, i.e., r
The increase in the interest rate will cause
u
1
,w
1
.
W hat about labor force participation? Can this model explain non-participation?
Suppose that w ork ers have outside opportunities distributed in the population,
and they decide to tak e these outside opportunities if the market is not attractive
enough. Suppose that there are N
1
and N
2
unskilled and skilled w or kers in the
population . Eac h unskilled worker has an ou tside option dra w n from a distribu tion
G
1
(v), while the same distribution is G
2
(v) for skilled workers. In summ ary:
G
1
(v) N
1
: unskille d
G
2
(v) N
2
: skilled
Given v; the wo rker has a c h oice bet ween J
U
i
and v.
Clearly, only those unskilled workers with
J
U
1
v
will participate and only skilled w o rkers with
J
U
2
v
(whyareweusingthevaluesofunemployedworkersandnotemployedworkers?)
Since L
1
and L
2
are irrelevant to steady-state labor m arket equilibrium above
(because of constan t returns to scale), the equilibrium equations are unchanged.
Then,
L
1
= N
1
Z
J
U
1
0
dG
1
(v)
L
2
= N
2
Z
J
U
2
0
dG
2
(v).
η ,r = u
1
,w
1
J
U
1
250
Lectures in Labor Economics
= unskilled participation falls. (consistent with Juhn-M urp hy and Topel’s
nd ings on US labor markets in the 1980s).
But this mechanism requires an in terest rate response. Is the in terest rate higher
in the ’80s?
Alternat ive formulation: the skilled do the unskilled jobs and there are not so
man y jobs (demand??). This tak es us the next topic.
251
CHAPTER 12
C o mposition of Job s
Search models, and more generally models with frictional labor markets, also
provided a useful perspectiv e for thinking about the endogeno us composition of
jobs. The “composition of jobs” here refers to the qualit y distribution of jobs,
for examp le, some jobs ma y involve higher qualit y or newer vin tage mac h ines or
more ph ysical capital, and the same w o rker will be more prod uctive in these jobs
than others with lower quality mac h ines or less phy sical capital. An investigation
of the composition of jobs is interestin g in part because this is one of the main
margins in which labor markets ma y hav e dierent degrees of success in achievin g
and ecient allocation. For example, depending on labor market institutions or
other features of the en v iron m ent, the equilibrium ma y or may not involve the
“appropr iate” allocation of workers to rms, or the creation of the righ t types of
job s.
1. Endogenous Composition of Jobs with Homogeneous Work ers
Let us start with the simplest setup, in whic h workers are homogeneo us, but
theycanbeemployedintwodieren t types of jobs. Labor and capital are used
to produce tw o non-storab le interm ed iate goods that are then sold in a competitiv e
market and immediately transform ed into the nal consumption good. Preferences
of all agen ts are dened over the nal consum p tion good alone. Let us normalize
the price of the nal good to 1.
There is a continuum of iden tical w orkers with measure normalized to 1.All
work ers are innitely liv e d and risk-neutral. They derive utility from the consum p -
tion of the unique nal good and maxim ize the presen t discounted value of their
253
Lectures in Labor Economics
utility. Time is con tinuous and the discount rate of work ers is equal to r.Onthe
other side of the market, there is a larger continuum of rms that are also risk-neutral
with discount rate r.
The technology of production for the nal good is:
(12.1) Y =
¡
αY
ρ
b
+(1 α)Y
ρ
g
¢
1
where Y
g
is the aggregate production of the rst input, and Y
b
is the aggregate
production of the second input, and ρ<1. The elasticity of substitution bet ween
Y
g
and Y
b
is 1/(1 ρ) and α parameterizestherelativeimportanceofY
b
.The
subscripts g and b refer “good” and “bad” jobs as it will become clear shortly.
This form u lation captures the idea that there is some need for diversity in o verall
consumption/production, and is also equivalent to assuming that (12.1) is the utility
function dened over the two goods.
Since the tw o in t ermediate goods are sold in competitive mar kets, their prices
are:
p
b
= αY
ρ1
b
Y
1ρ
p
g
=(1 α)Y
ρ1
g
Y
1ρ
(12.2)
The technology of production for the inputs is Leon tie.Whenmatchedwith
a rm with the necessary equipment (capital k
b
or k
g
), a w orker produces 1 unit
of the respective good. The equipment required to produce the rst input costs k
g
while the cost of equipment for the second input is k
b
. Let us assume that
k
g
>k
b
.
Before we mov e to the search economy, it is useful to consider the perfectly
competitiv e bench m a rk. Since k
g
>k
b
, in equilibr ium, we will hav e
p
g
>p
b
.
But rms hire workers at the commo n wage, w, irrespectiv e of their sector. Thus,
therewillbeneitherwagedierences nor bad nor good jobs. Also, since the rst
welfare theorem applies to this econo my, the composition of outpu t will be optimal.
254
Lectures in Labor Economics
Given the setup so far we can obtain the main idea before prese nting the detailed
analysis. Assoonasweentertheworldofsearch,therewillbesomerent-sharing.
This implies that a work er who produces a higher valued output will receive a higher
wage. As noted above, because k
g
>k
b
, the input whic h costs more to produce will
command a higher price, th us in equilibrium p
g
>p
b
. Rent-sh arin g, then, leads to
equilib r iu m wage dieren tials across identical w orkers. That is, w
g
>w
b
. Hence, the
terms good and bad jobs. Next, it is intuitiv e that since, compared to the economy
with competitiv e labor mark ets, good jobs ha v e higher relativ e labor costs, their
relativ e production will be less than optimal. In other w ords, the proportion of good
(high-w age) jobs will be too low compared to what a social planner would choose.
The rest of this section will formally analyze the search economy and establish these
claims. It will then demonstrate that higher minimum w a ges and more generous
unemploymen t benets will improve the composition of jobs and possibly w elfa re.
1.1. The Technology of Search. As in the canonical search model, rms and
w orkers come together via a matc hing technology M(u, v) where u is the unem ploy-
ment rate, and v is the vacancy rate (the n umber of vacancies). Once again, w e
assume that search is undirected, th us both types of v acancies hav e the same prob-
abilit y of meeting w o rkers, and it is the total n umber of vacancies that ent ers the
matching function. M(u, v) is twice dierentiable and increasing in its argumen ts
and exhibits constan t returns to scale. This enables me to write the ow rate of
match for a vacancy as
M(u, v)
v
= q(θ),
where q(.) is a dierentiable decreasing function and
θ =
v
u
is the tightness of the labor ma rket. It also immedia tely follo w s from the constan t
returns to scale assumption that the ow rate of matc h for an unemploy ed worker
is
M(u, v)
u
= θq(θ).
255
Lectures in Labor Economics
In general, q(θ)q(θ) < , thu s it tak es time for w orkers and rms to nd suitable
production partners. We also make the standard Inada-type assumptions on M(u, v)
which ensure that θq(θ) is increasing in θ,andthatlim
θ→∞
q(θ)=0, lim
θ0
q(θ)=
, lim
θ→∞
q(θ)θ =0and lim
θ0
q(θ)θ = .
All jobs end at the exogenous ow rate s, and in this case, the rm becomes an
unlled vacancy and the w ork er becomes unemployed. Finally, there is free entry
into both good and bad job vacancies, therefore both t ypes of vacancies should
expect zero net prots.
Let us denote the ow return from unemplo ym ent by z whic h will be though t
as the lev el of unemplo ym ent benet nanced b y lump-sum taxation. A s usual,
we assume that w ages are determined by asymmetric Nash Bargaining where the
work er has bargaining power β. Nash Bargaining per se is not essen tial, though
rent-sharing is crucial for the results.
Firms can choose either one of t w o types of vacancies: (i) a v acancy for a
intermediate good 1 -agood job; (ii) a vacancy for an intermediate good 2 - a
bad job. T herefore, before opening a vacancy a rm has to decide whic h input it
will produce, and at this poin t, it will have to buy the equipmen t that costs either
k
b
or k
g
. Th e important aspect is that these creation costs are incurred before
the rm meets its emplo yees; this is a reasonable assumption, since, in practice, k
corresponds to the costs of mac hin ery, which are sector and occupation specic.
1.2. The Bas ic Bellm a n Equa tio n s . As usual, w e will solve the model via
a series of Bellm an equations. We denote the discounted value of a vacancy by J
V
,
of a lled job by J
F
, of being unemploy ed b y J
U
and of being employ ed by J
E
.
We will use subscripts b and g to denote good and bad jobs. We also denote the
proportion of bad job v a cancies among all vacancies b y φ. Then, in steady state:
(12.3) rJ
U
= z + θq(θ)
£
φJ
E
b
+(1 φ)J
E
g
J
U
¤
Being unemployed is similar to holding an asset; this asset pays a dividend of z,the
unemploymen t benet, and has a probabilit y θq(θ)φ of being tran sformed in to a bad
256
Lectures in Labor Economics
jobinwhichcasetheworkerobtainsJ
E
b
, the asset value of being emplo yed in a bad
job, and loses J
U
; it also has a probabilit y θq(θ)(1 φ) of being tran s forme d into a
good job, yielding a capital gain J
E
g
J
U
(out of steady state,
˙
J
U
hastobeadded
to the righ t-h an d side to capture future chang es in the value of unem ploymen t).
Observe that this equation is written under the implicit assump tion that wo rkers
will not turn dow n jobs, which w e will discuss further belo w . T he steady state
discounted present value of employmen t can be written as:
(12.4) rJ
E
i
= w
i
+ s(J
U
J
E
i
)
for i = b, g . (12.4) has a similar intuition to (12.3).
Similarly, when matched, both vacancies produce 1 unit of their goods, so:
(12.5) rJ
F
i
= p
i
w
i
+ s
¡
J
V
i
J
F
i
¢
(12.6) rJ
V
i
= q(θ)
¡
J
F
i
J
V
i
¢
for i = b, g, where w e have ignored the possibility of v o luntary job destruction which
will nev er take place in steady state.
Since workers and rms are risk-neutral and have the same discount rate, Nash
Bargaining implies that w
b
and w
g
will be chosen so that:
(1 β)(J
E
b
J
U
)=β(J
F
b
J
V
b
)(12.7)
(1 β)(J
E
g
J
U
)=β(J
F
g
J
V
g
)
Note that an importan t feature is already incorporated in these expressions:
work ers cannot pay to be employ ed in high w a ge jobs: due to searc h frictions, at
the moment a w orker ndsajob,thereisbilateralmonopoly,andthisleadsto
rent-sharing over the surplus of the match .
As there is free-entry on the rm side, it should not be possible for an additional
vacancy to open and mak e expected net prots. Hence:
(12.8) J
V
i
= k
i
.
257
Lectures in Labor Economics
Finally, the steady state unemplo y m ent rate is given by equating ows out of un-
employment to the number of destro yed jobs. Thus:
(12.9) u =
s
s + θq(θ)
.
1.3. Characterization of Steady State Equilibria. A steady state equilib-
rium is dened as a proportion φ of bad jobs, tightn ess of the labor market θ,value
functions J
V
b
,J
F
b
,J
E
b
, J
V
g
,J
F
g
,J
E
g
and J
U
, prices for the two goods, p
b
and p
g
such
that equations (12.2), (12.3), and (12.4), (12.5), (12.6), (12.7) and (12.8) for both
i = b and g are satised . The steady state unemployment rate is then giv en by
(12.9).
1
In steady state, both types of vacancies meet w orkers at the same rate, and
in equilibrium w o rkers accept both t ypes of jobs, therefore Y
b
=(1 u)φ and
Y
g
=(1 u)(1 φ). Then, from (12.2), the prices of the two inputs can be written
as:
p
g
=(1 α)(1 φ)
ρ1
[αφ
ρ
+(1 α)(1 φ)
ρ
]
1ρ
ρ
(12.10)
p
b
= αφ
ρ1
[αφ
ρ
+(1 α)(1 φ)
ρ
]
1ρ
ρ
.
Simple algebra using (12.4), (12.5), (12.7) and (12.8) gives:
(12.11) w
i
= β (p
i
rk
i
)+(1 β)rJ
U
as the w age equation. Intu itively, the surplus that the rm gets is equal to the value
of output which is p
i
minus the ow cost of the equipment, rk
i
. The worker gets a
share β of this, plus (1 β) times his outside option , rJ
U
. Using (12.5) and (12.6),
the zero-prot condition (12.8) can be rewritten as:
(12.12)
q(θ)(1 β)
¡
p
b
rJ
U
¢
r + s +(1 β)q(θ)
= rk
b
(12.13)
q(θ)(1 β)
¡
p
g
rJ
U
¢
r + s +(1 β)q(θ)
= rk
g
.
1
One might wonder at this point whether a dierent type of equilibrium, with J
U
= J
E
b
and workers accepting bad jobs with probability ζ<1, could exist. The answer is no. From
equation (8.1), this would imply J
V
b
= J
F
b
, but in this case, rms could never recover their upfront
in vestment costs.
258
Lectures in Labor Economics
A rm buys equipment that costs k
i
, which remains idle for a while due to
search frictions (i.e. because q(θ) < ). This cost is larger for rms that buy more
expensiv e equipment and open good jobs. They need to recover these costs in the
form of a higher net ow prots: i.e. p
g
rk
g
>p
b
rk
b
. From rent-sharing, this
immediately imp lies tha t w
g
>w
b
.Morespecically, com bining (12.11), (12.12) and
(12.13), we get :
(12.14) w
g
w
b
=
(r + s)β(rk
g
rk
b
)
(1 β) q(θ)
> 0
Therefor e, wa ge dierences are related to the dierences in capital costs and also to
the av erage duration of a vacancy. In particular, when q(θ) →∞, the equilibrium
conv e rge s to the Walrasian limit point, and both w
g
and w
b
con verge to rJ
U
,so
wage dierences disappear. The reason is that in this limit poin t, capital in vestments
never remain idle, thus good jobs do not need to make higher net ow prots. Also,
with equal creation costs, i.e., k
b
= k
g
,wagedierentials disappear again.
Finally, (12.3) giv es the value of an unemployed work er as
(12.15) rJ
U
= G(θ, φ)
(r + s)z + βθq(θ)[φ(p
b
rk
b
)+(1 φ)(p
g
rk
g
)]
r + s + βθq(θ)
It can easily be v eried that G(., .) is con t inuous, strictly increasing in θ,and
strictly decreasing in φ. Int uitively, as the tightness of the labor market, θ, increases,
workers nd jobs faster, thus rJ
U
is higher. Also as φ decreases, the greater fraction
of good jobs amo ng vacancies increases the value of being unemp loyed since w
g
>w
b
(i.e., J
V
g
>J
E
b
). The dependence of rJ
U
on φ is the general equilibrium eect
men tioned in the introduction: as the composition of jobs changes, the option value
of being unemployed also changes.
A steady-state equilibrium is ch ar acterized b y the in tersec tion of two loci: bad
job locus, (12.12), and the good job locus, (12.13) (both evaluated with (12.10) and
(12.15) substituted in).
The next gure draw s these two loci in the θ-φ plane.
In this gure, the curve for (12.13), along which a rm that opens a good job
vacancy makes zero-prots, is upward sloping: a higher value of φ increases the
259
Lectures in Labor Economics
0
Good Job LocusBad Job Locus
θ
θ∗
φ
φ∗
1
Figure 12.1
left hand side, th us θ needs to c hange to increase the righ t-hand side (and reduce
the left-hand side through G(θ, φ)). Intuitiv ely, an increase in φ implies a higher
p
g
(from equation (12.10)). So to ensure zero prots, θ needs to increase to raise
the duration of vacancies. In con tra st, (12.12) cannot be sho w n to be decreasing
ev erywhere. Intuitively, an increase in φ reduces p
b
,thusrequiresafallinθ to
equilibrate the market, but the general equilibrium eect through J
U
(i.e. that a
fall in φ reduces J
U
) coun teracts this and may dom ina te. Th is issue is discussed
furth e r belo w.
Here, let us start with the case in which ρ 0, so that good and bad jobs are
gross complements. In this case, it is straigh tforward to see that as φ tends to
1, (12.12) gives θ →∞whereas (12.13) implies θ 0.Thus,thebadjoblocus
is above the good job locus. The opposite is the case as φ goes to zero. Then
b y the contin uity of the two functions, they m ust in tersect at least once in the
range φ (0, 1). Therefore, we can conclude that there always exists a steady state
equilib riu m with φ (0, 1) alw ays exists and is cha racterized by (12.10), (12.11),
260
Lectures in Labor Economics
(12.12), (12.13) and (12.15). In equilibrium, for all k
g
>k
b
,wehavep
g
>p
b
and
w
g
>w
b
.
When ρ>0, an equilibriu m contin u es to exist, but does not need to be in terior,
so one of (12.12) and (12.13) may not hold. We no w discuss a particular case of
this.
1.4. Multiple eq u ilib r ia . Since (12.12) can be upw a rd sloping o ver some range,
more than one intersections, hence m u ltiple equilibria, are possible. (12.12) is more
lik ely to be up ward sloping when relative prices ch ang e little as a result of a c h ang e
in the composition of jobs. T h er efore, to illustrate the possibility of multiple equi-
libria, let us consider the extreme case where ρ =1, so that goods g and b are
perfect substitutes, and there are no relative price eects. Furthermore, we assume
that
1 2α>r(k
g
k
b
).
In the absence of this assumption, good jobs are not productiv e enough, and will
never exist in equilibrium .
The absence of substitution between good and bad jobs im m ediately implies that
p
g
=1 α>p
b
= α.
The equilibrium can then be c h ara cterized diagrammatically. To do this, totally
dierentiate (12.12) and (12.13), with p
g
=1 α and p
b
= α,whichgives
(12.16)
¯
¯
¯
¯
i
=
∂G(θ,φ)
∂φ
∂G(θ,φ)
∂θ
k
i
(r+s)(1β)q
0
(θ)
(1β)q(θ)
2
∂G(θ,φ)
∂θ
> 0
where i = b is zero prot condition for bad jobs, (12.12), and i = g is the zero prot
condition for good jobs, (12.13). Th e derivativ e in (12.16) is positiv e, irrespective
of whether it is for good or bad jobs, because rJ
U
= G(θ, φ) is decreasing in φ and
increasing in θ, while q
0
(θ) < 0.Sincek
b
<k
g
, this equation also immediately
implies that (12.12) is steeper than (12.13). So (12.12) has to intersect (12.13) from
below if at all, in which case there will be three equilib ria. This is shown in the next
gu re.
261
Lectures in Labor Economics
0
Good Job Locus
Bad Job Locus
θ
θ
b
φ
1
θ
g
θ
b
θ
g
Figure 12.2
The rst is a “mixed strategy” equilibrium at the poin t where the two curves
in ters ect. The other two equilibria are more in t eresting . W h en φ =0,wehaveθ
g
>
θ
b
,sothatitismoreprotable to open a good job. H ence there is an equilibrium
in whic h all rmsopengoodjobs. Itisnotprotable for rm s to open a bad job,
because when φ =0, workers receiv e high wages and ha ve attractiv e outside options;
so a rm that opens a bad job will be forced to pay a relatively high wa ge, mak ing
a deviation to a bad job unpr otable. In contrast, at φ =1,wehaveθ
0
g
0
b
,soit
is an equilibrium for all rms to open bad jobs.
Intuitively, when all rm s open bad jobs, the outside option of w or kers is low, so
rm s bargain to low wages, making entry relatively protable. In equilibrium, θ has
to be high to ensure zero prots. But a tight labor market (a high θ) hurts good jobs
relatively more since they ha ve to make large r upfront investments. The multiplicity
of equilibria in this model illustrates the strength of the general equilibrium forces
that operate through the impact of job composition on the o verall level of w ages.
262
Lectures in Labor Economics
1.5. Welfare. Let us next analyze the w elfare properties of equilibrium using
thenotionoftotalsurplusasinthebaselinesearchmodel. Inthiscase,totalsurplus
(in steady state) can be written as:
(12.17) TS =(1 u)[φ(p
b
rk
b
)+(1 φ)(p
g
rk
g
)] θu (φrk
b
+(1 φ)rk
g
)
Total surplus is equal to total o w of net output, which consists of the number of
workers in good jobs ((1 φ)(1 u)) times their net ou tput (p
g
minus the ow cost
of capital rk
g
), plus the number of workers in bad jobs (φ(1 u)) times their net
product (p
b
rk
b
), min us the ow costs of job creation for good and bad vacancies
(respectiv ely, θu(1 φ)rk
g
and θuφrk
b
).
It is straigh tfo rward to locate the set of allocations that maxim ize total social
surplus. This set would be the solution to the maximization of (12.17) subject to
(12.9). Inspecting the rst-order conditions of this problem, it can be seen that
decentralized equilibria will not in general belong to this set, thu s a social planner
can imp rove over the equilibrium allocation. Th e results regarding the socially
optimal amount of job creation are standard: if β is too high, that is β>η(θ)
where η(θ) is elasticity of the matc h ing function, q(θ), then there will be too little
job creation, and if β<η(θ), there will be too m uc h. Since this paper is concerned
with the composition of jobs, we will not discuss these issues in detail. Instead, w e
will sho w that irrespectiv e of the value of θ, the equilibrium value of φ is alway s too
high; that is, there are too man y bad jobs relativ e to the n umber of good jobs.
To prove this claim, it is sucient to consider the derivative of TS with respect
to φ at z =0(note the constrain t, (12.9), does not depend on φ):
(12.18)
dT S
=(1 u) ·
d(φp
b
+(1 φ)p
g
)
¸
(1 u + ) ·{rk
b
rk
g
}
For the composition of jobs to be ecient at the laissez-faire equilibrium, (12.18)
needs to equal zero when evaluated in the equilibrium char acteriz ed above . Some
simple algebra using (12.9), (12.10), (12.12) and (12.13) to substitute out u, and k
i
263
Lectures in Labor Economics
gives (details of the algebra available upon request):
dT S
¯
¯
¯
¯
dec. eq.
=
θq(θ)
s + θq(θ)
·
µ
1+
(s + q(θ))(1 β)
r + s +(1 β)q(θ)
· (p
b
p
g
) < 0
This expression is always negativ e, irrespective of the value of θ,sostartingfrom
laissez-faire equilibrium , a reduction in φ will increase social surplus. Therefore, we
can conclude that, given the labor market tigh tn ess θ, a surplus-maxim izing social
planner would choose φ
s
(θ)
(θ),whereφ
(θ) is the decentralized equilibrium
with z =0. In other words, the equilibrium proportion of bad jobs is too high.
The in tuition is simp le; in a decen tr alized equilibr ium, it is always the case that
w
g
>w
b
.Yet,rms do not take int o account the higher utilit y they pro v ide to
work ers b y creating a good job rather than a bad job, hence there is an uninter-
nalized positive externalit y, whic h leads to an excessiv ely high fraction of bad jobs
in equilibrium. Searc h and rent-sharing are crucial for this result. Search ensures
that rms ha ve to share the ex post rents with the workers, and they cannot induce
competition among work e rs to bid down w age s. Firms w ould ideally lik e to con tract
with their w orkers on the w age rate before they mak e the investmen t decision, but
search also implies that they do not kno w who these w o rkers will be, thu s cannot
contract with them at the time of investment.
1.6. The Impact of Minim um Wages and Unem plo ym en t Benets. As
is usual in models with potential m u ltip le equilibria , only the compa r at ive statics of
“extremal” equilibria are of interest. Therefore, let us focus on an econom y where
in equilibrium (12.13) cuts (12.12) from below (or alternatively, an economy with
a unique equilibrium). No w consider an increase in z whic h corresponds to the UI
system becoming more generous. Both the bad job locus, (12.12), and the good job
locus, (12.13), will shift dow n . Hence, θ will denitely fall. It is also straightforward
to v erify that (12.12) will shift by more, therefore, φ is unam biguously reduced.
Intuitiv ely, with φ unc hanged, relativ e prices and hence wages will be unchanged,
but then with the higher unemplo ymen t benets, w ork ers w ould prefer to w ait for
264
Lectures in Labor Economics
good jobs rather than accept bad jobs. This increases w
b
and reduces φ (the fraction
of bad jobs).
Furthermore, a more generou s unemploymen t benet not only increases the frac-
tion of good jobs, but ma y also increase the total number of good jobs. Totally
dierentiating (12.12) and (12.13), w e obtain that the total number of good jobs
will increase if and only if:
w
g
w
b
>
µ
1
η(θ)
1
u(1 φ)
µ
d(p
g
p
b
)
where recall that η(θ) is the elasticity of q(θ). This inequ ality is likely to be satised
when the t wo inputs are highly substitutable, i.e. ρ close to 1;whenwagedierences
are large; when η(θ) is close to 1; and/or when unemplo yment is lo w to start with.
Th us, it is only increases in unemploymen t benet starting from moderate lev els
that increase the num ber of good jobs.
Theimpactonwelfaredependsonhowlargetheeect on θ is relative to the
eect on φ. We can see this b y totally dierentiating (12.17) after substituting for
u. This gives a relationship between θ and φ, drawn as the dashed line in the next
gu re, along which total surplus is constan t.
Shifts of this curv e to wards North-E ast giv e higher surplus. W h en this curve
is steeper than (12.13), a higher z canimprovewelfare,andthisisthecasedrawn
in the gure. For example, if β is v e ry lo w to start with, then unemp loyment will
be too low relativ e to the social optimum, and in this case an increase in z will
unambiguously increase total w elfare.
More generally, irrespective of whether total surplus increases, a more generous
unemploymen t benet raises a verage labor productivity, φp
b
+(1 φ)p
g
,whichis
unambiguously decreasing in φ. Therefore, when unemplo ym ent benets increase,
the composition of jobs shifts towards more capital in tensive good jobs, and labor
productivit y increases.
A minimum wage ha s a similar eect on job composition. C o nsid er a m inimum
wage w
such that w
b
<w<w
g
, so it is only binding for bad jobs. The equation for
265
Lectures in Labor Economics
0
Good Job LocusBad Job Locus
θ
θ∗
φ
φ∗
1
TS
Figure 12.3
J
F
b
now becomes:
J
F
b
=
p
b
w + sk
b
r + s
.
Then, (12.12) changes to:
(12.19) q(θ)
p
b
w
r + s + q(θ)
= rk
b
.
Since at a given θ, the left-hand side of (12.19) is less than that of (12.12), the
impact of higher minim u m w ages is to shift the bad job locus, curve (12.12), do w n.
The good job locus is still given by (12.13), but now, com bining (12.3) and (12.4),
rJ
U
= G(θ, φ)
(r + s)z + βθq(θ)[φw
+(1 φ)(p
g
rk
g
)]
r + s + θq(θ)(1 (1 β)(1 φ))
Since w
>w
b
, both curves shift down, but as in the case of unemplo ymen t benets,
(12.12) shifts do wn by more, so both φ and θ fall. Again, the rise in minimum wages
can increase the n umber, not just the proportion, of good jobs and total w elfare.
Moreover,forthesamedeclineinθ, an increase in minim um wages reduces φ more
than an increase in z, therefore, minimum wages appear to be more po werful in
shifting the composition of employment aw ay from bad towards good jobs.
266
Lectures in Labor Economics
Overall, w e can conclude that both the in troduction of a minimum wage w and
an increase in unemployment benet z decrease θ and φ. Therefore, they impro ve
the composition of jobs and a verage labor productivit y, but increase unem ployment.
Theimpactonoverallsurplusisambiguous.
2. Endogenous Composition of Jobs with Heterogeneous Workers
Now consider a somewh at more realistic environmen t in which work e rs are also
of heterogeneous skills. In particular, consider a w orld in which wor kers may have
high or low skills and they have to ma tch with rms. Firms will c hoose the lev el of
their capital stock before matching with the work er s. The basic idea that will be
highlighted b y the model is that when either the productivit y gap between skilled
and unskilled workers is limited or when the number of skilled work er s in the labor
force is small, it will be protable for rm s to create jobs that to employ both
skilled and unskilled w o rkers. But when the productivity gap is large or that are
asucient number of skilled w or kers, it ma y become protab le for (some) rm s to
target skilled workers, design ing the jobs specically for these work ers. Then these
rm s will wa it for the skilled w o rkers, and will try to screen the more skill once
among the applicants. In the meantime, there will be lo wer-qualit y (lo w capital)
jobs specically targeted at the unskilled.
Suppose that there are t wo types of workers. The unskilled ha ve human capital
(productivit y) 1, wh ile the skilled ha ve human capital η>1. Denote the fraction
of skilled w or kers in the labor force b y φ.
Firms choose the capital stoc k k before they meet a w orker, and matc hing is
assumed to be random, in the sense that eac h rm , irrespectiv e of its physical capital,
has exactly the same probab ility of meeting dierent types of w o rkers. Once the
rm and the w orker match, separating is costly, so there is a quasi-ren t to be divided
bet ween the pair. Here, the econom y is assumed to last for one period, so if the
rm and w orker do not agree they lose all of the output (see Acemoglu, 1999, for
themodelwheretheeconomyisinnite-horizon and agents who do not agree with
267
Lectures in Labor Economics
their partners can resample). Therefore, bargaining will result in work ers receiving
a certain fraction of output, which is again denoted by β.
The production function of a pair of worker and rm is
y = k
1α
h
α
,
where k is the ph ysical capital of the rm and h is the human capital of the w orker.
Firms choose their capital stock to maxim ize prots, before knowing which type
of w o rker will apply to their job. For simp licity, we assume that rms do not bear
the cost of capital if they decides not to produce with the work er who has applied
to the job. We also denote the cost of capital b y c.
Their expected prots are therefore given b y
φx
H
(1 β)
¡
k
1α
η ck
¢
+(1 φ) x
L
(1 β)
¡
k
1α
ck
¢
,
where x
j
is the probabilit y, chosen by the rm,thatitwillproducewithaworker
of type j conditional on matching that t ype of w orker. Therefore, the rst term is
prots conditional on matc h ing with a skilled w orker, and the second term giv es the
prots from m atching with an unskilled work er .
There can be to dierent t ypes of equilibria in this econom y :
(1) A pooling equilibrium in which rm s ch oose a level of capita l and use it both
of skilled and unskilled workers. We will see that in the pooling equilibriu m
inequ ality is limited.
(2) A separating equilibrium in which rms target the skilled and c hoose a
higher level of capital. In this equilibrium inequa lity will be greater.
In this one-period economy, rm s nev er specically target the unskilled, but that
outcome arises in the dynamic v ersion of this economy.
Now it is straigh tforward to ch aracterize the rms prot maximiz ing capital
c h oice and the resulting organization of production (whether rms will emplo y both
skilled and unskilled work ers). It turns out that rst c h oose the poolin g strategy as
long as
η<
µ
1 φ
φ
α
φ
1
268
Lectures in Labor Economics
Therefor e, a suciently large increase in η (in the relativ e productivity of skilled
work ers) and/or in φ (the fraction of skilled work ers in the labor force) switc h es the
economy from pooling to separating).
Figure 12.4
Such a switch will be associated with important chang es in the organization of
production, an increase in inequality, and a decline in the w ages of low-skill workers.
Is there any evidence that there has been such a change in the organization of
production? This is dicu lt to ascertain, but some evidence suggests that there may
hav e been some importan t c ha nges in ho w jobs are designed and organized now .
First, rms spend muc h more on recruiting, screening, and are now muc h less
happy to hire low -skill workers for jobs that they can ll with high skill workers.
Second, as already mentioned above, the distribution of capital to labor across
industries has become muc h more unequal over the past 25 years. This is consistent
with a change in the organization of production where rather than c hoosing the
same (or a similar) lev e l of capital with both skilled and unskilled w o r kers, now
269
Lectures in Labor Economics
some rms target the skilled w orkers with high-capital jobs, while other rm s go
after unskilled w or kers with jobs with lower capital inte nsity.Third, evidence from
Figure 12.5
the CPS suggests that the distribution of jobs has c hanged signicantly since the
early 1980s, with job categories that used to pay “a verage w ag es” ha ve declined
in importance, and more jobs at the bottom and top of the w age distribution. In
particular , if w e classify industry-occupa tion cells in to high-wage the middle-wage
and lo w -wage ones (based either on wages or residual w ages), there are many fewer
w orkers employed in the middle-w age cells today as compared to the early 1980s, or
the weigh t-a t-th e-ta les of the vob qualit y distribution has increased substantially as
the next gure sho w s.
This framework also suggests that there should be better “matc hin g” bet ween
rm s and w orkers no w , since rm s are targeting high skilled wo rkers. Th erefo re,
270
Lectures in Labor Economics
Figure 12.6. The evolution of the percentage of emplo ymen t in the
top and bottom 25 percen tile industry-occupation cells (weight-at-
the-tails of the job quality distribution).
measu res of mismatch should have declined over the past 25 or so years. Consistent
with this prediction, evidence from the PSID suggests that there is much less o v er-
or under-education today than in the 1970s.
271
CHAPTER 13
Wa g e Po s t i ng an d Di r e c t e d S e a r ch
1. Ine ciency of Search Equilib r ia with Inv e stments
Before turning to w age posting and directed searc h , let us highlight a more
sev ere (and more fundamental) source of ineciency in searc h models than the
bargaining power not satisfying the Hosios condition. This results in the presence
of investmen ts.
Production still requires 1 rm - 1 w orker, but now there is the intensive margin
of capital per work er. In particular, this pair produces f(k),wherek is capital per
work er. We assume
f
0
> 0,f
00
< 0
The most important feature is that k is to be ch osen ex ante and is irreve rsible. The
important economic implications of this are tw o:
(1) If there is bargaining, at this stage of bargaining, the capital is already sunk
and the capital to labor ratio is irrev ersibly determ ined .
(2) While looking for a work er, the rm incurs an opportunit y cost equal to be
user cost of capital times the am ou nt of capital that has, i.e., u
k
× k,where
u
k
is the user cost which will be determined belo w .
Trading friction s will be modeled in a way simila r to before, but since my in teres t
here is with “inec ienc y,” whic h is easily possible with increasing or decreasing
returns to scale in the matching tec h nolog y, I will assum e constant returns to scale
from the beginning. I will also develop the notation that will be useful when w e
look at wage posting and directed search .
273
Lectures in Labor Economics
FirstnotethatifM = M (U, V ) exhibits constant returns to scale, then exploit-
ing the standard linear hom ogen eity properties, w e can write
q =
M
V
= M
µ
U
V
, 1
= q (θ)
where θ V/U is the tightn ess of the labor market (the vacancy to unemployment
ratio), and the function q (θ) is decreasing in θ given our assumptions above. This
means that vacancies have a harder time ndin g matches in a tigh t er labor m arket.
This is the standard notation in the Diam ond -M or tensen -P issarides macro search
models.
Moreov er,
p =
M
U
=
V
U
M
µ
U
V
, 1
= θq (θ)
where θq (θ) is increasing in θ. This means that unemployed workers hav e an easier
time nding matches in a tighter labor market.
Now let us develop a sligh tly dierent notation. Assume that if there are Q
work ers searching for 1 job (think of the analogy to queues), Q is equivalen t to 1
in the above notation .
Then with constant returns to scale, we have
μ(Q): o w rate of match for w orkers, assumed it is contin uously dierentiable
and μ
0
< 0
η(Q) (Q): o w rate of match for vacancy, with η
0
> 0
Thefactthatμ, η are simply functions of Q is equivalen t to assuming Constant
Returns to Scale.
As before let r betherateoftimepreference,ands be the separation rate due
to destruction of capital
Here let us change the order a little, and start with the ecient allocation, whic h
is again a solution to the planner’s problem subject to the search constraints.
274
Lectures in Labor Economics
The objective function of the planner can be written as:
Z
0
e
rt
µ
μ(Q
t
)
f(k
t
) (r + s)k
t
r + s
net output o f a m atched worker
u
t
(r + s)k
t
u
t
Q
t
cost of unlled vacancies
dt
where u
t
is the measure of un em p loyed w orkers, or alternatively the unemplo y m ent
rate, at time t.
Here it is easy to see that (r + s)k is the ow cost of investment, or user cost of
capital, k.(k paid up fron t and rk opportunit y cost, sk cost of destruction). The
planner incurs this cost for V
t
= u
t
/Q
t
vacancies
Less ob vious at rst, but equally in tuitive is that the value of an unemployed
worker is that with probab ility μ(Q
t
) he will nd a job, in which case he will produce
a net output of f(k
t
) (r + s)k
t
, until the job is destroy ed , which has discounted
value
f(k
t
)(r+s)k
t
r+s
,thusthevalueofanunemployedworkeris
μ(Q
t
)
f(k
t
) (r + s)k
t
r + s
.
This expression already imposes that all rm s will ch oose the same capital lev e l,
and no segmentation in the market (Hom ework exercise: set up and solve this
problem when the planner allo ws rmstochoosedierent levels of capital).
The constraint that the planner faces is v ery similar to the ow constraints we
saw abo ve:
˙u
t
= s(1 u
t
) μ(Q
t
)u
t
This equation says that the evo lution of unem ployment is giv en b y the ows into
unemployment, s(1 u
t
), and exits from unem p loyment, i.e., job creation, μ(Q
t
)u
t
.
Now we can write the Current Value Hamilton ian as
H(k, Q, u, λ)=u
μ(Q)
µ
f(k)
r + s
k
(r + s)k
Q
¸
+ λ [s(1 u) μ(Q)u]
The necessary conditions are
275
Lectures in Labor Economics
H
k
= u
µ
μ(Q)
µ
f
0
(k)
r + s
1
(r + s)
Q
=0
H
Q
= u
µ
μ
0
(Q)
µ
f(k)
r + s
k λ
+
(r + s)
Q
2
k
=0
H
u
= μ(Q)
µ
f(k)
r + s
k
(r + s)
Q
k λ(s + μ(Q)) =
˙
λ
Again, focusing on steady state, we impose
˙
λ =0
H
u
= = λ =
μ(Q)
³
f(k)
r+s
k
´
(r+s)
Q
k
r + s + μ(Q)
whichistheshadowvalueofanunemployedworker. Thisequationhasavery
intuitive in terpretation. The shadow value of a worker is given by the probability
(ow rate) that he will create a job, whic h is μ(Q), and the value of the job is
µ
f(k)
r + s
k
.
W hile unemployed, the worker induces the planner to hav e more vacancies open (so
as to keep Q constant), hence the term
(r + s)
Q
k.
Finally, once the job is destroy ed, which happens at the rate s, a new cycle begins,
at the rate μ (Q), which gives the denominator for discounting.
The condition that H
k
=0gives
(13.1) =
Q
S
μ(Q
S
)f
0
(k
s
)
(r + s)(r + s + Q
S
μ(Q
S
))
=1
No w com bining this and the value of λ obtained about with H
u
=0=
(13.2) f(Q
S
)
μ
0
(Q
S
)
r + s
+
r + s + μ(Q
S
)+Q
S
μ
0
(Q
S
) (Q
S
)
2
μ
0
(Q
S
)
(Q
S
)
2
k =0
Cond itions (13.1) and (13.2) ch aract erize the constrained ecient allocation.
Next, consider the equilibrium allocation. With bargain ing this corresponds to:
rJ
F
(k)=f(k) w(k) sJ
F
(k)
rJ
V
(k)=η(Q)(J
F
(k) J
V
(k)) sJ
V
(k)
276
Lectures in Labor Economics
Recall that there is random matching, so Q wo rkers for eac h vacancy. Then I can
write
rJ
E
(k)=w(k)+s(J
U
J
E
(k))
rJ
U
= μ(Q)
Z
a(k)(J
E
(k) J
U
)dF (k)
where a(k) is the decision rule of the work er on whether to match with a rm with
capital k,andF (k) is the endo gen ou s distribution of capital (please do not confuse
this with f whic h is the production function).
Nash Bargain ing again implies:
(1 β)(J
E
(k) J
U
)=β(J
F
(k) J
V
(k))
Now we will impose free en try as in the basic Morten sen-P issarid es models, so
J
V
(k) k =0
That is, opening a job costs k (the sunk in vestment), and has a return of J
V
(k).
= w(k)=β (f(k) (r + s)k)+(1 β)rJ
U
Now use this wa ge rule with J
V
and J
F
(13.3) J
V
(k)=
η(Q)
¡
(1 β)f(k)+β (r + s) k (1 β)rJ
U
¢
(r + s)(r + s + η(Q))
Also recall that η(Q)= (Q).
How is the capital-labor ratio cho sen ? Firms will clearly ch oose it to maxim iz e
prots: that is,
k maximizes J
V
(k) k.
Since this is a strictly concave problem, this implies that all rms will choose
the same level of capital, k
B
=
F (k) is a degenerate distribution with all of its mass at k
B
277
Lectures in Labor Economics
where
(13.4)
η(Q
B
)(1 β)f
0
(k
B
)
(r + s)(r + s +(1 β)η(Q
B
))
=1
with Q
B
as the equilib rium que ue length in the economy.
Now use (13.3) with J
V
and J
E
to obtain an equation determining Q
B
.
(13.5)
η(Q
B
)(1 β)f(k
B
)
r + s
=
¡
r + s +(1 β)η(Q
B
)+βμ(Q
B
)
¢
k
B
The equations (13.4) and (13.5) ch ara cter ize the equilibrium, and can be directly
comp ared to the condition s (13.1) and (13.2) for the ecient allocation.
First, compa re k
S
to k
B
: we can see that for all β>0,k
B
<k
S
.Inother
words, there will be underinv estm ent as long as work ers ha ve ex post bargaining
pow er. This is a form of holdup, in the sense that the rm makes an investment
and the returns from the investmen ts are shared between the work er and the rm.
Becau se the in vestmen t is made before there is a match, there is no feasible w ay of
contracting between the worker and the rm in order to avoid this ho ldup problem.
Th us the only w ay of obtaining eciency is to set β =0.
What about Q
S
versus Q
B
?
To compare Q
S
versus Q
B
,letf(k
B
)=f(k
S
), then w e obtain
β = β
(Q)
η
0
(Q)Q
η(Q)
1+
μ
0
(Q)Q
μ(Q)
,
is necessary and sucient for Q
S
= Q
B
.
In other words, with f(k
B
)=f(k
S
), w e are back to the model without invest-
ment, so all we need is the Hosios condition for eciency.
M = μ · U = M
U
= μ
0
Q + μ,
=
M
U
U
M
=1+
μ
0
Q
μ
,
which can be veried as the Hosios condition in this case.
Th us when f(k
B
)=f(k
S
), the Hosios condition is necessary and sucient for
eciency.
This is not surprising, since with f(k
B
)=f(k
S
), the economy is identical to the
one with xed capital.
278
Lectures in Labor Economics
The k ey question is wheth er it is possible to ensure both f(k
B
)=f(k
S
) and
Q
S
= Q
B
simultaneo u sly.
Of course, from the analysis the answ er is no.
If β>0, hold-up problem and k
S
>k
B
If β =0, the excessive entry of rms Q
B
<Q
S
.
Theorem 13.1. Constr aine d eciency is impossible with ex ante investments
and ex post b argaining.
The in tuition is quite straigh tforw ard: as long as β>0, there is ren t sharing on
the marginal increase in productivity, th us hold-up . But β =0is inconsistent w i th
optimal entry.
2. The Basic Model of Directed Search
Work ers do not random ly searc h among all possible jobs, but apply for jobs that
are more lik ely to be appropria te for their skills and in terests. How do we model
this? And how does this chan ge d the positive and normative implications of search
models?
One wa y is to construct the general equilibrium model with a non-degen erate
wage distribu tion and then allow wo rkers to searc h, perhaps in a sma rt w ay, among
these jobs.. These models ha ve the potential of leading to a coherent general equi-
librium model with sequential search. But they are rather dicult to w or k with.
Howev er, when all work ers are assumed to observe all possible w a ge oers and can
direct their searc h to one of these potential oers, then these models become quite
tractable. At some level, this modeling assumption remove s the actual “searc h ”
problem, but something akin to this, the coordination problem among the applica-
tion decisions of workers is presen t in place the same role.
These models are sometim es referred to competitive search models, but is more
useful to emphasize the two underlying assumptions: wa ge posting and directed
search, so w e will refer to them as directe d search models.
279
Lectures in Labor Economics
To bring out the most importan t poin ts, let us start from the economic envi-
ronm ent of the searc h and inv estmen t model. Recall that in this model there are
ex an te inv estm ents b y rms, and bilateral searc h to form productiv e partnerships.
In particular, recall that production requires 1 rm - 1 w o rker, with access to the
production function f(k),wherek is capital for worker cho sen before the matc h in g
stage b y the rm. Recall that
f
0
> 0,f
00
< 0
Therateoftimepreferenceisr, and the rate of separation due to the destruction
of capital b y s.
We will now think of search frictions as equivalen t to “coordination frictions”.
In particular, if there are an average of q work ers per vacancy of a certain t ype then
the o w rate of matc h for workers is μ (q), which is assumed to be con tinuously
dierentiable with μ
0
< 0. Sim ilarly, the ow rate of matching for a vacancy is
η(q) (q), where I am purposefully using the notation little q to distinguish this
from the capital Q before which referred to the economy-wide queue length, whereas
q it’s specictoatypeofjob.
So this might seem somewhat strange; work ers kno w what the various wages
are, but conditional on applying to a job they ma y not get it; but this is sensible
when there is no (centralized) coordination in the economy, because too many other
people ma y be applying specically to that job. The urn ball tec hnology captured
is in a very specic w ay, and in particular, w e had
η (q)=1 exp(q) and μ (q)=
1 exp(q)
q
The technology here generalizes that.
As explained above, rst all rm s post wages w and also choose their capital k.
Workers observe all wages and then c hoose whic h job to seek. (they do not care
about capital stoc ks).
280
Lectures in Labor Economics
Now more specic ally let q(w) be the ratio of w orkers seeking w age w to rms
oering w.thenμ(q(w)) is ow rate of w orkers getting a job with w age w and
η(q(w)) is ow rate of rms lling their jobs.
W ha t equilibrium con cep t should we use here? Thinking about it intuitively, it
is clear that we should ensure that w orkers apply to jobs that maximize utility and
anticipate queue lengths at various w ages rationally. This is straightforward.
The harder part is for rms. Firms should c hoose w ages and inv estmen t to
maximize prots, anticipating queue lengths at w ag es not oered in equilibrium .
The last part is v ery important and corresponds to Subgame p erfection. Th is is
obviously important, since we ha ve a dynamic econom y, and you can see what will
go wrong if w e didn’t impose subgame perfection.
Before we go further, let us rst write the Bellman Equation s, which are intuitiv e
and standard for the rm (again imposing steady state throughout):
rJ
V
(w, k)=η(q(w))(J
F
(w, k) J
V
(w, k)) sJ
V
(w, k)
rJ
F
(w, k)=f(k) w sJ
F
(w, k)
imply ing a simple equation for the value of rm
J
V
(w, k)=
η(f(k) w)
(r + s)(r + s + η)
which we will use belo w .
The v alue of an employed w orker is also simple:
rJ
E
(w)=w + s(J
U
J
E
(w))
W ha t is sligh tly m ore involv ed is the value for unem p loye d worker.
Recall that unem ployed workers take an important action: they decide whic h
job to seek. Let J
U
(w) be the value of an unemploy ed work er when seeking wag e
w.
rJ
U
(w)
utility of applying to wage w
= μ(q(w))[J
E
(w) J
U
]
m aximal utility
of unem ployment
where I ha ve suppressed unem p loyment benets without loss of any generality.
281
Lectures in Labor Economics
So what is J
U
?Clearly:
J
U
=max
wW
J
U
(w)
where W is the support of the equilibrium wage distribution.
No w this already builds in the requirement that w maximizes J
U
(w).
Also it is clear that w, k should maximize J
V
(w, k).
Butwhataretheq(w)’s?
If w e did not impose subgame perfection, then we could ha ve crazy q(w)’s. In-
stead, rm s w o uld hav e to an ticipate wha t w orkers w ould do if they deviate and
create a new wage distribution.
So o-the-equilib r iu m path q(w) should satisfy
μ(q(w))
£
J
E
(w) J
U
¤
= rJ
U
or if J
E
(w) J
U
<rJ
U
,thenq(w)=0.
To dene an equilibrium more formally, let an allocation be a tuple
W,Q,K,J
U
®
,
where W is the support of the wage distrib ution, Q : W R is a queue length func-
tion, K : W R is a capital cho ice correspondence, and J
U
R is the equilib riu m
utility of unemploy ed work er s .
Definition 13.1. A directed search equilibrium satises
(1) For al l w W and k K(w), J
V
(w, k)=0.
(2) For al l k and for all w, J
V
(w, k) 0.
(3) J
U
=sup
wW
J
U
(w).
(4) Q(w) s.t. w, J
U
J
U
(w),andQ(w) 0, with c om plementary slackness.
In w ord s, the rst condition requires rm s to mak es zero prots when they c h oose
equilibrium wages and corresponding capital stocks. Th e second requires that for
all other capital stock and wa ge combinations, prots are nonpositiv e. The third
condition denes J
U
as the maxima l utilit y that an unemploy ed wo r ker can get.
The fourth condition is the most important one. It denes queue lengths to be such
282
Lectures in Labor Economics
that work ers are indierent between applying to available jobs, or if they cannot
be made indieren t, nobody applies to a particular job (th us the complementary
slackness part is ve ry importan t). This builds in the notion of subgame perfection.
Now w e hav e
Theorem 13.2. (Acemoglu and Shim er) Equilibrium k, w, q maximize
μ(q)w
r+s+μ
(=
rJ
U
) subject to η (q)
(f(k)w)
r+s+η(q)
=(r + s)k. A nd conversely, any solution to this maxi-
mization problem can be supported as an equilibrium.
Basically what this theorem sa ys is that the equilib rium will be suc h that the
utilit y of an unemployed worke r is maximized subject to zero prot.
Proof. (sk etc h) Suppose not. Take k
0
,w
0
,q
0
which fails to max im ize the above
program. Then another rm can oer k
00
,w
00
where (k
,w
,q
) is the solution and
w
00
= w
ε.Forε small enough w orkers prefer k
00
,w
00
to k
0
,w
0
, so q
00
>q
,
which implies that k
00
,w
00
makes positive prots, pro ving that (k
0
,w
0
,q) can’t be an
equilib r iu m . ¤
This theorem is very useful because it tells us that all we ha ve to do is to solve
the program:
max
μ(q)w
r + s + μ(q)
s.t.
η(q)(f (k) w)
r + s + η(q)
=(r + s)k
Is this a convex problem?
No, but let’s assume dierentiablit y (whic h we have so far), then rst order
conditions are necessary.
Formin g the Lagrangian with multiplier λ
(13.6)
η(q)f
0
(k)
r + s + η(q)
= r + s
(13.7)
μ(q)
r + s + μ(q)
λη(q)
r + s + η(q)
=0
283
Lectures in Labor Economics
Figure 13.1
and
(13.8)
(r + s)μ
0
(q)
(r + s + μ(q))
2
+ λ
µ
(r + s)η
0
(q)(f(k) w)
(r + s + η(q))
2
=0
Now (13.6) is iden tical to (13.1) abo ve, which was
Q
S
μ(Q
S
)f
0
(k
S
)
r + s + Q
S
μ(Q
S
)
= r + s
implies that, denoting the capital labor ratio in the wa ge posting equilibrium b y
k
wp
,
k
wp
= k
S
Therefor e, with wage posting, capital inv estm ents are alwa y s ecient.
Why is this? You migh t think this is because there is no more holdup problem,
and this is essen tially true, but the in tuition is a bit more subtle. In fact, there is
someth in g like hold-up because rmsthatinvestmoreinequilibriumprefertopay
higher w ages, but despite this the ecient lev el of inv estm e nt results. The reason
284
Lectures in Labor Economics
is that the higher wages that they pay is exactly oset with the higher probability
that they will attract w orkers, so net returns are not subject to hold-up .
Next w e hav e
λ =
r + s + η(q)
(r + s + μ(q))q
and substitute this into (iii), and used at zero prot constrain ts to solve for
w = f(k)
(r + s)(r + s + η(q))
η(q)
k
Then w e hav e:
η
0
q
2
f(k)
r + s
+
£
r + s + μ + μ
0
q q
2
μ
0
¤
k =0
which is identical to (13.2). We ha ve therefore established:
Theorem 13.3. Thedirectedsearchequilibriumofthesearchandinvestment
mo del is constr aine d ecient.
There fore, the equilibrium is constrained ecient! (note uniqueness is not guar-
anteed, but neither w as it in the social optimum)
Th us, wage posting decen tralizes the ecient allocation as the unique equilib-
rium.
Ho w can we understand this ecien cy better?
Acemoglu-Shimer consider a number of dierent economies
(1) Wage posting but no directed searc h. Clearly, in this case things are very
bad, and we get the Diamond parado x.
(2) An economy where rm s choose their o w n capital level, and then “post
a bargaining parameter β and upon matching, the rm and the w orker
Nash bargain with this parameter. It can be sho wn that if there is no
capital ch oic e, this economy will lead to an equilibriu m in whic h all rms
post the Hosios β, and constrained eciency is ach ieved. But if there is a
capital choice, and the only thing w orkers observe are the posted β’s, then
in equilibrium all rms oer the Hosios β, but there is under in vestment
because of the hold-up problem.
285
Lectures in Labor Economics
(3) An economy where rms choose their o w n capital level and w orkers apply
to rms observing these capital levels, and then they bargain according
to some exogenously giv en parameter β. In this case, the equilibrium is
inecien t and may ha ve under or o verinvestment. If the value of β is at
the Hosios value, then the equilibrium will be constrained ecienct.
(4) An economy where rms c h oose their ow n capital lev el and post β,and
work ers observe both k and β,thenalwaysconstrainedeciency.
So what do w e learn? What is important is directed search, and especially the
abilit y to direct search to wards higher capital in tensity rms. With w ag e posting,
those are the high-wage rms, hence the objective is achieved. But the same outcome
is also obtained if β is at the Hosios lev el, and work ers observe capital lev els.
Next, one might wonder whether an econom y in which w ork ers know/observ e all
of the w ages oe red in equilibrium is too extreme (especially given our motivation
of doing a wa y with a Walrasian auctioneer). A more plausible econom y may be one
where work er s observe a nite number of wages.
In teresting ly, we do not need all work ers to observ e all the w a ges as the model
with a non-degene rate wage distributio n in the last lecture illustrated .
Theorem 13.4. Suppose each worker observes (can apply to) at le a st two of
the rms among the continuum of active rm s, then the ecient allocation is an
equilibrium of the sear ch and investment model with dire cted se arch and wage posting.
Proof. (sk etc h) Suppose all rm s are oering (q
wp
,w
wp
,k
wp
). No w consider a
deviation to som e other (w
0
,k
0
). An y w orker wh o observes (w
0
,k
0
) has also observed
another rm oering (w
wp
,k
wp
).Since(w
wp
,k
wp
) maximizes w o rker utility, he will
apply to this in preference of
(w
0
,k
0
)= q(w
0
)=0.
Consequently, all rms will be happ y to oer (w
wp
,k
wp
) and they will each be
trac ked the queue length of q
wp
. ¤
286
Lectures in Labor Economics
W ha t is the intuitio n? Eectively Ber trand Com petition. Each rm knows that
it will ee ctively be competing with another rm oering the best possible deal to
the w ork er, even though dierently from the standard Bertrand model, it does not
kno w which particular rm this will be. Nevertheless, the Bertrand reasoning forces
each rm to go to the allocation that is best for the work ers.
Note that this theorem is not stated as an “if and only if” theorem. In partic-
ular, when each w orker only observes two wages, there can be other “non-ecient”
equilibr ia. In particular, it can be prov ed that: When each worker observes two
wages, there can exist non-ecient equilibria. This last theorem not withstanding,
the conclusion of this analysis is that relativ ely little informa tion is required for
wage post ing to decentralize the ecient allocation.
3. Risk Av ersion in Search Equilibrium
The tools we developed so far can also be used to analyze general equilibrium
search with risk ave rsion. Let us focus on the one-period model with wage posting.
This can again be extended to the dynam ic ver sion, but explicit form solutions
are possible only under constant absolute risk aversion (see Acem oglu-S him er, JPE
1999)
Measu re 1 w orkers; and they all have utility u(c) where the consump tion of
individual i is
C
i
= A
i
+ y
i
τ
i
where A
i
is the non-labor incom e of individ ua l, y
i
is his labor income, equal to the
wage w that he applies it obtains if he’s employed, and equal to the unemploymen t
benet z when unemploy ed. Finally, τ
i
is equal to the taxes paid by this indiv idua l.
u is increasing, concave and dier entiable.
Let us start with a homogeneous econom y where A
i
= A
0
and τ
i
= τ for all i.
We also assum e that rms are risk-neutral, which is not chill for example because
w orkers may hold a balanced m utual fund. I will onlypresent the analysis for the
static economy here.
287
Lectures in Labor Economics
Timing of ev ents:
Firms decide to enter, buy capital k>0 (as before irreversible,) and post
awagew
Workersobserveallwageoers and decide whic h wage to seek (apply to).
As before, if on a verage there are q times as many w orkers seeking wage w as
rm s oerin g w,thenworkersgetajobwithprob. μ(q).
Firms ll their vacancies with prob. η(q) (q), with our standard assump-
tions, μ
0
(q) < 0 and η
0
(q) > 0
As before, let an allocation be hW,Q,K,Ui,whereW is the support of the
w age distribution, Q : W R is a queue length function, K : W R is a capital
c ho ice correspondence, and U R is the equ ilib riu m utility of unem p loy ed workers.
Definition 13.2. An allocation is an equilibrium i
(1) w W and k K(w), η(Q(w))(f(k) w) k =0.
(2) w, k, η(Q(w))(f(k) w) k 0.
(3) U =sup
wW
μ(Q(w))u(A + w)+(1 μ(Q(w))u(A + z)
(4) Q(w) s.t. w, U μ(Q(w))u(A+w)+(1μ(Q(w)))u(A+z) and Q(w) 0,
with comple mentar y slack n e s s .
= As before t ype of subgame perfection on beliefs about queue lengths
after a deviation.
Char acte rizat ion of equilibrium is simila r to before
Theorem 13.5. (W,Q,K,U) an e quilibrium if and only if w
W,q
Q(w
),k
K(w
)
(w
,q
,k
) arg max μ(q)u(A + w)+(1 μ(q))u(A + z)
s.t.
η(q)(f(k) w) 0.
288
Lectures in Labor Economics
In words, every equilibrium maximizes work er utility subject to zero prots, as
prov ed before in the context of the risk-neutral model.
The analysis is simila r to before. Prot maximization implies an ev en simpler
condition (because the environment is static)
η(q
)f
0
(k
)=1
Zero prots gives
η(q
)(f(k
) w
)=k
No w com bining these t wo:
w
= f(k
) k
f
0
(k
),
which y ou will notice is exactly the neoclassical w ages equal to margina l product
condition. Why is that?
Finally, combining this with , η(q
)f
0
(k
)=1,wecanderivearelationinthe
(q, w) space whic h corresponds to the zero-prots and prot maximization con-
straints that an equilibrium has to satisfy.
An equilibrium is then a tangency point bet ween the indierence curv es of ho-
mogeneous worke rs and this prot-m axim ization constraint, as we had in the risk-
neutral model of Acemoglu-Shimer (IER, 1999):
The equilibriu m can be depicted and analy zed diag rammatically.
Notice again that uniqueness not guaran teed.
What mak es this attractive is that comparativ e statics can also be done in a
simple wa y, exploiting "revealed preferen ce" or single crossing.
For example, we ha ve a change suc h that all w orkers become more risk-a verse,
i.e., and the utility function becomes more concave, what hap pens to equilib riu m ?
We can show that as risk-a version increases, then w e have w ,q ,k .
Why? Indieren ce curves become ev erywh ere steeper, the causing the tangency
pointtoshifttotheleft. Unambiguousdespitethefactthatequilibriummaynot
be unique.
289
Lectures in Labor Economics
Figure 13.2
Figure 13.3
290
Lectures in Labor Economics
Essentially, compa rative static result unambiguous because u
1
-curve single-crosses
u
2
-curve.
Intuition: “Market Insurance.” Work ers are more risk-a verse, so rms oer insur-
ance b y creating lo w-wage but easier to get jobs. Capital falls because once jobs are
easier to get for w o rkers, vacancies remain open for longer (with higher probabilit y ),
so capital is un used for longer, reducing in vestment. Summ arizing this:
Theorem 13.6. Conside r a change from utility function u
1
to u
2
where u
2
is
a strictly concave transformation of u
1
.Thenif(k
1
,w
1
,q
1
) is any equilibrium with
preferences u
1
and (k
2
,w
2
,q
2
) is any equilibrium with preferences u
2
,thenk
2
<k
1
,
w
2
<w
1
and q
2
<q
1
.
Similarly, what happens when the unemploymen t benets z increases from z
1
to
z
2
?
Theorem 13.7. Consider a change from unemployment benets z
1
to z
2
>z
1
.
Then if (k
1
,w
1
,q
1
) is any equilibrium w ith benets z
1
and (k
2
,w
2
,q
2
) is any equilib-
rium with b ene ts z
2
,thenk
2
>k
1
, w
2
>w
1
and q
2
>q
1
.
Proof. (sketc h) By revealed preference
μ(q
1
)(u(A + w
1
) u(A + z
1
)) μ(q
2
)(u(A + w
2
) u(A + z
1
))
μ(q
2
)(u(A + w
2
) u(A + z
2
)) μ(q
1
)(u(A + w
1
) u(A + z
2
)
Mu ltiply throug h and simplify
(u(A + z
1
) u(A + z
2
))(u(A + w
1
) u(A + w
2
)) 0
= z
1
z
2
⇐⇒ w
1
w
2
.
All inequalities strict since all curves smooth. ¤
What happens when there is heterogeneity?
Suppose that there are s =1, 2,...,S types of w orkers, where t ype s has utility
function u
s
, after-tax asset lev el A
s
, and unemploymen t benet z
s
.LetU now be a
vector in R
S
, and assume, for simplicit y. Then:
291
Lectures in Labor Economics
Theorem 13.8. There always exists an equilibrium. If {K, W,Q,U} is an equi-
libriu m, then any k
s
K,w
s
W,andq
s
= Q(w
s
), solves
U
s
=max
k,w,q
μ(q)u
s
(A
s
+ w)+(1 μ(q)) u(A
s
+ z
s
)
subject to η(q)(f(k) w) k =0for some s =1, 2, .., S.If{k
s
,w
s
,q
s
} solves the
above pr ogram for some s, then there exists an e quilibrium {K, W,Q,U} such that
k
s
K, w
s
W,andq
s
= Q(w
s
).
The importan t result here is that an y triple {k
s
,w
s
,q
s
} that is part of an equilibrium
maxim izes the utility of one group of work ers, subject to rms making zero prots.
The market endogenously segments into S dierent submarkets, eac h catering to the
preferences of one t ype of work er, and receiving applications only from that type.
The eciency and output-maximization implications of this model are also in-
teresting. First, supposed that u(·) is linear. Then z = τ =0maximizes output.
In particular, w e hav e
Theorem 13.9. Suppose that u is line ar, then z = τ =0maximizes output.
Proof. (ske tch) Th e equilibrium solv e s max μ(q)w subject to (q)(f(k)w)=
k. Substituting for w we obtain:
μ(q)f(k) k/q y(k, q),
which is net output, thus is maximize d by equilibrium c h oices. ¤
But an imm ed iate corollary is that if u(·) is strictly concav e, than the equilbrium
with z = τ =0does not maximize output.
Theorem 13.10. Suppose that
u is strictly c on cave, then z = τ =0do es not
maximize output.
This is an imm ediate corollary of the previous theorem s.
Theorem 13.11. Let u be an arbitrary concave utility functio n , q
e
be the output-
maxim izing level of queue length and let
z
e
u(A
0
τ
e
+ w
e
) u(A
0
τ
e
+ z
e
)
u
0
(A
0
τ
e
+ w
e
)
292
Lectures in Labor Economics
and the b a lanced-budget condition
τ
e
=(1 μ(q
e
))z
e
then the economy with unem p lo y m e n t benet z
e
achieves an e q uilibrium with q
e
and
the ma ximum outpu t.
The following gure gives the in tuition:
Figure 13.4
But this is not “optimal,” since when w orkers are risk averse, maximizing output
is not necessarily the right objective. Optimal unemploymen t benets, z
o
,should
max imize ex an te utility. In tere stingly, this could be greater or less than the ecien t
level of unemplo ymen t benets, z
e
, which m axim izes output. What is the intuition
for this?
293