[I]n his short and broken treatise he provides an eternal example—not of laws, or
even of method, for there is no method except to be very intelligent, but of
intelligence itself swiftly operating the analysis of sensation to the point of
principle and definition.
— T. S. Eliot on Aristotle, “The Perfect Critic”, The Sacred Wood (1921)
The nice thing about standards is that you have so many to choose from;
furthermore, if you do not like any of them, you can just wait for next year’s model.
— Andrew S. Tannenbaum, Computer Networks (1981)
It is a rare mind indeed that can render the hitherto non-existent blindingly obvious.
The cry “I could have thought of that” is a very popular and misleading one, for the
fact is that they didn’t, and a very significant and revealing fact it is too.
— Dirk Gently to Richard McDuff
in Douglas Adams’ Dirk Gently’s Holistic Detective Agency (1987)
If a problem has no solution, it may not be a problem, but a fact —
not to be solved, but to be coped with over time.
— Shimon Peres, as quoted by David Rumsfeld, Rumsfeld’s Rules (2001)
12
NP-Hardness
12.1 A Game You Can’t Win
Imagine that a salesman in a red suit, who looks suspiciously like Tom Waits,
presents you with a black steel box with n binary switches on the front and
a light bulb on top. The salesman tells you that the state of the light bulb is
controlled by a complex boolean circuit—a collection of A,
O,
and N
gates
connected by wires, with one input wire for each switch and a single output
wire for the bulb. He then asks you a simple question: Is it possible to set
the switches so that the light bulb turns on? If you can answer this question
correctly, he will give you one million one hundred billion dollars; if you answer
incorrectly, or if you die without answering, he will take your soul.
x
y x x
y x∧y x∨y ¬x
Figure 12.1. An AND gate, an OR gate, and a NOT gate.
379
12. NP-HARDNESS
Figure 12.2. A boolean circuit. Inputs enter from the left, and the output leaves to the right.
As far as you can tell, the Adversary hasn’t connected the switches to the
light bulb at all, so no matter how you set the switches, the light bulb will stay
o.
If you declare that it is possible to turn on the light, the Adversary will open
the box and reveal that there is no circuit at all. But if you declare that it is
not possible to turn on the light, before testing all 2n settings, the Adversary
will magically create a circuit inside the box that turns on the light if and only
if the switches are in one of the settings you haven’t tested, and then flip the
switches to that setting, turning on the light. (You can’t detect the Adversary’s
cheating, because you can’t see inside the box until the end.) The only way
to provably answer the Adversary’s question correctly is to try all 2n possible
settings. You quickly realize that this will take far longer than you expect to
live, so you gracefully decline the Adversary’s oer.
The Adversary smiles and says, in a growl like Heath Ledger’s Joker after
smoking a carton of Marlboros, “Ah, yes, of course, you have no reason to trust
me. But perhaps I can set your mind at ease.” He hands you a large roll of
parchment—which you hope was made from sheep skin—with a circuit diagram
drawn (or perhaps tattooed) on it. “Here are the complete plans for the circuit
inside the box. Feel free to poke around inside the box to make sure the plans
are correct. Or build your own circuit from these plans. Or write a computer
program to simulate the circuit. Whatever you like. If you discover that the
plans don’t match the actual circuit in the box, you win the hundred billion
bucks.” A few spot checks convince you that the plans have no obvious flaws;
subtle cheating appears to be impossible.
But you should still decline the Adversary’s “generous” oer.
The problem
that the Adversary is posing is called circuit satisfiability or CS:
Given
a boolean circuit, is there a set of inputs that makes the circuit output T,
or
conversely, whether the circuit always outputs F.
For any particular input
setting, we can calculate the output of the circuit in polynomial (actually, linear)
time using depth-first-search. But nobody knows how to solve CS
faster
than trying all 2n possible inputs to the circuit by brute force, which requires
exponential time. Admittedly, nobody has actually formally proved that we can’t
380
12.2. P versus NP
beat brute force—maybe, just maybe, there’s a clever algorithm that just hasn’t
been discovered yet—but nobody has actually formally proved that anti-gravity
unicorns don’t exist, either. For all practical purposes, it’s safe to assume that
there is no fast algorithm for CS.
You tell the salesman no. He smiles and says, “You’re smarter than you look,
kid,” and then flies away on his anti-gravity unicorn.
12.2 P versus NP
A minimal requirement for an algorithm to be considered “ecient”
is that its
running time is bounded by a polynomial function of the input size: O(nc
) for
some constant c, where n is the size of the input.
Researchers recognized
early on that not all problems can be solved this quickly, but had a hard time
figuring out exactly which ones could and which ones couldn’t. There are
several so-called NP-hard problems, which most people believe cannot be solved
in polynomial time, even though nobody can prove a super-polynomial lower
bound.
A decision problem is a problem whose output is a single boolean value: Y
or N.
Let me define three classes of decision problems:
• P is the set of decision problems that can be solved in polynomial time.
Intuitively, P is the set of problems that can be solved quickly.
• NP is the set of decision problems with the following property: If the answer
is Y,
then there is a proof of this fact that can be checked in polynomial
time. Intuitively, NP is the set of decision problems where we can verify a
Y
answer quickly if we have the solution in front of us.
• co-NP is essentially the opposite of NP. If the answer to a problem in co-NP
is N,
then there is a proof of this fact that can be checked in polynomial
time.
For example, the circuit satisfiability problem is in NP. If a given boolean circuit
is satisfiable, then any set of m input values that produces T
output is a
proof that the circuit is satisfiable; we can check the proof by evaluating the
circuit in polynomial time. It is widely believed that circuit satisfiability is not
in P or in co-NP, but nobody actually knows.
Every decision problem in P is also in NP. If a problem is in P, we can verify
Y
answers in polynomial time recomputing the answer from scratch! Similarly,
every problem in P is also in co-NP.
This notion of eciency
was independently formalized by Alan Cobham in ,
Jack
Edmonds in ,
and Michael Rabin in ,
although similar notions were considered more
than a decade earlier by Kurt Gödel, John Nash, and John von Neumann.
381
12. NP-HARDNESS
Perhaps the single most important unanswered question in theoretical
computer science—if not all of computer science—if not all of science—is
whether the complexity classes P and NP are actually dierent.
Intuitively,
it seems obvious to most people that P 6= NP; the homeworks and exams in
your algorithms and data structures classes have (I hope) convinced you that
problems can be incredibly hard to solve, even when the solutions are simple
in retrospect. It’s completely obvious; of course solving problems from scratch
is harder than verifying that a given solution is correct. We can reasonably
accept—and most algorithm designers do accept—the statement “P 6= NP” as a
law of nature, similar to other laws of nature like Maxwell’s equations, general
relativity, and the sun rising tomorrow morning that are strongly supported by
evidence, but have no mathematical proof.
But if we’re being mathematically rigorous, we have to admit that nobody
knows how to prove that that P 6= NP. In fact, there has been little or no real
progress toward a proof for decades.
The Clay Mathematics Institute lists
P-versus-NP as the first of its seven Millennium Prize Problems, oering
a
$,,
reward for its solution. And yes, in fact, several people have lost
their souls, or at least their sanity, attempting to solve this problem.
A more subtle but still open question is whether the complexity classes NP
and co-NP are dierent.
Even if we can verify every Y
answer quickly, there’s
no reason to believe we can also verify N
answers quickly. For example, as far
as we know, there is no short proof that a boolean circuit is not satisfiable. It is
generally believed that NP 6= co-NP, but again, nobody knows how to prove it.
P
coNP NP
Figure 12.3. What we think the world looks like.
12.3 NP-hard, NP-easy, and NP-complete
A problem ⇧ is NP-hard if a polynomial-time algorithm for ⇧ would imply a
polynomial-time algorithm for every problem in NP. In other words:
⇧ is NP-hard () If ⇧ can be solved in polynomial time, then P=NP
Perhaps the most significant progress has taken the form of barrier results, which imply that
entire avenues of attack are doomed to fail. In a very real sense, not only do we have no idea
how to prove P 6= NP, but we can actually prove that we have no idea how to prove P 6= NP!
382
12.3. NP-hard, NP-easy, and NP-complete
Intuitively, if we could solve one particular NP-hard problem quickly, then we
could quickly solve any problem whose solution is easy to understand, using
the solution to that one special problem as a subroutine. NP-hard problems are
at least as hard as every problem in NP.
Finally, a problem is NP-complete if it is both NP-hard and an element of
NP (or “NP-easy”). Informally, NP-complete problems are the hardest problems
in NP. A polynomial-time algorithm for even one NP-complete problem would
immediately imply a polynomial-time algorithm for every NP-complete problem.
Literally thousands of problems have been shown to be NP-complete, so a
polynomial-time algorithm for one (and therefore all) of them seems incredibly
unlikely.
Calling a problem NP-hard is like saying “If I own a dog, then it can speak
fluent English.” You probably don’t know whether or not I own a dog, but I bet
you’re pretty sure that I don’t own a talking dog. Nobody has a mathematical
proof that dogs can’t speak English—the fact that no one has ever heard a dog
speak English is evidence, as are the hundreds of examinations of dogs that
lacked the proper mouth shape and brainpower, but mere evidence is not a
mathematical proof. Nevertheless, no sane person would believe me if I said I
owned a dog that spoke fluent English.
So the statement “If I own a dog, then
it can speak fluent English” has a natural corollary: No one in their right mind
should believe that I own a dog! Similarly, if a problem is NP-hard, no one in
their right mind should believe it can be solved in polynomial time.
P
NP coNP
NP-hard
NP-complete
Figure 12.4. More of what we think the world looks like.
It is not immediately obvious that any problems are NP-hard. The following
remarkable theorem was first published by Stephen Cook in
and
independently by Leonid Levin in .
. . . The department chair shakes his head sadly and says, “Oh, come on, that just sounds
like barking. Let me ask a question. Who was the greatest complexity theorist of all time?” The
dog cocks his head, pauses for a few seconds, and then says “Karp!” After the chair chases them
out of his oce,
the dog turns to its owner and says, “Maybe I should have said Impagliazzo?” Levin
first reported his results at seminars in Moscow in ,
while still a PhD student. News
of Cook’s result did not reach the Soviet Union until at least ,
after Levin’s announcement of
his results had been published; in accordance with Stigler’s Law, this result is often called “Cook’s
Theorem”. Levin was denied his PhD at Moscow University for political reasons; he emigrated to
the US in
and earned a PhD at MIT a year later. Cook was denied tenure by the Berkeley
383
12. NP-HARDNESS
The Cook-Levin Theorem. Circuit satisfiability is NP-hard.
I won’t even sketch a proof here, because I’ve been (deliberately) vague
about the definitions.
™12.4 Formal Definitions (HC SVNT DRACONES)
Formally, the complexity classes P, NP, and co-NP are defined in terms of
languages and Turing machines. A language is a set of strings over some finite
alphabet ⌃; without loss of generality, we can assume that ⌃ = {0, 1}. A Turing
machine is a very restrictive type of computer—crudely, a finite-state machine
with an unbounded memory tape—whose precise definition is surprisingly
unimportant. P is the set of languages that can be decided in Polynomial time
by a deterministic single-tape Turing machine. Similarly, NP is the set of all
languages that can be decided in polynomial time by a nondeterministic Turing
machine; NP is an abbreviation for Nondeterministic Polynomial-time.
The requirement of polynomial time is suciently
crude that we do not
have to specify the precise form of Turing machine (number of tapes, number
of heads, number of tracks, size of the tape alphabet, and so on). In fact, any
algorithm that runs on a random-access machine
in T(n) time can be simulated
by a single-tape, single-track, single-head Turing machine that runs in O(T(n)
4)
time. This simulation result allows us to argue formally about computational
complexity in terms of standard high-level programming constructs like arrays
and loops and recursion, instead of describing everything directly in terms of
Turing machines.
Formally, a problem ⇧ is NP-hard if and only if, for every language ⇧0 2 NP,
there is a polynomial-time Turing reduction from ⇧0 to ⇧. A Turing reduction
means a reduction that can be executed on a Turing machine; that is, a Turing
machine M that can solve ⇧0 using another Turing machine M0 for ⇧ as
a black-box subroutine. Turing reductions are also called oracle reductions;
polynomial-time Turing reductions are also called Cook reductions.
mathematics department in ,
just one year before publishing his seminal paper; he (but not
Levin) later won the Turing award for his proof.
Interested readers find a proof in my lecture notes on nondeterministic Turing machines at
http://algorithms.wtf, or in Boaz Barak’s excellent Introduction to Theoretical Computer Science. Random-access
machines are a model of computation that more faithfully models physical
computers. A standard random-access machine has unbounded random-access memory, modeled
as an unbounded array M[0 ..1] where each address M[i] holds a single w-bit integer, for
some fixed integer w, and can read to or write from any memory addresses in constant time.
RAM algorithms are formally written in assembly-like language, using instructions like ADD i, j, k
(meaning “M[i] M[ j] + M[k]”), INDIR i, j (meaning “M[i] M[M[ j]]”), and IF0GOTO i, `
(meaning “if M[i] = 0, go to line `”), but the precise instruction set is surprisingly irrelevant. By
definition, each instruction executes in unit time. In practice, RAM algorithms can be faithfully
described using higher-level pseudocode, as long as we’re careful about arithmetic precision.
384
12.5. Reductions and SAT
Researchers in complexity theory prefer to define NP-hardness in terms of
polynomial-time many-one reductions, which are also called Karp reductions.
A many-one reduction from one language L0 ✓ ⌃⇤ to another language L ✓ ⌃⇤
is a function f : ⌃⇤ ! ⌃⇤ such that x 2 L0 if and only if f (x) 2 L. Then we can
define a language L to be NP-hard if and only if, for any language L0 2 NP, there
is a many-one reduction from L0 to L that can be computed in polynomial time.
Every Karp reduction “is” a Cook reduction, but not vice versa. Specifically,
any Karp reduction from one decision problem ⇧ to another decision ⇧0 is
equivalent to transforming the input to ⇧ into the input for ⇧0
, invoking an
oracle (that is, a subroutine) for ⇧0
, and then returning the answer verbatim.
However, as far as we know, not every Cook reduction can be simulated by a
Karp reduction.
Complexity theorists prefer Karp reductions primarily because NP is closed
under Karp reductions, but is not closed under Cook reductions (unless NP=coNP,
which is considered unlikely). There are natural problems that are ()
NP-hard with respect to Cook reductions, but ()
NP-hard with respect to Karp
reductions only if P=NP. One trivial example of such a problem is US:
Given
a boolean formula, is it always false? On the other hand, many-one reductions
apply only to decision problems (or more formally, to languages); formally, no
optimization or construction problem is Karp-NP-hard.
To make things even more confusing, both Cook and Karp originally defined
NP-hardness in terms of logarithmic-space reductions. Every logarithmic-space
reduction is a polynomial-time reduction, but (as far as we know) not vice
versa. It is an open question whether relaxing the set of allowed (Cook or
Karp) reductions from logarithmic-space to polynomial-time changes the set of
NP-hard problems.
Fortunately, none of these subtleties rear their ugly heads in practice—in
particular, every reduction described in this chapter can be formalized as a
logarithmic-space many-one reduction—so you can wake up now.
12.5 Reductions and SAT
To prove that any problem other than circuit satisfiability is NP-hard, we use a
reduction argument. Reducing problem A to another problem B means describing
an algorithm to solve problem A under the assumption that an algorithm for
problem B already exists. You’ve already been doing reduction for years, even
before starting this book, only you probably called them something else, like
subroutines or utility functions or modular programming or using a calculator.
To prove something is NP-hard, we describe a similar transformation between
problems, but not in the direction that most people expect.
385
12. NP-HARDNESS
You should tattoo the following rule of onto the back of your hand, right
next to your mom’s birthday and the actual rules of Monopoly.
To prove that problem A is NP-hard,
reduce a known NP-hard problem to A.
In other words, to prove that your problem is hard, you need to describe an
ecient
algorithm to solve a dierent
problem, which you already know is
hard, using an hypothetical ecient
algorithm for your problem as a black-box
subroutine. The essential logic is a proof by contradiction. The reduction
implies that if your problem were easy, then the other problem would be easy,
which it ain’t. Equivalently, since you know the other problem is hard, the
reduction implies that your problem must also be hard; your hypothetical
ecient
algorithm does not actually exist.
As a canonical example, consider the formula satisfiability problem, usually
just called S.
The input to S
is a boolean formula like
(a _ b _ c _ d¯) , ((b ^ ¯c) _ (a¯ ) d) _ (c 6= a ^ b)),
and the question is whether it is possible to assign boolean values to the variables
a, b,c,... so that the entire formula evaluates to T.
To prove that S
is NP-hard, we need to give a reduction from a known
NP-hard problem. The only problem we know is NP-hard so far is CS,
so let’s start there.
Let K be an arbitrary boolean circuit. We can transform (or more accurately,
transcribe) K into a boolean formula
as follows. First, label each interior wire
by a new variable yj, and label the output wire with a new variable z. The
formula
consists of a list of equations, one for each gate, separated by As,
followed by a final ^ z. Figure .
shows the resulting transcription for our
example circuit.
Now we claim that the original circuit K is satisfiable if and only if the
resulting formula
is satisfiable. Like every other “if and only if” statement, we
prove this claim in two steps:
If a player lands on an available property and declines (or is unable) to buy it, that property
is immediately auctioned o
to the highest bidder; the player who originally declined the property
may bid, and bids may be arbitrarily higher or lower than the list price. Players in Jail can still
buy and sell property, buy and sell houses and hotels, and collect rent. The game has
houses
and
hotels; once they’re gone, they’re gone. In particular, if all houses are already on the
board, you cannot downgrade a hotel to four houses; you must raze all the hotels in the group
to the ground. Players can sell or exchange undeveloped properties with each other, but cannot
sell property back to the bank; on the other hand, players can sell buildings to the bank (at half
price), but cannot sell or exchange buildings with each other. All penalties are paid directly to
the bank. A player landing on Free Parking does not win anything. A player landing on Go gets
exactly $,
no more. Railroads are not magic transporters. Finally, Je
always gets the boot.
No, not the T-Rex or the penguin—the boot, dammit.
386
12.5. Reductions and SAT
x1
x2
x3
x4
x5
y1
y2
y3
y4
y5
y6
z
( y1 = x1 ^ x4) ^ ( y2 = x4) ^ ( y3 = x3 ^ y2) ^ ( y4 = y1 _ x2) ^
( y5 = x2) ^ ( y6 = x5) ^ ( y7 = y3 _ y5) ^ (z = y4 ^ y7 ^ y6) ^ z
Figure 12.5. Transcribing a boolean circuit as a boolean formula.
) Given a set of inputs that satisfy the circuit K, we can derive a satisfying
assignment for the formula
by computing the output of every gate in K.
( Given a satisfying assignment for the formula ,
we can obtain a satisfying
input the circuit by simply ignoring the internal wire variables yi and the
output variable z.
The entire transformation from circuit to formula can be carried out in linear
time. Moreover, the size of the resulting formula is at most a constant factor
larger than any reasonable representation of the circuit.
Now suppose, for the sake of argument, there is an algorithm that can
determine in polynomial time whether a given boolean formula is satisfiable.
Then given any boolean circuit K, we can decide whether K is satisfiable by first
transforming K into a boolean formula
as described above, and then asking
our magical mystery S
algorithm whether
is satisfiable, as suggested by the
following cartoon. Each box represents an algorithm. The red box on the left is
the transformation subroutine. The box on the right is the hypothetical magic
S
algorithm. It must be magic, because it has a rainbow on it.
C⌅⌃⇥ ⌅SAT
SAT
K
Boolean
formula
transform
in O(n)
time
Φ
Boolean
circuit
Φ is
satisfiable
Φ is not
satisfiable
T⌃ ⇤
F⇧⌥⇤
K is
satisfiable
K is not
satisfiable
T⌃ ⇤
F⇧⌥⇤
If you prefer magic pseudocode to magic boxes:
Kay
Erickson, personal communication, .
For those of you reading black-and-white
printed copies: Yes, that round thing is a rainbow.
387
12. NP-HARDNESS
CS(K):
transcribe K into a boolean formula
return S()
hh????MAGIC????ii
Transcribing K into
requires only polynomial time (in fact, only linear time,
but whatever), so the entire CS
algorithm also runs in polynomial time.
TCS(n)
O(n) + TS(O(n))
We conclude that any polynomial-time algorithm for S
would give us a
polynomial-time algorithm for CS,
which in turn would imply P=NP.
So S
is NP-hard!
12.6 3SAT (from CIRCUITSAT)
A special case of S
that is particularly useful in proving NP-hardness results is
called CNF-S
or more often simply S.
A boolean formula is in conjunctive normal form (CNF) if it is a conjunction
()
of several clauses, each of which is the disjunction ()
of several literals,
each of which is either a variable or its negation. For example:
clause z }| {
(a _ b _ c _ d) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯
b)
A CNF
formula is a CNF formula with exactly three literals per clause; the
previous example is not a CNF
formula, since its first clause has four literals
and its last clause has only two. S
is the restriction of S
to CNF
formulas:
Given a CNF
formula, is there an assignment to the variables that makes the
formula evaluate to T?
We could prove that S
is NP-hard by a reduction from the more general
S
problem, but it’s actually easier to start over from scratch, by reducing
directly from CS.
Figure 12.6. A polynomial-time reduction from CIRCUITSAT to 3SAT.
Given an arbitrary boolean circuit K, we transform K into an equivalent
CNF
formula in several stages. Except for the very last stage, this reduction
388
12.6. 3SAT (from CIRCUITSAT)
was actually described by Grigorii Tseitin in ,
five years before Cook and
Levin reported their proofs of the Cook-Levin Theorem. (In the same
paper, Tseitin described the problem we now call CNF-S,
possibly for the first
time.) As we describe each stage, we will also prove that stage is correct.
• Make sure every
and
gate in K has exactly two inputs. If any gate has
k > 2 inputs, replace it with a binary tree of k
1 binary gates. Call the
resulting circuit K0
. The circuits K and K0 are logically equivalent circuits,
so every satisfying input for K is a satisfying input for K0 and vice versa.
• Transcribe K0 into a boolean formula 1
with one clause per gate, exactly as in
our previous reduction to S.
We already proved that every satisfying input
for K0 can be transformed into a satisfying assignment for 1
and vice versa.
• Replace each clause in 1
with a CNF formula. There are only three types of
clauses in 1,
one for each type of gate in K0
:
a = b ^ c 7!
(a _ ¯
b _ ¯c) ^ (a¯ _ b) ^ (a¯ _ c)
a = b _ c 7!
(a¯ _ b _ c) ^ (a _ ¯
b) ^ (a _ ¯c)
a = ¯
b 7!
(a _ b) ^ (a¯ _ ¯
b)
Call the resulting CNF formula 2.
Because 1
and 2
are logically equivalent
formulas, every satisfying assignment for 1
is also a satisfying assignment
for 2,
and vice versa.
• Replace each clause in 2
with a CNF
formula. Every clause in 2
has at most
three literals, but we need clauses with exactly three literals. To obtain a
CNF
formula, we expand each two-literal clause in 2
into two three-literal
clauses by introducing one new variable, and we expand the final one-literal
clause in 2
into four three-literal clauses by introducing two new variables.
a _ b 7!
(a _ b _ x) ^ (a _ b _ x¯)
z 7!
(z _ x _ y) ^ (z _ x¯ _ y) ^ (z _ x _ ¯y) ^ (z _ x¯ _ ¯y)
Call the resulting CNF
formula 3.
Every satisfying assignment for 2
can be
transformed into a satisfying assignment for 3
by assigning arbitrary values
to the new variables (x and y). Conversely, every satisfying assignment
for 3
can be transformed into a satisfying assignment for 2
by ignoring
the new variables.
For example, our example circuit is transformed into the following CNF
formula;
compare with Figure ..
389
12. NP-HARDNESS
( y1 _ x1 _ x4) ^ ( y1 _ x1 _ z1) ^ ( y1 _ x1 _ z1) ^ ( y1 _ x4 _ z2) ^ ( y1 _ x4 _ z2)
^ ( y2 _ x4 _ z3) ^ ( y2 _ x4 _ z3) ^ ( y2 _ x4 _ z4) ^ ( y2 _ x4 _ z4)
^ ( y3 _ x3 _ y2) ^ ( y3 _ x3 _ z5) ^ ( y3 _ x3 _ z5) ^ ( y3 _ y2 _ z6) ^ ( y3 _ y2 _ z6)
^ ( y4 _ y1 _ x2) ^ ( y4 _ x2 _ z7) ^ ( y4 _ x2 _ z7) ^ ( y4 _ y1 _ z8) ^ ( y4 _ y1 _ z8)
^ ( y5 _ x2 _ z9) ^ ( y5 _ x2 _ z9) ^ ( y5 _ x2 _ z10) ^ ( y5 _ x2 _ z10)
^ ( y6 _ x5 _ z11) ^ ( y6 _ x5 _ z11) ^ ( y6 _ x5 _ z12) ^ ( y6 _ x5 _ z12)
^ ( y7 _ y3 _ y5) ^ ( y7 _ y3 _ z13) ^ ( y7 _ y3 _ z13) ^ ( y7 _ y5 _ z14) ^ ( y7 _ y5 _ z14)
^ ( y8 _ y4 _ y7) ^ ( y8 _ y4 _ z15) ^ ( y8 _ y4 _ z15) ^ ( y8 _ y7 _ z16) ^ ( y8 _ y7 _ z16)
^ ( y9 _ y8 _ y6) ^ ( y9 _ y8 _ z17) ^ ( y9 _ y6 _ z18) ^ ( y9 _ y6 _ z18) ^ ( y9 _ y8 _ z17)
^ ( y9 _ z19 _ z20) ^ ( y9 _ z19 _ z20) ^ ( y9 _ z19 _ z20) ^ ( y9 _ z19 _ z20)
Yikes! At first glance, this formula might look a lot more complicated than the
original circuit, but in fact, it’s only larger by a constant factor. Specifically, the
simplified circuit K0 has at most twice as many wires as the original circuit K,
each binary gate in K0 is transformed into at most five clauses in 3.
Even if the
formula size were a large polynomial function (like n374) of the circuit size, we
would still have a valid reduction.
Our reduction transforms an arbitrary boolean circuit K into a CNF
formula
3
in polynomial time (in fact, in linear time). Moreover, any satisfying
input for the input circuit K can be transformed into a satisfying assignment
for 3,
and any satisfying assignment for 3
can be transformed into a satisfying
input for K. In other words, K is satisfiable if and only if 3
is satisfiable. Thus,
if S
can be solved in polynomial time, then CS
can be solved in
polynomial time, which implies that P = NP. We conclude that S
is NP-hard.
12.7 Maximum Independent Set (from 3SAT)
For the next few problems we consider, the input is a simple, unweighted,
undirected graph, and the problem asks for the size of the largest or smallest
subgraph satisfying some structural property.
Let G be an arbitrary graph. An independent set in G is a subset of the
vertices of G with no edges between them. The maximum independent set
problem, which I’ll abbreviate MIS,
asks for the size of the largest
independent set in a given graph. I will prove that MIS
is NP-hard using
a reduction from S,
as suggested by Figure ..
Given an arbitrary CNF
formula ,
we construct a graph G as follows.
Let k denote the number of clauses in .
The graph G contains exactly 3k
vertices, one for each literal in .
Two vertices in G are connected by an
edge if and only if either ()
they correspond to literals in the same clause, or
()
they correspond to a variable and its inverse. For example, the formula
(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯
b _ d¯) is transformed into the graph
shown in Figure ..
390
12.7. Maximum Independent Set (from 3SAT)
S⇥
M⇥↵I⌃⇤S⌅
G
CNF
Boolean
formula
Φ
graph
Φ is
satisfiable
Φ is not
satisfiable
T⌥⌦⌅
F⇥⇧⌅
G has an
independent
set of size k
T⌥⌦⌅
F⇥⇧⌅
transform
in O(n)
time
G has no
independent
set of size k
=?
k
number of clauses in Φ
size of largest
independent
set in G
Figure 12.7. A polynomial-time reduction from 3SAT to MAXINDSET.
‾ ‾
a
b
c
c
d
a
b
d
b‾
d
a‾
‾c
(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯
b _ d¯)
Figure 12.8. A graph derived from the satisfiable 3CNF formula with 4 clauses, and an independent
set of size 4.
Each independent set in G contains at most one vertex from each clause
triangle, because any two vertices in each triangle are connected. Thus, the
largest independent set in G has size at most k. I claim that G contains an
independent set of size exactly k if and only if the original formula
is satisfiable.
As usual for “if and only if” statements, the proof consists of two parts.
) Suppose
is satisfiable. Fix an arbitrary satisfying assignment. By definition,
each clause in
contains at least one T
literal. Thus, we can choose
a subset S of k vertices in G that contains exactly one vertex per clause
triangle, such that the corresponding k literals are all T.
Because each
triangle contains at most one vertex in S, no two vertices in S are connected
by a triangle edge. Because every literal corresponding to a vertex in S is
T,
no two vertices in S are connected by a negation edge. We conclude
that S is an independent set of size k in G.
( On the other hand, suppose G contains an independent set S of size k. Each
vertex in S must lie in a dierent
clause triangle. Suppose we assign the
value T
to each literal in S; because contradictory literals are connected
by edges, this assignment is consistent. There may be variables x such that
neither x nor x¯ corresponds to a vertex in S; we can set these variables to
391
12. NP-HARDNESS
any value we like. Because S contains one vertex in each clause triangle,
each clause in
contains (at least) one T
literal. We conclude that
is
satisfiable.
Transforming the CNF
formula
into the graph G takes polynomial time,
even if we do everything by brute force. Thus, if we could solve MIS
in polynomial time, then we could also solve S
in polynomial time, by
transforming the input formula
into a graph G and comparing the size of the
largest independent set in G with the number of clauses in .
But that would
imply P=NP, which is ridiculous! We conclude that MIS
is NP-hard.
12.8 The General Pattern
All NP-hardness proofs—and more generally, all polynomial-time reductions—
follow the same general outline. To reduce problem X to problem Y in polynomial
time, we need to do three things:
.
Describe a polynomial-time algorithm to transform an arbitrary instance
of x of X into a special instance y of Y .
.
Prove that if x is a “good” instance of X, then y is a “good” instance of Y .
.
Prove that if y is a “good” instance of Y , then x is a “good” instance of X.
(This is usually the part that causes the most trouble.)
Of course, developing a correct reduction doesn’t mean handling these three
tasks one at a time. First writing down an algorithm that seems to work and then
trying prove that it actually works is rarely successful, especially in time-limited
settings like exams. We must develop the algorithm, the “if” proof, and the
“only if” proof simultaneously.
To quote the late great Ricky Jay:
This is an acquired skill.
One point that confuses many students is that the reduction algorithm only
“works one way”—from X to Y—but the correctness proof needs to “work both
ways”. But the correctness proofs are not actually symmetric. The “if” proof
needs to handle arbitrary instances of X, but the “only if” only needs to handle
the special instances of Y produced by the reduction algorithm. Exploiting this
asymmetry is the key to successfully designing correct reductions.
I find it useful to think in terms of transforming certificates—proofs that a
given instance is “good”—along with the instances themselves. For example,
a certificate for CS
is a set of inputs that turns on the light bulb;
a certificate for S
or S
is a satisfying assignment; a certificate for MIS
is a large independent set. To reduce X to Y , we actually need to design three
algorithms, one for each of the following tasks:
from
his
o-Broadway
show Ricky Jay and his
Assistants
392
12.8. The General Pattern
• Transform an arbitrary instance x of X into a special instance y of Y in
polynomial time.
• Transform an arbitrary certificate for x into a certificate for y, and
• Transform an arbitrary certificate for y into a certificate for x.
The second and third tasks refer to the input and output of the first algorithm. The
certificate transformation needs to be reversible, not the instance transformation.
We never have to transform instances of Y , and we don’t need to think about
arbitrary instances of Y at all. Only the first algorithm needs to run in polynomial
time (although in practice, the second and third algorithms are almost always
simpler than the first).
For example, our reduction from CS
to S
consists of three
algorithms:
• The first transforms an arbitrary boolean circuit K into a special CNF
boolean formula 3,
in polynomial time. (Encode each wire as a variable
and each gate as a sub-formula, and then expand each sub-formula into
CNF.)
• The second transforms an arbitrary satisfying input for K into a satisfying
assignment for 3.
(Trace the input through the circuit, transfer values from
each wire to the corresponding variable, and give any additional variables
arbitrary values.)
• The third transforms an arbitrary satisfying assignment for 3
into a satisfying
input for K. (Transfer values from each wire variable in 3
to the
corresponding wire in K.)
The reduction works because the first algorithm encodes any boolean circuit K
into a highly structured CNF
formula 3.
The specific structure of 3
restricts
how it can be satisfied; every satisfying assignment for 3
must “come from”
some satisfying input for K. We don’t have to think about arbitrary CNF
formulas at all.
Similarly, our reduction from S
to MIS
consists of three algorithms:
• The first transforms an arbitrary CNF
formula
into a special graph G and
a specific integer k, in polynomial time.
• The second transforms an arbitrary satisfying assignment for
into an
independent set in G of size k.
• The third transforms an arbitrary independent set in G of size k into a
satisfying assignment for .
Again, our first transformation encodes the input formula
into a highly
structured graph G and a specific integer k. The structure of G ensures that
every independent set of size k “comes from” a satisfying assignment for .
We
don’t consider arbitrary graphs or arbitrary independent set sizes at all.
393
12. NP-HARDNESS
12.9 Clique and Vertex Cover (from Independent Set)
A clique is another name for a complete graph, that is, a graph where every
pair of vertices is connected by an edge. The MC
problem asks for the
number of nodes in its largest complete subgraph in a given graph. A vertex
cover of a graph is a set of vertices that touches every edge in the graph. The
MVC
problem asks for the size of the smallest vertex cover in a
given graph.
Figure 12.9. A graph whose largest independent set, largest clique, and smallest vertex cover all have
size 4.
We can prove that MC
is NP-hard using the following easy reduction
from MIS.
Any graph G has an edge-complement G with the same
vertices, but with exactly the opposite set of edges—uv is an edge in G if and
only if uv is not an edge in G. A set of vertices is independent in G if and
only if the same vertices define a clique in G. Thus, the largest independent in
G has the same vertices (and thus the same size) as the largest clique in the
complement of G.
M↵I⌃⇥⇤⌥⇤⌃⇥⇤⌃
S⇤
M↵C⇧⌅⌦⇤
G
graph
complement
in O(V2)
time
G
graph size of largest
clique in G
k
size of largest
independent
set in G
‾ k
‾
‾
Figure 12.10. An easy reduction from MAXINDSET to MAXCLIQUE.
The proof that MVC
is NP-hard is even simpler, because it relies
on the following easy observation: I is an independent set in a graph G = (V, E)
if and only if its complement V \ I is a vertex cover of the same graph G. Thus,
the largest independent set in any graph is the complement of the smallest vertex
cover of the same graph! Thus, if the smallest vertex cover in an n-vertex graph
has size k, then the largest independent set has size n
k.
394
12.10. Graph Coloring (from 3SAT)
M↵I⇧⇥⇤⌥⇤⇧⇥⇤⇧
S⇤
G
graph –
M⌅⇧V⇤
⇤↵
C⌃⌦⇤
size of largest
independent
set in G
n – k
size of smallest
vertex cover in G
k
n
number of vertices in G
Figure 12.11. An even easier reductions from MAXINDSET to MINVERTEXCOVER.
12.10 Graph Coloring (from 3SAT)
A proper k-coloring of a graph G = (V, E) is a function C : V ! {1, 2, . . . , k}
that assigns one of k “colors” to each vertex, so that every edge has two dierent
colors at its endpoints. (The “colors” are really arbitrary labels, which for
simplicity we represent by small positive integers, rather than electromagnetic
frequencies, CMYK vectors, or Pantone numbers, for example.) The graph
coloring problem asks for the smallest possible number of colors in a legal
coloring of a given graph.
To prove that graph coloring is NP-hard, it suces
to consider the decision
problem C:
Given a graph, does it have a proper 3-coloring? We prove
C
is NP-hard using a reduction from S.
(Why S?
Because it has a
in it. You probably think I’m joking, but I’m not.) Given a CNF
formula ,
we
construct a graph G that is -colorable
if and only if
is satisfiable, as suggested
by the usual diagram.
SAT
C⇧⌅⇧⌃
G
3CNF
Boolean
formula
Φ
graph
G is
-colorable
G is not
-colorable
T⌃⇤
F⇥⌅⌥⇤
Φ is
satisfiable
Φ is not
satisfiable
T⌃⇤
F⇥⌅⌥⇤
transform
in O(n)
time
We describe the reduction using a standard strategy of decomposing the
output graph G into gadgets, subgraphs that enforce various semantics of the
input formula
in the language of graph coloring. Decomposing reductions
into separate gadgets is not only helpful for understanding existing reductions
and proving them correct, but for designing new NP-hardness reductions.
Our
reduction from CS
to S
encoding each gate in the input circuit as a clause
in the output formula; these clauses are “gate gadgets”. Similarly, our reduction from S
to
MIS
used two type of gadgets: “clause gadgets” (triangles) and “variable gadgets” (edges
between contradicting literals).
395
12. NP-HARDNESS
Our formula-to-graph reduction uses three types of gadgets:
• There is a single truth gadget: a triangle with three vertices T, F, and X,
which intuitively stand for T,
F,
and O.
Since these vertices
are all connected, they must have dierent
colors in any 3-coloring. For the
sake of convenience, we will name those colors T,
F,
and O.
Thus, when we say that a node is colored T,
we mean that it has the
same color as vertex T.
• For each variable a, the graph contains a variable gadget, which is a triangle
joining two new nodes labeled a and a to node X in the truth gadget. Node a
must be colored either T
or F,
and therefore node a must be colored
either F
or T,
respectively.
X
T F
X
a a
Figure 12.12. The truth gadget and a variable gadget for a.
• Finally, for each clause in ,
the graph contains a clause gadget. Each clause
gadget joins three literal nodes (from the corresponding variable gadgets)
to node T (from the truth gadget) using five new unlabeled nodes and ten
edges, as shown below.
a
b
c
T
Figure 12.13. A clause gadget for (a _ b _ ¯c).
In eect,
each triangle in the clause gadget behaves like a “majority
gate”. In any valid -coloring,
if the two vertices to the left of the triangle
have the same color, the rightmost vertex of the triangle must have the same
color; on the other hand, if the two left vertices have dierent
colors, the
color of the right vertex can be chosen arbitrarily. See Figure ..
It follows that there is no valid -coloring
of a clause gadget where all
three literal nodes are colored F.
On the other hand, any coloring of the
literal nodes with more than one color can be extended to a valid -coloring
of the clause gadget. The variable gadgets force each literal node to be
colored either T
or F;
thus, in any valid -coloring
of the clause
gadget, at least one literal node is colored T.
396
12.10. Graph Coloring (from 3SAT)
Figure 12.14. All valid 3-colorings of a “half-gadget”, up to permutations of the colors
The final graph G contains exactly one node T, exactly one node F, and exactly
two nodes a and a¯ for each variable. For example, Figure .
shows the graph
that results from the same CNF
formula (a _ b _ c)^(b _ ¯c _ d¯)^(a¯ _ c _ d)^
(a _ ¯
b _ d¯) that we previously used to illustrate the MIS
reduction in
Figure ..
The -coloring
is one of several that correspond to the satisfying
assignment a = c = T,
b = d = F.
X
T F
a a b b c c d d
(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯
b _ d¯)
Figure 12.15. The 3-colorable graph derived from a satisfiable 3CNF formula.
We’ve already done most of the work for a proof of correctness. If the formula
is satisfiable, then we can color the literal nodes according to any satisfying
assignment, and then (because each clause is satisfied) extend the coloring
across every clause gadget. On the other hand, if the graph is -colorable,
then
we can extract a satisfying assignment from any -coloring—at
least one of the
three literal nodes in every clause gadget is colored T.
Because C
is a special case of the more general graph coloring problem—
397
12. NP-HARDNESS
What is the minimum number of colors?—the more general optimization problem
is also NP-hard.
12.11 Hamiltonian Cycle
A Hamiltonian cycle in a graph is a cycle that visits every vertex exactly once.
(This is very dierent
from an Euler circuit, which is a closed walk that traverses
every edge exactly once; Euler circuits are easy to find and construct in linear
time using depth-first search.) Here we consider two dierent
proofs that the
Hamiltonian cycle problem in directed graphs is NP-hard.
From Vertex Cover
Our first NP-hardness proof reduces from the decision version of the vertex cover
problem. Given an undirected graph G and an integer k, we construct a directed
graph H, such that H has a Hamiltonian cycle if and only if G has a vertex cover
of size k. As in our previous reductions, the output graph H is composed of
several gadgets, each corresponding to certain features of the inputs G and k.
V⌅ ↵⌅C⌅
G
graph
D⇧ ⌅⇥↵⌅⇤
H⌥C✏⇥⌃⌅
integer
k
H has a
Ham. cycle
H has no
Ham. cycle
T ⌅
F⌃⌦⌅
G has a
vertex cover
of size k
T ⌅
F⌃⌦⌅
H
graph
transform
in O(V+E)
time
G has no
vertex cover
of size k
• For each undirected edge uv in G, the directed graph H contains an edge
gadget consisting of four vertices (u, v,in), (u, v, out), (v, u,in), (v, u, out)
and six directed edges
(u, v,in)(u,
v, out) (u, v,in)(v,
u,in) (v, u,in)(u,
v,in)
(v, u,in)(v,
u, out) (u, v, out)(v,
u, out) (v, u, out)(u,
v, out)
as shown in Figure ..
Each “in” vertex has an additional incoming edge,
and each “out” vertex has an additional outgoing edge. Any Hamiltonian
cycle in H must pass through an edge gadget in one of three ways—either
straight through on both sides, or with a detour from one side to the other
and back. Eventually, these options will correspond to both u and v, only u,
or only v belonging to some vertex cover.
• For each vertex u in G, all the edge gadgets for incident edges uv are
connected in H into a single directed path, which we call a vertex chain.
Specifically, suppose vertex u has d neighbors v1, v2,..., vd . Then H has
d
1 additional edges (u, vi, out)(u,
vi+1,in) for each i from 1 to d
1.
398
12.11. Hamiltonian Cycle
u v
(u,v,in)
(u,v,out)
(v,u,in)
(v,u,out)
u v u v u v
Figure 12.16. An edge gadget and its only possible intersections with a Hamiltonian cycle.
• Finally, H also contains k cover vertices x1, x2,..., xk. Each cover vertex has
a directed edge to the first vertex in each vertex chain, and a directed edge
from the last vertex in each vertex chain.
Figure .
shows a complete example of our transformation; each doublearrowed
blue segment represents a pair of directed edges.
u v
w x
u v
w x
k=2
Figure 12.17. Example of our reduction from VERTEXCOVER to DIRECTEDHAMCYCLE.
As usual, we prove our reduction is correct in two stages.
) First, suppose C = {u1, u2,..., uk} is a vertex cover of G of size k. We can
construct a Hamiltonian cycle in H that “encodes” C as follows. For each
index i from 1 to k, we traverse a path from cover vertex xi, through the
vertex chain for ui, to cover vertex xi+1 (or cover vertex x1 if i = k). As we
traverse the chain for each vertex ui, we determine how to proceed from
each node (ui, v,in) as follows:
– If v 2 C, follow the edge (ui, v,in)(ui,
v, out).
– If v 62 C, detour through (ui, v,in)(v,
ui,in)(v,
ui, out)(ui,
v, out).
Thus, for each edge uv of G, the Hamiltonian cycle visits (u, v,in) and
(u, v, out) as part of u’s vertex chain if u 2 C and as part of v’s vertex chain
otherwise. See Figure ..
399
12. NP-HARDNESS
u v
w x
u v
w x
k=2
Figure 12.18. Every vertex cover of size k in G corresponds to a Hamiltonian cycle in H and vice versa.
( On the other hand, suppose H contains a Hamiltonian cycle C. This
cycle must contain an edge from each cover vertex to the start of some
vertex chain. Our case analysis of edge gadgets inductively implies that
after C enters the vertex chain for some vertex u, it must traverse the entire
vertex chain. Specifically, at each vertex (u, v,in), the cycle must
contain either the single edge (u, v,in)(u,
v, out) or the detour path
(u, v,in)(v,
u,in)(v,
u, out)(u,
v, out), followed by an edge to the next
edge gadget in u’s vertex chain, or to a cover vertex if this is the last edge
gadget in u’s vertex chain. In particular, if C contains the detour edge
(u, v,in)(v,
u,in), it cannot contain edges between any cover vertex and v’s
vertex chain. It follows that C traverses exactly k vertex chains. Moreover,
these vertex chains describe a vertex cover of the original graph G, because C
visits the vertex (u, v,in) for every edge uv in G.
We conclude that G has a vertex cover of size k if and only if H contains a
Hamiltonian cycle. The transformation from G to H takes at most O(V2) time;
it follows that the directed Hamiltonian cycle problem is NP-hard.
From 3SAT
We can also prove that the directed Hamiltonian cycle problem is NP-hard by
reducing directly from S.
Given an arbitrary CNF
formula
with n variables
x1, x2,..., xn and k clauses c1,c2,...,ck, we construct a directed graph H that
contains a Hamiltonian cycle if and only if
is satisfiable, as follows.
For each variable xi, we construct a variable gadget, which consists of a
doubly-linked list of 2k vertices (i, 0),(i, 1),...,(i, 2k), connected by edges
(i, j
1)(i,
j) and (i, j)(i,
j
1) for each index j. We connect the first and
400
12.11. Hamiltonian Cycle
last nodes in each adjacent pair of variable gadgets by adding edges
(i, 0)(i
+ 1, 0) (i, 2k)(i
+ 1, 0) (i, 0)(i
+ 1, 2k) (i, 2k)(i
+ 1, 2k)
for each index i; we also connect the endpoints of the first and last variable
gadgets with the edges
(n, 0)(1,
0) (n, 2k)(1,
0) (n, 0)(1,
2k) (n, 2k)(1,
2k).
The resulting graph G has exactly 2n Hamiltonian cycles, one for each assignment
of boolean values to the n variables of .
Specifically, for each i, we traverse the
ith variable gadget from left to right if xi = T
and right to left if xi = F.
See Figure ..
a
b
c
d
a
b
c
d
Figure 12.19. Left: Variable gadgets and connectors in G, for any formula with 4 variables and 4 clauses.
Right: The Hamiltonian cycle in G corresponding to the assignment a = b = d = T
and c = F
Now we extend G to a larger graph H by adding a clause vertex [ j] for each
clause cj, connected to the variable gadgets by six edges, as shown in Figure ..
For each positive literal xi in cj, we add the edges (i, 2j
1)[
j](i,
2j), and
for each negative literal x¯i in cj, we add the edges (i, 2j)[
j](i,
2j
1). The
connections to the clause vertices guarantee that a Hamiltonian cycle in G
can be extended to a Hamiltonian cycle in H if and only if the corresponding
variable assignment satisfies .
Exhaustive case analysis now implies that H has
a Hamiltonian cycle if and only if
is satisfiable.
Transforming the formula
into the graph H takes O(kn) time, which is at
most quadratic in the total length of the formula; we conclude that the directed
Hamiltonian cycle problem is NP-hard.
Variants and Extensions
Trivial modifications of the previous reductions imply that the Hamiltonian path
problem in directed graphs is also NP-hard. A Hamiltonian path in a graph G is
of course a simple path that visits every vertex of G exactly once. In fact, there
401
12. NP-HARDNESS
a
b
c
d
a
b
c
d
Figure 12.20. Left: Clause gadgets for the formula (a _ b _ c)^(b _¯c _ d¯)^(a¯_ c _ d)^(a _¯
b _ d¯).
Right: A hamiltonian cycle in H corresponding to the satisfying assignment a = b = d = T
and
c = F.
are simple polynomial-time reductions from the Hamiltonian cycle problem to
the Hamiltonian path problem and vice versa. I’ll leave the details of these
reductions as exercises.
Both of the previous reductions deal with directed graphs, but the corresponding
question in undirected graph is also NP-hard. In fact, there is a
relatively simple reduction from the directed Hamiltonian cycle/path problem
to the undirected Hamiltonian cycle/path problem; again, I’ll leave the details
of this reduction as an exercise.
Finally, the infamous traveling salesman problem asks to find the shortest
Hamiltonian cycle (or path) in a graph with weighted edges. Since finding the
shortest cycle/path is obviously harder than determining if a cycle/path exists
at all—Consider a graph where every edge has weight 1!—TS
is also NP-hard.
12.12 Subset Sum (from Vertex Cover)
The next problem that we prove NP-hard is the SS
problem considered
in Chapter :
Given a set X of positive integers and an integer T, determine
whether X has a subset whose elements sum to T.
We once again reduce from VC.
Given a graph G and an integer k,
we need to compute a set X of positive integers and an integer T, such that X
has a subset that sums to T if and only if G has an vertex cover of size k. Our
transformation uses only two types of “gadgets”, which are integers representing
the vertices and edges in G.
Number the edges of G arbitrarily from 0 to E
1. Our set X contains the
402
12.12. Subset Sum (from Vertex Cover)
integer bi := 4i for each edge i, and the integer
av := 4E +
X
i2(v)
4i
for each vertex v, where (v)
is the set of edges that have v has an endpoint.
Alternately, we can think of each integer in X as an (E +1)-digit number written
in base 4. The Eth digit is 1 if the integer represents a vertex, and 0 otherwise;
and for each i < E, the ith digit is 1 if the integer represents edge i or one of its
endpoints, and 0 otherwise. Finally, we set the target sum
T := k · 4E +
X
E1
i=0
2 · 4i
.
Now let’s prove that the reduction is correct.
) First, suppose G has a vertex cover C of size k. Consider the subset
X0 := {av | v 2 C} [ {bi | edge i has exactly one endpoint in C}
The sum of the elements of X0
, written in base ,
has most significant digit k
and all other digits equal to 2. Thus, the elements of X0 sum to exactly T.
( On the other hand, suppose there is a subset X0 ✓ X that sums to T.
Specifically, we must have
X
v2V0
av +
X
i2E0
bi = t
for some subsets V0 ✓ V and E0 ✓ E. Again, if we sum these base-
numbers,
there are no carries in the first E digits, because for each i there are only
three numbers in X whose ith digit is 1. Each edge number bi contributes
only one 1 to the ith digit of the sum, but the ith digit of t is 2. Thus, for
each edge in G, at least one of its endpoints must be in V0
. In other words,
V0 is a vertex cover. On the other hand, only vertex numbers are larger
than 4E, and bT/4Ec = k, so V0 has at most k elements. (In fact, it’s not
hard to see that V0 has exactly k elements.)
For example, given the four-vertex graph G = (V, E) where V = {u, v, w, x} and
E = {uv, uw, vw, v x, wx}, our set X might contain the following base-
integers:
au := 1110004 = 1344 buv := 0100004 = 256
av := 1101104 = 1300 buw := 0010004 = 64
aw := 1011014 = 1105 bvw := 0001004 = 16
ax := 1000114 = 1029 bv x := 0000104 = 4
bwx := 0000014 = 1
403
12. NP-HARDNESS
If we are looking for a vertex cover of size k = 2, our target sum would be
T := 2222224 = 2730. Indeed, the vertex cover {v, w} corresponds to the subset
{av , aw, buv , buw, bv x , bwx }, whose sum is 1300+1105+256+64+4+1 = 2730.
The reduction can clearly be performed in polynomial time. We’ve already
proved that VC
is NP-hard, so it follows that SS
is NP-hard.
Caveat Reductor!
One subtle point must be emphasized here. 3-something
pages ago, back in
Chapter ,
we developed a dynamic programming algorithm to solve SS
in O(nT) time. Isn’t this a polynomial-time algorithm? Didn’t we just prove that
P=NP? Hey, where’s my million dollars?!
Alas, life is not so simple. True, the running time is a polynomial function
of the variables n and T, but to qualify as a true polynomial-time algorithm, the
running time must be a polynomial function of the input size—the number of bits
required to represent the input. The values of the elements of X and the target
sum T could be exponentially larger than the number of input bits. Indeed,
the reduction we just described produces a value of T that is exponentially
larger than the size of our original input graph, which forces our dynamic
programming algorithm to run in exponential time.
Algorithms like this are said to run in pseudo-polynomial time, and any
NP-hard problem with such an algorithm is called weakly NP-hard. Equivalently,
a weakly NP-hard problem is a problem that can be solved in polynomial time
when all input numbers are represented in unary (as a sum of 1s), but becomes
NP-hard when all input numbers are represented in binary. If a problem is
NP-hard even when all the input numbers are represented in unary, we say
that the problem is strongly NP-hard. A good example of a strongly NP-hard
problem is TS,
which remains NP-hard even if the input graph
is complete and all edge weights are equal to 1 or 2.
12.13 Other Useful NP-hard Problems
Literally thousands of problems have been proved to be NP-hard. Here I will list
a few NP-hard problems that are useful in deriving reductions.
I won’t describe
the NP-hardness proofs for these problems in detail, but you can find most of
them in Garey and Johnson’s classic Scary Black Book of NP-Completeness.
All
As
someday it may happen a reduction must be found, I’ve got a little list. I’ve got a little
list. Of some Herculean problems we can use to lower bound. Fast solutions don’t exist. Our
proofs they can assist. Michael
Garey and David Johnson. Computers and Intractability: A Guide to the Theory of
NP-Completeness. W. H. Freeman and Co., .
And yes, it really is black.
404
12.13. Other Useful NP-hard Problems
of the problems I’ve discussed so far, and most of the problems in the following
list, were first proved NP-hard in a single landmark
paper by Richard Karp.
• PlanarCircuitSat: Given a boolean circuit that can be embedded in the
plane so that no two wires cross, is there an input that makes the circuit
output T?
This problem can be proved NP-hard by reduction from the
general circuit satisfiability problem, by replacing each crossing with a small
assemblage of gates.
• 1-in-3Sat: Given a CNF
formula, is there an assignment of values to the
variables so that each clause contains exactly one T
literal? This problem
can be proved NP-hard by reduction from the usual S.
• NotAllEqual3Sat: Given a CNF
formula, is there an assignment of
values to the variables so that every clause contains at least one T
literal
and at least one F
literal? This problem can be proved NP-hard by
reduction from the usual S.
• Planar3Sat: Given a CNF
boolean formula, consider a bipartite graph
whose vertices are the clauses and variables, where an edge indicates that
a variable (or its negation) appears in a clause. If this graph is planar, the
CNF
formula is also called planar. The PS
problem asks, given a
planar CNF
formula, whether it has a satisfying assignment. This problem
can be proved NP-hard by reduction from PCS.
• Exact3DimensionalMatching or X3M: Given a set S and a collection of
three-element subsets of S, called triples, is there a sub-collection of disjoint
triples that exactly cover S? This problem can be proved NP-hard by a
reduction from S,
because it has a
in it.
• Partition: Given a set S of n integers, are there subsets A and B such that
A[ B = S, A\ B = ?, and
X
a2A
a =
X
b2B
b?
This problem can be proved NP-hard by a simple reduction from SS.
Like SS,
the P
problem is only weakly NP-hard.
• 3Partition: Given a set S of 3n integers, can it be partitioned into n disjoint
three-element subsets, such that every subset has exactly the same sum?
Despite the similar names, this problem is very dierent
from P;
sorry, I didn’t make up the names. This problem can be proved NP-hard
by reduction from XM,
because it has a
in it. Unlike P,
the
P
problem is strongly NP-hard; it remains NP-hard even if every
input number is at most n3.
Later
performed o-Broadway
as Richard Karp and his
Assistants, for which Karp won a
well-deserved Tony Turing award. Surprisingly,
PNAES
is solvable in polynomial time!
405
12. NP-HARDNESS
• SetCover: Given a collection of sets S = {S1, S2,..., Sm}, find the smallest
sub-collection of Si’s that contains all the elements of S
i Si. This problem is
a generalization of both VC
and XM.
• HittingSet: Given a collection of sets S = {S1, S2,..., Sm}, find the minimum
number of elements of S
i Si that hit every set in S. This problem is
also a generalization of VC.
• LongestPath: Given a non-negatively weighted graph G (either directed
or undirected) and two vertices u and v, what is the longest simple path
from u to v in the graph? A path is simple if it visits each vertex at most
once. This problem is a generalization of the corresponding Hamiltonian
path problem. Of course, the corresponding shortest path problem can be
solved in polynomial time.
• SteinerTree: Given a weighted, undirected graph G with some of the
vertices marked, what is the minimum-weight subtree of G that contains
every marked vertex? If every vertex is marked, the minimum Steiner tree
is the minimum spanning tree; if exactly two vertices are marked, the
minimum Steiner tree is the shortest path between them. This problem can
be proved NP-hard by reduction from VC.
• Max2Sat: Given a Boolean formula in conjunctive normal form, with exactly
two literals per clause, find a variable assignment that maximizes the number
of clauses with at least one T
literal. This problem can be proved NP-hard
by reduction from S
(yes, even though it doesn’t have a
in it). The
simpler decision problem S,
which asks if there is an assignment that
satisfies every clause, can actually be solved in polynomial time.
• MaxCut: Given an undirected graph G = (V, E), find a subset S ⇢ V that
maximizes the number of edges with exactly one endpoint in S. Equivalently,
find the largest bipartite subgraph of G. This problem can be proved NP-hard
by reduction from MS.
In addition to these dry but useful problems, most interesting puzzles and solitaire
games have been shown to be NP-hard, or to have NP-hard generalizations.
(Arguably, if a game or puzzle isn’t at least NP-hard, it isn’t interesting!) Here
are some examples you may find familiar:
• Minesweeper (from CS)
• Sudoku (utlimately from S)
Richard
Kaye. Minesweeper is NP-complete. Mathematical Intelligencer ():–,
.
But see also: Allan Scott, Ulrike Stege, and Iris van Rooij. Minesweeper may not be NP-complete
but is hard nonetheless. Mathematical Intelligencer ():–,
.
Takayuki
Yato and Takahiro Seta. Complexity and completeness of finding another solution
and its application to puzzles. IEICE Transactions on Fundamentals of Electronics, Communications
and Computer Sciences E-A():–,
.
http://www-imai.is.s.u-tokyo.ac.jp/~yato/
data/MasterThesis.pdf.
406
12.14. Choosing the Right Problem
• Tetris (from P)
• Klondike, aka “Solitaire” (from S)
• Pac-Man (from HC)
• Super Mario Brothers (from S)
• Candy Crush Saga (from a variant of S)
• Threes/
(from S,
of course)
• Trainyard (from DS;
see Exercise )
• Shortest n ⇥ n ⇥ n Rubik’s cube solution (from S
via a special case of
PUHC).
• Cookie Clicker (from P
or P)
This list is necessarily incomplete, thanks to a limited footnote budget.
As
of June ,
nobody has published a proof that a generalization of Ultimate
Paperclips, Line Rider, Twister, or Cards Against Humanity is NP-hard, but I’m
sure it’s only a matter of time.
12.14 Choosing the Right Problem
One of the most dicult
steps in proving that a problem is NP-hard is choosing
the best problem to reduce from. The Cook-Levin Theorem implies that if there
is a reduction from any NP-hard problem to problem X, then there is a reduction
Ron
Breukelaar, Erik D. Demaine, Susan Hohenberger, Hendrik J. Hoogeboom, Walter A.
Kosters, and David Liben-Nowell. Tetris is hard, even to approximate. International Journal of
Computational Geometry and Applications :–,
.
Luc
Longpré and Pierre McKenzie. The complexity of Solitaire. Proceedings of the nd
International Mathematical Foundations of Computer Science, –,
.
Giovanni
Viglietta. Gaming is a hard job, but someone has to do it! Theory of Computing
Systems, ():–,
.
http://giovanniviglietta.com/papers/gaming.pdf.
Greg
Aloupis, Erik D. Demaine, Alan Guo, and Giovanni Viglietta. Classic Nintendo games
Are (computationally) hard. Theoretical Computer Science :–,
.
http://arxiv.org/
abs/..
Luciano
Gualà, Stefano Leucci, Emanuele Natale. Bejeweled, Candy Crush and other matchthree
games are (NP-)hard. Proc.
IEEE Conference on Computational Intelligence and Games,
.
http://arxiv.org/abs/..
Stefan
Langerman and Yushi Uno. Threes!, Fives, !,
and
are Hard. Proc. th
International Conference on Fun with Algorithms, .
https://arxiv.org/abs/..
Matteo
Almanza, Stefano Leucci, and Alessandro Panconesi. Trainyard is NP-Hard. Proc.
th
International Conference on Fun with Algorithms, .
https://arxiv.org/abs/..
Erik
D. Demaine, Sarah Eisenstat, and Mikhail Rudoy. Solving the Rubik’s Cube optimally
is NP-complete. Proc. th
Symposium on Theoretical Aspects of Computer Science, .
https:
//arxiv.org/abs/..
Erik
D. Demaine, Hiro Ito, Stefan Langerman, Jayson Lynch, Mikhail Rudoy, and Kai Xiao.
Cookie Clicker. Preprint, August .
https://arxiv.org/abs/..
See
https://xkcd.com//
407
12. NP-HARDNESS
from every NP-complete problem to problem X, but some problems are easier
to work with than others. There’s no systematic method for choosing the right
problem, but here are a few useful rules of thumb.
• If the problem asks how to assign bits to objects, or to choose a subset of
objects, or to partition objects into two dierent
subsets, try reducing from
some version of S
or P.
• If the problem asks how to assign labels to objects from a small fixed set,
or to partition objects into a small number of subsets, try reducing from
kC
or even C.
• If the problem asks to arrange a set of objects in a particular order, try
reducing from DHC
or DHP
or T-
S.
• If the problem asks to find a small subset satisfying some constraints, try
reducing from MVC.
• If the problem asks to find a large subset satisfying some constraints, try
reducing from MIS
or MC
or MS.
• If the problem asks to partition objects into a large number of small subsets,
try reducing from P.
• If the number 3 appears naturally in the problem, try S
or C
or
XM
or P.
(No, this is not a joke.)
• If all else fails, try S
or even CS!
I do not recommend trying to reduce from T,
SMB,
or
T.
You really want to choose a starting problem that is as simple as
possible, while still capturing some feature of your problem that makes it dicult
to solve.
12.15 A Frivolous Real-World Example
Draughts is a family of board games that have been played for thousands of
years. Most Americans are familiar with the version called checkers or English
draughts, but the most common variant worldwide, known as international
draughts or Polish draughts, originated in the Netherlands in the th
century.
For a complete set of rules, the reader should consult Wikipedia; here a few
important dierences
from the Anglo-American game:
• Flying kings: As in checkers, a piece that ends a move in the row closest to
the opponent becomes a king and gains the ability to move backward. Unlike
in checkers, however, a king in international draughts can move any distance
along a diagonal line in a single turn, as long as the intermediate squares
are empty or contain exactly one opposing piece (which is captured).
408
12.15. A Frivolous Real-World Example
• Forced maximum capture: In each turn, the moving player must capture
as many opposing pieces as possible. This is distinct from the forced-capture
rule in checkers, which requires only that each player must capture if possible,
and that a capturing move ends only when the moving piece cannot capture
further. In other words, checkers requires capturing a locally maximal set of
opposing pieces on each turn; whereas, international draughts requires a
globally maximum capture.
• Capture subtleties: As in checkers, captured pieces are removed from the
board only at the end of the turn. Any piece can be captured at most once.
Thus, when an opposing piece is jumped, that piece remains on the board
but cannot be jumped again until the end of the turn.
For example, in the first position shown below, each circle represents a piece,
and doubled circles represent kings. Black must make the first indicated move,
capturing five white pieces, because it is not possible to capture more than five
pieces, and there is no other move that captures five. Black cannot extend
his capture further, either northeast or southeast, because the captured White
pieces remain on the board until his turn is over. Then White must make the
second indicated move, thereby winning the game.
Figure 12.21. Two forced(!) moves in international draughts; doubled circles are kings.
The actual game, which is played on a 10 ⇥ 10 board with
pieces of each
color, is computationally trivial; we can precompute the optimal move for both
players in every possible board configuration and hard-code the results into a
lookup table of constant size. Sure, it’s a big constant, but it’s still a constant!
But consider the natural generalization of international draughts to an n ⇥ n
board. In this setting, finding a legal move is actually NP-hard! The following
reduction from the Hamiltonian cycle problem in directed graphs was discovered
by Bob Hearn in .
In most two-player games, finding the best move is
NP-hard (or worse). This is the only example I know of a game—and moreover
a real game played by millions of people for centuries—where merely following
the rules is NP-hard!
Given an undirected graph G with n vertices, we construct a board con-
figuration for international draughts, such that White can capture a certain
See
Theoretical Computer Science Stack Exchange: http://cstheory.stackexchange.com/a/
/.
409
12. NP-HARDNESS
number of black pieces in a single move if and only if G has a Hamiltonian
cycle. We treat G as a directed graph, with two arcs uv
and vu
in place of
each undirected edge uv. Number the vertices arbitrarily from 1 to n. The final
draughts configuration has several gadgets:
• The vertices of G are represented by rabbit-shaped vertex gadgets, which
are evenly spaced along a horizontal line. Each arc ij
is represented by a
path of two diagonal line segments from the “left ear” of vertex gadget i to
the “right ear” of vertex gadget j. The path for arc ij
is located above the
vertex gadgets if i < j, and below the vertex gadgets if i > j.
1 2 3 4
2 4
1
3
Figure 12.22. A high-level overview of the reduction from Hamiltonian cycle to international draughts.
• The bulk of each vertex gadget is a diamond-shaped region called a vault.
The walls of the vault are composed of two solid layers of black pieces,
which cannot be captured; these pieces are drawn as gray circles in the
figures. There are N capturable black pieces inside each vault, for some
large integer N to be determined later. A white king can enter the vault
through the “right ear”, capture every internal piece, and then exit through
the “left ear”. Both ears are hallways, again with walls two pieces thick,
with gaps where the arc paths end to allow the white king to enter and
leave. The lengths of the “ears” can easily be adjusted to align with the
other gadgets.
• For each arc ij,
we have a corner gadget, which allows a white king leaving
vertex gadget i to be redirected to vertex gadget j.
• Finally, wherever two arc paths cross, we have a crossing gadget; these
gadgets allow the white king to traverse either arc path, but forbid switching
from one arc path to the other.
A single white king starts at the bottom corner of one of the vaults. In any
legal move, this king must alternate between traversing entire arc paths and
clearing vaults. The king can traverse the various gadgets backward, entering
410
12.15. A Frivolous Real-World Example
1
2
4
4
2
1
3 3
Entrance
Exit
Figure 12.23. Left: A vertex gadget with three entrances and three exits. Right: A white king emptying
the vault. Gray circles are black pieces that cannot be captured.
Figure 12.24. Left: One of two paths through a corner gadget. Right: One of two paths through a
crossing gadget.
each vault through the exit and vice versa. But the reversal of a Hamiltonian
cycle in G is another Hamiltonian cycle in G, so walking backward is fine.
If there is a Hamiltonian cycle in G, the white king can capture at least nN
black pieces by visiting each of the other vaults and returning to the starting
vault. On the other hand, if there is no Hamiltonian cycle in G, the white king
can capture at most half of the pieces in the starting vault, and thus can capture
at most (n
1/2)N + O(n3) enemy pieces altogether. The O(n3) term accounts
for the corner and crossing gadgets; each edge passes through one corner gadget
and at most n2/2 crossing gadgets.
To complete the reduction, we set N = n4. Summing up, we obtain an
O(n5) ⇥ O(n5) board configuration, with O(n5) black pieces and one white king.
We can clearly construct this board configuration by brute force in polynomial
time. Figure .
shows a complete example of the construction.
It is still open whether the following related question is NP-hard: Given an
n ⇥ n board configuration for international draughts, can (and therefore must)
White capture all the black pieces (thereby winning the game) in a single turn?
411
12. NP-HARDNESS
1 2 3 4
Figure 12.25. The final draughts configuration for the 4-vertex graph in Figure 12.22. (The green arrows
are not actually part of the configuration.)
™12.16 On Beyond Zebra
P and NP are only the first two steps in an enormous hierarchy of complexity
classes. To close this chapter (and the book), let me describe a few more classes
of interest.
Polynomial Space
PSPACE is the set of decision problems that can be solved using polynomial
space. Every problem in NP (and therefore in P) is also in PSPACE. It is generally
believed that NP 6= PSPACE, but nobody can even prove that P 6= PSPACE.
A problem ⇧ is PSPACE-hard if, for any problem ⇧0 that can be solved using
polynomial space, there is a polynomial-time many-one reduction from ⇧0 to ⇧.
If any PSPACE-hard problem is in NP, then PSPACE=NP; similarly, if any
PSPACE-hard problem is in P, then PSPACE=P.
412
™12.16. On Beyond Zebra
The canonical PSPACE-hard problem is the quantified boolean formula
problem, or QBF: Given a boolean formula
that may include any number
of universal or existential quantifiers, but no free variables, is
equivalent to
T?
For example, the following expression is a valid input to QBF:
9a: 8b: 9c: (8d : a _ b _ c _ d¯) , ((b ^ ¯c) _ (9e : (a¯ ) e) _ (c 6= a ^ e))).
S
is equivalent to the special case of QBF where the input formula contains
only existential quantifiers (9). QBF remains PSPACE-hard even when the input
formula must have all its quantifiers at the beginning, the quantifiers strictly
alternate between 9 and 8, and the quantified proposition is in conjunctive
normal form, with exactly three literals in each clause, for example:
9a: 8b: 9c: 8d :
(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯
b _ d¯)
This restricted version of QBF can also be phrased as a two-player strategy
question. Suppose two players, Alice and Bob, are given a CNF
predicate
with free variables x1, x2,..., xn. The players alternately assign values to the
variables in order by index—Alice assigns a value to x1, Bob assigns a value
to x2, and so on. Alice eventually assigns values to every variable with an
odd index, and Bob eventually assigns values to every variable with an even
index. Alice wants to make the expression T,
and Bob wants to make
it F.
Assuming Alice and Bob play perfectly, who wins this game? Not
surprisingly, most two-player games
like tic-tac-toe, reversi, checkers, go,
chess, and mancala—or more accurately, appropriate generalizations of these
constant-size games to arbitrary board sizes—are PSPACE-hard.
Another canonical PSPACE-hard problem is NFA totality: Given a nondeterministic
finite-state automaton M over some alphabet ⌃, does M accept
every string in ⌃⇤? The closely related problems NFA equivalence (Do two given
NFAs accept the same language?) and NFA minimization (Find the smallest NFA
that accepts the same language as a given NFA) are also PSPACE-hard, as are
the corresponding questions about regular expressions. (The corresponding
questions about deterministic finite-state automata are actually solvable in
polynomial time.)
Exponential Time
The next significantly larger complexity class, EXP (also called EXPTIME), is
the set of decision problems that can be solved in exponential time, that is,
using at most 2nc
steps for some constant c > 0. Every problem in PSPACE (and
For
a good (but inevitably aging) overview of known results on the computational complexity
of games and puzzles, see Erik Demaine and Bob Hearn’s monograph Games, Puzzles, and
Computation (CRC Press, ).
413
12. NP-HARDNESS
therefore in NP (and therefore in P)) is also in EXP. It is generally believed
that PSPACE ( EXP, but nobody can even prove that NP 6= EXP. A problem ⇧ is
EXP-hard if, for any problem ⇧0 that can be solved in exponential time, there is
a polynomial-time many-one reduction from ⇧0 to ⇧. If any EXP-hard problem is
in PSPACE, then EXP=PSPACE; similarly, if any EXP-hard problem is in NP, then
EXP=NP. We do know that P 6= EXP; in particular, no EXP-hard problem is in P.
Natural generalizations of many interesting -player
games—like checkers,
chess, mancala, and go—are actually EXP-hard. The boundary between PSPACEcomplete
games and EXP-hard games is rather subtle. For example, there are
three ways to draw in chess (the standard 8 ⇥ 8 game): stalemate (the player to
move is not in check but has no legal moves), repeating the same board position
three times, or moving fifty times without capturing or moving a pawn. The
n ⇥ n generalization of chess is either in PSPACE or EXP-hard depending on
how we generalize these rules. If we declare a draw after (say) n3 capture-free
moves, then every game must end after a polynomial number of moves, so we
can simulate all possible games from any given position using only polynomial
space. On the other hand, if we ignore the capture-free move rule entirely, the
resulting game can last an exponential number of moves, so there no obvious
way to detect a repeating position using only polynomial space; indeed, this
version of n ⇥ n chess is EXP-hard.
Excelsior!
Naturally, even exponential time is not the end of the story. NEXP is the class
of decision problems that can be solve in nondeterministic exponential time;
equivalently, a decision problem is in NEXP if and only if, for every Y
instance,
there is a proof of this fact that can be checked in exponential time. EXPSPACE
is the set of decision problems that can be solved using exponential space. Even
these larger complexity classes have hard problems; for example, if we add the
intersection operator \ to the syntax of regular expressions, deciding whether
two such expressions describe the same language is EXPSPACE-hard. Beyond
EXPSPACE are complexity classes with doubly-exponential resource bounds
(EEXP, NEEXP, and EEXPSPACE), then triply exponential resource bounds
(EEEXP, NEEEXP, and EEEXPSPACE), and so on ad infinitum.
All these complexity classes can be ordered by inclusion:
P ✓ NP ✓ PSPACE ✓ EXP ✓ NEXP ✓ EXPSPACE ✓ EEXP ✓ NEEXP ✓ ···
Most complexity theorists strongly believe that every inclusion in this sequence
is strict; that is, no two of these complexity classes are equal. However, the
strongest result that has been proved is that every class in this sequence is strictly
contained in the class three steps later in the sequence. For example, we have
414
Exercises
proofs that P 6= EXP and PSPACE 6= EXPSPACE, but not whether P 6= PSPACE or
NP 6= EXP.
The limit of this series of increasingly exponential complexity classes is the
class ELEMENTARY of decision problems that can be solved using time or space
bounded by a function the form 2 "k n for some constant integer k, where
2 "k n :=
®
n if k = 0,
22"k1n
otherwise.
For example, 2 "1 n = 2n and 2 "2 n = 22n
.
It may be tempting to conjecture that every natural decidable problem can
be solved in elementary time, but in fact this conjecture is incorrect. Consider
the generalized regular expressions defined by recursively combining (possibly
empty) strings over some finite alphabet by concatenation (x y), union (x + y),
Kleene closure (x⇤), and negation (x). For example, the generalized regular
expression (0 + 1)
⇤00(0 + 1)
⇤ represents the set of strings in {0, 1}⇤ that do not
contain two 0s in a row. It is possible to determine algorithmically whether
two generalized regular expressions describe identical languages, by recursively
converting each expression into an equivalent NFA, converting each NFA into a
DFA, and then minimizing the DFA. However, the running time of this algorithm
has the non-elementary bound 2 "⇥(n) 2, intuitively because each layer of
recursive negation can exponentially increase the number of states. In fact,
Larry Stockmeyer proved in
that this problem cannot be solved in merely
elementary time, even if we forbid Kleene closure.
Exercises
.
(a) Describe and analyze and algorithm to solve P
in time O(nM),
where n is the size of the input set and M is the sum of the absolute
values of its elements.
(b) Why doesn’t this algorithm imply that P=NP?
.
Consider the following problem, called BD:
Given a set of n axisaligned
rectangles in the plane, how big is the largest subset of these
rectangles that contain a common point?
(a) Describe a polynomial-time reduction from BD
to MC.
(b) Describe and analyze a polynomial-time algorithm for BD.
[Hint:
O(n3) time should be easy, but O(n log n) time is possible.]
(c) Why don’t these two results imply that P=NP?
415
12. NP-HARDNESS
.
A boolean formula is in disjunctive normal form (or DNF) if it consists of a
disjunction (O)
or several terms, each of which is the conjunction (A)
of
one or more literals. For example, the formula
(x ^ y ^ z) _ ( y ^ z) _ (x ^ y ^ z)
is in disjunctive normal form. DNF-S
asks, given a boolean formula in
disjunctive normal form, whether that formula is satisfiable.
(a) Describe a polynomial-time algorithm to solve DNF-S.
(b) What is the error in the following argument that P=NP?
Suppose we are given a boolean formula in conjunctive normal form with
at most three literals per clause, and we want to know if it is satisfiable.
We can use the distributive law to construct an equivalent formula in disjunctive
normal form. For example,
(x _ y _ z) ^ (x _ y) () (x ^ y) _ (y ^ x) _ (z ^ x) _ (z ^ y)
Now we can use the algorithm from part (a) to determine, in polynomial
time, whether the resulting DNF formula is satisfiable. We have just solved
3SAT in polynomial time. Since 3SAT is NP-hard, we must conclude that
P=NP!
.
The problem AONS
asks, given a CNF
boolean formula,
whether there is an assignment to the variables such that each clause either
has three T
literals or has three F
literals.
(a) Describe a polynomial-time algorithm to solve AONS.
(b) But S
is NP-hard! Why doesn’t the existence of this algorithm prove
that P=NP?
.
(a) Suppose you are given a magic black box that can determine in polynomial
time, given an arbitrary weighted graph G, the length of the
shortest Hamiltonian cycle in G. Describe and analyze a polynomialtime
algorithm that computes, given an arbitrary weighted graph G,
the shortest Hamiltonian cycle in G, using this magic black box as a
subroutine.
(b) Suppose you are given a magic black box that can determine in polynomial
time, given an arbitrary graph G, the number of vertices in the
largest complete subgraph of G. Describe and analyze a polynomialtime
algorithm that computes, given an arbitrary graph G, a complete
subgraph of G of maximum size, using this magic black box as a subroutine.
416
Exercises
(c) Suppose you are given a magic black box that can determine in polynomial
time, given an arbitrary graph G, whether G is 3-colorable.
Describe and analyze a polynomial-time algorithm that either computes
a proper 3-coloring of a given graph or correctly reports that no
such coloring exists, using the magic black box as a subroutine. [Hint:
The input to the magic black box is a graph. Only a graph. Vertices and
edges. Nothing else.]
(d) Suppose you are given a magic black box that can determine in polynomial
time, given an arbitrary boolean formula ,
whether
is satisfiable.
Describe and analyze a polynomial-time algorithm that either computes
a satisfying assignment for a given boolean formula or correctly reports
that no such assignment exists, using the magic black box as a subroutine.
(e) Suppose you are given a magic black box that can determine in polynomial
time, given an arbitrary set X of positive integers, whether X can
be partitioned into two sets A and B such that PA = PB. Describe and
analyze a polynomial-time algorithm that either computes an equal
partition of a given set of positive integers or correctly reports that no
such partition exists, using the magic black box as a subroutine.
®™(f) Suppose you are given a magic black box that can determine in polynomial
time, given an arbitrary generalized regular expression R (as
defined just before the Exercises), whether R matches any string. Describe
and analyze a polynomial-time algorithm that either finds a
single string that matches a given generalized regular expression or
correctly reports that no such string exists, using the magic black box as
a subroutine.
.
There’s something special about the number 3.
(a) Describe and analyze a polynomial-time algorithm for P.
Given
a set S of 2n positive integers, your algorithm will determine in polynomial
time whether the elements of S can be split into n disjoint pairs
whose sums are all equal.
(b) Describe and analyze a polynomial-time algorithm for C.
Given an
undirected graph G, your algorithm will determine in polynomial time
whether G has a proper coloring that uses only two colors.
(c) Describe and analyze a polynomial-time algorithm for S.
Given a
boolean formula
in conjunctive normal form, with exactly two literals
per clause, your algorithm will determine in polynomial time whether
has a satisfying assignment. [Hint: This problem is strongly connected
to topics described in an earlier chapter.]
417
12. NP-HARDNESS
.
There’s nothing special about the number 3.
(a) The problem P
is defined as follows: Given a set S of 12n
positive integers, determine whether the elements of S can be split into n
subsets, each with 12 elements, whose sums are all equal. Prove that
P
is NP-hard. [Hint: Reduce from P.
It may be
easier to consider multisets first.]
(b) The problem C
is defined as follows: Given an undirected graph G,
determine whether we can color each vertex with one of twelve colors,
so that every edge touches two dierent
colors. Prove that C
is
NP-hard. [Hint: Reduce from C.]
(c) The problem SAT
is defined as follows: Given a boolean formula
in conjunctive normal form, with exactly twelve literals per clause,
determine whether
has a satisfying assignment. Prove that S
is
NP-hard. [Hint: Reduce from S.]
.
There are two dierent
versions of the Hamiltonian cycle problem, one for
directed graphs and one for undirected graphs. Earlier in this chapter you
can find two proofs that the directed Hamiltonian cycle problem is NP-hard.
(a) Describe a polynomial-time reduction from the undirected Hamiltonian
cycle problem to the directed Hamiltonian cycle problem. Prove your
reduction is correct.
(b) Describe a polynomial-time reduction from the directed Hamiltonian
cycle problem to the undirected Hamiltonian cycle problem. Prove your
reduction is correct.
(c) Which of these two reductions implies that the undirected Hamiltonian
cycle problem is NP-hard?
.
(a) Describe a polynomial-time reduction from UH-
C
to DHC.
(b) Describe a polynomial-time reduction from DH-
C
to UHC.
.
(a) Describe a polynomial-time reduction from the HP
problem
to HC.
(b) Describe a polynomial-time reduction from the HC
problem to HP.
[Hint: A polynomial-time reduction can
call the black-box subroutine more than once, but it doesn’t have to.]
.
Consider the following subtle variants of CNFS.
For each problem, the
input is a boolean formula
in conjunctive normal form, and the goal is to
determine whether
has a satisfying assignment.
418
Exercises
(a) Suppose every clause of
contains at most three literals and each
variable appears in at most three clauses. Prove that this variant of
CNFS
is NP-hard.
(b) Suppose every clause of
contains exactly three literals and each
variable appears in at most four clauses. Prove that this variant of S
is NP-hard. [Hint: Solve part (a) first.]
™(c) Suppose every clause of
can contain any number of literals, but each
variable appears in at most two clauses. Describe a polynomial-time
algorithm for this variant of CNFS.
™(d) Suppose every clause of
contains exactly three literals and each variable
appears in at most three clauses. Prove that
must be satisfiable. (So
this variant of S
is completely trivial!)
.
(a) Prove that NAES
is NP-hard.
(b) Prove that --S
is NP-hard.
.
A boolean formula in exclusive-or conjunctive normal form (XCNF) is a
conjunction (A)
of several clauses, each of which is the exclusive-or of
several literals; that is, a clause is true if and only if it contains an odd number
of true literals. The XCNF-S
problem asks whether a given XCNF formula
is satisfiable. Either describe a polynomial-time algorithm for XCNF-S
or
prove that XCNF-S
is NP-hard. [Hint: Do not try to do both.]
™.
Consider the following variant of S,
called MS.
Just like S,
the input to MS
is a boolean formula
in conjunctive normal
form, with exactly three literals er clause. MS
asks whether
there is an assignment to the variables of ,
such that every clause contains
at least two T
literals.
Either describe an algorithm that solves MS
in polynomial
time or prove that MS
is NP-hard. [Hint: Do not try to do both.]
´™.
For any subset X ✓ {0, 1, 2, 3}, consider the following problem, which I’ll
call X-S.
The input is a boolean formula
in conjunctive normal form,
with exactly three literals in each clause. The problem is to decide whether
there is an assignment to the variables of
such that in each clause of ,
the number of T
literals is in the set X. For example:
• {1, 2, 3}-S
is the standard S
problem.
• {0, 3}-S
is the same as AONS.
(See Exercise .)
• {1, 2}-S
is usually called NAES.
(See Exercise (a).)
• {1}-S
is usually called --S.
(See Exercise (b).)
419
12. NP-HARDNESS
• {1, 3}-S
is usually called XCNF-S.
(See Exercise .)
• {2, 3}-S
is usually called MS.
(See Exercise .)
Give a complete list of all subsets X ✓ {0, 1, 2, 3} such that X-S
is solvable
in polynomial time, assuming P 6= NP. [Hint: Don’t give 16 dierent
arguments.]
.
Prove that the following problems are NP-hard.
(a) Given an undirected graph G, does G contain a simple path that visits
all but
vertices?
(b) Given an undirected graph G, does G have a spanning tree in which
every node has degree at most ?
(c) Given an undirected graph G, does G have a spanning tree with at most
leaves?
(d) Given an undirected graph G = (V, E), what is the size of the largest
subset of vertices S ✓ V such that at most
edges in E have both
endpoints in S?
(e) Given an undirected graph G = (V, E), what is the size of the largest
subset of vertices S ✓ V such that each vertex in S has at most
neighbors in S?
(f) Given an undirected graph G, is it possible to color the vertices of G with
three dierent
colors, so that at most
edges have both endpoints
the same color?
.
Prove that the following variants of the minimum spanning tree problem
are NP-hard.
(a) Given a graph G, compute the maximum-diameter spanning tree of G.
(The diameter of a tree T is the length of the longest path in T.)
(b) Given a graph G with weighted edges, compute the minimum-weight
depth-first spanning tree of G.
(c) Given a graph G with weighted edges and a subset S of vertices of G,
compute the minimum-weight spanning tree all of whose leaves are in S.
(d) Given a graph G with weighted edges and an integer `, compute the
minimum-weight spanning tree with at most ` leaves.
(e) Given a graph G with weighted edges and an integer ,
compute the
minimum-weight spanning tree where every node has degree at most .
.
(a) Using the gadget in Figure .(a),
prove that deciding whether a given
planar graph is -colorable
is NP-hard. [Hint: Show that the gadget
can be -colored,
and then replace any crossings in a planar embedding
with the gadget appropriately.]
420
Exercises
(b) Using part (a) and the gadget in Figure .(b),
prove that deciding
whether a planar graph with maximum degree
is -colorable
is NP-hard.
[Hint: Replace any vertex with degree greater than
with a collection
of gadgets connected so that no degree is greater than four.]
(a) (b)
Figure 12.26. (a) Gadget for planar 3-colorability. (b) Gadget for degree-4 planar 3-colorability.
.
Prove that PCS
is NP-hard. [Hint: Construct a gadget for
crossing wires.]
.
(a) Describe a polynomial-time reduction from S
to S.
(b) Describe a polynomial-time reduction from S
to S.
™.
Describe a direct polynomial-time reduction from C
to C.
(This
is a lot harder than the opposite direction.)
.
A domino is a 1 ⇥ 2 rectangle divided into two squares, each of which is
labeled with an integer.
In a legal arrangement of dominos, the dominos
are lined up end-to-end so that the numbers on adjacent ends match.
Figure 12.27. A legal arrangement of dominos in which every integer between 0 and 6 appears twice
For each of the following problems, either describe a polynomial-time
algorithm or prove that the problem is NP-hard:
(a) Given an arbitrary bag D of dominos, is there a legal arrangement of all
the dominos in D?
(b) Given an arbitrary bag D of dominos, is there a legal arrangement of a
dominos from D in which every integer between 1 and n appears exactly
twice?
These
integers are usually represented by pips, exactly like dice. On a standard domino, the
number of pips on each side is between
and ,
although one can buy sets with up to
or even
pips on each side; we will allow arbitrary integer labels. A standard set of dominos contains
exactly one domino for each possible unordered pair of labels; we do not assume that the inputs
to our problems have this property.
421
12. NP-HARDNESS
™(c) Given an arbitrary bag D of dominos, what is the largest number of
dominos we can take from D to make a legal arrangement?
.
Pebbling is a solitaire game played on an undirected graph G, where each
vertex has zero or more pebbles. A single pebbling move consists of removing
two pebbles from a vertex v and adding one pebble to an arbitrary neighbor
of v. (Obviously, the vertex v must have at least two pebbles before the move.)
The PD
problem asks, given a graph G = (V, E) and a
pebble count p(v) for each vertex v, whether is there a sequence of pebbling
moves that removes all but one pebble. Prove that PD
is
NP-hard.
.
Recall that a -coloring
of a graph G is a function that assigns each vertex
of G a “color” from the set {0, 1, 2, 3, 4}, such that for any edge uv, vertices u
and v are assigned dierent
“colors”. A -coloring
is careful if the colors
assigned to adjacent vertices are not only distinct, but dier
by more than 1
(mod 5). Prove that deciding whether a given graph has a careful 5-coloring
is NP-hard. [Hint: Reduce from the standard C
problem.]
3
4
0 4
2
3 0 2
1
Figure 12.28. A careful 5-coloring.
.
(a) A subset S of vertices in an undirected graph G is half-independent if
each vertex in S is adjacent to at most one other vertex in S. Prove that
finding the size of the largest half-independent set of vertices in a given
undirected graph is NP-hard.
(b) A subset S of vertices in an undirected graph G is sort-of-independent if
if each vertex in S is adjacent to at most
other vertices in S. Prove
that finding the size of the largest sort-of-independent set of vertices in
a given undirected graph is NP-hard.
(c) A subset S of vertices in an undirected graph G is almost independent if at
most
edges in G have both endpoints in S. Prove that finding the size
of the largest almost-independent set of vertices in a given undirected
graph is NP-hard.
.
Let G = (V, E) be a graph. A dominating set in G is a subset S of the vertices
such that every vertex in G is either in S or adjacent to a vertex in S. The
422
Exercises
DS
problem asks, given a graph G and an integer k as input,
whether G contains a dominating set of size k. Prove that this problem is
NP-hard.
Figure 12.29. A dominating set of size 3 in the Petersen graph.
.
A subset S of vertices in an undirected graph G is triangle-free if, for every
triple of vertices u, v, w 2 S, at least one of the three edges uv, uw, vw is
absent from G. Prove that finding the size of the largest triangle-free subset
of vertices in a given undirected graph is NP-hard.
Figure 12.30. A triangle-free subset of 7 vertices. This is not the largest such subset in this graph.
.
The RT
problem is defined as follows: Given one large
rectangle and several smaller rectangles, determine whether the smaller
rectangles can be placed inside the large rectangle with no gaps or overlaps.
(a) Prove that RT
is NP-hard.
(b) Prove that RT
is strongly NP-hard.
7
1
5 3
6
2
4
7
1
5 3
6
2 4
Figure 12.31. A positive instance of the RT
problem.
.
(a) A subset B of vertices in a graph G is a Burr set if removing every vertex
in B from G leaves a subgraph that does not contain a Hamiltonian path.
Prove that finding the smallest Burr set in a given graph is NP-hard.
(b) A subset S of vertices in a graph G is a Schuyler set if removing every
vertex in S from G leaves a subgraph that does contain a Hamiltonian
423
12. NP-HARDNESS
path. Prove that finding the smallest Schuyler set in a given graph is
NP-hard.
.
(a) A tonian path in a graph G is a path that goes through at least half of
the vertices of G. Show that determining whether a graph has a tonian
path is NP-hard.
(b) A tonian cycle in a graph G is a cycle that goes through at least half of
the vertices of G. Show that determining whether a graph has a tonian
cycle is NP-hard. [Hint: Use part (a). Or not.]
.
Let G be an undirected graph with weighted edges. A heavy Hamiltonian
cycle is a cycle C that passes through each vertex of G exactly once, such that
the total weight of the edges in C is more than half of the total weight of all
edges in G. Prove that deciding whether a graph has a heavy Hamiltonian
cycle is NP-hard.
4
8
2
7
5
3
1
12
8
6
5
9
5
Figure 12.32. A heavy Hamiltonian cycle. The cycle has total weight 34; the graph has total weight 67.
.
For each of the following problems, either describe a polynomial-time
algorithm or prove that the problem is NP-hard.
(a) A double-Eulerian tour in an undirected graph G is a closed walk that
traverses every edge in G exactly twice. Given a graph G, does G have a
double-Eulerian tour?
(b) A double-Hamiltonian tour in an undirected graph G is a closed walk
that visits every vertex in G exactly twice. Given a graph G, does G have
a double-Hamiltonian tour?
(c) A double-Hamiltonian circuit in an undirected graph G is a closed walk
that visits every vertex in G exactly twice and traverses each edge in G at
most once. Given a graph G, does G have a double-Hamiltonian circuit?
(d) A triple-Eulerian tour in an undirected graph G is a closed walk that
traverses every edge in G exactly three times. Given a graph G, does G
have a triple-Eulerian tour?
424
Exercises
(e) A triple-Hamiltonian tour in an undirected graph G is a closed walk that
visits every vertex in G exactly three times. Given a graph G, does G
have a triple-Hamiltonian tour?
.
This exercise asks you to prove that a certain reduction from VC
to ST
is correct. Suppose we want to find the smallest vertex
cover in a given undirected graph G = (V, E). We construct a new graph
H = (V0
, E0
) as follows:
• V0 = V [ E [ {z}
• E0 = {ve | v 2 V is an endpoint of e 2 E} [ {vz | v 2 V}.
Equivalently, we construct H by subdividing each edge in G with a new
vertex, and then connecting all the original vertices of G to a new apex
vertex z.
Prove that G has a vertex cover of size k if and only if there is a subtree
of H with k + |E| + 1 vertices that contains every vertex in E [ {z}.
.
Consider the following solitaire game. The puzzle consists of an n ⇥ m
grid of squares, where each square may be empty, occupied by a red stone,
or occupied by a blue stone. The goal of the puzzle is to remove some of
the given stones so that the remaining stones satisfy two conditions: ()
every row contains at least one stone, and ()
no column contains stones of
both colors. For some initial configurations of stones, reaching this goal is
impossible.
A solvable puzzle and one of its many solutions. An unsolvable puzzle.
Prove that it is NP-hard to determine, given an initial configuration of
red and blue stones, whether the puzzle can be solved.
.
Each of the following games involves an n ⇥ m grid of squares, where each
square is either empty or occupied by a stone. In a single move, you can
remove all the stones in an arbitrary column.
(a) Prove that it is NP-hard to find the smallest subset of columns that can
be cleared so that at most one stone remains in each row of the grid.
(b) Prove that it is NP-hard to find the largest subset of columns that can be
cleared so that at least one stone remains in each row of the grid.
425
12. NP-HARDNESS
™(c) Prove that it is NP-hard to determine whether any subset of columns can
be cleared so that exactly one stone remains in each row of the grid.
.
Je
tries to make his students happy. At the beginning of class, he passes out
a questionnaire that lists a number of possible course policies in areas where
he is flexible. Every student is asked to respond to each possible course
policy with one of “strongly favor”, “mostly neutral”, or “strongly oppose”.
Each student may respond with “strongly favor” or “strongly oppose” to at
most five questions. Because Je’s
students are very understanding, each
student is happy if (but only if) he or she prevails in at least one of their
strong policy preferences. Either describe a polynomial-time algorithm for
setting course policy to maximize the number of happy students, or show
that the problem is NP-hard.
.
You’re in charge of choreographing a musical for your local community
theater, and it’s time to figure out the final pose of the big show-stopping
number at the end. (“Streetcar!”) You’ve decided that each of the n cast
members in the show will be positioned in a big line when the song finishes,
all with their arms extended and showing o
their best spirit fingers.
The director has declared that during the final flourish, each cast member
must either point both their arms up or point both their arms down; it’s
your job to figure out who points up and who points down. Moreover, the
director has also given you a list of arrangements that will upset his delicate
artistic temperament. Each forbidden arrangement is a subset of the cast
members paired with arm positions; for example: “Marge may not point her
arms up while Ned, Apu, and Smithers point their arms down.”
Prove that finding an acceptable arrangement of arm positions is NP-hard.
.
The next time you are at a party, one of the guests will suggest everyone play a
round of Three-Way Mumbletypeg, a game of skill and dexterity that requires
three teams and a knife. The ocial
Rules of Three-Way Mumbletypeg
(fixed during the Holy Roman Three-Way Mumbletypeg Council in )
require that ()
each team must have at least one person, ()
any two people
on the same team must know each other, and ()
everyone watching the
game must be on one of the three teams. Of course, it will be a really fun
party; nobody will want to leave. There will be several pairs of people at
the party who don’t know each other. The host of the party, having heard
thrilling tales of your prowess in all things algorithmic, will hand you a list
of which pairs of party-goers know each other and ask you to choose the
teams, while he sharpens the knife.
426
Exercises
Either describe and analyze a polynomial time algorithm to determine
whether the party-goers can be split into three legal Three-Way Mumbletypeg
teams, or prove that the problem is NP-hard.
The Algorithm March is performed by a line of people; each person in line
starts a specific sequence of movements one measure later than the person
directly in front of them. Thus, the march is the dance equivalent of a
musical round or canon, like “Row Row Row Your Boat” or "Frère Jacques".
Proper etiquette dictates that each marcher must know the person
directly in front of them in line, lest a minor mistake lead to horrible
embarrassment between strangers. Suppose you are given a complete list
of which people at your party know each other. Prove that it is NP-hard
to determine the largest number of party-goers that can participate in the
Algorithm March. You may assume without loss of generality that there are
no ninjas at your party.
®.
Prove that the following problems about nondeterministic finite-state automata
and regular expressions are NP-hard:
(a) Given an NFA M over the alphabet ⌃ = {0, 1}, is there a string in ⌃⇤
that M does not accept?
(b) Given an acyclic NFA M over the alphabet ⌃ = {0, 1}, what is the length
of the shortest string in ⌃⇤ that M does not accept?
(c) Given a regular expression R over the alphabet ⌃ = {0, 1}, is there a
string in ⌃⇤ that R does not match?
(d) Given a star-free regular expression R over the alphabet ⌃ = {0, 1}, what
is the length of the shortest string in ⌃⇤ that R does not match?
(In fact, problems (a) and (c) are PSPACE-complete; even proving that these
problems are in PSPACE is nontrivial.)
®.
(a) Describe a polynomial-time algorithm for the following problem: Given
an NFA M over the alphabet ⌃ = {0, 1}, is there a string in ⌃⇤ that M
does accept?
(b) Describe a polynomial-time algorithm for the following problem: Given
a regular expression R over the alphabet ⌃ = {0, 1}, is there a string
in ⌃⇤ that R does match?
427
12. NP-HARDNESS
(c) The complement of any regular language is another regular language.
So why don’t these two algorithms, together with the NP-hardness results
in Problem ,
prove that P=NP?
.
Charon needs to ferry n recently deceased people across the river Acheron
into Hades. Certain pairs of these people are sworn enemies, who cannot
be together on either side of the river unless Charon is also present. (If two
enemies are left alone, one will steal the obol from the other’s mouth, leaving
them to wander the banks of the Acheron as a ghost for all eternity. Let’s
just say this is a Very Bad Thing.) The ferry can hold at most k passengers
at a time, including Charon, and only Charon can pilot the ferry.
Prove that it is NP-hard to decide whether Charon can ferry all n people
across the Acheron unharmed (aside from being, you know, dead). The
input for Charon’s problem consists of the integers k and n and an n-vertex
graph G describing the pairs of enemies. The output is either T
or F.
Please do not write your solution in classical Latin.
This
problem is a generalization of the well-known wolf-goat-and-cabbage puzzle, whose
first known appearance is in the remarkable medieval manuscript Propositiones ad Acuendos
Juvenes [Problems to Sharpen the Young].
Homo quidam debebat ultra fluuium transferre lupum, capram, et fasciculum cauli. Et
non potuit aliam nauem inuenire, nisi quae duos tantum ex ipsis ferre ualebat. Praeceptum
itaque ei fuerat, ut omnia haec ultra illaesa omnino transferret. Dicat, qui potest, quomodo
eis illaesis transire potuit?
Solutio. Simili namque tenore ducerem prius capram et dimitterem foris lupum et caulum.
Tum deinde uenirem, lupumque transferrem: lupoque foris misso capram naui receptam ultra
reducerem; capramque foris missam caulum transueherem ultra; atque iterum remigassem,
capramque assumptam ultra duxissem. Sicque faciendo facta erit remigatio salubris, absque
uoragine lacerationis.
For those few readers whose classical Latin is a little rusty, here is an English translation:
A man needed to transfer a wolf, a goat, and a bundle of cabbage across a river. However,
he found that his boat could only bear the weight of two [objects at a time, including the
man]. And he had to get everything across unharmed. Tell me if you can: How they were
able to cross unharmed?
Solution. In a similar fashion [as an earlier problem], I would first take the goat across
and leave the wolf and cabbage on the opposite bank. Then I would take the wolf across;
leaving the wolf on shore, I would retrieve the goat and bring it back again. Then I would
leave the goat and take the cabbage across. And then I would row across again and get the
goat. In this way the crossing would go well, without any threat of slaughter.
The most likely author of the Propositiones is the prolific th-century
English scholar Alcuin of
York. The evidence for Alcuin’s authorship of this treatise is somewhat circumstantial; however,
we do know from his correspondence with Charlemagne that he sent the emperor some “simple
arithmetical problems for fun”. Most modern scholars believe that even if Alcuin did write the
Propositiones, he did not invent all of the problems himself, but rather collected them from even
earlier sources.
Some things never change.
428
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。