联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> CS作业CS作业

日期:2020-12-08 11:35

[I]n his short and broken treatise he provides an eternal example—not of laws, or

even of method, for there is no method except to be very intelligent, but of

intelligence itself swiftly operating the analysis of sensation to the point of

principle and definition.

— T. S. Eliot on Aristotle, “The Perfect Critic”, The Sacred Wood (1921)

The nice thing about standards is that you have so many to choose from;

furthermore, if you do not like any of them, you can just wait for next year’s model.

— Andrew S. Tannenbaum, Computer Networks (1981)

It is a rare mind indeed that can render the hitherto non-existent blindingly obvious.

The cry “I could have thought of that” is a very popular and misleading one, for the

fact is that they didn’t, and a very significant and revealing fact it is too.

— Dirk Gently to Richard McDuff

in Douglas Adams’ Dirk Gently’s Holistic Detective Agency (1987)

If a problem has no solution, it may not be a problem, but a fact —

not to be solved, but to be coped with over time.

— Shimon Peres, as quoted by David Rumsfeld, Rumsfeld’s Rules (2001)

12

NP-Hardness

12.1 A Game You Can’t Win

Imagine that a salesman in a red suit, who looks suspiciously like Tom Waits,

presents you with a black steel box with n binary switches on the front and

a light bulb on top. The salesman tells you that the state of the light bulb is

controlled by a complex boolean circuit—a collection of A,

O,

and N

gates

connected by wires, with one input wire for each switch and a single output

wire for the bulb. He then asks you a simple question: Is it possible to set

the switches so that the light bulb turns on? If you can answer this question

correctly, he will give you one million one hundred billion dollars; if you answer

incorrectly, or if you die without answering, he will take your soul.

x

y x x

y x∧y x∨y ¬x

Figure 12.1. An AND gate, an OR gate, and a NOT gate.

379

12. NP-HARDNESS

Figure 12.2. A boolean circuit. Inputs enter from the left, and the output leaves to the right.

As far as you can tell, the Adversary hasn’t connected the switches to the

light bulb at all, so no matter how you set the switches, the light bulb will stay

o.

If you declare that it is possible to turn on the light, the Adversary will open

the box and reveal that there is no circuit at all. But if you declare that it is

not possible to turn on the light, before testing all 2n settings, the Adversary

will magically create a circuit inside the box that turns on the light if and only

if the switches are in one of the settings you haven’t tested, and then flip the

switches to that setting, turning on the light. (You can’t detect the Adversary’s

cheating, because you can’t see inside the box until the end.) The only way

to provably answer the Adversary’s question correctly is to try all 2n possible

settings. You quickly realize that this will take far longer than you expect to

live, so you gracefully decline the Adversary’s oer.

The Adversary smiles and says, in a growl like Heath Ledger’s Joker after

smoking a carton of Marlboros, “Ah, yes, of course, you have no reason to trust

me. But perhaps I can set your mind at ease.” He hands you a large roll of

parchment—which you hope was made from sheep skin—with a circuit diagram

drawn (or perhaps tattooed) on it. “Here are the complete plans for the circuit

inside the box. Feel free to poke around inside the box to make sure the plans

are correct. Or build your own circuit from these plans. Or write a computer

program to simulate the circuit. Whatever you like. If you discover that the

plans don’t match the actual circuit in the box, you win the hundred billion

bucks.” A few spot checks convince you that the plans have no obvious flaws;

subtle cheating appears to be impossible.

But you should still decline the Adversary’s “generous” oer.

The problem

that the Adversary is posing is called circuit satisfiability or CS:

Given

a boolean circuit, is there a set of inputs that makes the circuit output T,

or

conversely, whether the circuit always outputs F.

For any particular input

setting, we can calculate the output of the circuit in polynomial (actually, linear)

time using depth-first-search. But nobody knows how to solve CS

faster

than trying all 2n possible inputs to the circuit by brute force, which requires

exponential time. Admittedly, nobody has actually formally proved that we can’t

380

12.2. P versus NP

beat brute force—maybe, just maybe, there’s a clever algorithm that just hasn’t

been discovered yet—but nobody has actually formally proved that anti-gravity

unicorns don’t exist, either. For all practical purposes, it’s safe to assume that

there is no fast algorithm for CS.

You tell the salesman no. He smiles and says, “You’re smarter than you look,

kid,” and then flies away on his anti-gravity unicorn.

12.2 P versus NP

A minimal requirement for an algorithm to be considered “ecient”

is that its

running time is bounded by a polynomial function of the input size: O(nc

) for

some constant c, where n is the size of the input.

Researchers recognized

early on that not all problems can be solved this quickly, but had a hard time

figuring out exactly which ones could and which ones couldn’t. There are

several so-called NP-hard problems, which most people believe cannot be solved

in polynomial time, even though nobody can prove a super-polynomial lower

bound.

A decision problem is a problem whose output is a single boolean value: Y

or N.

Let me define three classes of decision problems:

• P is the set of decision problems that can be solved in polynomial time.

Intuitively, P is the set of problems that can be solved quickly.

• NP is the set of decision problems with the following property: If the answer

is Y,

then there is a proof of this fact that can be checked in polynomial

time. Intuitively, NP is the set of decision problems where we can verify a

Y

answer quickly if we have the solution in front of us.

• co-NP is essentially the opposite of NP. If the answer to a problem in co-NP

is N,

then there is a proof of this fact that can be checked in polynomial

time.

For example, the circuit satisfiability problem is in NP. If a given boolean circuit

is satisfiable, then any set of m input values that produces T

output is a

proof that the circuit is satisfiable; we can check the proof by evaluating the

circuit in polynomial time. It is widely believed that circuit satisfiability is not

in P or in co-NP, but nobody actually knows.

Every decision problem in P is also in NP. If a problem is in P, we can verify

Y

answers in polynomial time recomputing the answer from scratch! Similarly,

every problem in P is also in co-NP.


This notion of eciency

was independently formalized by Alan Cobham in ,

Jack

Edmonds in ,

and Michael Rabin in ,

although similar notions were considered more

than a decade earlier by Kurt Gödel, John Nash, and John von Neumann.

381

12. NP-HARDNESS

Perhaps the single most important unanswered question in theoretical

computer science—if not all of computer science—if not all of science—is

whether the complexity classes P and NP are actually dierent.

Intuitively,

it seems obvious to most people that P 6= NP; the homeworks and exams in

your algorithms and data structures classes have (I hope) convinced you that

problems can be incredibly hard to solve, even when the solutions are simple

in retrospect. It’s completely obvious; of course solving problems from scratch

is harder than verifying that a given solution is correct. We can reasonably

accept—and most algorithm designers do accept—the statement “P 6= NP” as a

law of nature, similar to other laws of nature like Maxwell’s equations, general

relativity, and the sun rising tomorrow morning that are strongly supported by

evidence, but have no mathematical proof.

But if we’re being mathematically rigorous, we have to admit that nobody

knows how to prove that that P 6= NP. In fact, there has been little or no real

progress toward a proof for decades.

The Clay Mathematics Institute lists

P-versus-NP as the first of its seven Millennium Prize Problems, oering

a

$,,

reward for its solution. And yes, in fact, several people have lost

their souls, or at least their sanity, attempting to solve this problem.

A more subtle but still open question is whether the complexity classes NP

and co-NP are dierent.

Even if we can verify every Y

answer quickly, there’s

no reason to believe we can also verify N

answers quickly. For example, as far

as we know, there is no short proof that a boolean circuit is not satisfiable. It is

generally believed that NP 6= co-NP, but again, nobody knows how to prove it.

P

coNP NP

Figure 12.3. What we think the world looks like.

12.3 NP-hard, NP-easy, and NP-complete

A problem ⇧ is NP-hard if a polynomial-time algorithm for ⇧ would imply a

polynomial-time algorithm for every problem in NP. In other words:

⇧ is NP-hard () If ⇧ can be solved in polynomial time, then P=NP


Perhaps the most significant progress has taken the form of barrier results, which imply that

entire avenues of attack are doomed to fail. In a very real sense, not only do we have no idea

how to prove P 6= NP, but we can actually prove that we have no idea how to prove P 6= NP!

382

12.3. NP-hard, NP-easy, and NP-complete

Intuitively, if we could solve one particular NP-hard problem quickly, then we

could quickly solve any problem whose solution is easy to understand, using

the solution to that one special problem as a subroutine. NP-hard problems are

at least as hard as every problem in NP.

Finally, a problem is NP-complete if it is both NP-hard and an element of

NP (or “NP-easy”). Informally, NP-complete problems are the hardest problems

in NP. A polynomial-time algorithm for even one NP-complete problem would

immediately imply a polynomial-time algorithm for every NP-complete problem.

Literally thousands of problems have been shown to be NP-complete, so a

polynomial-time algorithm for one (and therefore all) of them seems incredibly

unlikely.

Calling a problem NP-hard is like saying “If I own a dog, then it can speak

fluent English.” You probably don’t know whether or not I own a dog, but I bet

you’re pretty sure that I don’t own a talking dog. Nobody has a mathematical

proof that dogs can’t speak English—the fact that no one has ever heard a dog

speak English is evidence, as are the hundreds of examinations of dogs that

lacked the proper mouth shape and brainpower, but mere evidence is not a

mathematical proof. Nevertheless, no sane person would believe me if I said I

owned a dog that spoke fluent English.

So the statement “If I own a dog, then

it can speak fluent English” has a natural corollary: No one in their right mind

should believe that I own a dog! Similarly, if a problem is NP-hard, no one in

their right mind should believe it can be solved in polynomial time.

P

NP coNP

NP-hard

NP-complete

Figure 12.4. More of what we think the world looks like.

It is not immediately obvious that any problems are NP-hard. The following

remarkable theorem was first published by Stephen Cook in

and

independently by Leonid Levin in .



. . . The department chair shakes his head sadly and says, “Oh, come on, that just sounds

like barking. Let me ask a question. Who was the greatest complexity theorist of all time?” The

dog cocks his head, pauses for a few seconds, and then says “Karp!” After the chair chases them

out of his oce,

the dog turns to its owner and says, “Maybe I should have said Impagliazzo?” Levin

first reported his results at seminars in Moscow in ,

while still a PhD student. News

of Cook’s result did not reach the Soviet Union until at least ,

after Levin’s announcement of

his results had been published; in accordance with Stigler’s Law, this result is often called “Cook’s

Theorem”. Levin was denied his PhD at Moscow University for political reasons; he emigrated to

the US in

and earned a PhD at MIT a year later. Cook was denied tenure by the Berkeley

383

12. NP-HARDNESS

The Cook-Levin Theorem. Circuit satisfiability is NP-hard.

I won’t even sketch a proof here, because I’ve been (deliberately) vague

about the definitions.

™12.4 Formal Definitions (HC SVNT DRACONES)

Formally, the complexity classes P, NP, and co-NP are defined in terms of

languages and Turing machines. A language is a set of strings over some finite

alphabet ⌃; without loss of generality, we can assume that ⌃ = {0, 1}. A Turing

machine is a very restrictive type of computer—crudely, a finite-state machine

with an unbounded memory tape—whose precise definition is surprisingly

unimportant. P is the set of languages that can be decided in Polynomial time

by a deterministic single-tape Turing machine. Similarly, NP is the set of all

languages that can be decided in polynomial time by a nondeterministic Turing

machine; NP is an abbreviation for Nondeterministic Polynomial-time.

The requirement of polynomial time is suciently

crude that we do not

have to specify the precise form of Turing machine (number of tapes, number

of heads, number of tracks, size of the tape alphabet, and so on). In fact, any

algorithm that runs on a random-access machine

in T(n) time can be simulated

by a single-tape, single-track, single-head Turing machine that runs in O(T(n)

4)

time. This simulation result allows us to argue formally about computational

complexity in terms of standard high-level programming constructs like arrays

and loops and recursion, instead of describing everything directly in terms of

Turing machines.

Formally, a problem ⇧ is NP-hard if and only if, for every language ⇧0 2 NP,

there is a polynomial-time Turing reduction from ⇧0 to ⇧. A Turing reduction

means a reduction that can be executed on a Turing machine; that is, a Turing

machine M that can solve ⇧0 using another Turing machine M0 for ⇧ as

a black-box subroutine. Turing reductions are also called oracle reductions;

polynomial-time Turing reductions are also called Cook reductions.

mathematics department in ,

just one year before publishing his seminal paper; he (but not

Levin) later won the Turing award for his proof.

Interested readers find a proof in my lecture notes on nondeterministic Turing machines at

http://algorithms.wtf, or in Boaz Barak’s excellent Introduction to Theoretical Computer Science. Random-access

machines are a model of computation that more faithfully models physical

computers. A standard random-access machine has unbounded random-access memory, modeled

as an unbounded array M[0 ..1] where each address M[i] holds a single w-bit integer, for

some fixed integer w, and can read to or write from any memory addresses in constant time.

RAM algorithms are formally written in assembly-like language, using instructions like ADD i, j, k

(meaning “M[i] M[ j] + M[k]”), INDIR i, j (meaning “M[i] M[M[ j]]”), and IF0GOTO i, `

(meaning “if M[i] = 0, go to line `”), but the precise instruction set is surprisingly irrelevant. By

definition, each instruction executes in unit time. In practice, RAM algorithms can be faithfully

described using higher-level pseudocode, as long as we’re careful about arithmetic precision.

384

12.5. Reductions and SAT

Researchers in complexity theory prefer to define NP-hardness in terms of

polynomial-time many-one reductions, which are also called Karp reductions.

A many-one reduction from one language L0 ✓ ⌃⇤ to another language L ✓ ⌃⇤

is a function f : ⌃⇤ ! ⌃⇤ such that x 2 L0 if and only if f (x) 2 L. Then we can

define a language L to be NP-hard if and only if, for any language L0 2 NP, there

is a many-one reduction from L0 to L that can be computed in polynomial time.

Every Karp reduction “is” a Cook reduction, but not vice versa. Specifically,

any Karp reduction from one decision problem ⇧ to another decision ⇧0 is

equivalent to transforming the input to ⇧ into the input for ⇧0

, invoking an

oracle (that is, a subroutine) for ⇧0

, and then returning the answer verbatim.

However, as far as we know, not every Cook reduction can be simulated by a

Karp reduction.

Complexity theorists prefer Karp reductions primarily because NP is closed

under Karp reductions, but is not closed under Cook reductions (unless NP=coNP,

which is considered unlikely). There are natural problems that are ()

NP-hard with respect to Cook reductions, but ()

NP-hard with respect to Karp

reductions only if P=NP. One trivial example of such a problem is US:

Given

a boolean formula, is it always false? On the other hand, many-one reductions

apply only to decision problems (or more formally, to languages); formally, no

optimization or construction problem is Karp-NP-hard.

To make things even more confusing, both Cook and Karp originally defined

NP-hardness in terms of logarithmic-space reductions. Every logarithmic-space

reduction is a polynomial-time reduction, but (as far as we know) not vice

versa. It is an open question whether relaxing the set of allowed (Cook or

Karp) reductions from logarithmic-space to polynomial-time changes the set of

NP-hard problems.

Fortunately, none of these subtleties rear their ugly heads in practice—in

particular, every reduction described in this chapter can be formalized as a

logarithmic-space many-one reduction—so you can wake up now.

12.5 Reductions and SAT

To prove that any problem other than circuit satisfiability is NP-hard, we use a

reduction argument. Reducing problem A to another problem B means describing

an algorithm to solve problem A under the assumption that an algorithm for

problem B already exists. You’ve already been doing reduction for years, even

before starting this book, only you probably called them something else, like

subroutines or utility functions or modular programming or using a calculator.

To prove something is NP-hard, we describe a similar transformation between

problems, but not in the direction that most people expect.

385

12. NP-HARDNESS

You should tattoo the following rule of onto the back of your hand, right

next to your mom’s birthday and the actual rules of Monopoly.

To prove that problem A is NP-hard,

reduce a known NP-hard problem to A.

In other words, to prove that your problem is hard, you need to describe an

ecient

algorithm to solve a dierent

problem, which you already know is

hard, using an hypothetical ecient

algorithm for your problem as a black-box

subroutine. The essential logic is a proof by contradiction. The reduction

implies that if your problem were easy, then the other problem would be easy,

which it ain’t. Equivalently, since you know the other problem is hard, the

reduction implies that your problem must also be hard; your hypothetical

ecient

algorithm does not actually exist.

As a canonical example, consider the formula satisfiability problem, usually

just called S.

The input to S

is a boolean formula like

(a _ b _ c _ d¯) , ((b ^ ¯c) _ (a¯ ) d) _ (c 6= a ^ b)),

and the question is whether it is possible to assign boolean values to the variables

a, b,c,... so that the entire formula evaluates to T.

To prove that S

is NP-hard, we need to give a reduction from a known

NP-hard problem. The only problem we know is NP-hard so far is CS,

so let’s start there.

Let K be an arbitrary boolean circuit. We can transform (or more accurately,

transcribe) K into a boolean formula

as follows. First, label each interior wire

by a new variable yj, and label the output wire with a new variable z. The

formula

consists of a list of equations, one for each gate, separated by As,

followed by a final ^ z. Figure .

shows the resulting transcription for our

example circuit.

Now we claim that the original circuit K is satisfiable if and only if the

resulting formula

is satisfiable. Like every other “if and only if” statement, we

prove this claim in two steps:


If a player lands on an available property and declines (or is unable) to buy it, that property

is immediately auctioned o

to the highest bidder; the player who originally declined the property

may bid, and bids may be arbitrarily higher or lower than the list price. Players in Jail can still

buy and sell property, buy and sell houses and hotels, and collect rent. The game has

houses

and

hotels; once they’re gone, they’re gone. In particular, if all houses are already on the

board, you cannot downgrade a hotel to four houses; you must raze all the hotels in the group

to the ground. Players can sell or exchange undeveloped properties with each other, but cannot

sell property back to the bank; on the other hand, players can sell buildings to the bank (at half

price), but cannot sell or exchange buildings with each other. All penalties are paid directly to

the bank. A player landing on Free Parking does not win anything. A player landing on Go gets

exactly $,

no more. Railroads are not magic transporters. Finally, Je

always gets the boot.

No, not the T-Rex or the penguin—the boot, dammit.

386

12.5. Reductions and SAT

x1

x2

x3

x4

x5

y1

y2

y3

y4

y5

y6

z

( y1 = x1 ^ x4) ^ ( y2 = x4) ^ ( y3 = x3 ^ y2) ^ ( y4 = y1 _ x2) ^

( y5 = x2) ^ ( y6 = x5) ^ ( y7 = y3 _ y5) ^ (z = y4 ^ y7 ^ y6) ^ z

Figure 12.5. Transcribing a boolean circuit as a boolean formula.

) Given a set of inputs that satisfy the circuit K, we can derive a satisfying

assignment for the formula

by computing the output of every gate in K.

( Given a satisfying assignment for the formula ,

we can obtain a satisfying

input the circuit by simply ignoring the internal wire variables yi and the

output variable z.

The entire transformation from circuit to formula can be carried out in linear

time. Moreover, the size of the resulting formula is at most a constant factor

larger than any reasonable representation of the circuit.

Now suppose, for the sake of argument, there is an algorithm that can

determine in polynomial time whether a given boolean formula is satisfiable.

Then given any boolean circuit K, we can decide whether K is satisfiable by first

transforming K into a boolean formula

as described above, and then asking

our magical mystery S

algorithm whether

is satisfiable, as suggested by the

following cartoon. Each box represents an algorithm. The red box on the left is

the transformation subroutine. The box on the right is the hypothetical magic

S

algorithm. It must be magic, because it has a rainbow on it.

C⌅⌃⇥ ⌅SAT

SAT

K

Boolean

formula

transform

in O(n)

time

Φ

Boolean

circuit

Φ is

satisfiable

Φ is not

satisfiable

T⌃ ⇤

F⇧⌥⇤

K is

satisfiable

K is not

satisfiable

T⌃ ⇤

F⇧⌥⇤

If you prefer magic pseudocode to magic boxes:

Kay

Erickson, personal communication, .

For those of you reading black-and-white

printed copies: Yes, that round thing is a rainbow.

387

12. NP-HARDNESS

CS(K):

transcribe K into a boolean formula

return S()

hh????MAGIC????ii

Transcribing K into

requires only polynomial time (in fact, only linear time,

but whatever), so the entire CS

algorithm also runs in polynomial time.

TCS(n)

 O(n) + TS(O(n))

We conclude that any polynomial-time algorithm for S

would give us a

polynomial-time algorithm for CS,

which in turn would imply P=NP.

So S

is NP-hard!

12.6 3SAT (from CIRCUITSAT)

A special case of S

that is particularly useful in proving NP-hardness results is

called CNF-S

or more often simply S.

A boolean formula is in conjunctive normal form (CNF) if it is a conjunction

()

of several clauses, each of which is the disjunction ()

of several literals,

each of which is either a variable or its negation. For example:

clause z }| {

(a _ b _ c _ d) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯

b)

A CNF

formula is a CNF formula with exactly three literals per clause; the

previous example is not a CNF

formula, since its first clause has four literals

and its last clause has only two. S

is the restriction of S

to CNF

formulas:

Given a CNF

formula, is there an assignment to the variables that makes the

formula evaluate to T?

We could prove that S

is NP-hard by a reduction from the more general

S

problem, but it’s actually easier to start over from scratch, by reducing

directly from CS.

Figure 12.6. A polynomial-time reduction from CIRCUITSAT to 3SAT.

Given an arbitrary boolean circuit K, we transform K into an equivalent

CNF

formula in several stages. Except for the very last stage, this reduction

388

12.6. 3SAT (from CIRCUITSAT)

was actually described by Grigorii Tseitin in ,

five years before Cook and

Levin reported their proofs of the Cook-Levin Theorem. (In the same

paper, Tseitin described the problem we now call CNF-S,

possibly for the first

time.) As we describe each stage, we will also prove that stage is correct.

• Make sure every

and

gate in K has exactly two inputs. If any gate has

k > 2 inputs, replace it with a binary tree of k

1 binary gates. Call the

resulting circuit K0

. The circuits K and K0 are logically equivalent circuits,

so every satisfying input for K is a satisfying input for K0 and vice versa.

• Transcribe K0 into a boolean formula 1

with one clause per gate, exactly as in

our previous reduction to S.

We already proved that every satisfying input

for K0 can be transformed into a satisfying assignment for 1

and vice versa.

• Replace each clause in 1

with a CNF formula. There are only three types of

clauses in 1,

one for each type of gate in K0

:

a = b ^ c 7!

(a _ ¯

b _ ¯c) ^ (a¯ _ b) ^ (a¯ _ c)

a = b _ c 7!

(a¯ _ b _ c) ^ (a _ ¯

b) ^ (a _ ¯c)

a = ¯

b 7!

(a _ b) ^ (a¯ _ ¯

b)

Call the resulting CNF formula 2.

Because 1

and 2

are logically equivalent

formulas, every satisfying assignment for 1

is also a satisfying assignment

for 2,

and vice versa.

• Replace each clause in 2

with a CNF

formula. Every clause in 2

has at most

three literals, but we need clauses with exactly three literals. To obtain a

CNF

formula, we expand each two-literal clause in 2

into two three-literal

clauses by introducing one new variable, and we expand the final one-literal

clause in 2

into four three-literal clauses by introducing two new variables.

a _ b 7!

(a _ b _ x) ^ (a _ b _ x¯)

z 7!

(z _ x _ y) ^ (z _ x¯ _ y) ^ (z _ x _ ¯y) ^ (z _ x¯ _ ¯y)

Call the resulting CNF

formula 3.

Every satisfying assignment for 2

can be

transformed into a satisfying assignment for 3

by assigning arbitrary values

to the new variables (x and y). Conversely, every satisfying assignment

for 3

can be transformed into a satisfying assignment for 2

by ignoring

the new variables.

For example, our example circuit is transformed into the following CNF

formula;

compare with Figure ..

389

12. NP-HARDNESS

( y1 _ x1 _ x4) ^ ( y1 _ x1 _ z1) ^ ( y1 _ x1 _ z1) ^ ( y1 _ x4 _ z2) ^ ( y1 _ x4 _ z2)

^ ( y2 _ x4 _ z3) ^ ( y2 _ x4 _ z3) ^ ( y2 _ x4 _ z4) ^ ( y2 _ x4 _ z4)

^ ( y3 _ x3 _ y2) ^ ( y3 _ x3 _ z5) ^ ( y3 _ x3 _ z5) ^ ( y3 _ y2 _ z6) ^ ( y3 _ y2 _ z6)

^ ( y4 _ y1 _ x2) ^ ( y4 _ x2 _ z7) ^ ( y4 _ x2 _ z7) ^ ( y4 _ y1 _ z8) ^ ( y4 _ y1 _ z8)

^ ( y5 _ x2 _ z9) ^ ( y5 _ x2 _ z9) ^ ( y5 _ x2 _ z10) ^ ( y5 _ x2 _ z10)

^ ( y6 _ x5 _ z11) ^ ( y6 _ x5 _ z11) ^ ( y6 _ x5 _ z12) ^ ( y6 _ x5 _ z12)

^ ( y7 _ y3 _ y5) ^ ( y7 _ y3 _ z13) ^ ( y7 _ y3 _ z13) ^ ( y7 _ y5 _ z14) ^ ( y7 _ y5 _ z14)

^ ( y8 _ y4 _ y7) ^ ( y8 _ y4 _ z15) ^ ( y8 _ y4 _ z15) ^ ( y8 _ y7 _ z16) ^ ( y8 _ y7 _ z16)

^ ( y9 _ y8 _ y6) ^ ( y9 _ y8 _ z17) ^ ( y9 _ y6 _ z18) ^ ( y9 _ y6 _ z18) ^ ( y9 _ y8 _ z17)

^ ( y9 _ z19 _ z20) ^ ( y9 _ z19 _ z20) ^ ( y9 _ z19 _ z20) ^ ( y9 _ z19 _ z20)

Yikes! At first glance, this formula might look a lot more complicated than the

original circuit, but in fact, it’s only larger by a constant factor. Specifically, the

simplified circuit K0 has at most twice as many wires as the original circuit K,

each binary gate in K0 is transformed into at most five clauses in 3.

Even if the

formula size were a large polynomial function (like n374) of the circuit size, we

would still have a valid reduction.

Our reduction transforms an arbitrary boolean circuit K into a CNF

formula

3

in polynomial time (in fact, in linear time). Moreover, any satisfying

input for the input circuit K can be transformed into a satisfying assignment

for 3,

and any satisfying assignment for 3

can be transformed into a satisfying

input for K. In other words, K is satisfiable if and only if 3

is satisfiable. Thus,

if S

can be solved in polynomial time, then CS

can be solved in

polynomial time, which implies that P = NP. We conclude that S

is NP-hard.

12.7 Maximum Independent Set (from 3SAT)

For the next few problems we consider, the input is a simple, unweighted,

undirected graph, and the problem asks for the size of the largest or smallest

subgraph satisfying some structural property.

Let G be an arbitrary graph. An independent set in G is a subset of the

vertices of G with no edges between them. The maximum independent set

problem, which I’ll abbreviate MIS,

asks for the size of the largest

independent set in a given graph. I will prove that MIS

is NP-hard using

a reduction from S,

as suggested by Figure ..

Given an arbitrary CNF

formula ,

we construct a graph G as follows.

Let k denote the number of clauses in .

The graph G contains exactly 3k

vertices, one for each literal in .

Two vertices in G are connected by an

edge if and only if either ()

they correspond to literals in the same clause, or

()

they correspond to a variable and its inverse. For example, the formula

(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯

b _ d¯) is transformed into the graph

shown in Figure ..

390

12.7. Maximum Independent Set (from 3SAT)

S⇥


M⇥↵I⌃⇤S⌅

G

CNF

Boolean

formula

Φ

graph

Φ is

satisfiable

Φ is not

satisfiable

T⌥⌦⌅

F⇥⇧⌅

G has an

independent

set of size k

T⌥⌦⌅

F⇥⇧⌅

transform

in O(n)

time

G has no

independent

set of size k

=?

k

number of clauses in Φ

size of largest

independent

set in G

Figure 12.7. A polynomial-time reduction from 3SAT to MAXINDSET.

‾ ‾

a

b

c

c

d

a

b

d

b‾

d

a‾

‾c

(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯

b _ d¯)

Figure 12.8. A graph derived from the satisfiable 3CNF formula with 4 clauses, and an independent

set of size 4.

Each independent set in G contains at most one vertex from each clause

triangle, because any two vertices in each triangle are connected. Thus, the

largest independent set in G has size at most k. I claim that G contains an

independent set of size exactly k if and only if the original formula

is satisfiable.

As usual for “if and only if” statements, the proof consists of two parts.

) Suppose

is satisfiable. Fix an arbitrary satisfying assignment. By definition,

each clause in

contains at least one T

literal. Thus, we can choose

a subset S of k vertices in G that contains exactly one vertex per clause

triangle, such that the corresponding k literals are all T.

Because each

triangle contains at most one vertex in S, no two vertices in S are connected

by a triangle edge. Because every literal corresponding to a vertex in S is

T,

no two vertices in S are connected by a negation edge. We conclude

that S is an independent set of size k in G.

( On the other hand, suppose G contains an independent set S of size k. Each

vertex in S must lie in a dierent

clause triangle. Suppose we assign the

value T

to each literal in S; because contradictory literals are connected

by edges, this assignment is consistent. There may be variables x such that

neither x nor x¯ corresponds to a vertex in S; we can set these variables to

391

12. NP-HARDNESS

any value we like. Because S contains one vertex in each clause triangle,

each clause in

contains (at least) one T

literal. We conclude that

is

satisfiable.

Transforming the CNF

formula

into the graph G takes polynomial time,

even if we do everything by brute force. Thus, if we could solve MIS

in polynomial time, then we could also solve S

in polynomial time, by

transforming the input formula

into a graph G and comparing the size of the

largest independent set in G with the number of clauses in .

But that would

imply P=NP, which is ridiculous! We conclude that MIS

is NP-hard.

12.8 The General Pattern

All NP-hardness proofs—and more generally, all polynomial-time reductions—

follow the same general outline. To reduce problem X to problem Y in polynomial

time, we need to do three things:

.

Describe a polynomial-time algorithm to transform an arbitrary instance

of x of X into a special instance y of Y .

.

Prove that if x is a “good” instance of X, then y is a “good” instance of Y .

.

Prove that if y is a “good” instance of Y , then x is a “good” instance of X.

(This is usually the part that causes the most trouble.)

Of course, developing a correct reduction doesn’t mean handling these three

tasks one at a time. First writing down an algorithm that seems to work and then

trying prove that it actually works is rarely successful, especially in time-limited

settings like exams. We must develop the algorithm, the “if” proof, and the

“only if” proof simultaneously.

To quote the late great Ricky Jay:

This is an acquired skill.

One point that confuses many students is that the reduction algorithm only

“works one way”—from X to Y—but the correctness proof needs to “work both

ways”. But the correctness proofs are not actually symmetric. The “if” proof

needs to handle arbitrary instances of X, but the “only if” only needs to handle

the special instances of Y produced by the reduction algorithm. Exploiting this

asymmetry is the key to successfully designing correct reductions.

I find it useful to think in terms of transforming certificates—proofs that a

given instance is “good”—along with the instances themselves. For example,

a certificate for CS

is a set of inputs that turns on the light bulb;

a certificate for S

or S

is a satisfying assignment; a certificate for MIS

is a large independent set. To reduce X to Y , we actually need to design three

algorithms, one for each of the following tasks:

from

his

o-Broadway

show Ricky Jay and his

Assistants

392

12.8. The General Pattern

• Transform an arbitrary instance x of X into a special instance y of Y in

polynomial time.

• Transform an arbitrary certificate for x into a certificate for y, and

• Transform an arbitrary certificate for y into a certificate for x.

The second and third tasks refer to the input and output of the first algorithm. The

certificate transformation needs to be reversible, not the instance transformation.

We never have to transform instances of Y , and we don’t need to think about

arbitrary instances of Y at all. Only the first algorithm needs to run in polynomial

time (although in practice, the second and third algorithms are almost always

simpler than the first).

For example, our reduction from CS

to S

consists of three

algorithms:

• The first transforms an arbitrary boolean circuit K into a special CNF

boolean formula 3,

in polynomial time. (Encode each wire as a variable

and each gate as a sub-formula, and then expand each sub-formula into

CNF.)

• The second transforms an arbitrary satisfying input for K into a satisfying

assignment for 3.

(Trace the input through the circuit, transfer values from

each wire to the corresponding variable, and give any additional variables

arbitrary values.)

• The third transforms an arbitrary satisfying assignment for 3

into a satisfying

input for K. (Transfer values from each wire variable in 3

to the

corresponding wire in K.)

The reduction works because the first algorithm encodes any boolean circuit K

into a highly structured CNF

formula 3.

The specific structure of 3

restricts

how it can be satisfied; every satisfying assignment for 3

must “come from”

some satisfying input for K. We don’t have to think about arbitrary CNF

formulas at all.

Similarly, our reduction from S

to MIS

consists of three algorithms:

• The first transforms an arbitrary CNF

formula

into a special graph G and

a specific integer k, in polynomial time.

• The second transforms an arbitrary satisfying assignment for

into an

independent set in G of size k.

• The third transforms an arbitrary independent set in G of size k into a

satisfying assignment for .

Again, our first transformation encodes the input formula

into a highly

structured graph G and a specific integer k. The structure of G ensures that

every independent set of size k “comes from” a satisfying assignment for .

We

don’t consider arbitrary graphs or arbitrary independent set sizes at all.

393

12. NP-HARDNESS

12.9 Clique and Vertex Cover (from Independent Set)

A clique is another name for a complete graph, that is, a graph where every

pair of vertices is connected by an edge. The MC

problem asks for the

number of nodes in its largest complete subgraph in a given graph. A vertex

cover of a graph is a set of vertices that touches every edge in the graph. The

MVC

problem asks for the size of the smallest vertex cover in a

given graph.

Figure 12.9. A graph whose largest independent set, largest clique, and smallest vertex cover all have

size 4.

We can prove that MC

is NP-hard using the following easy reduction

from MIS.

Any graph G has an edge-complement G with the same

vertices, but with exactly the opposite set of edges—uv is an edge in G if and

only if uv is not an edge in G. A set of vertices is independent in G if and

only if the same vertices define a clique in G. Thus, the largest independent in

G has the same vertices (and thus the same size) as the largest clique in the

complement of G.

M↵I⌃⇥⇤⌥⇤⌃⇥⇤⌃

S⇤

M↵C⇧⌅⌦⇤

G

graph

complement

in O(V2)

time

G

graph size of largest

clique in G

k

size of largest

independent

set in G

‾ k

Figure 12.10. An easy reduction from MAXINDSET to MAXCLIQUE.

The proof that MVC

is NP-hard is even simpler, because it relies

on the following easy observation: I is an independent set in a graph G = (V, E)

if and only if its complement V \ I is a vertex cover of the same graph G. Thus,

the largest independent set in any graph is the complement of the smallest vertex

cover of the same graph! Thus, if the smallest vertex cover in an n-vertex graph

has size k, then the largest independent set has size n

k.

394

12.10. Graph Coloring (from 3SAT)

M↵I⇧⇥⇤⌥⇤⇧⇥⇤⇧

S⇤

G

graph –

M⌅⇧V⇤

⇤↵

C⌃⌦⇤

size of largest

independent

set in G

n – k

size of smallest

vertex cover in G

k

n

number of vertices in G

Figure 12.11. An even easier reductions from MAXINDSET to MINVERTEXCOVER.

12.10 Graph Coloring (from 3SAT)

A proper k-coloring of a graph G = (V, E) is a function C : V ! {1, 2, . . . , k}

that assigns one of k “colors” to each vertex, so that every edge has two dierent

colors at its endpoints. (The “colors” are really arbitrary labels, which for

simplicity we represent by small positive integers, rather than electromagnetic

frequencies, CMYK vectors, or Pantone numbers, for example.) The graph

coloring problem asks for the smallest possible number of colors in a legal

coloring of a given graph.

To prove that graph coloring is NP-hard, it suces

to consider the decision

problem C:

Given a graph, does it have a proper 3-coloring? We prove

C

is NP-hard using a reduction from S.

(Why S?

Because it has a

in it. You probably think I’m joking, but I’m not.) Given a CNF

formula ,

we

construct a graph G that is -colorable

if and only if

is satisfiable, as suggested

by the usual diagram.

SAT

C⇧⌅⇧⌃

G

3CNF

Boolean

formula

Φ

graph

G is

-colorable

G is not

-colorable

T⌃⇤

F⇥⌅⌥⇤

Φ is

satisfiable

Φ is not

satisfiable

T⌃⇤

F⇥⌅⌥⇤

transform

in O(n)

time

We describe the reduction using a standard strategy of decomposing the

output graph G into gadgets, subgraphs that enforce various semantics of the

input formula

in the language of graph coloring. Decomposing reductions

into separate gadgets is not only helpful for understanding existing reductions

and proving them correct, but for designing new NP-hardness reductions.

Our

reduction from CS

to S

encoding each gate in the input circuit as a clause

in the output formula; these clauses are “gate gadgets”. Similarly, our reduction from S

to

MIS

used two type of gadgets: “clause gadgets” (triangles) and “variable gadgets” (edges

between contradicting literals).

395

12. NP-HARDNESS

Our formula-to-graph reduction uses three types of gadgets:

• There is a single truth gadget: a triangle with three vertices T, F, and X,

which intuitively stand for T,

F,

and O.

Since these vertices

are all connected, they must have dierent

colors in any 3-coloring. For the

sake of convenience, we will name those colors T,

F,

and O.

Thus, when we say that a node is colored T,

we mean that it has the

same color as vertex T.

• For each variable a, the graph contains a variable gadget, which is a triangle

joining two new nodes labeled a and a to node X in the truth gadget. Node a

must be colored either T

or F,

and therefore node a must be colored

either F

or T,

respectively.

X

T F

X

a a

Figure 12.12. The truth gadget and a variable gadget for a.

• Finally, for each clause in ,

the graph contains a clause gadget. Each clause

gadget joins three literal nodes (from the corresponding variable gadgets)

to node T (from the truth gadget) using five new unlabeled nodes and ten

edges, as shown below.

a

b

c

T

Figure 12.13. A clause gadget for (a _ b _ ¯c).

In eect,

each triangle in the clause gadget behaves like a “majority

gate”. In any valid -coloring,

if the two vertices to the left of the triangle

have the same color, the rightmost vertex of the triangle must have the same

color; on the other hand, if the two left vertices have dierent

colors, the

color of the right vertex can be chosen arbitrarily. See Figure ..

It follows that there is no valid -coloring

of a clause gadget where all

three literal nodes are colored F.

On the other hand, any coloring of the

literal nodes with more than one color can be extended to a valid -coloring

of the clause gadget. The variable gadgets force each literal node to be

colored either T

or F;

thus, in any valid -coloring

of the clause

gadget, at least one literal node is colored T.

396

12.10. Graph Coloring (from 3SAT)

Figure 12.14. All valid 3-colorings of a “half-gadget”, up to permutations of the colors

The final graph G contains exactly one node T, exactly one node F, and exactly

two nodes a and a¯ for each variable. For example, Figure .

shows the graph

that results from the same CNF

formula (a _ b _ c)^(b _ ¯c _ d¯)^(a¯ _ c _ d)^

(a _ ¯

b _ d¯) that we previously used to illustrate the MIS

reduction in

Figure ..

The -coloring

is one of several that correspond to the satisfying

assignment a = c = T,

b = d = F.

X

T F

a a b b c c d d

(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯

b _ d¯)

Figure 12.15. The 3-colorable graph derived from a satisfiable 3CNF formula.

We’ve already done most of the work for a proof of correctness. If the formula

is satisfiable, then we can color the literal nodes according to any satisfying

assignment, and then (because each clause is satisfied) extend the coloring

across every clause gadget. On the other hand, if the graph is -colorable,

then

we can extract a satisfying assignment from any -coloring—at

least one of the

three literal nodes in every clause gadget is colored T.

Because C

is a special case of the more general graph coloring problem—

397

12. NP-HARDNESS

What is the minimum number of colors?—the more general optimization problem

is also NP-hard.

12.11 Hamiltonian Cycle

A Hamiltonian cycle in a graph is a cycle that visits every vertex exactly once.

(This is very dierent

from an Euler circuit, which is a closed walk that traverses

every edge exactly once; Euler circuits are easy to find and construct in linear

time using depth-first search.) Here we consider two dierent

proofs that the

Hamiltonian cycle problem in directed graphs is NP-hard.

From Vertex Cover

Our first NP-hardness proof reduces from the decision version of the vertex cover

problem. Given an undirected graph G and an integer k, we construct a directed

graph H, such that H has a Hamiltonian cycle if and only if G has a vertex cover

of size k. As in our previous reductions, the output graph H is composed of

several gadgets, each corresponding to certain features of the inputs G and k.

V⌅ ↵⌅C⌅


G

graph

D⇧ ⌅⇥↵⌅⇤

H⌥C✏⇥⌃⌅

integer

k

H has a

Ham. cycle

H has no

Ham. cycle

T ⌅

F⌃⌦⌅

G has a

vertex cover

of size k

T ⌅

F⌃⌦⌅

H

graph

transform

in O(V+E)

time

G has no

vertex cover

of size k

• For each undirected edge uv in G, the directed graph H contains an edge

gadget consisting of four vertices (u, v,in), (u, v, out), (v, u,in), (v, u, out)

and six directed edges

(u, v,in)(u,

v, out) (u, v,in)(v,

u,in) (v, u,in)(u,

v,in)

(v, u,in)(v,

u, out) (u, v, out)(v,

u, out) (v, u, out)(u,

v, out)

as shown in Figure ..

Each “in” vertex has an additional incoming edge,

and each “out” vertex has an additional outgoing edge. Any Hamiltonian

cycle in H must pass through an edge gadget in one of three ways—either

straight through on both sides, or with a detour from one side to the other

and back. Eventually, these options will correspond to both u and v, only u,

or only v belonging to some vertex cover.

• For each vertex u in G, all the edge gadgets for incident edges uv are

connected in H into a single directed path, which we call a vertex chain.

Specifically, suppose vertex u has d neighbors v1, v2,..., vd . Then H has

d

1 additional edges (u, vi, out)(u,

vi+1,in) for each i from 1 to d

1.

398

12.11. Hamiltonian Cycle

u v

(u,v,in)

(u,v,out)

(v,u,in)

(v,u,out)

u v u v u v

Figure 12.16. An edge gadget and its only possible intersections with a Hamiltonian cycle.

• Finally, H also contains k cover vertices x1, x2,..., xk. Each cover vertex has

a directed edge to the first vertex in each vertex chain, and a directed edge

from the last vertex in each vertex chain.

Figure .

shows a complete example of our transformation; each doublearrowed

blue segment represents a pair of directed edges.

u v

w x

u v

w x

k=2

Figure 12.17. Example of our reduction from VERTEXCOVER to DIRECTEDHAMCYCLE.

As usual, we prove our reduction is correct in two stages.

) First, suppose C = {u1, u2,..., uk} is a vertex cover of G of size k. We can

construct a Hamiltonian cycle in H that “encodes” C as follows. For each

index i from 1 to k, we traverse a path from cover vertex xi, through the

vertex chain for ui, to cover vertex xi+1 (or cover vertex x1 if i = k). As we

traverse the chain for each vertex ui, we determine how to proceed from

each node (ui, v,in) as follows:

– If v 2 C, follow the edge (ui, v,in)(ui,

v, out).

– If v 62 C, detour through (ui, v,in)(v,

ui,in)(v,

ui, out)(ui,

v, out).

Thus, for each edge uv of G, the Hamiltonian cycle visits (u, v,in) and

(u, v, out) as part of u’s vertex chain if u 2 C and as part of v’s vertex chain

otherwise. See Figure ..

399

12. NP-HARDNESS

u v

w x

u v

w x

k=2

Figure 12.18. Every vertex cover of size k in G corresponds to a Hamiltonian cycle in H and vice versa.

( On the other hand, suppose H contains a Hamiltonian cycle C. This

cycle must contain an edge from each cover vertex to the start of some

vertex chain. Our case analysis of edge gadgets inductively implies that

after C enters the vertex chain for some vertex u, it must traverse the entire

vertex chain. Specifically, at each vertex (u, v,in), the cycle must

contain either the single edge (u, v,in)(u,

v, out) or the detour path

(u, v,in)(v,

u,in)(v,

u, out)(u,

v, out), followed by an edge to the next

edge gadget in u’s vertex chain, or to a cover vertex if this is the last edge

gadget in u’s vertex chain. In particular, if C contains the detour edge

(u, v,in)(v,

u,in), it cannot contain edges between any cover vertex and v’s

vertex chain. It follows that C traverses exactly k vertex chains. Moreover,

these vertex chains describe a vertex cover of the original graph G, because C

visits the vertex (u, v,in) for every edge uv in G.

We conclude that G has a vertex cover of size k if and only if H contains a

Hamiltonian cycle. The transformation from G to H takes at most O(V2) time;

it follows that the directed Hamiltonian cycle problem is NP-hard.

From 3SAT

We can also prove that the directed Hamiltonian cycle problem is NP-hard by

reducing directly from S.

Given an arbitrary CNF

formula

with n variables

x1, x2,..., xn and k clauses c1,c2,...,ck, we construct a directed graph H that

contains a Hamiltonian cycle if and only if

is satisfiable, as follows.

For each variable xi, we construct a variable gadget, which consists of a

doubly-linked list of 2k vertices (i, 0),(i, 1),...,(i, 2k), connected by edges

(i, j

1)(i,

j) and (i, j)(i,

j

1) for each index j. We connect the first and

400

12.11. Hamiltonian Cycle

last nodes in each adjacent pair of variable gadgets by adding edges

(i, 0)(i

+ 1, 0) (i, 2k)(i

+ 1, 0) (i, 0)(i

+ 1, 2k) (i, 2k)(i

+ 1, 2k)

for each index i; we also connect the endpoints of the first and last variable

gadgets with the edges

(n, 0)(1,

0) (n, 2k)(1,

0) (n, 0)(1,

2k) (n, 2k)(1,

2k).

The resulting graph G has exactly 2n Hamiltonian cycles, one for each assignment

of boolean values to the n variables of .

Specifically, for each i, we traverse the

ith variable gadget from left to right if xi = T

and right to left if xi = F.

See Figure ..

a

b

c

d

a

b

c

d

Figure 12.19. Left: Variable gadgets and connectors in G, for any formula with 4 variables and 4 clauses.

Right: The Hamiltonian cycle in G corresponding to the assignment a = b = d = T

and c = F

Now we extend G to a larger graph H by adding a clause vertex [ j] for each

clause cj, connected to the variable gadgets by six edges, as shown in Figure ..

For each positive literal xi in cj, we add the edges (i, 2j

1)[

j](i,

2j), and

for each negative literal x¯i in cj, we add the edges (i, 2j)[

j](i,

2j

1). The

connections to the clause vertices guarantee that a Hamiltonian cycle in G

can be extended to a Hamiltonian cycle in H if and only if the corresponding

variable assignment satisfies .

Exhaustive case analysis now implies that H has

a Hamiltonian cycle if and only if

is satisfiable.

Transforming the formula

into the graph H takes O(kn) time, which is at

most quadratic in the total length of the formula; we conclude that the directed

Hamiltonian cycle problem is NP-hard.

Variants and Extensions

Trivial modifications of the previous reductions imply that the Hamiltonian path

problem in directed graphs is also NP-hard. A Hamiltonian path in a graph G is

of course a simple path that visits every vertex of G exactly once. In fact, there

401

12. NP-HARDNESS

a

b

c

d

a

b

c

d

Figure 12.20. Left: Clause gadgets for the formula (a _ b _ c)^(b _¯c _ d¯)^(a¯_ c _ d)^(a _¯

b _ d¯).

Right: A hamiltonian cycle in H corresponding to the satisfying assignment a = b = d = T

and

c = F.

are simple polynomial-time reductions from the Hamiltonian cycle problem to

the Hamiltonian path problem and vice versa. I’ll leave the details of these

reductions as exercises.

Both of the previous reductions deal with directed graphs, but the corresponding

question in undirected graph is also NP-hard. In fact, there is a

relatively simple reduction from the directed Hamiltonian cycle/path problem

to the undirected Hamiltonian cycle/path problem; again, I’ll leave the details

of this reduction as an exercise.

Finally, the infamous traveling salesman problem asks to find the shortest

Hamiltonian cycle (or path) in a graph with weighted edges. Since finding the

shortest cycle/path is obviously harder than determining if a cycle/path exists

at all—Consider a graph where every edge has weight 1!—TS

is also NP-hard.

12.12 Subset Sum (from Vertex Cover)

The next problem that we prove NP-hard is the SS

problem considered

in Chapter :

Given a set X of positive integers and an integer T, determine

whether X has a subset whose elements sum to T.

We once again reduce from VC.

Given a graph G and an integer k,

we need to compute a set X of positive integers and an integer T, such that X

has a subset that sums to T if and only if G has an vertex cover of size k. Our

transformation uses only two types of “gadgets”, which are integers representing

the vertices and edges in G.

Number the edges of G arbitrarily from 0 to E

1. Our set X contains the

402

12.12. Subset Sum (from Vertex Cover)

integer bi := 4i for each edge i, and the integer

av := 4E +

X

i2(v)

4i

for each vertex v, where (v)

is the set of edges that have v has an endpoint.

Alternately, we can think of each integer in X as an (E +1)-digit number written

in base 4. The Eth digit is 1 if the integer represents a vertex, and 0 otherwise;

and for each i < E, the ith digit is 1 if the integer represents edge i or one of its

endpoints, and 0 otherwise. Finally, we set the target sum

T := k · 4E +

X

E1

i=0

2 · 4i

.

Now let’s prove that the reduction is correct.

) First, suppose G has a vertex cover C of size k. Consider the subset

X0 := {av | v 2 C} [ {bi | edge i has exactly one endpoint in C}

The sum of the elements of X0

, written in base ,

has most significant digit k

and all other digits equal to 2. Thus, the elements of X0 sum to exactly T.

( On the other hand, suppose there is a subset X0 ✓ X that sums to T.

Specifically, we must have

X

v2V0

av +

X

i2E0

bi = t

for some subsets V0 ✓ V and E0 ✓ E. Again, if we sum these base-

numbers,

there are no carries in the first E digits, because for each i there are only

three numbers in X whose ith digit is 1. Each edge number bi contributes

only one 1 to the ith digit of the sum, but the ith digit of t is 2. Thus, for

each edge in G, at least one of its endpoints must be in V0

. In other words,

V0 is a vertex cover. On the other hand, only vertex numbers are larger

than 4E, and bT/4Ec = k, so V0 has at most k elements. (In fact, it’s not

hard to see that V0 has exactly k elements.)

For example, given the four-vertex graph G = (V, E) where V = {u, v, w, x} and

E = {uv, uw, vw, v x, wx}, our set X might contain the following base-

integers:

au := 1110004 = 1344 buv := 0100004 = 256

av := 1101104 = 1300 buw := 0010004 = 64

aw := 1011014 = 1105 bvw := 0001004 = 16

ax := 1000114 = 1029 bv x := 0000104 = 4

bwx := 0000014 = 1

403

12. NP-HARDNESS

If we are looking for a vertex cover of size k = 2, our target sum would be

T := 2222224 = 2730. Indeed, the vertex cover {v, w} corresponds to the subset

{av , aw, buv , buw, bv x , bwx }, whose sum is 1300+1105+256+64+4+1 = 2730.

The reduction can clearly be performed in polynomial time. We’ve already

proved that VC

is NP-hard, so it follows that SS

is NP-hard.

Caveat Reductor!

One subtle point must be emphasized here. 3-something

pages ago, back in

Chapter ,

we developed a dynamic programming algorithm to solve SS

in O(nT) time. Isn’t this a polynomial-time algorithm? Didn’t we just prove that

P=NP? Hey, where’s my million dollars?!

Alas, life is not so simple. True, the running time is a polynomial function

of the variables n and T, but to qualify as a true polynomial-time algorithm, the

running time must be a polynomial function of the input size—the number of bits

required to represent the input. The values of the elements of X and the target

sum T could be exponentially larger than the number of input bits. Indeed,

the reduction we just described produces a value of T that is exponentially

larger than the size of our original input graph, which forces our dynamic

programming algorithm to run in exponential time.

Algorithms like this are said to run in pseudo-polynomial time, and any

NP-hard problem with such an algorithm is called weakly NP-hard. Equivalently,

a weakly NP-hard problem is a problem that can be solved in polynomial time

when all input numbers are represented in unary (as a sum of 1s), but becomes

NP-hard when all input numbers are represented in binary. If a problem is

NP-hard even when all the input numbers are represented in unary, we say

that the problem is strongly NP-hard. A good example of a strongly NP-hard

problem is TS,

which remains NP-hard even if the input graph

is complete and all edge weights are equal to 1 or 2.

12.13 Other Useful NP-hard Problems

Literally thousands of problems have been proved to be NP-hard. Here I will list

a few NP-hard problems that are useful in deriving reductions.

I won’t describe

the NP-hardness proofs for these problems in detail, but you can find most of

them in Garey and Johnson’s classic Scary Black Book of NP-Completeness.


All

As

someday it may happen a reduction must be found, I’ve got a little list. I’ve got a little

list. Of some Herculean problems we can use to lower bound. Fast solutions don’t exist. Our

proofs they can assist. Michael

Garey and David Johnson. Computers and Intractability: A Guide to the Theory of

NP-Completeness. W. H. Freeman and Co., .

And yes, it really is black.

404

12.13. Other Useful NP-hard Problems

of the problems I’ve discussed so far, and most of the problems in the following

list, were first proved NP-hard in a single landmark

paper by Richard Karp.

• PlanarCircuitSat: Given a boolean circuit that can be embedded in the

plane so that no two wires cross, is there an input that makes the circuit

output T?

This problem can be proved NP-hard by reduction from the

general circuit satisfiability problem, by replacing each crossing with a small

assemblage of gates.

• 1-in-3Sat: Given a CNF

formula, is there an assignment of values to the

variables so that each clause contains exactly one T

literal? This problem

can be proved NP-hard by reduction from the usual S.

• NotAllEqual3Sat: Given a CNF

formula, is there an assignment of

values to the variables so that every clause contains at least one T

literal

and at least one F

literal? This problem can be proved NP-hard by

reduction from the usual S.

• Planar3Sat: Given a CNF

boolean formula, consider a bipartite graph

whose vertices are the clauses and variables, where an edge indicates that

a variable (or its negation) appears in a clause. If this graph is planar, the

CNF

formula is also called planar. The PS

problem asks, given a

planar CNF

formula, whether it has a satisfying assignment. This problem

can be proved NP-hard by reduction from PCS.


• Exact3DimensionalMatching or X3M: Given a set S and a collection of

three-element subsets of S, called triples, is there a sub-collection of disjoint

triples that exactly cover S? This problem can be proved NP-hard by a

reduction from S,

because it has a

in it.

• Partition: Given a set S of n integers, are there subsets A and B such that

A[ B = S, A\ B = ?, and

X

a2A

a =

X

b2B

b?

This problem can be proved NP-hard by a simple reduction from SS.

Like SS,

the P

problem is only weakly NP-hard.

• 3Partition: Given a set S of 3n integers, can it be partitioned into n disjoint

three-element subsets, such that every subset has exactly the same sum?

Despite the similar names, this problem is very dierent

from P;

sorry, I didn’t make up the names. This problem can be proved NP-hard

by reduction from XM,

because it has a

in it. Unlike P,

the

P

problem is strongly NP-hard; it remains NP-hard even if every

input number is at most n3.

Later

performed o-Broadway

as Richard Karp and his

Assistants, for which Karp won a

well-deserved Tony Turing award. Surprisingly,

PNAES

is solvable in polynomial time!

405

12. NP-HARDNESS

• SetCover: Given a collection of sets S = {S1, S2,..., Sm}, find the smallest

sub-collection of Si’s that contains all the elements of S

i Si. This problem is

a generalization of both VC

and XM.

• HittingSet: Given a collection of sets S = {S1, S2,..., Sm}, find the minimum

number of elements of S

i Si that hit every set in S. This problem is

also a generalization of VC.

• LongestPath: Given a non-negatively weighted graph G (either directed

or undirected) and two vertices u and v, what is the longest simple path

from u to v in the graph? A path is simple if it visits each vertex at most

once. This problem is a generalization of the corresponding Hamiltonian

path problem. Of course, the corresponding shortest path problem can be

solved in polynomial time.

• SteinerTree: Given a weighted, undirected graph G with some of the

vertices marked, what is the minimum-weight subtree of G that contains

every marked vertex? If every vertex is marked, the minimum Steiner tree

is the minimum spanning tree; if exactly two vertices are marked, the

minimum Steiner tree is the shortest path between them. This problem can

be proved NP-hard by reduction from VC.

• Max2Sat: Given a Boolean formula in conjunctive normal form, with exactly

two literals per clause, find a variable assignment that maximizes the number

of clauses with at least one T

literal. This problem can be proved NP-hard

by reduction from S

(yes, even though it doesn’t have a

in it). The

simpler decision problem S,

which asks if there is an assignment that

satisfies every clause, can actually be solved in polynomial time.

• MaxCut: Given an undirected graph G = (V, E), find a subset S ⇢ V that

maximizes the number of edges with exactly one endpoint in S. Equivalently,

find the largest bipartite subgraph of G. This problem can be proved NP-hard

by reduction from MS.

In addition to these dry but useful problems, most interesting puzzles and solitaire

games have been shown to be NP-hard, or to have NP-hard generalizations.

(Arguably, if a game or puzzle isn’t at least NP-hard, it isn’t interesting!) Here

are some examples you may find familiar:

• Minesweeper (from CS)

• Sudoku (utlimately from S)

Richard

Kaye. Minesweeper is NP-complete. Mathematical Intelligencer ():–,

.

But see also: Allan Scott, Ulrike Stege, and Iris van Rooij. Minesweeper may not be NP-complete

but is hard nonetheless. Mathematical Intelligencer ():–,

.

Takayuki

Yato and Takahiro Seta. Complexity and completeness of finding another solution

and its application to puzzles. IEICE Transactions on Fundamentals of Electronics, Communications

and Computer Sciences E-A():–,

.

http://www-imai.is.s.u-tokyo.ac.jp/~yato/

data/MasterThesis.pdf.

406

12.14. Choosing the Right Problem

• Tetris (from P)

• Klondike, aka “Solitaire” (from S)

• Pac-Man (from HC)

• Super Mario Brothers (from S)

• Candy Crush Saga (from a variant of S)

• Threes/

(from S,

of course)

• Trainyard (from DS;

see Exercise )

• Shortest n ⇥ n ⇥ n Rubik’s cube solution (from S

via a special case of

PUHC).

• Cookie Clicker (from P

or P)

This list is necessarily incomplete, thanks to a limited footnote budget.

As

of June ,

nobody has published a proof that a generalization of Ultimate

Paperclips, Line Rider, Twister, or Cards Against Humanity is NP-hard, but I’m

sure it’s only a matter of time.

12.14 Choosing the Right Problem

One of the most dicult

steps in proving that a problem is NP-hard is choosing

the best problem to reduce from. The Cook-Levin Theorem implies that if there

is a reduction from any NP-hard problem to problem X, then there is a reduction

Ron

Breukelaar, Erik D. Demaine, Susan Hohenberger, Hendrik J. Hoogeboom, Walter A.

Kosters, and David Liben-Nowell. Tetris is hard, even to approximate. International Journal of

Computational Geometry and Applications :–,

.

Luc

Longpré and Pierre McKenzie. The complexity of Solitaire. Proceedings of the nd

International Mathematical Foundations of Computer Science, –,

.

Giovanni

Viglietta. Gaming is a hard job, but someone has to do it! Theory of Computing

Systems, ():–,

.

http://giovanniviglietta.com/papers/gaming.pdf.

Greg

Aloupis, Erik D. Demaine, Alan Guo, and Giovanni Viglietta. Classic Nintendo games

Are (computationally) hard. Theoretical Computer Science :–,

.

http://arxiv.org/

abs/..

Luciano

Gualà, Stefano Leucci, Emanuele Natale. Bejeweled, Candy Crush and other matchthree

games are (NP-)hard. Proc.

IEEE Conference on Computational Intelligence and Games,

.

http://arxiv.org/abs/..

Stefan

Langerman and Yushi Uno. Threes!, Fives, !,

and

are Hard. Proc. th

International Conference on Fun with Algorithms, .

https://arxiv.org/abs/..

Matteo

Almanza, Stefano Leucci, and Alessandro Panconesi. Trainyard is NP-Hard. Proc.

th

International Conference on Fun with Algorithms, .

https://arxiv.org/abs/..

Erik

D. Demaine, Sarah Eisenstat, and Mikhail Rudoy. Solving the Rubik’s Cube optimally

is NP-complete. Proc. th

Symposium on Theoretical Aspects of Computer Science, .

https:

//arxiv.org/abs/..

Erik

D. Demaine, Hiro Ito, Stefan Langerman, Jayson Lynch, Mikhail Rudoy, and Kai Xiao.

Cookie Clicker. Preprint, August .

https://arxiv.org/abs/..

See

https://xkcd.com//

407

12. NP-HARDNESS

from every NP-complete problem to problem X, but some problems are easier

to work with than others. There’s no systematic method for choosing the right

problem, but here are a few useful rules of thumb.

• If the problem asks how to assign bits to objects, or to choose a subset of

objects, or to partition objects into two dierent

subsets, try reducing from

some version of S

or P.

• If the problem asks how to assign labels to objects from a small fixed set,

or to partition objects into a small number of subsets, try reducing from

kC

or even C.

• If the problem asks to arrange a set of objects in a particular order, try

reducing from DHC

or DHP

or T-

S.

• If the problem asks to find a small subset satisfying some constraints, try

reducing from MVC.

• If the problem asks to find a large subset satisfying some constraints, try

reducing from MIS

or MC

or MS.

• If the problem asks to partition objects into a large number of small subsets,

try reducing from P.

• If the number 3 appears naturally in the problem, try S

or C

or

XM

or P.

(No, this is not a joke.)

• If all else fails, try S

or even CS!

I do not recommend trying to reduce from T,

SMB,

or

T.

You really want to choose a starting problem that is as simple as

possible, while still capturing some feature of your problem that makes it dicult

to solve.

12.15 A Frivolous Real-World Example

Draughts is a family of board games that have been played for thousands of

years. Most Americans are familiar with the version called checkers or English

draughts, but the most common variant worldwide, known as international

draughts or Polish draughts, originated in the Netherlands in the th

century.

For a complete set of rules, the reader should consult Wikipedia; here a few

important dierences

from the Anglo-American game:

• Flying kings: As in checkers, a piece that ends a move in the row closest to

the opponent becomes a king and gains the ability to move backward. Unlike

in checkers, however, a king in international draughts can move any distance

along a diagonal line in a single turn, as long as the intermediate squares

are empty or contain exactly one opposing piece (which is captured).

408

12.15. A Frivolous Real-World Example

• Forced maximum capture: In each turn, the moving player must capture

as many opposing pieces as possible. This is distinct from the forced-capture

rule in checkers, which requires only that each player must capture if possible,

and that a capturing move ends only when the moving piece cannot capture

further. In other words, checkers requires capturing a locally maximal set of

opposing pieces on each turn; whereas, international draughts requires a

globally maximum capture.

• Capture subtleties: As in checkers, captured pieces are removed from the

board only at the end of the turn. Any piece can be captured at most once.

Thus, when an opposing piece is jumped, that piece remains on the board

but cannot be jumped again until the end of the turn.

For example, in the first position shown below, each circle represents a piece,

and doubled circles represent kings. Black must make the first indicated move,

capturing five white pieces, because it is not possible to capture more than five

pieces, and there is no other move that captures five. Black cannot extend

his capture further, either northeast or southeast, because the captured White

pieces remain on the board until his turn is over. Then White must make the

second indicated move, thereby winning the game.

Figure 12.21. Two forced(!) moves in international draughts; doubled circles are kings.

The actual game, which is played on a 10 ⇥ 10 board with

pieces of each

color, is computationally trivial; we can precompute the optimal move for both

players in every possible board configuration and hard-code the results into a

lookup table of constant size. Sure, it’s a big constant, but it’s still a constant!

But consider the natural generalization of international draughts to an n ⇥ n

board. In this setting, finding a legal move is actually NP-hard! The following

reduction from the Hamiltonian cycle problem in directed graphs was discovered

by Bob Hearn in .


In most two-player games, finding the best move is

NP-hard (or worse). This is the only example I know of a game—and moreover

a real game played by millions of people for centuries—where merely following

the rules is NP-hard!

Given an undirected graph G with n vertices, we construct a board con-

figuration for international draughts, such that White can capture a certain

See

Theoretical Computer Science Stack Exchange: http://cstheory.stackexchange.com/a/

/.

409

12. NP-HARDNESS

number of black pieces in a single move if and only if G has a Hamiltonian

cycle. We treat G as a directed graph, with two arcs uv

and vu

in place of

each undirected edge uv. Number the vertices arbitrarily from 1 to n. The final

draughts configuration has several gadgets:

• The vertices of G are represented by rabbit-shaped vertex gadgets, which

are evenly spaced along a horizontal line. Each arc ij

is represented by a

path of two diagonal line segments from the “left ear” of vertex gadget i to

the “right ear” of vertex gadget j. The path for arc ij

is located above the

vertex gadgets if i < j, and below the vertex gadgets if i > j.

1 2 3 4

2 4

1

3

Figure 12.22. A high-level overview of the reduction from Hamiltonian cycle to international draughts.

• The bulk of each vertex gadget is a diamond-shaped region called a vault.

The walls of the vault are composed of two solid layers of black pieces,

which cannot be captured; these pieces are drawn as gray circles in the

figures. There are N capturable black pieces inside each vault, for some

large integer N to be determined later. A white king can enter the vault

through the “right ear”, capture every internal piece, and then exit through

the “left ear”. Both ears are hallways, again with walls two pieces thick,

with gaps where the arc paths end to allow the white king to enter and

leave. The lengths of the “ears” can easily be adjusted to align with the

other gadgets.

• For each arc ij,

we have a corner gadget, which allows a white king leaving

vertex gadget i to be redirected to vertex gadget j.

• Finally, wherever two arc paths cross, we have a crossing gadget; these

gadgets allow the white king to traverse either arc path, but forbid switching

from one arc path to the other.

A single white king starts at the bottom corner of one of the vaults. In any

legal move, this king must alternate between traversing entire arc paths and

clearing vaults. The king can traverse the various gadgets backward, entering

410

12.15. A Frivolous Real-World Example

1

2

4

4

2

1

3 3

Entrance

Exit

Figure 12.23. Left: A vertex gadget with three entrances and three exits. Right: A white king emptying

the vault. Gray circles are black pieces that cannot be captured.

Figure 12.24. Left: One of two paths through a corner gadget. Right: One of two paths through a

crossing gadget.

each vault through the exit and vice versa. But the reversal of a Hamiltonian

cycle in G is another Hamiltonian cycle in G, so walking backward is fine.

If there is a Hamiltonian cycle in G, the white king can capture at least nN

black pieces by visiting each of the other vaults and returning to the starting

vault. On the other hand, if there is no Hamiltonian cycle in G, the white king

can capture at most half of the pieces in the starting vault, and thus can capture

at most (n

1/2)N + O(n3) enemy pieces altogether. The O(n3) term accounts

for the corner and crossing gadgets; each edge passes through one corner gadget

and at most n2/2 crossing gadgets.

To complete the reduction, we set N = n4. Summing up, we obtain an

O(n5) ⇥ O(n5) board configuration, with O(n5) black pieces and one white king.

We can clearly construct this board configuration by brute force in polynomial

time. Figure .

shows a complete example of the construction.

It is still open whether the following related question is NP-hard: Given an

n ⇥ n board configuration for international draughts, can (and therefore must)

White capture all the black pieces (thereby winning the game) in a single turn?

411

12. NP-HARDNESS

1 2 3 4

Figure 12.25. The final draughts configuration for the 4-vertex graph in Figure 12.22. (The green arrows

are not actually part of the configuration.)

™12.16 On Beyond Zebra

P and NP are only the first two steps in an enormous hierarchy of complexity

classes. To close this chapter (and the book), let me describe a few more classes

of interest.

Polynomial Space

PSPACE is the set of decision problems that can be solved using polynomial

space. Every problem in NP (and therefore in P) is also in PSPACE. It is generally

believed that NP 6= PSPACE, but nobody can even prove that P 6= PSPACE.

A problem ⇧ is PSPACE-hard if, for any problem ⇧0 that can be solved using

polynomial space, there is a polynomial-time many-one reduction from ⇧0 to ⇧.

If any PSPACE-hard problem is in NP, then PSPACE=NP; similarly, if any

PSPACE-hard problem is in P, then PSPACE=P.

412

™12.16. On Beyond Zebra

The canonical PSPACE-hard problem is the quantified boolean formula

problem, or QBF: Given a boolean formula

that may include any number

of universal or existential quantifiers, but no free variables, is

equivalent to

T?

For example, the following expression is a valid input to QBF:

9a: 8b: 9c: (8d : a _ b _ c _ d¯) , ((b ^ ¯c) _ (9e : (a¯ ) e) _ (c 6= a ^ e))).

S

is equivalent to the special case of QBF where the input formula contains

only existential quantifiers (9). QBF remains PSPACE-hard even when the input

formula must have all its quantifiers at the beginning, the quantifiers strictly

alternate between 9 and 8, and the quantified proposition is in conjunctive

normal form, with exactly three literals in each clause, for example:

9a: 8b: 9c: 8d :


(a _ b _ c) ^ (b _ ¯c _ d¯) ^ (a¯ _ c _ d) ^ (a _ ¯

b _ d¯)


This restricted version of QBF can also be phrased as a two-player strategy

question. Suppose two players, Alice and Bob, are given a CNF

predicate

with free variables x1, x2,..., xn. The players alternately assign values to the

variables in order by index—Alice assigns a value to x1, Bob assigns a value

to x2, and so on. Alice eventually assigns values to every variable with an

odd index, and Bob eventually assigns values to every variable with an even

index. Alice wants to make the expression T,

and Bob wants to make

it F.

Assuming Alice and Bob play perfectly, who wins this game? Not

surprisingly, most two-player games

like tic-tac-toe, reversi, checkers, go,

chess, and mancala—or more accurately, appropriate generalizations of these

constant-size games to arbitrary board sizes—are PSPACE-hard.

Another canonical PSPACE-hard problem is NFA totality: Given a nondeterministic

finite-state automaton M over some alphabet ⌃, does M accept

every string in ⌃⇤? The closely related problems NFA equivalence (Do two given

NFAs accept the same language?) and NFA minimization (Find the smallest NFA

that accepts the same language as a given NFA) are also PSPACE-hard, as are

the corresponding questions about regular expressions. (The corresponding

questions about deterministic finite-state automata are actually solvable in

polynomial time.)

Exponential Time

The next significantly larger complexity class, EXP (also called EXPTIME), is

the set of decision problems that can be solved in exponential time, that is,

using at most 2nc

steps for some constant c > 0. Every problem in PSPACE (and

For

a good (but inevitably aging) overview of known results on the computational complexity

of games and puzzles, see Erik Demaine and Bob Hearn’s monograph Games, Puzzles, and

Computation (CRC Press, ).

413

12. NP-HARDNESS

therefore in NP (and therefore in P)) is also in EXP. It is generally believed

that PSPACE ( EXP, but nobody can even prove that NP 6= EXP. A problem ⇧ is

EXP-hard if, for any problem ⇧0 that can be solved in exponential time, there is

a polynomial-time many-one reduction from ⇧0 to ⇧. If any EXP-hard problem is

in PSPACE, then EXP=PSPACE; similarly, if any EXP-hard problem is in NP, then

EXP=NP. We do know that P 6= EXP; in particular, no EXP-hard problem is in P.

Natural generalizations of many interesting -player

games—like checkers,

chess, mancala, and go—are actually EXP-hard. The boundary between PSPACEcomplete

games and EXP-hard games is rather subtle. For example, there are

three ways to draw in chess (the standard 8 ⇥ 8 game): stalemate (the player to

move is not in check but has no legal moves), repeating the same board position

three times, or moving fifty times without capturing or moving a pawn. The

n ⇥ n generalization of chess is either in PSPACE or EXP-hard depending on

how we generalize these rules. If we declare a draw after (say) n3 capture-free

moves, then every game must end after a polynomial number of moves, so we

can simulate all possible games from any given position using only polynomial

space. On the other hand, if we ignore the capture-free move rule entirely, the

resulting game can last an exponential number of moves, so there no obvious

way to detect a repeating position using only polynomial space; indeed, this

version of n ⇥ n chess is EXP-hard.

Excelsior!

Naturally, even exponential time is not the end of the story. NEXP is the class

of decision problems that can be solve in nondeterministic exponential time;

equivalently, a decision problem is in NEXP if and only if, for every Y

instance,

there is a proof of this fact that can be checked in exponential time. EXPSPACE

is the set of decision problems that can be solved using exponential space. Even

these larger complexity classes have hard problems; for example, if we add the

intersection operator \ to the syntax of regular expressions, deciding whether

two such expressions describe the same language is EXPSPACE-hard. Beyond

EXPSPACE are complexity classes with doubly-exponential resource bounds

(EEXP, NEEXP, and EEXPSPACE), then triply exponential resource bounds

(EEEXP, NEEEXP, and EEEXPSPACE), and so on ad infinitum.

All these complexity classes can be ordered by inclusion:

P ✓ NP ✓ PSPACE ✓ EXP ✓ NEXP ✓ EXPSPACE ✓ EEXP ✓ NEEXP ✓ ···

Most complexity theorists strongly believe that every inclusion in this sequence

is strict; that is, no two of these complexity classes are equal. However, the

strongest result that has been proved is that every class in this sequence is strictly

contained in the class three steps later in the sequence. For example, we have

414

Exercises

proofs that P 6= EXP and PSPACE 6= EXPSPACE, but not whether P 6= PSPACE or

NP 6= EXP.

The limit of this series of increasingly exponential complexity classes is the

class ELEMENTARY of decision problems that can be solved using time or space

bounded by a function the form 2 "k n for some constant integer k, where

2 "k n :=

®

n if k = 0,

22"k1n

otherwise.

For example, 2 "1 n = 2n and 2 "2 n = 22n

.

It may be tempting to conjecture that every natural decidable problem can

be solved in elementary time, but in fact this conjecture is incorrect. Consider

the generalized regular expressions defined by recursively combining (possibly

empty) strings over some finite alphabet by concatenation (x y), union (x + y),

Kleene closure (x⇤), and negation (x). For example, the generalized regular

expression (0 + 1)

⇤00(0 + 1)

⇤ represents the set of strings in {0, 1}⇤ that do not

contain two 0s in a row. It is possible to determine algorithmically whether

two generalized regular expressions describe identical languages, by recursively

converting each expression into an equivalent NFA, converting each NFA into a

DFA, and then minimizing the DFA. However, the running time of this algorithm

has the non-elementary bound 2 "⇥(n) 2, intuitively because each layer of

recursive negation can exponentially increase the number of states. In fact,

Larry Stockmeyer proved in

that this problem cannot be solved in merely

elementary time, even if we forbid Kleene closure.

Exercises

.

(a) Describe and analyze and algorithm to solve P

in time O(nM),

where n is the size of the input set and M is the sum of the absolute

values of its elements.

(b) Why doesn’t this algorithm imply that P=NP?

.

Consider the following problem, called BD:

Given a set of n axisaligned

rectangles in the plane, how big is the largest subset of these

rectangles that contain a common point?

(a) Describe a polynomial-time reduction from BD

to MC.

(b) Describe and analyze a polynomial-time algorithm for BD.

[Hint:

O(n3) time should be easy, but O(n log n) time is possible.]

(c) Why don’t these two results imply that P=NP?

415

12. NP-HARDNESS

.

A boolean formula is in disjunctive normal form (or DNF) if it consists of a

disjunction (O)

or several terms, each of which is the conjunction (A)

of

one or more literals. For example, the formula

(x ^ y ^ z) _ ( y ^ z) _ (x ^ y ^ z)

is in disjunctive normal form. DNF-S

asks, given a boolean formula in

disjunctive normal form, whether that formula is satisfiable.

(a) Describe a polynomial-time algorithm to solve DNF-S.

(b) What is the error in the following argument that P=NP?

Suppose we are given a boolean formula in conjunctive normal form with

at most three literals per clause, and we want to know if it is satisfiable.

We can use the distributive law to construct an equivalent formula in disjunctive

normal form. For example,

(x _ y _ z) ^ (x _ y) () (x ^ y) _ (y ^ x) _ (z ^ x) _ (z ^ y)

Now we can use the algorithm from part (a) to determine, in polynomial

time, whether the resulting DNF formula is satisfiable. We have just solved

3SAT in polynomial time. Since 3SAT is NP-hard, we must conclude that

P=NP!

.

The problem AONS

asks, given a CNF

boolean formula,

whether there is an assignment to the variables such that each clause either

has three T

literals or has three F

literals.

(a) Describe a polynomial-time algorithm to solve AONS.

(b) But S

is NP-hard! Why doesn’t the existence of this algorithm prove

that P=NP?

.

(a) Suppose you are given a magic black box that can determine in polynomial

time, given an arbitrary weighted graph G, the length of the

shortest Hamiltonian cycle in G. Describe and analyze a polynomialtime

algorithm that computes, given an arbitrary weighted graph G,

the shortest Hamiltonian cycle in G, using this magic black box as a

subroutine.

(b) Suppose you are given a magic black box that can determine in polynomial

time, given an arbitrary graph G, the number of vertices in the

largest complete subgraph of G. Describe and analyze a polynomialtime

algorithm that computes, given an arbitrary graph G, a complete

subgraph of G of maximum size, using this magic black box as a subroutine.

416

Exercises

(c) Suppose you are given a magic black box that can determine in polynomial

time, given an arbitrary graph G, whether G is 3-colorable.

Describe and analyze a polynomial-time algorithm that either computes

a proper 3-coloring of a given graph or correctly reports that no

such coloring exists, using the magic black box as a subroutine. [Hint:

The input to the magic black box is a graph. Only a graph. Vertices and

edges. Nothing else.]

(d) Suppose you are given a magic black box that can determine in polynomial

time, given an arbitrary boolean formula ,

whether

is satisfiable.

Describe and analyze a polynomial-time algorithm that either computes

a satisfying assignment for a given boolean formula or correctly reports

that no such assignment exists, using the magic black box as a subroutine.

(e) Suppose you are given a magic black box that can determine in polynomial

time, given an arbitrary set X of positive integers, whether X can

be partitioned into two sets A and B such that PA = PB. Describe and

analyze a polynomial-time algorithm that either computes an equal

partition of a given set of positive integers or correctly reports that no

such partition exists, using the magic black box as a subroutine.

®™(f) Suppose you are given a magic black box that can determine in polynomial

time, given an arbitrary generalized regular expression R (as

defined just before the Exercises), whether R matches any string. Describe

and analyze a polynomial-time algorithm that either finds a

single string that matches a given generalized regular expression or

correctly reports that no such string exists, using the magic black box as

a subroutine.

.

There’s something special about the number 3.

(a) Describe and analyze a polynomial-time algorithm for P.

Given

a set S of 2n positive integers, your algorithm will determine in polynomial

time whether the elements of S can be split into n disjoint pairs

whose sums are all equal.

(b) Describe and analyze a polynomial-time algorithm for C.

Given an

undirected graph G, your algorithm will determine in polynomial time

whether G has a proper coloring that uses only two colors.

(c) Describe and analyze a polynomial-time algorithm for S.

Given a

boolean formula

in conjunctive normal form, with exactly two literals

per clause, your algorithm will determine in polynomial time whether

has a satisfying assignment. [Hint: This problem is strongly connected

to topics described in an earlier chapter.]

417

12. NP-HARDNESS

.

There’s nothing special about the number 3.

(a) The problem P

is defined as follows: Given a set S of 12n

positive integers, determine whether the elements of S can be split into n

subsets, each with 12 elements, whose sums are all equal. Prove that

P

is NP-hard. [Hint: Reduce from P.

It may be

easier to consider multisets first.]

(b) The problem C

is defined as follows: Given an undirected graph G,

determine whether we can color each vertex with one of twelve colors,

so that every edge touches two dierent

colors. Prove that C

is

NP-hard. [Hint: Reduce from C.]

(c) The problem SAT

is defined as follows: Given a boolean formula

in conjunctive normal form, with exactly twelve literals per clause,

determine whether

has a satisfying assignment. Prove that S

is

NP-hard. [Hint: Reduce from S.]

.

There are two dierent

versions of the Hamiltonian cycle problem, one for

directed graphs and one for undirected graphs. Earlier in this chapter you

can find two proofs that the directed Hamiltonian cycle problem is NP-hard.

(a) Describe a polynomial-time reduction from the undirected Hamiltonian

cycle problem to the directed Hamiltonian cycle problem. Prove your

reduction is correct.

(b) Describe a polynomial-time reduction from the directed Hamiltonian

cycle problem to the undirected Hamiltonian cycle problem. Prove your

reduction is correct.

(c) Which of these two reductions implies that the undirected Hamiltonian

cycle problem is NP-hard?

.

(a) Describe a polynomial-time reduction from UH-

C

to DHC.

(b) Describe a polynomial-time reduction from DH-

C

to UHC.

.

(a) Describe a polynomial-time reduction from the HP

problem

to HC.

(b) Describe a polynomial-time reduction from the HC

problem to HP.

[Hint: A polynomial-time reduction can

call the black-box subroutine more than once, but it doesn’t have to.]

.

Consider the following subtle variants of CNFS.

For each problem, the

input is a boolean formula

in conjunctive normal form, and the goal is to

determine whether

has a satisfying assignment.

418

Exercises

(a) Suppose every clause of

contains at most three literals and each

variable appears in at most three clauses. Prove that this variant of

CNFS

is NP-hard.

(b) Suppose every clause of

contains exactly three literals and each

variable appears in at most four clauses. Prove that this variant of S

is NP-hard. [Hint: Solve part (a) first.]

™(c) Suppose every clause of

can contain any number of literals, but each

variable appears in at most two clauses. Describe a polynomial-time

algorithm for this variant of CNFS.

™(d) Suppose every clause of

contains exactly three literals and each variable

appears in at most three clauses. Prove that

must be satisfiable. (So

this variant of S

is completely trivial!)

.

(a) Prove that NAES

is NP-hard.

(b) Prove that --S

is NP-hard.

.

A boolean formula in exclusive-or conjunctive normal form (XCNF) is a

conjunction (A)

of several clauses, each of which is the exclusive-or of

several literals; that is, a clause is true if and only if it contains an odd number

of true literals. The XCNF-S

problem asks whether a given XCNF formula

is satisfiable. Either describe a polynomial-time algorithm for XCNF-S

or

prove that XCNF-S

is NP-hard. [Hint: Do not try to do both.]

™.

Consider the following variant of S,

called MS.

Just like S,

the input to MS

is a boolean formula

in conjunctive normal

form, with exactly three literals er clause. MS

asks whether

there is an assignment to the variables of ,

such that every clause contains

at least two T

literals.

Either describe an algorithm that solves MS

in polynomial

time or prove that MS

is NP-hard. [Hint: Do not try to do both.]

´™.

For any subset X ✓ {0, 1, 2, 3}, consider the following problem, which I’ll

call X-S.

The input is a boolean formula

in conjunctive normal form,

with exactly three literals in each clause. The problem is to decide whether

there is an assignment to the variables of

such that in each clause of ,

the number of T

literals is in the set X. For example:

• {1, 2, 3}-S

is the standard S

problem.

• {0, 3}-S

is the same as AONS.

(See Exercise .)

• {1, 2}-S

is usually called NAES.

(See Exercise (a).)

• {1}-S

is usually called --S.

(See Exercise (b).)

419

12. NP-HARDNESS

• {1, 3}-S

is usually called XCNF-S.

(See Exercise .)

• {2, 3}-S

is usually called MS.

(See Exercise .)

Give a complete list of all subsets X ✓ {0, 1, 2, 3} such that X-S

is solvable

in polynomial time, assuming P 6= NP. [Hint: Don’t give 16 dierent

arguments.]

.

Prove that the following problems are NP-hard.

(a) Given an undirected graph G, does G contain a simple path that visits

all but

vertices?

(b) Given an undirected graph G, does G have a spanning tree in which

every node has degree at most ?

(c) Given an undirected graph G, does G have a spanning tree with at most


leaves?

(d) Given an undirected graph G = (V, E), what is the size of the largest

subset of vertices S ✓ V such that at most

edges in E have both

endpoints in S?

(e) Given an undirected graph G = (V, E), what is the size of the largest

subset of vertices S ✓ V such that each vertex in S has at most

neighbors in S?

(f) Given an undirected graph G, is it possible to color the vertices of G with

three dierent

colors, so that at most

edges have both endpoints

the same color?

.

Prove that the following variants of the minimum spanning tree problem

are NP-hard.

(a) Given a graph G, compute the maximum-diameter spanning tree of G.

(The diameter of a tree T is the length of the longest path in T.)

(b) Given a graph G with weighted edges, compute the minimum-weight

depth-first spanning tree of G.

(c) Given a graph G with weighted edges and a subset S of vertices of G,

compute the minimum-weight spanning tree all of whose leaves are in S.

(d) Given a graph G with weighted edges and an integer `, compute the

minimum-weight spanning tree with at most ` leaves.

(e) Given a graph G with weighted edges and an integer ,

compute the

minimum-weight spanning tree where every node has degree at most .

.

(a) Using the gadget in Figure .(a),

prove that deciding whether a given

planar graph is -colorable

is NP-hard. [Hint: Show that the gadget

can be -colored,

and then replace any crossings in a planar embedding

with the gadget appropriately.]

420

Exercises

(b) Using part (a) and the gadget in Figure .(b),

prove that deciding

whether a planar graph with maximum degree

is -colorable

is NP-hard.

[Hint: Replace any vertex with degree greater than

with a collection

of gadgets connected so that no degree is greater than four.]

(a) (b)

Figure 12.26. (a) Gadget for planar 3-colorability. (b) Gadget for degree-4 planar 3-colorability.

.

Prove that PCS

is NP-hard. [Hint: Construct a gadget for

crossing wires.]

.

(a) Describe a polynomial-time reduction from S

to S.

(b) Describe a polynomial-time reduction from S

to S.

™.

Describe a direct polynomial-time reduction from C

to C.

(This

is a lot harder than the opposite direction.)

.

A domino is a 1 ⇥ 2 rectangle divided into two squares, each of which is

labeled with an integer.

In a legal arrangement of dominos, the dominos

are lined up end-to-end so that the numbers on adjacent ends match.

Figure 12.27. A legal arrangement of dominos in which every integer between 0 and 6 appears twice

For each of the following problems, either describe a polynomial-time

algorithm or prove that the problem is NP-hard:

(a) Given an arbitrary bag D of dominos, is there a legal arrangement of all

the dominos in D?

(b) Given an arbitrary bag D of dominos, is there a legal arrangement of a

dominos from D in which every integer between 1 and n appears exactly

twice?

These

integers are usually represented by pips, exactly like dice. On a standard domino, the

number of pips on each side is between

and ,

although one can buy sets with up to

or even


pips on each side; we will allow arbitrary integer labels. A standard set of dominos contains

exactly one domino for each possible unordered pair of labels; we do not assume that the inputs

to our problems have this property.

421

12. NP-HARDNESS

™(c) Given an arbitrary bag D of dominos, what is the largest number of

dominos we can take from D to make a legal arrangement?

.

Pebbling is a solitaire game played on an undirected graph G, where each

vertex has zero or more pebbles. A single pebbling move consists of removing

two pebbles from a vertex v and adding one pebble to an arbitrary neighbor

of v. (Obviously, the vertex v must have at least two pebbles before the move.)

The PD

problem asks, given a graph G = (V, E) and a

pebble count p(v) for each vertex v, whether is there a sequence of pebbling

moves that removes all but one pebble. Prove that PD

is

NP-hard.

.

Recall that a -coloring

of a graph G is a function that assigns each vertex

of G a “color” from the set {0, 1, 2, 3, 4}, such that for any edge uv, vertices u

and v are assigned dierent

“colors”. A -coloring

is careful if the colors

assigned to adjacent vertices are not only distinct, but dier

by more than 1

(mod 5). Prove that deciding whether a given graph has a careful 5-coloring

is NP-hard. [Hint: Reduce from the standard C

problem.]

3

4

0 4

2

3 0 2

1

Figure 12.28. A careful 5-coloring.

.

(a) A subset S of vertices in an undirected graph G is half-independent if

each vertex in S is adjacent to at most one other vertex in S. Prove that

finding the size of the largest half-independent set of vertices in a given

undirected graph is NP-hard.

(b) A subset S of vertices in an undirected graph G is sort-of-independent if

if each vertex in S is adjacent to at most

other vertices in S. Prove

that finding the size of the largest sort-of-independent set of vertices in

a given undirected graph is NP-hard.

(c) A subset S of vertices in an undirected graph G is almost independent if at

most

edges in G have both endpoints in S. Prove that finding the size

of the largest almost-independent set of vertices in a given undirected

graph is NP-hard.

.

Let G = (V, E) be a graph. A dominating set in G is a subset S of the vertices

such that every vertex in G is either in S or adjacent to a vertex in S. The

422

Exercises

DS

problem asks, given a graph G and an integer k as input,

whether G contains a dominating set of size k. Prove that this problem is

NP-hard.

Figure 12.29. A dominating set of size 3 in the Petersen graph.

.

A subset S of vertices in an undirected graph G is triangle-free if, for every

triple of vertices u, v, w 2 S, at least one of the three edges uv, uw, vw is

absent from G. Prove that finding the size of the largest triangle-free subset

of vertices in a given undirected graph is NP-hard.

Figure 12.30. A triangle-free subset of 7 vertices. This is not the largest such subset in this graph.

.

The RT

problem is defined as follows: Given one large

rectangle and several smaller rectangles, determine whether the smaller

rectangles can be placed inside the large rectangle with no gaps or overlaps.

(a) Prove that RT

is NP-hard.

(b) Prove that RT

is strongly NP-hard.

7

1

5 3

6

2

4

7

1

5 3

6

2 4

Figure 12.31. A positive instance of the RT

problem.

.

(a) A subset B of vertices in a graph G is a Burr set if removing every vertex

in B from G leaves a subgraph that does not contain a Hamiltonian path.

Prove that finding the smallest Burr set in a given graph is NP-hard.

(b) A subset S of vertices in a graph G is a Schuyler set if removing every

vertex in S from G leaves a subgraph that does contain a Hamiltonian

423

12. NP-HARDNESS

path. Prove that finding the smallest Schuyler set in a given graph is

NP-hard.

.

(a) A tonian path in a graph G is a path that goes through at least half of

the vertices of G. Show that determining whether a graph has a tonian

path is NP-hard.

(b) A tonian cycle in a graph G is a cycle that goes through at least half of

the vertices of G. Show that determining whether a graph has a tonian

cycle is NP-hard. [Hint: Use part (a). Or not.]

.

Let G be an undirected graph with weighted edges. A heavy Hamiltonian

cycle is a cycle C that passes through each vertex of G exactly once, such that

the total weight of the edges in C is more than half of the total weight of all

edges in G. Prove that deciding whether a graph has a heavy Hamiltonian

cycle is NP-hard.

4

8

2

7

5

3

1

12

8

6

5

9

5

Figure 12.32. A heavy Hamiltonian cycle. The cycle has total weight 34; the graph has total weight 67.

.

For each of the following problems, either describe a polynomial-time

algorithm or prove that the problem is NP-hard.

(a) A double-Eulerian tour in an undirected graph G is a closed walk that

traverses every edge in G exactly twice. Given a graph G, does G have a

double-Eulerian tour?

(b) A double-Hamiltonian tour in an undirected graph G is a closed walk

that visits every vertex in G exactly twice. Given a graph G, does G have

a double-Hamiltonian tour?

(c) A double-Hamiltonian circuit in an undirected graph G is a closed walk

that visits every vertex in G exactly twice and traverses each edge in G at

most once. Given a graph G, does G have a double-Hamiltonian circuit?

(d) A triple-Eulerian tour in an undirected graph G is a closed walk that

traverses every edge in G exactly three times. Given a graph G, does G

have a triple-Eulerian tour?

424

Exercises

(e) A triple-Hamiltonian tour in an undirected graph G is a closed walk that

visits every vertex in G exactly three times. Given a graph G, does G

have a triple-Hamiltonian tour?

.

This exercise asks you to prove that a certain reduction from VC

to ST

is correct. Suppose we want to find the smallest vertex

cover in a given undirected graph G = (V, E). We construct a new graph

H = (V0

, E0

) as follows:

• V0 = V [ E [ {z}

• E0 = {ve | v 2 V is an endpoint of e 2 E} [ {vz | v 2 V}.

Equivalently, we construct H by subdividing each edge in G with a new

vertex, and then connecting all the original vertices of G to a new apex

vertex z.

Prove that G has a vertex cover of size k if and only if there is a subtree

of H with k + |E| + 1 vertices that contains every vertex in E [ {z}.

.

Consider the following solitaire game. The puzzle consists of an n ⇥ m

grid of squares, where each square may be empty, occupied by a red stone,

or occupied by a blue stone. The goal of the puzzle is to remove some of

the given stones so that the remaining stones satisfy two conditions: ()

every row contains at least one stone, and ()

no column contains stones of

both colors. For some initial configurations of stones, reaching this goal is

impossible.

A solvable puzzle and one of its many solutions. An unsolvable puzzle.

Prove that it is NP-hard to determine, given an initial configuration of

red and blue stones, whether the puzzle can be solved.

.

Each of the following games involves an n ⇥ m grid of squares, where each

square is either empty or occupied by a stone. In a single move, you can

remove all the stones in an arbitrary column.

(a) Prove that it is NP-hard to find the smallest subset of columns that can

be cleared so that at most one stone remains in each row of the grid.

(b) Prove that it is NP-hard to find the largest subset of columns that can be

cleared so that at least one stone remains in each row of the grid.

425

12. NP-HARDNESS

™(c) Prove that it is NP-hard to determine whether any subset of columns can

be cleared so that exactly one stone remains in each row of the grid.

.

Je

tries to make his students happy. At the beginning of class, he passes out

a questionnaire that lists a number of possible course policies in areas where

he is flexible. Every student is asked to respond to each possible course

policy with one of “strongly favor”, “mostly neutral”, or “strongly oppose”.

Each student may respond with “strongly favor” or “strongly oppose” to at

most five questions. Because Je’s

students are very understanding, each

student is happy if (but only if) he or she prevails in at least one of their

strong policy preferences. Either describe a polynomial-time algorithm for

setting course policy to maximize the number of happy students, or show

that the problem is NP-hard.

.

You’re in charge of choreographing a musical for your local community

theater, and it’s time to figure out the final pose of the big show-stopping

number at the end. (“Streetcar!”) You’ve decided that each of the n cast

members in the show will be positioned in a big line when the song finishes,

all with their arms extended and showing o

their best spirit fingers.

The director has declared that during the final flourish, each cast member

must either point both their arms up or point both their arms down; it’s

your job to figure out who points up and who points down. Moreover, the

director has also given you a list of arrangements that will upset his delicate

artistic temperament. Each forbidden arrangement is a subset of the cast

members paired with arm positions; for example: “Marge may not point her

arms up while Ned, Apu, and Smithers point their arms down.”

Prove that finding an acceptable arrangement of arm positions is NP-hard.

.

The next time you are at a party, one of the guests will suggest everyone play a

round of Three-Way Mumbletypeg, a game of skill and dexterity that requires

three teams and a knife. The ocial

Rules of Three-Way Mumbletypeg

(fixed during the Holy Roman Three-Way Mumbletypeg Council in )

require that ()

each team must have at least one person, ()

any two people

on the same team must know each other, and ()

everyone watching the

game must be on one of the three teams. Of course, it will be a really fun

party; nobody will want to leave. There will be several pairs of people at

the party who don’t know each other. The host of the party, having heard

thrilling tales of your prowess in all things algorithmic, will hand you a list

of which pairs of party-goers know each other and ask you to choose the

teams, while he sharpens the knife.

426

Exercises

Either describe and analyze a polynomial time algorithm to determine

whether the party-goers can be split into three legal Three-Way Mumbletypeg

teams, or prove that the problem is NP-hard.

The Algorithm March is performed by a line of people; each person in line

starts a specific sequence of movements one measure later than the person

directly in front of them. Thus, the march is the dance equivalent of a

musical round or canon, like “Row Row Row Your Boat” or "Frère Jacques".

Proper etiquette dictates that each marcher must know the person

directly in front of them in line, lest a minor mistake lead to horrible

embarrassment between strangers. Suppose you are given a complete list

of which people at your party know each other. Prove that it is NP-hard

to determine the largest number of party-goers that can participate in the

Algorithm March. You may assume without loss of generality that there are

no ninjas at your party.

®.

Prove that the following problems about nondeterministic finite-state automata

and regular expressions are NP-hard:

(a) Given an NFA M over the alphabet ⌃ = {0, 1}, is there a string in ⌃⇤

that M does not accept?

(b) Given an acyclic NFA M over the alphabet ⌃ = {0, 1}, what is the length

of the shortest string in ⌃⇤ that M does not accept?

(c) Given a regular expression R over the alphabet ⌃ = {0, 1}, is there a

string in ⌃⇤ that R does not match?

(d) Given a star-free regular expression R over the alphabet ⌃ = {0, 1}, what

is the length of the shortest string in ⌃⇤ that R does not match?

(In fact, problems (a) and (c) are PSPACE-complete; even proving that these

problems are in PSPACE is nontrivial.)

®.

(a) Describe a polynomial-time algorithm for the following problem: Given

an NFA M over the alphabet ⌃ = {0, 1}, is there a string in ⌃⇤ that M

does accept?

(b) Describe a polynomial-time algorithm for the following problem: Given

a regular expression R over the alphabet ⌃ = {0, 1}, is there a string

in ⌃⇤ that R does match?

427

12. NP-HARDNESS

(c) The complement of any regular language is another regular language.

So why don’t these two algorithms, together with the NP-hardness results

in Problem ,

prove that P=NP?

.

Charon needs to ferry n recently deceased people across the river Acheron

into Hades. Certain pairs of these people are sworn enemies, who cannot

be together on either side of the river unless Charon is also present. (If two

enemies are left alone, one will steal the obol from the other’s mouth, leaving

them to wander the banks of the Acheron as a ghost for all eternity. Let’s

just say this is a Very Bad Thing.) The ferry can hold at most k passengers

at a time, including Charon, and only Charon can pilot the ferry.

Prove that it is NP-hard to decide whether Charon can ferry all n people

across the Acheron unharmed (aside from being, you know, dead). The

input for Charon’s problem consists of the integers k and n and an n-vertex

graph G describing the pairs of enemies. The output is either T

or F.

Please do not write your solution in classical Latin.

This

problem is a generalization of the well-known wolf-goat-and-cabbage puzzle, whose

first known appearance is in the remarkable medieval manuscript Propositiones ad Acuendos

Juvenes [Problems to Sharpen the Young].

Homo quidam debebat ultra fluuium transferre lupum, capram, et fasciculum cauli. Et

non potuit aliam nauem inuenire, nisi quae duos tantum ex ipsis ferre ualebat. Praeceptum

itaque ei fuerat, ut omnia haec ultra illaesa omnino transferret. Dicat, qui potest, quomodo

eis illaesis transire potuit?

Solutio. Simili namque tenore ducerem prius capram et dimitterem foris lupum et caulum.

Tum deinde uenirem, lupumque transferrem: lupoque foris misso capram naui receptam ultra

reducerem; capramque foris missam caulum transueherem ultra; atque iterum remigassem,

capramque assumptam ultra duxissem. Sicque faciendo facta erit remigatio salubris, absque

uoragine lacerationis.

For those few readers whose classical Latin is a little rusty, here is an English translation:

A man needed to transfer a wolf, a goat, and a bundle of cabbage across a river. However,

he found that his boat could only bear the weight of two [objects at a time, including the

man]. And he had to get everything across unharmed. Tell me if you can: How they were

able to cross unharmed?

Solution. In a similar fashion [as an earlier problem], I would first take the goat across

and leave the wolf and cabbage on the opposite bank. Then I would take the wolf across;

leaving the wolf on shore, I would retrieve the goat and bring it back again. Then I would

leave the goat and take the cabbage across. And then I would row across again and get the

goat. In this way the crossing would go well, without any threat of slaughter.

The most likely author of the Propositiones is the prolific th-century

English scholar Alcuin of

York. The evidence for Alcuin’s authorship of this treatise is somewhat circumstantial; however,

we do know from his correspondence with Charlemagne that he sent the emperor some “simple

arithmetical problems for fun”. Most modern scholars believe that even if Alcuin did write the

Propositiones, he did not invent all of the problems himself, but rather collected them from even

earlier sources.

Some things never change.

428


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp