Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and...

55
Theoretical Chemistry I Martin Sch¨ utz 1 1 Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit¨ atsstraße 31, D-93040 Regensburg, Germany “Solange als noch f¨ ur die chymischen Wirkungen der Materien aufeinander kein Begriff ausgefunden wird, der sich construieren l¨ aßt, d. i. kein Gesetz der Ann¨ aherung oder Entfernung der Theile angeben l¨ aßt, nach welchem etwa in Propertionen ihrer Dichtigkeiten u.d.g. ihre Bewegungen samt ihren Folgen sich im Raume a priori anschaulich machen und darstellen lassen (eine Forderung, die schwerlich jemals erf¨ ullt werden wird), so kann Chymie nichts mehr als systematische Kunst, oder Experimentallehre, niemals aber eigentliche Wissenschaft werden, weil die Principien derselben blos empirisch sind und keine Darstellung a priori in der Anschauung erlauben, folglich die Grunds¨ atze chymischer Erscheinungen ihrer M¨ oglichkeit nach nicht im mindesten begreiflich machen, weil sie der Anwendung der Mathematik unf¨ ahig wird” I. Kant (1786), Metaphysische Anfangsgr¨ unde der Naturwissenschaft “The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.” P.A.M. Dirac, Proc. R. Soc. London 123, 714 (1929)

Transcript of Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and...

Page 1: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

Theoretical Chemistry I

Martin Schutz1

1Institute of Physical and Theoretical Chemistry, University of Regensburg,Universitatsstraße 31, D-93040 Regensburg, Germany

“Solange als noch fur die chymischen Wirkungen der Materien aufeinander kein Begriff ausgefunden wird, der sichconstruieren laßt, d. i. kein Gesetz der Annaherung oder Entfernung der Theile angeben laßt, nach welchem etwain Propertionen ihrer Dichtigkeiten u.d.g. ihre Bewegungen samt ihren Folgen sich im Raume a priori anschaulichmachen und darstellen lassen (eine Forderung, die schwerlich jemals erfullt werden wird), so kann Chymie nichts mehrals systematische Kunst, oder Experimentallehre, niemals aber eigentliche Wissenschaft werden, weil die Principienderselben blos empirisch sind und keine Darstellung a priori in der Anschauung erlauben, folglich die Grundsatzechymischer Erscheinungen ihrer Moglichkeit nach nicht im mindesten begreiflich machen, weil sie der Anwendung derMathematik unfahig wird”

I. Kant (1786), Metaphysische Anfangsgrunde der Naturwissenschaft

“The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole ofchemistry are thus completely known, and the difficulty is only that the exact application of these laws leads toequations much too complicated to be soluble.”

P.A.M. Dirac, Proc. R. Soc. London 123, 714 (1929)

Page 2: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

2

Mathematical primer

A. Vectors, operators, and their matrix representations

A three-dimensional vector |a〉 can be specified by its components a1, a2, a3 w.r. to an orthonormalized basis1〉, |2〉, |3〉 as

|a〉 = |1〉a1 + |2〉a2 + |3〉a3 =

3∑i=1

|i〉ai, with 〈i|j〉 =

1 (if i = j)0 (otherwise)

= δij . (1)

In the previous equation we have introduced the Kronecker delta δij .

Usually, the components of a vector are written in matrix form, i.e. as a column matrix a =

a1a2a3

.

Of course the choice of the basis is not unique, and a different orthonormalized basis |1′〉, |2′〉, |3′〉 can be used

instead. W.r. to that basis we have for |a〉 different components, i.e., a =

a′1a′2a′3

.

The scalar product between two vectors |a〉 and |b〉 is defined as

〈a||b〉 ≡ 〈a|b〉 =

3∑i=1

a∗i bi =(a∗1 a∗2 a∗3

)b1b2b3

(2)

The asterisk in the above equation implies complex conjugation of the bra component ai and of course has no effectprovided that ai is real. The reader may have noticed so far that I have introduced Dirac’s general 〈bra| |ket〉formalism to denote the vectors. In matrix form, ket vectors correspond to the usual column matrices. bra vectorscorrespond to their adjoints, i.e., to row matrices obtained from the former by transposition and complex conjugation.

Let us now write the scalar product 〈a|b〉 by inserting the individual vectors as defined in eq. (1). We obtain

〈a|b〉 =

3∑i=1

3∑j=1

a∗i 〈i|j〉bj , (3)

which corresponds to eq. (2) only if 〈i|j〉 = δij , i.e., for an orthonormal basis. The latter equation is a more generalform to define the scalar product for an arbitrary (non-orthonormal) basis with metric 〈i|j〉. In the following werestrict our discussion to orthonormal basis.

Having a vector |a〉 its component along |i〉 is obtained as 〈i|a〉 =∑k

〈i|k〉ak =∑k

δikak = ai. This is also known

as the projection of |a〉 onto |i〉. Using this in eq. (1) yields

|a〉 =∑k

|k〉〈k|a〉 ≡∑k

|k〉〈k||a〉, (4)

which shows that∑k |k〉〈k| ≡ 1, where 1 is the unity operator. The matrix representation of the unity operator is

the unity matrix 1, i.e., a matrix with ones on the diagonal and zeros elsewhere (here, of course, a 3×3 unity matrix).Note that in contrast to the scalar product 〈i|j〉 (row times column matrix) yielding a number, the dyadic product|i〉〈j| (column times row matrix) yields a square matrix.

We now have already obtained the simplest operator, namely unity. There are of course more interesting operators.Generally, operators are entities which convert a vector into another vector, i.e., O|a〉 = |b〉. We will exclusively dealwith linear operators, for which we have for any numbers x and y

O(x|a〉+ y|b〉) = xO|a〉+ yO|b〉. (5)

Page 3: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

3

Since any vector |a〉 can be expressed as a linear combination of its basis vectors |i〉 it is sufficient to know what O

does with the individual |i〉. Now since O|i〉 still is a vector living in the same vector space we can expand it as

O|i〉 =∑k

|k〉Oki, (6)

where Oki is an element of the square matrix O. O is the matrix representation of the operator O in the basis|i〉. The action of the operator O on an arbitrary vector |a〉 living in the vector space spanned by |i〉 then is

|b〉 = O|a〉 = O∑i

|i〉ai =∑k

∑i

|k〉Okiai, (7)

and an individual component bj of vector |b〉

bj = 〈j|b〉 =∑i

∑k

〈j|k〉Okiai =∑i

∑k

δjkOkiai =∑i

Ojiai,

b = Oa. (8)

Eq. (8) is the matrix representation of the operator equation |b〉 = O|a〉. In the context of a 3-dimensionalvector space O is a 3 × 3 square matrix, and a, b are column matrices with 3 elements. Of course, everything wehave outlined so far is straightforwardly generalized from 3- to N -dimensional vector spaces: we have then not threebut N orthonormal basis vectors |i〉 = |1〉, |2〉, . . . , |N〉. Accordingly, the bra and ket vectors correspond to rowand column matrices, respectively, involving N rather than three elements, and the operators operating in that vectorspace correspond to N ×N rather than 3× 3 square matrices.

Starting from equation (6) we have

〈j|O|i〉 =∑k

〈j|k〉Oki =∑k

δjkOki = Oji, (9)

implying that the elements of the matrix representation of O are obtained as the scalar products of e.g. bra vector〈j| with ket vector O|i〉.

Operators can be applied one after the other onto a vector to their right, e.g., O1O2|a〉. It is imperative that the

rightmost operator (next to the vector) is applied first. For example, if O1 corresponds to a rotation operation, while

O2 is a mirror operation, the final vector obtained after application of both operators will obviously depend on theorder at which the operators are applied.

The operator product O1O2 of course can be interpreted as new operator, i.e., O = O1O2. If O1 and O2 arethe matrix representations of the operators O1 and O2, respectively, then the matrix representation of the operatorproduct O = O1O2 can be found as follows:

O|i〉 =∑k

|k〉Oki

= O1O2|i〉 = O1

∑l

|l〉O2li =∑l

O1|l〉O2li =∑l

∑k

|k〉O1klO2li =∑k

|k〉∑l

O1klO2li

Oki =∑l

O1klO2li. (10)

Eq. (10) defines the matrix multiplication of two matrices, O = O1O2.

As already pointed out above, operators, as well as their matrix representations, usually do not commute (eventhough there are operators and matrices which do commute.) It is therefore sensible to define the commutator formatrices or operators as

[O1, O2] = O1O2 − O2O1, (11)

[O1,O2] = O1O2 −O2O1. (12)

As you may already have heard, commutation relations of operators have a profound impact on quantum mechanics.Think of the Heisenberg uncertainty, which is a direct consequence of the non-commutativeness of certain operatorsrelated to certain observables.

Page 4: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

4

We now introduce the adjoint of an operator O, which we denote by O†. If O changes the ket |a〉 into ket |b〉, then

its adjoint O† changes bra 〈a| into bra 〈b|, i.e.,

O|a〉 = |b〉 〈a|O† = 〈b|. (13)

Muliplying the first equation from the with the bra 〈c|, the second from the right with the ket |c〉 yields

〈c|O|a〉 = 〈c|b〉, and 〈a|O†|c〉 = 〈b|c〉. (14)

Using 〈c|b〉 = 〈b|c〉∗ one finally obtains the relation

〈a|O†|c〉 = 〈c|O|a〉∗. (15)

Substituting the general bra and ket vectors by basis vectors finally yields the matrix representations of O and itsadjoint O†, for which we have

〈j|O†|i〉 =(O†)ji

= 〈i|O|j〉∗ = (O)∗ij . (16)

So the matrix representation of the adjoint operator is obtained by transposition and complex conjugation of thematrix representation of the original operator.

We conclude this section by specifying certain definitions and properties of square matrices:

• A matrix A is diagonal if all its off-diagonal matrix elements are zero, Aij = Aiiδij .

• The trace of the matrix A is the sum of its diagonal elements, trA =∑iAii.

• The unit matrix 1 is defined by 1A = A1 = A, ∀A. It is a diagonal matrix of the form 1ij = δij .

• The inverse of a matrix A denoted by matrix A−1 is a matrix which fulfills A−1A = AA−1 = 1.

• A unitary matrix U is defined as a matrix, for which its adjoint (transposition plus complex conjugation) isidentical to its inverse, U† = U−1. A real unitary matrix is denoted as an orthogonal matrix.

• A Hermitian matrix is self-adjoint, i.e., A† = A, or A∗ji = Aij . Consequently, a Hermitian operator fulfills the

equation O† = O. A real Hermitian matrix is denoted as a symmetric matrix.

B. Determinants

The determinant of an N ×N square matrix A is a number (a scalar) defined as

det(A) = |A| =

∣∣∣∣∣∣∣A11 . . . A1N

......

AN1 . . . ANN

∣∣∣∣∣∣∣ =

N !∑i=1

(−1)piPiA11A22 . . . ANN , (17)

where Pi is a permutation operator that permutes the column indices 1, 2, . . . , N (there are N ! such permutations),and pi is the number of transpositions required to restore a given permutation i1, i2, . . . , iN to natural order 1, 2, . . . , N .Note that pi determines only the sign of a contribution to the sum due to the permutation PiA11A22 . . . ANN , andtherefore it is only relevant if pi is odd or even. As an example, let’s have a look at the determinant of a 2× 2 matrix.There are only two permutations of the column indices 1 and 2 possible, namely the natural order 1, 2 with p1 = 0,and the permutation 2, 1 with p2 = 1. Hence we have∣∣∣∣A11 A12

A21 A22

∣∣∣∣ = (−1)0A11A22 + (−1)1A12A21 = A11A22 −A12A21. (18)

Page 5: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

5

Some interesting properties of determinants:

• if each element in a row or a column of matrix A is zero, then det(A) = 0.

• if A is diagonal, the det(A) =∏iAii = A11A22 . . . ANN .

• a single interchange of any two columns or rows in the matrix A changes the sign of det(A).

• det(A) = det(A†)∗

• det(AB) = det(A) det(B)

• if there are any linear dependencies in the rows or columns of A, then det(A) = 0.

• det(A−1) = det(A)−1

• if AA† = 1 then det(A) det(A)∗ = 1.

• if U†AU = B with U†U = UU† = 1, then det(A) = det(B).

It is left to the reader to prove, by using these properties of determinants, the following statement: The equationAc = 0 has a non-trivial solution c 6= 0 only when det(A) = 0.

C. Change of basis

We have already stated above, that the choice of an orthonormal basis is not unique. Given two orthonormal bases|i〉 and |i′〉 with

〈i|j〉 = δij ,∑i

|i〉〈i| = 1,

〈i′|j′〉 = δi′j′ ,∑i′

|i′〉〈i′| = 1, (19)

what is the relationship between them? Since both bases live in the same vector space we can express any basis vector|i′〉 in the basis |i〉,

|i′〉 = 1|i′〉 =∑i

|i〉〈i|i′〉 =∑i

|i〉Uii′ =∑i

|i〉 (U)ii′ , (20)

where we have implicitly defined the transformation matrix U. Vice versa, we can express the basis vectors |i〉 in thebasis |i′〉,

|i〉 = 1|i〉 =∑i′

|i′〉〈i′|i〉 =∑i′

|i′〉U∗ii′ =∑k

|k〉(U†)i′i, (21)

where we have used 〈i′|i〉 = 〈i|i′〉∗ in third equality (note that the matrix U was implicitly defined in eq. (20), not ineq. (21)). The transformation matrix U is unitary, which is a consequence of the orthonormality of the two bases itconnects. We have

δij = 〈i|j〉 = 〈i|1|j〉 =∑k′

〈i|k′〉〈k′|j〉 =∑k′

(U)ik′(U†)k′j

=(UU†

)ij, (22)

thus UU† = 1. U†U = 1 can be prooven analogously.

Let us now look how a matrix representation of an operator changes on a change of the basis. We have

O|i〉 =∑k

|k〉〈k|O|i〉 =∑k

|k〉Oki, and

O|i′〉 =∑k′

|k′〉〈k′|O|i′〉 =∑k′

|k′〉O′k′i′ . (23)

Page 6: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

6

The relations between the two matrix representations O and O′ of O in the two bases can be derived by insertion ofthe unity as follows:

(O′)i′j′ = 〈i′|O|j′〉 = 〈i′|1O1|j′〉 =∑k

∑l

〈i′|k〉〈k|O|l〉〈l|j′〉

=∑k

∑l

(U†)i′k

(O)kl (U)lj′ , (24)

or in matrix form

O′ = U†OU. (25)

Vice versa, from eq. (25) we obtain the reverse relation

O = UO′U† (26)

by exploiting the fact that U is unitary, i.e., UU† = 1, and U†U = 1.

Now that we have seen how matrix representations of an operator change when changing the basis one may rise thequestion if there is a particular choice of basis where the matrix representation takes a particularly simple, namelydiagonal form. This brings us to the next section discussing the eigenvalue problem.

D. The eigenvalue problem

As already stated above, operators acting on a vector convert that vector in another vector, i.e., O|a〉 = |b〉. If |b〉 is

identical to |a〉 times a scalar scaling factor ωa, then |b〉 is designated as an eigenvector of O. The eigenvalue equation

O|a〉 = ωa|a〉 (27)

determines the possible eigenvectors |a〉 and the related eigenvalues ωa of operator O. The matrix representation ofeq. (27) is immediately obtained by insertion of the unit operator, i.e.,

〈i|O1|a〉 = ωa〈i|a〉 ∑k

〈i|O|k〉〈k|a〉 = ωa〈i|a〉 ∑k

Oikak = ωaai

Oa = ωaa. (28)

Without loss of generality, one can require the eigenvectors to be normalized, i.e., 〈a|a〉 = 1. For different (non-degenerate) eigenvalues this is immediately clear from eq. (27), where |a〉 appears on both sides of the equation.Moreover, each subset of degenerate eigenvectors (eigenvectors related to the same eigenvalue), can be orthonormalizedseparately.

In quantum mechanics most operators (actually all operators related to an observable) are Hermitian, i.e., O† = O(vide supra). There are two important properties of Hermitian operators:

• The eigenvalues of a Hermitian operator are real (this is actually the underlying reason why all QM operatorsrelated to an observables are Hermitian). The proof of this statement is straightforward; we have

ωa = ωa〈a|a〉 = 〈a|O|a〉 = 〈a|O†|a〉∗ = 〈a|O|a〉∗ = ω∗a, (29)

where we were using eq. (27) in the second and fifth, and Hermiticity of O in the fourth equality.

• The eigenvectors of a Hermitian operator are orthogonal. To see this consider the two eigenvalue equationsO|a〉 = ωa|a〉, O|b〉 = ωb|b〉. Using these we obtain

〈b|O|a〉 = ωa〈b|a〉, 〈a|O|b〉 = ωb〈a|b〉,0 = 〈a|O|b〉 − 〈b|O†|a〉∗ = 〈a|O|b〉 − 〈b|O|a〉∗ = ωa〈a|b〉 − ω∗b 〈b|a〉∗

= (ωa − ωb)〈a|b〉 〈a|b〉 = 0, ∀ωa 6= ωb. (30)

As already stated above, each subset of degenerate eigenvectors (eigenvectors related to the same eigenvalue),can be orthonormalized separately (without proof).

Page 7: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

7

We can conclude that the eigenvectors of a Hermitian operator form an orthonormalized basis. The matrix repre-sentation of a Hermitian operator is generally not diagonal, yet in the basis of its eigenvectors it trivially is. Hencethe eigenvalue problem can also be recast in the following form: given a non-diagonal matrix representation O ofoperator O in a certain basis we wish to change the basis such that the new matrix representation O′ in the new basisis diagonal. Utilizing what we have learned in the previous section we can write

O′ = U†OU = ω =

ω1

ω2 0

0. . .

ωN

. (31)

Obviously, an N ×N matrix has N eigenvalues.

An alternative way to reformulate the eigenvalue problem proceeds as follows: given a non-diagonal matrix repre-sentation O of operator O in a certain basis we wish to find all distinct column vectors a and related numbers ωasuch that

Oa = ωaa (32)

(this is the matrix representation of eq. (27)). The equivalent equation Oa−ωaa = 0 has a nontrivial solution a 6= 0only when

det (O− ωa1) = 0. (33)

The left hand side (lhs) of eq. (33) is denoted as the secular determinant. It is a polynomial of degree N in theunknown ωa. The eigenvalues thus correspond to the N roots of the secular determinant. For each root ωa the relatedeigenvector is found by substituting the ωa back into eq. (32) and solving the resulting linear equation system for a.So there are in total N eigenvectors ai, i = 1 . . . N , each related to an eigenvalue ωai . Arranging the column vectorsai in a matrix U, i.e.,

U = (a1,a2, . . . ,aN ) =

a11 a12, . . . , a1Na21 a22, . . . , a2N...

......

...aN1 aN2, . . . , aNN

(34)

and employing eq. (32) yields

OU = U

ω1

ω2 0

0. . .

ωN

= Uω. (35)

Note that the eigenvectors ai are orthonormalized, i.e.,

a†iaj =(a∗1i a

∗2i . . . a∗Ni

)a1ja2j...

aNj

= δij , (36)

hence U†U = UU† = 1. Using this and multiplying eq. (35) with U† from the left we finally retrieve eq. (31). Weconclude that the ith-column vector of the unitary transformation matrix U in eq. (31) is the eigenvector related toeigenvalue ωi.

E. Evaluating functions of matrices and operators

For operators and matrices we know the operations add, subtract (both trivial), and multiply, as well as multipli-cation by a scalar. This is sufficient to evaluate functions of operators or matrices by utilizing an appropriate power

Page 8: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

8

expansion of the corresponding function. For example, in order to evaluate the exponential of the matrix O we canuse the Taylor expansion of an exponential, yielding

exp(O) = 1 +1

1!O +

1

2!OO +

1

3!OOO + · · · = 1 +

1

1!O +

1

2!O2 +

1

3!O3 + . . . (37)

These expansions define the corresponding functions; when using them directly they better converge quickly, sinceone has to truncate them at some point. An alternative way is to diagonalize the matrix O, evaluate the function forthe eigenvalues, and transform back, i.e.,

U†OU = ω,

f(ω) =

f(ω1)

f(ω2) 0

0. . .

f(ωN )

,

f(O) = Uf(ω)U†. (38)

To see this lets consider an arbitrary term of the power series of the exponential. We have

On =(UωU†

)n= UωU†U︸ ︷︷ ︸

1

ωU†U︸ ︷︷ ︸1

ωU† · · · = UωnU†. (39)

From eq. (38) it also immediately follows that a matrix having at least one zero eigenvalue does not possess an inverse(1/(ωi = 0) is undefined).

F. Generalization to functions, operators acting on functions, and eigenfunctions of operators

One can show that a sufficiently well behaved function a(x) can be expanded in a series, e.g., as a Taylor expansion,or a Fourier series. Hence, as a vector can be represented by a set of coefficients w.r. to a certain basis, also a functiona(x) can be represented by a set of coefficients w.r. to the basis functions of the series, e.g., the sines and cosines in aFourier series. In order to adopt the formalism we have developed so far for vectors and operators acting on vectors,we need an orthonormal set of basis functions. Furthermore, in order to specify such a basis we first have to redefinethe scalar product in the present context (cf. eq. (2) for vectors). The scalar product of two functions a(x), and b(x)is defined as

〈a|b〉 =

∫ x2

x1

a∗(x)b(x)dx, (40)

where the asterisk again implies complex conjugation of the function. In the following we will drop the integrationinterval [x1, x2] for brevity. As the reader may have noticed, I again used Dirac’s 〈bra||ket〉 notation; in the presentcontext the ket |a〉 ≡ a(x) corresponds to the function a(x), the corresponding bra 〈a| ≡ a∗(x) to its complexconjugate.

Let’s now introduce an infinite set of orthonormal basis functions φi(x) = φ1(x), φ2(x), . . . with

〈i|j〉 =

∫φ∗i (x)φj(x)dx = δij , ∀i, j (41)

We consider the basis as complete, meaning that any (sufficiently well behaved) function a(x) can be expressed as alinear combination of these basis functions,

a(x) =∑i

φi(x)ai. (42)

As in the context of vectors, the coefficient ai is obtained by projecting a(x) onto the basis function φi(x), i.e.,

〈i|a〉 =

∫φ∗i (x)a(x)dx =

∑k

∫φ∗i (x)φk(x)dx ak =

∑k

〈i|k〉ak =∑k

δkiak = ai. (43)

Page 9: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

9

Substituting this back into the original expansion (42) yields

a(x) =∑i

φi(x)ai =∑i

φi(x)

∫φ∗i (x

′)a(x′)dx′ =∑i

|φi〉〈φi|a〉 =

∫ (∑i

φi(x)φ∗i (x′)

)a(x′)dx′, (44)

where in the last equality we have interchanged summation and integration, which is ok, since φi(x) does not dependon the integration variable x′. We identify the expression in big parenthesis as the Dirac delta function, i.e.,∑

i

φi(x)φ∗i (x′) =

∑i

|φi〉〈φi| = δ(x− x′). (45)

The Dirac delta function is the continuous analogon to Kronecker’s delta, which was introduced at the very beginningof our short review of the mathematical framework required for TC I. In an integral over a function a(x) times theDirac delta function the latter plucks out the value of a(x) where the argument of the delta function vanishes, i.e.,∫

a(x′)δ(x− x′)dx′ = a(x), or specifically

∫a(x′)δ(x′)dx′ = a(0). (46)

Defining a(x) ≡ 1 we immediately obtain from the last equation∫δ(x′)dx′ = 1. (47)

The Dirac delta function thus is an infinitely high, infinitely narrow function, but still with the finite value 1 for thearea under its curve. Regard it as a Gaussian made narrower and narrower while maintaining the area under its curve.

We now turn our attention to operators acting on functions. In analogy to operators acting on vectors as discussedabove, an operator O acting on function a(x) maps it into another function b(x), i.e.,

Oa(x) = b(x) =

∫O(x, x′)a(x′)dx′. (48)

The last equality is the continuous generalization of eq. (8), and O(x, x′) can be regarded as a continuous matrix.The Dirac delta function δ(x − x′) just introduced before hence can be interpreted as the identity operator in thecontext of functions, since it maps a(x) onto itself. The matrix representation thereof in the discrete basis φi(x)takes the form

〈φi|δ|φj〉 =

∫φ∗i (x)

∫δ(x− x′)φj(x′)dx′dx =

∫φ∗i (x)φj(x)dx = δij , or

=

∫φ∗i (x)

∫ ∑k

φk(x)φ∗k(x′)φj(x′)dx′dx =

∑k

∫φ∗i (x)φk(x)dx

∫φ∗k(x′)φj(x

′)dx′ =∑k

δikδkj = δij , or

= 〈φi|∑k

|φk〉〈φk|φj〉 =∑k

〈φi|φk〉〈φk|φj〉 =∑k

δikδkj = δij . (49)

Obviously, the matrix representation of the Dirac delta function corresponds to a ∞×∞ unity matrix 1 (recall thatwe introduced an infinite basis set φi(x)).

The operator O is denoted as local, if in order to evaluate b(x0) according to eq. (48) we have to know a(x) only in

an infinitesimal environment around x0. Obviously, δ is a local operator. Another example is the derivative operator

Da(x) =d

dxa(x) =

∫δ(x− x′) d

dx′a(x′)dx′. (50)

In analogy to vectors we can specify an eigenvalue equation for functions, i.e.,

Oa(x) = ωaa(x), or shorter, O|a〉 = ωa|a〉. (51)

If a(x) fulfills this equation, it is denoted as eigenfunction of operator O, corresponding to eigenvalue ωa. It is left tothe reader as an exercise to show that also in the present context the eigenvalue problem can be written in matrixrepresentation as Oa = ωaa with

(O)ij = 〈φi|O|φj〉 =

∫φ∗i (x)Oφj(x)dx, and (a)i = 〈i|a〉 =

∫φ∗i (x)a(x)dx. (52)

Page 10: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

10

For Hermitian operators we have

(O)ji = 〈φj |O|φi〉 =

∫φ∗j (x)Oφi(x)dx =

∫φi(x)

(Oφj(x)

)∗dx =

(∫φ∗i (x)Oφj(x)dx

)∗= 〈φi|O|φj〉∗

= (O)∗ij . (53)

Eigenfunctions of Hermitian operators form an orthonormal basis, just as eigenvectors of Hermitian operators do (videsupra).

As became evident so far, the whole machinery we have developed for vectors can also be adopted for functions. Thegeneralization to functions of many variables is straight forward. Essentially, we have to redefine the scalar producteq. (40) as

〈a|b〉 =

∫ ∫. . .

∫a∗(x1, x2, . . . , xn)b(x1, x2, . . . , xn)dx1dx2 . . . dxn

=

∫a∗(x1, x2, . . . , xn)b(x1, x2, . . . , xn)dx1dx2 . . . dxn (54)

(for convenience we usually just write a single, rather than multiple integration signs, the number of integrations isjust reflected in the number of dxi). Analogously, for a matrix element of an operator we have

〈a|O|b〉 =

∫a∗(x1, x2, . . . , xn)Ob(x1, x2, . . . , xn)dx1dx2 . . . dxn. (55)

Observe that 〈a|b〉 or 〈a|O|b〉 are always numbers, and not functions, since we integrate over all variables of thefunctions!

G. Optimization with constraints, the method of Lagrange

In the following, in particular in the derivation of the Hartree-Fock equations in section VII, we will have to optimizethe expectation energy by insisting simultaneously on certain constraints. This can be done by applying the methodof Lagrange. The trick is not require stationarity of the target function (or functional) itself, but of the so calledLagrangian. The latter is obtained by adding to the original function all the conditions Ωi (written in a form thatthey are equal to zero so that nothing is added to the original function if the constraints are fulfilled) times the socalled Lagrange multipliers, e.g.,

L(x, y, . . . , λ1, λ2, . . . , λn) = f(x, y, . . . ) +

n∑i

λiΩi(x, y, . . . ). (56)

The Lagrangian is required to be stationary w.r. to all variables x, y, . . . , λ1, λ2, . . . , λn (note that we now have addi-tional variables besides x, y, . . . , namely the Lagrange multipliers λi). Stationarity w.r. to the Lagrange multipliersjust yields the respective constraint, e.g.,

0 =∂

∂λkL(x, y, . . . , λ1, λ2, . . . , λn) = Ωk(x, y, . . . ), (57)

since by construction the Lagrangian depends linearly on the multipliers. Stationarity w.r. to the actual variablesyields equations for the multipliers.

In order to make all this more clear, we present here a small illustrative example. Consider the function

f(x, y) = cos(x) sin(y), (58)

which is displayed in the form of a contour plot in Fig. 1. We would like to optimize this function with the constraintthat x = y. So here we have just a single constraint. The Lagrangian, according to the recipe just described above is

L(x, y, λ) = f(x, y) + λ(x− y) = cos(x) sin(y) + λ(x− y). (59)

Page 11: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

11

f(x,y)=cos(x)sin(y)

-0.8

-0.8

-0.8

-0.8

-0.8

-0.8

-0.8

-0.6

-0.6

-0.6

-0.6

-0.6

-0.6

-0.6

-0.6

-0.6

-0.6

-0.4

-0.4

-0.4

-0.4

-0.4 -0.4

-0.4

-0.4

-0.4-0.4

-0.4

-0.4

-0.2

-0.2-0.2

-0.2

-0.2

-0.2-0.2

-0.2

-0.2

-0.2

-0.2

-0.2-0.2

-0.2

-0.2

0 0 0

0

0

0 0 0

000

00

000

0.2

0.2 0.2

0.2

0.2

0.20.2

0.2

0.2

0.2

0.2

0.20.2

0.2

0.2

0.2

0.4

0.4

0.4

0.4

0.40.4

0.4

0.4

0.4

0.4

0.4

0.4

0.6

0.6

0.6

0.6

0.6

0.6

0.6

0.6

0.6

0.6

0.6

0.8

0.8

0.80.8

0.8

0.8

0.8

-Pi/4 0 Pi/4 Pi/2 3Pi/4 Pi 5Pi/4 3Pi/2

X

-Pi/4

0

Pi/4

Pi/2

3Pi/4

Pi

5Pi/4

3Pi/2

Y

FIG. 1: Contour plot of f(x, y) = cos(x) sin(y). The line representing x = y is also given.

Stationarity w.r. to λ yields the constraint x− y = 0, i.e., x = y. Furthermore, we require stationarity w.r. x and y,

0 =∂

∂xL(x, y, λ) = − sin(x) sin(y) + λ,

0 =∂

∂yL(x, y, λ) = cos(x) cos(y)− λ. (60)

Subtracting the first from the second equation above yields

0 = cos(x) cos(y) + sin(x) sin(y)− 2λ = cos(x− y)− 2λ = cos(0)− 2λ λ =1

2, (61)

where we have used one of the usual trigonometric formulae in the second equalilty, and the constraint in the third.Knowing now λ we can get x = y from either one of the two stationary conditions above, e.g,

0 = cos2(x)− 1

2 x = y = arccos

(1√2

)=

(2n+ 1)π

4, ∀n ∈ Z0 (62)

Obvioulsy, when comparing this result with Fig. 1 it appears that we have indeed maxima f(x, x) = 0.5 along the

line x = y at x = y = (2n+1)π4 for even n, and minima f(x, x) = −0.5 for odd n.

This concludes our short mathematical overview and we now turn to real thing, namely theoretical chemistry.

Page 12: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

12

I. THE ELECTRONIC SCHRODINGER EQUATION FOR AN N-PARTICLE SYSTEM

The time-independent Schrodinger equation of a system consisting of N electrons specified by the set of combinedspace/spin coordinates x1, . . . ,xN with x1 = (x1, y1, z1, ω1), and M nucleii specified by the set of space coordinatesR1, . . . ,RM with R1 = (X1, Y1, Z1) takes the form

HΨ(x1, . . . ,xN ,R1, . . . ,RM ) = EΨ(x1, . . . ,xN ,R1, . . . ,RM ), (I.1)

with

H = −N∑i=1

h2

2me∇2i −

M∑K=1

h2

2MK∇2K −

N∑i=1

M∑K=1

ZKe2

4πε0riK+

N∑i=1

N∑j>i

e2

4πε0rij+

M∑K=1

M∑L>K

ZKZKe2

4πε0rKL(I.2)

Obviously, when recalling our mathematical review, eq.(I.1) has the form of an eigenvalue equation. The individualterms in eq. (I.2) describe the kinectic energy of electrons and nucleii, the electron-nuclear Coulomb attraction, theelectron-electron Coulomb repulsion, and the nuclear-nuclear repulsion, respectively. e and me denote charge and

mass of an electron, ZK and MK charge and mass of nucleus K. riK =((xi −XK)2 + (yi − YK)2 + (zi − ZK)2

)1/2is the distance between electron i and nucleus K. rij as the distance between two electrons, and rKL as the distancebetween two nucleii are analogously defined. Note that in the electron-electron and nuclear-nuclear repulsion termsthe second summation ommits the j = i and L = K terms to avoid double counting of the pairwise interactions.Furthermore, the differential operators ∇2

i and ∇2K only act on the space coordinates of electron i and nucleus K,

respectively.

Since the mass of the electrons is much smaller than the mass of the nucleii (a proton is 1836 times heavier thanme) the Schrodinger equation in eq. (I.1) can be decoupled. This is known as the Born-Oppenheimer approximation(think of a hike in northern Sweden; the cloud of mosquitoes flying around your head does not care too much ifyou ran fast or slow). The Born-Oppenheimer approximation leads to a separation of the electronic and the nuclearproblem. The electronic problem can be solved for a fixed arrangement of the nucleii,

HelΨ(x1, . . . ,xN ) = EΨ(x1, . . . ,xN ), (I.3)

with

Hel =

N∑i=1

(− h2

2me∇2i −

M∑K=1

ZKe2

4πε0riK

)+

1

2

∑ij

′ e2

4πε0rij+

1

2

M∑KL

′ ZKZKe2

4πε0rKL=

N∑i=1

h(i) +1

2

N∑ij

′g(i, j) + Enuc. (I.4)

Note the primes in the summations of eq. (I.4), which exclude diagonal terms. Using these together with the factorsof a half again avoids abovementioned double counting, but is a somewhat more elegant and compact notation.

The Schrodinger equation (I.3) now depends only parametrically on the coordinates of the nucleii. This implies,that clamped nucleii Schrodinger equation (I.3) can be solved for many different arrangements of the nucleii, e.g.,different angles and O–H distances in the water molecule. Furthermore, since eq. (I.3) is an eigenvalue problemthere are for a certain nuclear geometry not just one, but many eigenvalues Ek, actually a discrete spectrum of suchsolutions. The lowest Ek corresponds to the electronic ground state, higher ones to electronically excited states, whichfor example can be populated by absorption of a photon. The resulting high-dimensional potential energy surfacesEk(R1, . . . ,RM ) comprise the relevant chemical information of the system. For example the minima on the groundstate surface E0(R1, . . . ,RM ) correspond to energetically stable nuclear geometries, the first-order saddle points inbetween the corresponding transition structures determining reaction rates. Intersections between two surfaces (socalled conical intersections) are very important in photochemistry, their position relative to the Franck-Condon pointsdetermines the phtostability of molecules. Having calculated the relevant surfaces Ek(R1, . . . ,RM ) it is then alsopossible to solve the nuclear Schrodinger equation separated off by the Born-Oppenheimer approximation, i.e.,(

−M∑K=1

h2

2MK∇2K + Ek(R1, . . . ,RM )

)Θk,l(R1, . . . ,RM ) = Ek,lΘk,l(R1, . . . ,RM ). (I.5)

In this lecture we will concentrate on the electronic (clamped nucleii ) Schrodinger equation (I.3). For the hydrogenatom it is possible to solve it analytically, however for many-electron systems it is necessary to introduce approxima-tions. To make our life (and writing) easier we want to get rid of all the constants in Hel on the lhs of Eq. (I.3), i.e., we

Page 13: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

13

want tu use the so called atomic units bohr and Hartree. To this end we make the substitution x, y, z −→ λx, λy, λz,yielding for Eq. (I.3) N∑

i=1

(− h2

2meλ2∇′2i −

M∑K=1

ZKe2

4πε0λr′iK

)+

1

2

∑ij

′ e2

4πε0λr′ij+

1

2

M∑KL

′ ZKZKe2

4πε0λr′KL

Ψ(x1, . . . ,xN ) = EΨ(x1, . . . ,xN )(I.6)

It is now possible to factor out the constants in front of kinetic and Coulomb operators provided that

h2

meλ2=

e2

4πε0λ= EH, (I.7)

yielding

EH

N∑i=1

(−1

2∇′2i −

M∑K=1

ZKr′iK

)+

1

2

∑ij

′ 1

r′ij+

1

2

M∑KL

′ ZKZKr′KL

Ψ(x1, . . . ,xN ) = EΨ(x1, . . . ,xN ). (I.8)

Eq. (I.7) is obviously fulfilled for

λ =4πε0h

2

e2me= a0, (I.9)

hence λ is equal to the Bohr radius a0 = 1bohr = 5.292 · 10−11m. Setting λ = a0 in Eq. (I.7) yields EH = 27.21eV =1Hartree.

Finally, Eq. (I.6) can be scaled by 1/EH, yielding N∑i=1

(−1

2∇′2i −

M∑K=1

ZKr′iK

)+

1

2

∑ij

′ 1

r′ij+

1

2

M∑KL

′ ZKZKr′KL

Ψ(x1, . . . ,xN ) =E

EHΨ(x1, . . . ,xN )

= E′Ψ(x1, . . . ,xN ), (I.10)

with energy E′ in units of Hartree and distances in units of bohr. Consequently, we can write the one- and two-electronoperators of Hel in Eq. (I.4) as

h(i) = −1

2∇2i −

∑K

ZKriK

, (I.11)

g(i, j) =1

rij, (I.12)

provided that we work in atomic units, i.e., that we measure distances in bohr and energies in Hartree.

II. PROPERTIES AND SYMMETRIES OF THE WAVEFUNCTION

Before we can explore further the different electronic structure methods which provide good approximations forenergy and wavefunction of eq. (I.3) we first have to discuss certain properties and symmetry conditions a properwavefunction has to fulfill.

A. Quadratic integrability

An important property already discussed in basic quantum mechanics is the quadratic integrability of the wave-function (mathematicians would say that the wavefunction has to belong to the L2 class of functions). This excludes

Page 14: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

14

singularities like poles. To require such a property for the wavefunction is sensible, since the square of the wave-function Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx1 . . . dxN corresponds to the probability density of finding electron 1 in dx1,simultaneously electron 2 in dx2, etc., thus∫ ∞

−∞Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx1 . . . dxN = 1. (II.1)

In other words, the wavefunction should be normable. This is fulfilled when the wavefunction solving eq. (I.3) belongsto L2. Assume that Ψ′(x1, . . . ,xN ) is a solution of eq. (I.3) with the property that it belongs to L2,∫ ∞

−∞Ψ′∗(x1, . . . ,xN )Ψ′(x1, . . . ,xN )dx1 . . . dxN = K (II.2)

with K denoting a finite number. We can then define a new, scaled wavefunction Ψ(x1, . . . ,xN ) = cΨ′(x1, . . . ,xN ),which trivially is also a solution of eq. (I.3), with∫ ∞

−∞Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx1 . . . dxN = c2

∫ ∞−∞

Ψ′∗(x1, . . . ,xN )Ψ′(x1, . . . ,xN )dx1 . . . dxN = c2K.(II.3)

Now choosing c such that c2K = 1 yields Ψ(x1, . . . ,xN ) as a normable wavefunction. A consequence of the conditionof quadratic integrability is that the spectrum of eigenfunctions of eq. (I.3) is not continuous but discrete (vide supra).Furthermore, since the Hamiltonian is a hermitian Operator its eigenfunctions belonging to different eigenvalues mustbe orthogonal. Thus for normed wavefunctions (and as we have seen they can always be choosen as normed) we have∫ ∞

−∞Ψ∗k(x1, . . . ,xN )Ψl(x1, . . . ,xN )dx1 . . . dxN = δkl, (II.4)

where Ψk(x1, . . . ,xN ) and Ψl(x1, . . . ,xN ) belong to different energy eigenvalues Ek 6= El. In the following we areprimarily interested in the lowest eigenvalue E0 = E belonging to the electronic ground state of the system.

B. The Hartree product

Let’s now have a look at the most simple N -electron system, the He atom with two electrons. Its Hamiltonian hasthe form

H(1, 2) = h(1) + h(2) + g(1, 2), h(i) = −1

2∇2i − Z/ri, g(1, 2) = r−112 . (II.5)

Note that in eq. (II.5) we use the short-hand notation 1, 2 instead of x1,x2, which makes the equations a bit less messy.

We will use in the following both notations simultaneously, as it is convenient. H(1, 2) obvioulsy is a two-particle

operator (it acts on the spatial coordinates of two electrons), while the h(i) are one-particle operators. The latter areidentical to the Hamiltonian of the hydrogen atom problem, apart from the fact that Z is equal to two rather thanone. Obviously, the two particle operator g(1, 2) describing the repulsion of the two electrons makes life complicated.

Let’s assume for the moment that its effect on the energy (in comparison to the h(i)) is small (which is actually true)such that we can neglect it for the moment and take care of it e.g. a posteriori by perturbation theory. We then havethe simplified zeroth-order Hamiltonian

H0(1, 2) = h(1) + h(2). (II.6)

It is an easy excercise to show that the Schrodinger equation

H0(1, 2)Ψ0(1, 2) = E0Ψ0(1, 2) (II.7)

is solved by the simple product ansatz Ψ0(1, 2) = χ1(1)χ2(2), where the χi(i) are eigenfunctions of h(i),

h(1)χ1(1) = ε1χ1(1), h(2)χ1(2) = ε2χ1(2), (II.8)

(i.e., the χi(i) are solutions of the hydrogen atom problem). The energy eigenvalue to eigenfunction Ψ0(1, 2) =χ1(1)χ2(2) in eq. (II.7) then is E0 = ε1 + ε2. This is the independent particle model. We have now seen than theansatz of a many electron wavefunction as a product of one-electron functions (also called orbitals) is eigenfunction of

the operator H0(1, 2). The generalization of such a product ansatz for N electrons,

Ψ(x1, . . . ,xN ) = χ1(x1)χ2(x2) . . . χN (xN ) (II.9)

is known as the Hartree product.

Page 15: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

15

C. Symmetry or antisymmetry of the wavefunction, part 1

The Hartree product ansatz in eq. (II.9) however misses an important property of electrons, namely that electronsare indistinguishable. For simplicity let’s return to the He atom problem comprising only two electrons. The two-particle density function describing the physical situation is given by the square of the wavefunction as

ρ2(x1,x2) = Ψ∗(x1,x2)Ψ(x1,x2) =| Ψ(x1,x2) |2 . (II.10)

Now, due to the indistinguishability of the two electrons ρ2(x1,x2) must be invariant w.r. to permutation of the twoelectrons, i.e., ρ2(x1,x2) = ρ2(x2,x1). This implies, that for real (non complex) wavefunctions we have

| Ψ(x1,x2) |2=| Ψ(x2,x1) |2 → Ψ(x1,x2) = ±Ψ(x2,x1). (II.11)

Therefore, as a consequence, a proper wavefunction for the He atom problem must be either symmetric (plus sign) orantisymmetric (minus sign) w.r. to the permutation of all electron coordinates of electron 1 with electron 2. Presentlywe do not know if the wavefunction is either symmetric or antisymmetric and we will return to this point a bit laterafter discussing spin symmetries of the wavefunction. However, the Hartree product ansatz for He,

Ψ(x1,x2) = χ1(x1)χ2(x2) (II.12)

in general is neither symmetric, nor antisymmetric, and therefore does not fulfill the property of indistinguishableelectrons. To see this more clearly lets consider a particular excited state with χ1(x) = 1s(x) and χ2(x) = 2s(x) (the1s and 2s type wavefunctions of the hydrogen atom problem). We then have

Ψ(x1,x2) = 1s(x1)2s(x2), (II.13)

Ψ(x2,x1) = 1s(x2)2s(x1) 6= ±Ψ(x1,x2).

The Hartree product itself thus is not a proper ansatz for a wavefunction since it lacks the important symme-try/antisymmetry property. However, a linear combination of two Hartree products does the job,

Ψ(x1,x2) = 1s(x1)2s(x2)± 1s(x2)2s(x1), (II.14)

Ψ(x2,x1) = 1s(x2)2s(x1)± 1s(x1)2s(x2) = ±Ψ(x1,x2).

Again, it is an easy excercise for the reader to verify, that the linear combination of Hartree products defined in eq.(II.14) is eigenfunction of eq. (II.7), provided that the corresponding simple Hartree products are eigenfunctions of(II.7).

To summarize, for the He atom we have for the ground and the excited state with electron configurations 1s2, and1s2s, respectively, three possible proper wavefunctions,

Ψ0(x1,x2) = 1s1(x1)1s(x2)

Ψ1(x1,x2) =1√2

(1s1(x1)2s(x2) + 1s1(x2)2s(x1))

Ψ2(x1,x2) =1√2

(1s1(x1)2s(x2)− 1s1(x2)2s(x1)) . (II.15)

Note that we have introduced the normalization factor 1/√

2 such that all wavefunctions are normed. Furthermore, ob-serve that for the ground state with 1s2 electron configuration just the symmetric Ψ0(x1,x2) exists; the antisymmetriclinear combination trivially is identical to zero.

We now want to generalize these findings to the N -electron case: a proper wavefunction for an N electron systemshould have the form

PΨ(x1, . . . ,xN ) = Ψ(x1, . . . ,xN ) (is symmetric), (II.16)

PΨ(x1, . . . ,xN ) = εPΨ(x1, . . . ,xN ) (is antisymmetric), (II.17)

where P is a general permutation operator permuting all coordinates of two or more arbitrary electrons (e.g. el.1with 3, 3 with 4, 4 with 1, 6 with 7, and 7 with 6), and εP = ±1 is the related sign factor, either +1 for even numberof permutations (as in above example), or −1 for odd number of permutations. We will later discover that in fact theantisymmetric form, eq. (II.17) is the proper symmetry of a wavefunction for fermions (like electrons), while bosonsadhere to a the symmetric form, eq. (II.16). However, first we have to explore another symmetry of the wavefunction,namely the spin symmetry.

Page 16: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

16

D. Spin and spin coupling

In order to discuss the spin symmetry of wavefunctions we first have to introduce the spin of an electron anddifferent spin operators acting on the wavefunction and providing information about the spin. It is known that asingle electron is not completely characterised by the spatial orbital, let’s say the 1s(r) orbital of the hydrogen atom.Applying a magnetic field B reveals that the 1s level at zero magnetic field is doubly degenerate. For non-zero fieldstrength B 6= 0 the two degenerate energy levels split,

ε(Bz) = ε(Bz = 0)± 1

2gbBz, (II.18)

with the g-factor g = 2.0023, the Bohr magneton b = eh/2m, and Bz representing the z-component of the magneticfield B. This splitting is known as the Zeeman effect. This implies that the electron must have an intrinsic magneticmoment. Its component along the z-Axis (parallel to Bz) can have two possible (quantized) values ±1/2h leading totwo possible energy values for a certain Bz value. An electron thus has two possible spin functions, α(s), and β(s)corresponding to these two possible components of its intrinsic magnetic moment along the z-Axis. An electron in anorbital thus has to be specified by a spatial orbital part φ(r), and a spin part η(s), where η(s) is either α(s), or β(s).Here, a simple product ansatz does the job, and we arrive at the spin orbital

χ(x) = φ(r)η(s). (II.19)

These spin orbitals χ(x) are actually used to set up the symmetrized or antisymmetrized wavefunctions in a linearcombination of Hartree products (vide supra). The electron coordinate x = (r, s) = (x, y, z, s) is a vector containingfour elements, i.e., the three spatial components of the position vector r, and the spin coordinate s.

The spin functions α(s), and β(s), are by construction eigenfunctions of the spin operator Sz(s), with

Sz(s)α(s) =1

2α(s), Sz(s)β(s) = −1

2β(s). (II.20)

Trivially, also the spin orbitals χ(x) = φ(r)η(s) is eigenfunction of Sz(s). Sz(s) acting on s spin function yields thez-component of its spin angular momentum vector (±1/2 for α(s), and β(s), respectively), in entire analogy to the

orbital angular momentum operator Lz(s) known from the hydrogen atom problem. Again, in entire analogy to the

orbital angular momentum vector the operators related to the remaining two Cartesian components, Sx(s), and Sy(s),

do not commute, with Sz(s), and not with each other, i.e.,

[Sx(s), Sy(s)] = iSz(s), [Sy(s), Sz(s)] = iSx(s), [Sz(s), Sx(s)] = iSy(s). (II.21)

Consequently, any spin function can only be eigenfunction of one of these operators, and, as we have already seen,by convention of Sz(s). However, again in analogy to the orbital angular momentum case, any of these operators

commutes with the spin operator S2(s) = S2x(s) + S2

y(s) + S2z (s)(s), which represents the square of the norm (length)

of the spin angular momentum vector, e.g.,

[S2(s), Sz(s)] = 0. (II.22)

Therefore, any spin function can simultaneously be eigenfunction of S2(s), and one of the Cartesian component

operators, i.e. Sz(s). These are precisely two further requirements an appropriate wavefunction has to fulfill. It

should be also eigenfunction of S2(s) and Sz(s). Acting with S2(s) on the one-electron spin functions α(s), and β(s)yields

S2(s)α(s) =3

4α(s), S2(s)β(s) =

3

4β(s). (II.23)

When the molecule comprises multiple electrons N then the individual Cartesian components are simply added, e.g.,for N = 2 we have

Sz(s1, s2) = Sz(s1) + Sz(s2), (II.24)

and analogous equations for Sx(s1, s2), and Sy(s1, s2), while S2(s1, s2) is defined in the usual way as the sum of thesquares of these operators. We will learn how to work with such operators in a minute. Now lets briefly return to

Page 17: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

17

the problem of the energetic splitting of one-electron wavefunctions (as for the hydrogen atom) in the presence of amagnetic field. In order to capture this Zeeman effect we have to augment the Hamiltonian of the hydrogen atom bya spin-dependent term,

h(x) = h(r) + gbBzSz(s). (II.25)

h(x) now depends on the spin coordinate, and it is an easy exercise to show that h(x) acting on the spin orbitalχ(x) = φ(r)η(s) yields the Zeeman splitting gbBz.

The one-electron spin functions α(s), and β(s) can also be considered as Dirac delta functions δ(s ∓ 1/2) (Diracdelta functions can be envisaged as infinitely narrow and high Gaussians with norm one). Consequently, they forman orthonormal basis,∫

α∗(s)α(s)ds =

∫β∗(s)β(s)ds = 1,

∫α∗(s)β(s)ds =

∫β∗(s)α(s)ds = 0, (II.26)

which is very convenient as we will see later. We now turn to an alternative and very convenient formulation of theone-electron spin functions α(s), and β(s), and the spin operators working on them, namely to a representation asvectors and Pauli matrices in a two dimensional space. The one-electron spin functions α(s), and β(s) are representedby

α(s) =

(10

), β(s) =

(01

), (II.27)

respectively. The Pauli matrices are defined as

σz(s) =

(1 00 −1

), σx(s) =

(0 11 0

), σy(s) =

(0 −ii 0

). (II.28)

All these matrices are hermitian and have trace zero. Furthermore, σz is chosen in diagonal form such that α(s), and

β(s) of eq. (II.27) indeed are eigenfunctions of the operator Sz(s) = 1/2 σz(s) with eigenvalues of ±1/2, as easilycan be verified. The remaining matrices σx(s), and σy(s) then are determined, since they have to fulfill the propercommutator relations

[σx(s), σy(s)] = 2iσz(s), [σy(s), σz(s)] = 2iσx(s), [σz(s), σx(s)] = 2iσy(s), (II.29)

mentioned in eq. (II.21). Also these commutator relations can be easily verified as an exercise. For the S2(s) operatorwe get

S2(s) =1

4σ2x(s) +

1

4σ2y(s) +

1

4σ2z(s) =

3

4

(1 00 1

). (II.30)

Acting with S2(s) on α(s), and β(s) indeed trivially yields eq. (II.23). Acting with Sx(s) and Sy(s) on α(s), andβ(s) yields

Sx(s)α(s) =1

2β(s), Sx(s)β(s) =

1

2α(s),

Sy(s)α(s) =i

2β(s), Sy(s)β(s) =

−i2α(s). (II.31)

This can be used to define the versatile ladder operators

S+(s) = Sx(s) + iSy(s), S−(s) = Sx(s)− iSy(s), (II.32)

for which

S+(s)α(s) =1

2β(s) + i

i

2β(s) = 0,

S+(s)β(s) =1

2α(s) + i

−i2α(s) = α(s),

S−(s)α(s) =1

2β(s)− i i

2β(s) = β(s),

S−(s)β(s) =1

2α(s)− i−i

2α(s) = 0. (II.33)

Page 18: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

18

S+(s), and S−(s) are step up and step down operators, the step up operator S+(s) ”increases” β(s) to α(s), or kills

α(s), the step down operator S−(s) does the opposite. These ladder operators are very useful to rewrite the S2(s)

operator. Of course, we can also write Sx(s), and Sy(s) in terms of these ladder operators as

Sx(s) =1

2

(S+(s) + S−(s)

), Sy(s) =

−i2

(S+(s)− S−(s)

). (II.34)

From the definition of the ladder operators in eq. (II.32) it immediately follows that

S+(s)S−(s) = S2x(s) + S2

y(s)− i (Sx(s)Sy(s)− Sy(s)Sx(s)) = S2x(s) + S2

y(s) + Sz(s)

S−(s)S+(s) = S2x(s) + S2

y(s) + i (Sx(s)Sy(s)− Sy(s)Sx(s)) = S2x(s) + S2

y(s)− Sz(s), and therefore,

S+(s)S−(s) + S−(s)S+(s) = 2(S2x(s) + S2

y(s)). (II.35)

For the derivation of the previous equation we used the commutator relation eq. (II.21). Using these equations we

get for the S2(s) operator the following convenient equations,

S2(s) = S+(s)S−(s) + Sz(s)(Sz(s)− 1

), (II.36)

S2(s) = S−(s)S+(s) + Sz(s)(Sz(s) + 1

), (II.37)

S2(s) =1

2

(S+(s)S−(s) + S−(s)S+(s)

)+ S2

z (s). (II.38)

We no turn our attention to spin functions and operators of more than one electron. For simplicity let’s returnonce more to the two-electron problem, e.g. the helium atom, or the H2 molecule. The overall spin vector of thetwo-electron system is the vector sum of the two spin vectors of the individual electrons, thus we have

Sx(s1, s2) = Sx(s1) + Sx(s2), (II.39)

and analogous eqs. for Sy(s1, s2), and Sz(s1, s2). Note that these operators now obviously depend on two spin

coordinates s1, and s2, related two the two electrons. Using these we could write the S2(s1, s2) operator as

S2(s1, s2) = S2x(s1, s2) + S2

y(s1, s2) + S2z (s1, s2) = S2(s1) + S2(s2) + 2S(s1)S(s2), (II.40)

where S2(si) = S2x(si) + S2

y(si) + S2z (si) as above, and

S(s1)S(s2) = Sx(s1)Sx(s2) + Sy(s1)Sy(s2) + Sz(s1)Sz(s2). (II.41)

Using the definition of the ladder operators given in eq. (II.32) it is easy to verify the relation

S+(s1)S−(s2) + S−(s1)S+(s2) = 2(Sx(s1)Sx(s2) + Sy(s1)Sy(s2)

), (II.42)

and thus

S(s1)S(s2) =1

2

(S+(s1)S−(s2) + S−(s1)S+(s2)

)+ Sz(s1)Sz(s2). (II.43)

Using this in eq. (II.40) we finally get a convenient equation for the S2(s1, s2) operator,

S2(s1, s2) = S2(s1) + S2(s2) + S+(s1)S−(s2) + S−(s1)S+(s2) + 2Sz(s1)Sz(s2). (II.44)

Any proper two-electron spin function ΘS,MS(s1, s2) should simultaneously be eigenfunction of Sz(s1, s2) and

S2(s1, s2), according to

S2(s1, s2)ΘS,MS(s1, s2) = S(S + 1)ΘS,MS

(s1, s2), (II.45)

Sz(s1, s2)ΘS,MS(s1, s2) = MSΘS,MS

(s1, s2). (II.46)

Using the simple product ansatz we can trivially form four different two-electron spin functions out of the one-electronspin functions α(s), and β(s), which are all eigenfunctions of Sz(s1, s2) with eigenvalue MS ,

Page 19: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

19

product MS

α(s1)α(s2) 1

α(s1)β(s2) 0

β(s1)α(s2) 0

β(s1)β(s2) -1

For example,

Sz(s1, s2)α(s1)β(s2) = β(s2)Sz(s1)α(s1) + α(s1)Sz(s2)β(s2) =1

2α(s1)β(s2)− 1

2α(s1)β(s2) = 0. (II.47)

Furthermore, the two products α(s1)α(s2), and β(s1)β(s2) are also eigenfunctions of S2(s1, s2), which can easily beverified by employing eq. (II.44). E.g.,

S2(s1, s2)α(s1)α(s2) =(S2(s1) + S2(s2) + S+(s1)S−(s2) + S−(s1)S+(s2) + 2Sz(s1)Sz(s2)

)α(s1)α(s2)

=

(3

4+

3

4+ 0 + 0 + 2

1

2

1

2

)α(s1)α(s2) = 2α(s1)α(s2), (II.48)

yielding eigenvalue S(S + 1) = 2 and thus S = 1. On the other hand, the two remaining products α(s1)β(s2), and

β(s1)α(s2) are NOT eigenfunctions of S2(s1, s2), e.g.,

S2(s1, s2)α(s1)β(s2) =(S2(s1) + S2(s2) + S+(s1)S−(s2) + S−(s1)S+(s2) + 2Sz(s1)Sz(s2)

)α(s1)β(s2)

=

(3

4+

3

4

))α(s1)β(s2) + 0 + β(s1)α(s2)− 2

1

2

1

2α(s1)β(s2) = α(s1)β(s2) + β(s1)α(s2). (II.49)

However, taking the plus and minus linear combinations of these products leads to two functions, which are propereigenfunctions of S2(s1, s2),

S2(s1, s2) (α(s1)β(s2) + β(s1)α(s2)) = 2 (α(s1)β(s2) + β(s1)α(s2)) (II.50)

S2(s1, s2) (α(s1)β(s2)− β(s1)α(s2)) = 0 (α(s1)β(s2)− β(s1)α(s2)) , (II.51)

as can easily be verified as an exercise. So we now have generated out of the four simple products four proper spinfunctions

spin function S MS

Θ0,0(s1, s2) = 1√2

(α(s1)β(s2)− β(s1)α(s2)) 0 0

Θ1,1(s1, s2) = α(s1)α(s2) 1 1

Θ1,0(s1, s2) = 1√2

(α(s1)β(s2) + β(s1)α(s2)) 1 0

Θ1,−1(s1, s2) = β(s1)β(s2) 1 -1

The single eigenfunction of S2(s1, s2) with eigenvalue S = 0 is a singlet function; the three eigenfunctions of S2(s1, s2)with eigenvalue S = 1 constitute the three components of a triplet state with z-components of the spin momentumvector of MS = 1, 0,−1, respectively. Applying a magnetic field the three components of the triplet state would nolonger be energetically degenerate, but split into three different levels, just as the doublet state of the single electronof e.g. the hydrogen atom splits into two levels in the presence of a magnetic field (cf. the Zeeman effect discussedabove).

Closer inspection of the four spin functions collected in the previous table reveals another important property:these functions are either symmetric or antisymmetric w.r. to exchange of the two coordinates. The singlet functionis antisymmetric (it changes sign), while the three triplet functions are symmetric (they don’t).

E. Symmetry or antisymmetry of the wavefunction, part 2

Let us now return to the question about symmetry or antisymmetry of the electronic wave function risen in sectionII C, which has led so far to eqs. (II.16) and (II.17). For simplicity we still consider the two-electron problem, i.e.,

Page 20: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

20

Energy

symmetric antisymmetric

FIG. 2: Singlet and triplet energy levels of the 1s2 and 1s2s electron configurations of the helium atom.

the helium atom. For the electronic ground state with a two-fold occupation of the 1s(r) orbital we can construct thefour functions

antisymmetric symmetric

Ψa1(x1, x2) = 1s(r1)1s(r2)Θ0,0(s1, s2) Ψs

1(x1, x2) = 1s(r1)1s(r2)Θ1,1(s1, s2)

Ψs2(x1, x2) = 1s(r1)1s(r2)Θ1,0(s1, s2)

Ψs3(x1, x2) = 1s(r1)1s(r2)Θ1,−1(s1, s2)

These four functions result from the combination of the symmetric space function 1s(r1)1s(r2) with an either an-tisymmetric (singlet) or symmetric (triplet) spin function. Note that there is no way one can produce a sensibleantisymmetric space function for a two-fold occupation of the 1s(r) orbital, it would be identical to zero. Now letsturn to an electronically excited state with e.g., 1s2s electron configuration. Here we have eight possibilities bycombination of either a symmetric or antisymmetric space function with a singlet or a triplet spin function, i.e.,

antisymmetric symmetric

Ψa2(x1, x2) = 1√

2(1s(r1)2s(r2) + 2s(r1)1s(r2)) Θ0,0(s1, s2) Ψs

4(x1, x2) = 1√2

(1s(r1)2s(r2)− 2s(r1)1s(r2)) Θ0,0(s1, s2)

Ψa3(x1, x2) = 1√

2(1s(r1)2s(r2)− 2s(r1)1s(r2)) Θ1,1(s1, s2) Ψs

5(x1, x2) = 1√2

(1s(r1)2s(r2) + 2s(r1)1s(r2)) Θ1,1(s1, s2)

Ψa4(x1, x2) = 1√

2(1s(r1)2s(r2)− 2s(r1)1s(r2)) Θ1,0(s1, s2) Ψs

6(x1, x2) = 1√2

(1s(r1)2s(r2) + 2s(r1)1s(r2)) Θ1,0(s1, s2)

Ψa5(x1, x2) = 1√

2(1s(r1)2s(r2)− 2s(r1)1s(r2)) Θ1,−1(s1, s2) Ψs

7(x1, x2) = 1√2

(1s(r1)2s(r2) + 2s(r1)1s(r2)) Θ1,−1(s1, s2)

Fig. 2 shows, in a qualitative way, the energetical order of the electronic wave functions belonging to the 1s2 and 1s2selectron configurations of the helium atom. Note that antisymmetric space functions are more stable than symmetricones, since the wavefuction has a ”hole” for r1 ← r2 and approaches zero at electron coalescence. This correspondsto the first of Hund’s rules.

Now, by exploiting the Zeeman effect we can once and for all decide about symmetry or antisymmetry of theoverall electronic wavefunction by comparison to spectroscopic results. When applying an external magnetic field thetriplet states split. The spectra of the symmetric vs. antisymmetric case are clearly distinguishable. Comparisonto experiment reveals that actually the overall electronic wavefunction must be antisymmetric. This is generallytrue for fermions (particles with half-integer spins). Note that the Pauli exclusion principle is a consequence of thegeneral antisymmetry of the wavefunction. It is obvious that occupying say space orbital 1s(r) twice with electronsof the same spin leads to a forbidden symmetric wave function. From now on we know once and for all that electronicwavefunctions are always antisymmetric , i.e., they obey eq. (II.17), and not eq. (II.16). We also note in passingthat triplet states of a given electron configuration are energetically more favorable (lower) than singlet states, whichcorresponds to the first of Hund’s rules. The reason for that is, as already mentioned above, is the ”Fermi hole” inthe wavefunction when electrons approach each other (Fermi correlation). We will return to that later.

Page 21: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

21

F. Spatial symmetry

Also the symmetry in the arrangement of the individual atoms of a molecule has an impact on the symmetry ofelectronic wavefunction. Any symmetry operator (e.g. reflection on a mirror plane, or rotation about an axis ofsymmetry) of the point group of the molecule trivially commutes with the Hamilton operator (clearly, it does notmatter if we first perform the symmetry operation and the calculate the energy, or vice versa). This leads to certainfurther symmetry properties of the electronic wave function, which we won’t discuss here (there are other lecturesdevoted to that topic). We just mention that the electronic wavefunction has to transform according to certainirreducible representations of the point group of the molecule. Closed shell systems transform according to the totallysymmetric one.

III. SLATER DETERMINANTS

Now that we know that the electronic wavefunction is antisymmetric we can conclude that a simple Hartree productansatz as discussed in section II B, i.e., eq. (II.9) is clearly inappropriate. Obviously it does not fulfill the requiredantisymmetry condition. The most straightforward possibility then is to take such an orbital product ansatz andproperly antisymmetrize it, i.e., to generate a linear combination out of that product ansatz including all necessaryterms with proper assignments of electron coordinates to orbitals and proper signs (according to eq. (II.17)). Whatresults is the so called Slater determinant,

Ψ(x1, . . . ,xN ) =1√N !

N !∑PεPPχ1(x1)χ2(x2) . . . χN (xN ), (III.1)

which is replacing eq. (II.9). Here, P is a general permutation operator permuting electron coordinates (of bothspace and spin), as encountered already in eq. (II.17), and εP = ±1 is its related sign factor. Having a product ofN spin orbitals χi(x), and N electrons to occupy them at hand, there are naturally N ! ways of to distribute them.Consequently, our sum in eq. (III.1) runs over all N ! possible permutation operators P. The pre-factor 1√

N !, as we

will see later, properly norms the Slater determinant, i.e.,∫ ∞−∞

Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx1 . . . dxN = 〈Ψ | Ψ〉 = 1. (III.2)

Slater determinants can also be written in the form

Ψ(x1, . . . ,xN ) =1√N !

∣∣∣∣∣∣∣∣∣∣χ1(x1) χ2(x1) . . . χN (x1)

χ1(x2) χ2(x2) . . . χN (x2)...

... . . ....

χ1(xN ) χ2(xN ) . . . χN (xN )

∣∣∣∣∣∣∣∣∣∣=

1√N !

∥∥∥χ1(x1) χ2(x2) . . . χN (xN )∥∥∥ , (III.3)

i.e., as a determinant of a matrix containing the orbitals in ascending orbitals as columns, and the electron coordinatesin ascending order as rows. A shorthand notation for this is just writing the diagonal of this matrix, as done afterthe second equal sign in eq. (III.3).

One can immediately see from eq. (III.3) that the determinant is invariant w.r. to any linear combinations (orrotations) of the orbitals χI(x). This follows from the important property of determinants, that adding a scalarmultiple of one column (or row) to another column (or row) of the matrix leaves its determinant unchanged. Thishas important consequences for the physical meaning (or better meaninglessness) of the orbitals, as we will see later.

Clearly, by construction, the Slater determinant ansatz fulfills the requirement of antisymmetry according to eq.(II.17). Therefore, they are proper building blocks to construct good wave function approximations, since theyalready incorporate antisymmetry, and with this, Fermi correlation. Actually, the springboard to ab initio electronicstructure theory, Hartree-Fock theory, is a single Slater determinant ansatz, as we will see later, but there are alsolinear combinations and other multi determinant ansatze (Configuration Interaction, Coupled Cluster theory), whichare more accurate and repair the main deficiencies of Hartree-Fock theory. Furthermore, it is an easy exercise to showthat any N electron Slater determinant is eigenfunction of the Sz(s1, s2, . . . , sN ) spin operator, i.e. that it obeys theN electron analogue of eq. (II.46). On the other hand, an N electron Slater determinant is not generally eigenfunction

Page 22: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

22

of the S2(s1, s2, . . . , sN ) operator, as we have seen already in section II D. However, appropriate linear combinationsof Slater determinants, so called configuration state functions, CSFs, can be formed which are eigenfunctions ofS2(s1, s2, . . . , sN ) and thus fully comply with the required spin symmetry. For the case of N electron closed shell

systems actually a single Slater determinant indeed is eigenfunction also of the S2(s1, s2, . . . , sN ) operator. This isless trivial to demonstrate; a convenient proof requires the knowledge of second quantization, which will be discussedin a later lecture series.

IV. DENSITY FUNCTIONS AND DENSITY MATRICES

A. Density functions...

The wave function Ψ(x1, . . . ,xN ) is a very complicated object, which depends on all of the 4N coordinates of the Nelectrons. Actually, it contains much more information than what is physically essential. We will see in a minute thatthe exact energy can in principle be computed from much simpler objects, i.e., from the exact one-particle densitymatrix, and the exact two particle density functions, provided that they are known. In order to elaborate on that letsstart with the simple system of a single electron of α spin, described by the spin orbital χ(x) = φ(r)α(s).

| χ(x) |2 dx = χ∗(x)χ(x) = ρ(x)dx (IV.1)

then is the probability of finding that electron in the (space-spin) volume element dx = drds. The probability offinding this electron with α spin thus is governed by the density function ρ(x), implicitly defined in the previousequation. If we just want to know how the electron is distributed in space regardless of spin we can integrate out spinas

P (r)dr =

∫ρ(x)dsdr = φ∗(r)φ(r)dr

∫α∗(s)α(s)ds =| φ(r) |2 dr, (IV.2)

where we have made use of eq. (II.26).

P (r) =

∫ρ(x)ds (IV.3)

is the density function without reference to spin. Let us now generalize this to the case of N electrons.

| Ψ(x1, . . . ,xN ) |2 dx1dx2 . . . dxN = Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx1dx2 . . . dxN (IV.4)

is the probability of finding the a first electron in (space-spin) volume element dx1 = dr1ds1, and simultaneously asecond electron in (space-spin) volume element dx2, and simultaneously a third in dx3 . . . , and simultaneously thelast in dxN. Now since we are dealing at most with two-electron operators this is clearly much more information thanis physically relevant. So lets start to extract the relevant information out of that object by integrating out unwantedcoordinates, just as we did above with the spin coordinate. The probability of finding the first electron in volumeelement dx1 regardless of where all the other electrons simultaneously are is given by

dx1

∫Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx2 . . . dxN. (IV.5)

Since electrons are undistinguishable particles the probability of finding any electron (not just the first) in dx is Ntimes larger, i.e.,

ρ1(x)dx =

[N

∫Ψ∗(x,x2 . . . ,xN )Ψ(x,x2 . . . ,xN )dx2 . . . dxN

]dx. (IV.6)

ρ1(x) is the one-particle density function of the system. It depends on four variables. Note that we have dropped thesubscript 1 since we now consider a single, arbitrary, infinitesimal volume element dx at position vector x.

Again, the corresponding spinless density function is obtained by integrating out also the remaining spin coordinate,

P1(r) =

∫ρ1(x)ds. (IV.7)

Page 23: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

23

P1(r) carries the information about the spatial distribution of the electrons and can be measured e.g. by X-rayscattering experiments.

Since the Hamiltonian is a two-electron operator (electrons interact pairwise via the Coulomb operator r−1ij , cf.

eq. (I.2)) we also need the two-particle density function. We ask for the probability of finding an arbitrary electronin volume element dx1, and simultaneously another in dx2, hence we integrate over all coordinates but x1 and x2.Furthermore, we exploit that the electron pairs are undistinguishable. This yields

ρ2(x1,x2)dx1dx2 =

[N(N − 1)

∫Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx3 . . . dxN

]dx1dx2. (IV.8)

ρ2(x1,x2) is the two particle density function of the system and depends on eight variables. Note that here thesubscripts 1 and 2 just serve the purpose to distinguish two arbitrary infinitesimal volume elements dx1, dx2 atpositions x1, and x2, respectively. They are no longer related to electron coordinates x1, and x2.

And again, the related spinless quantity is obtained by integrating over the two remaining spin coordinates s1 ands2,

P2(r1, r2) =

∫ρ2(x1,x2)ds1ds1. (IV.9)

P2(r1, r2) is the spin free two particle density function and carries the information about the correlated movement ofinteracting electrons. Higher density functions could be constructed analogously, but are not needed.

B. ...and density matrices

The expectation value of a one-electron operator X =∑Ni x(xi) related to a certain quantity is calculated as

〈X〉 = 〈Ψ | X | Ψ〉 =

∫Ψ∗(x1, . . . ,xN )XΨ(x1, . . . ,xN )dx1 . . . dxN. (IV.10)

Provided, that X is only a multiplier (no differential operator) we can put it as well in front of Ψ∗(x1, . . . ,xN ).Furthermore, since electrons are indistinguishable we can just consider x acting on the first electron coordinate, andmultiplying with N (the contributions of all N electrons are identical). Using this, eq. (IV.10) can be re-written as

〈X〉 =

∫x(x1)

[N

∫Ψ∗(x1, . . . ,xN )Ψ(x1, . . . ,xN )dx2 . . . dxN

]dx1 =

∫x(x)ρ1(x)dx, (IV.11)

where we have used the definition of the one-particle density function implicitly given in eq. (IV.6). As eq. (IV.11)shows, the expectation value of a multiplicative one-electron operator can be calculated by tracing the operator withthe one-particle density. This is a very nice result and we would like generalize it to any one-electron operators(including differential operators). This can be achieved by applying a small trick. For an arbitrary one-electronoperator we still have

〈X〉 =

∫ [N

∫Ψ∗(x1, . . . ,xN )x(x1)Ψ(x1, . . . ,xN )dx2 . . . dxN

]dx1. (IV.12)

We could move x(x1) in front of Ψ∗(x1, . . . ,xN ) if we somehow protect the dependence of Ψ∗ on coordinate x1 fromthe operator x(x1) now standing in front of it. This can be done by renaming the argument x1 of Ψ∗ to x′1. However,before carrying out the outer integration over x1 we must remember that x′1 actually is x1. The mathematical notationfor all this is

〈X〉 =

∫x′1=x1

x(x1)

[N

∫Ψ∗(x′1,x2, . . . ,xN )Ψ(x1, . . . ,xN )dx2 . . . dxN

]dx1 =

∫x′=x

x(x)ρ1(x′; x)dx, (IV.13)

where we have implicitly defined the one-particle density matrix ρ1(x′; x). Obviously, it depends on two sets ofvariables, x′ and x, and as its name says, it can be considered as a kind of an abstract matrix, with its diagonal

Page 24: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

24

ρ1(x; x) = ρ1(x) being identical to the density function introduced earlier. Analogously, we could define the two-particle density matrix as

ρ2(x′1,x′2; x1,x2) = N(N − 1)

∫Ψ∗(x′1,x

′2,x3, . . . ,xN )Ψ(x1, . . . ,xN )dx3 . . . dxN. (IV.14)

Its diagonal, ρ2(x1,x2; x1,x2) = ρ2(x1,x2) again is identical to the two-particle density function introduced earlier.Now, since the only two-electron operator we will ever encounter, i.e., the electron-electron Coulomb operator g(1, 2) =(r2 − r1)−1 is a multiplier, the two-particle density matrix will never be needed.

C. The energy as a functional of density matrix and density functions

Now we are in a position to rewrite the expectation value of the Hamiltonian defined in eq. (I.4), i.e.,

〈Ψ | Hel | Ψ〉, with Hel =

N∑i=1

h(ri) +1

2

∑ij

′g(ri, rj) + Enuc (IV.15)

in terms of density matrices and functions. The last term is trivial. Since Enuc is just a number we can factorizeit out of the integral, which itself is one due to normalization of the wave function. Since the one-electron operator

h(ri) contains a differential operator (cf. eq. (I.11)) we need the one-particle density matrix ρ1(x′; x) for this term.For the two-electron term involving the purely multiplicative two-electron operator g(ri, rj), on the other hand, onlythe two-particle density function is required. This all yields for the expectation value of the Hamiltonian

〈Ψ | Hel | Ψ〉 =

∫x′=x

h(r)ρ1(x′; x)dx +1

2

∫g(r1r2)ρ2(x1,x2)dx1dx2 + Enuc. (IV.16)

Since none of the operators acts on a spin coordinate, we could as well integrate out spin, yielding

〈Ψ | Hel | Ψ〉 =

∫r′=r

h(r)P1(r′; r)dr +1

2

∫g(r1r2)P2(r1, r2)dr1dr2 + Enuc. (IV.17)

This is an important result! We now have with eq. (IV.17) a prescription of how to calculate the exact expectationvalue of the Hamiltonian, and with that the exact energy, from much simpler quantities than the exact wavefunction.Remember that the wave function depends on 3N variables (disregarding spin), while the one-particle density matrixand the two-particle density function just depend on six variables. In other words, provided that we somehow havethe exact one-particle density matrix and two-particle density function at hand, we could calculate, according to eq.(IV.17) the exact energy from it. The catch is in the small print and its name is the N -representability condition,which states that we also have to make sure that proper wave function exists from which the density matrix/functioncould be computed, even though its not needed. This is a hard problem, and again an example for the ”no free lunchtheorem” ubiquitous in science.

In any case, eq.(IV.17) is a clear prescription (functional) mapping one-particle density matrix and two-particledensity function to energy. We will later learn in the context of density functional theory that in principle there isa functional mapping the even much simpler one-particle density function (depends on just three variables) to theenergy. However, nobody in the world knows how this functional looks like and one has to try to model it somehow.

V. THE SLATER-CONDON RULES

We have learned in section III that the elementary building blocks of electronic wavefunctions are Slater determi-nants. We now have to learn how to calculate expectation values involving Slater determinants to arrive at expressionsinvolving less abstract quantities like integrals involving orbitals, which can be evaluated by computers. In this sec-tion we derive a set of rules called the Slater Condon rules, which are doing exactly that. We are considering matrixelements of the type

〈ΨK(x1, . . . ,xN ) | O1 | ΨL(x1, . . . ,xN )〉 = 〈K | O1 | L〉, (V.1)

〈ΨK(x1, . . . ,xN ) | O2 | ΨL(x1, . . . ,xN )〉 = 〈K | O2 | L〉, (V.2)

Page 25: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

25

where |ΨK(x1, . . . ,xN )〉 = |K〉 and |ΨL(x1, . . . ,xN )〉 = |L〉 are Slater determinants as specified in eq. (III.1), which

may differ in their individual sets of orbitals, and O1, O2 are general one-, and two-electron operators of the form

O1 =

N∑i=1

h(xi), (V.3)

O2 =1

2

∑ij

′g(xi,xj), (V.4)

as they e.g. appear in the expression of the electronic Hamilton operator (IV.15). However, lets first check if theSlater determinants form an orthonormal N -electron basis, i.e., if

〈ΨK(x1, . . . ,xN ) | ΨL(x1, . . . ,xN )〉 = 〈K | L〉 = δKL. (V.5)

Lets first explore the case of identical Slater determinants |K〉 = |L〉 (constructed from the same set or orbitals).Inserting eq. (III.1) into the expectation value yields

〈K | K〉 = (N !)−1N !∑PK

N !∑PL

εPK εPL

∫PK(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)PL(χ1(x1)χ2(x2) . . . χN (xN )

)dx1 . . . dxN. (V.6)

Due to the premise that the orbitals constituting the Slater determinants form an orthonormal one-electron basis,i.e., ∫

χ∗I(x)χJ(x)dx = δIJ , (V.7)

every term of eq. (V.6) contains a factor of zero, unless both permutation are identical, i.e., PK = PL = P. Eq. (V.6)hence simplifies to

〈K | K〉 = (N !)−1N !∑P

∫P(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)P(χ1(x1)χ2(x2) . . . χN (xN )

)dx1 . . . dxN, (V.8)

where we have used that ε2P ≡ 1. We can now see that the integral factorizes into one-electron integrals, i.e.,

〈K | K〉 = (N !)−1( ∫

χ∗1(x1)χ1(x1)dx1

∫χ∗2(x2)χ2(x2)dx2 . . .

∫χ∗N (xN )χN (xN )dxN

+

∫χ∗2(x1)χ2(x1)dx1

∫χ∗1(x2)χ1(x2)dx2 . . .

∫χ∗N (xN )χN (xN )dxN + . . .

).

(V.9)

Each individual factorized integral obviously is equal to one, and we have a sum of (N !) ones. Hence the (N !) cancelswith the normalization (N !)−1 and we get indeed 〈K|K〉 = 1, as anticipated. It is now also easy to see what happenswhen the orbital sets of the two determinants |K〉 and |L〉 differ by at least orbital. In this case whatever we do withthe permutations PK and PL we will always have a factor of zero in each term of the sum. We can conclude that theSlater determinants indeed form an orthonormal N -electron basis, according to eq. (V.5).

A. The Slater-Condon rules for one-electron operators

Let us now explore the case of an expectation value over a one-electron operator O1. We again explore the caseof identical determinants |K〉 = |L〉. Inserting O1 and exploiting the fact that electrons are undistinguishable (eachelectron gives the same contribution as electron 1) yields

〈K | O1 | K〉 = N〈K | h(x1) | K〉. (V.10)

We now play the same game as before. We insert |K〉 and obtain

〈K | O1 | K〉 = ((N − 1)!)−1N !∑PK

N !∑PL

εPK εPL

∫PK(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)h(x1)PL

(χ1(x1)χ2(x2) . . . χN (xN )

)× dx1 . . . dxN, (V.11)

Page 26: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

26

TABLE I: Slater Condon rules for matrix elements involving a one-electron operator O1.

Slater determinant: ΨK = |K〉 =1√N !

∑PK

εPKPK χ1(x1)χ2(x2) . . . χN (xN ) = |χ1(1)χ2(2) . . . χN (N)|,

One electron operator: O1 =

N∑i=1

h(xi)

difference |K〉, |L〉

0 SOs, |K〉 = |L〉: 〈K|O1|K〉 =

N∑I=1

∫χ∗I(x1)h(x1)χI(x1)dx1 =

N∑I=1

〈I|h|I〉 =

N∑I=1

hII

1 SOs, |K〉 = | . . . χI(xi)) . . . | 〈K|O1|L〉 = ΓKI ΓLJ

∫χ∗I(x1)h(x1)χJ(x1)dx1 = ΓKI ΓLJhIJ , with

|L〉 = | . . . χJ(xj)) . . . | ΓKI |χI(x1) . . . | = | . . . χI(xi) . . . |, i.e., ΓKI = ±1

2 SOs, 〈K|O1|L〉 = 0

which is very similar to eq. (V.6). Again, for the sake of the same arguments, the two permutations PK and PLmust be identical for the electron coordinates x2 . . .xN (h(x1) obviously acts only on x1). Moreover, if the twopermutations are identical for all electron coordinates but one, then also this one is automatically fixed. We thereforehave again PK = PL = P. Eq. (V.11) thus simplifies to

〈K | O1 | K〉 = ((N − 1)!)−1N !∑P

∫P(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)h(x1)P

(χ1(x1)χ2(x2) . . . χN (xN )

)dx1 . . . dxN.

(V.12)

And again, the integral in the summation over the permutations factorizes,

〈K | O1 | K〉 = ((N − 1)!)−1( ∫

χ∗1(x1)h(x1)χ1(x1)dx1

∫χ∗2(x2)χ2(x2)dx2 . . .

∫χ∗N (xN )χN (xN )dxN

+

∫χ∗2(x1)h(x1)χ2(x1)dx1

∫χ∗1(x2)χ1(x2)dx2 . . .

∫χ∗N (xN )χN (xN )dxN + . . .

).

(V.13)

In these N ! summands corresponding to the individual permutations each orbital χI is once occupied by electron 1

and thus enters the integral involving the operator h(x1). For each of these occupations there are (N−1)! possibilitiesto permute electrons 2−N . Therefore we have

〈K | O1 | K〉 = ((N − 1)!)−1N∑I=1

(N − 1)!

∫χ∗I(x1)h(x1)χI(x1)dx1 =

N∑I=1

∫χ∗I(x1)h(x1)χI(x1)dx1 =

N∑I=1

hII .

(V.14)

With eq. (V.14) we arrived at a simple expression for this expectation value, which involves simple integrals involvingonly one-electron orbitals.

In a similar way one can derive the Slater Condon rules for matrix elements over O1 involving different determinants.Since such matrix elements are not needed for the methods discussed in this first course, we won’t derive them explicitlyhere, but just provide the final result in Table I. An excellent discussion and derivation of the Slater Condon rulescan e.g. be found in the book Modern Quantum Chemistry by Szabo and Ostlund.

Page 27: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

27

B. The Slater-Condon rules for two-electron operators

Next we consider the expectation value over a general two-electron operator O2. Inserting O2 and exploiting thefact that electron pairs are undistinguishable (each (each electron pair gives the same contribution) yields

〈K | O2 | K〉 =N(N − 1)

2〈K | g(x1,x2) | K〉. (V.15)

And again, we insert |K〉 and obtain

〈K | O2 | K〉 = (2(N − 2)!)−1N !∑PK

N !∑PL

εPK εPL

∫PK(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)g(x1,x2)PL

(χ1(x1)χ2(x2) . . . χN (xN )

)× dx1 . . . dxN. (V.16)

Likewise, this expression is, apart from prefactor and operator identical to eq. (V.11), and we can argue along the samelines: In order to avoid zero factors the two permutations PK and PL must be identical for the electron coordinatesx3 . . .xN (g(x1,x2) obviously acts only on x1 and x2). Moreover, if the two permutations are identical for all electroncoordinates but the first two, then there are just two remaining possibilities to deal with x1 and x2, namely not topermute them (identity operator), or to permute them (permutation operator for coordinates x1 and x2). Using thiseq. (V.16) simplifies to

〈K | O2 | K〉 = (2(N − 2)!)−1N !∑P

∫P(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)g(x1,x2) (1− P12)P

(χ1(x1)χ2(x2) . . . χN (xN )

)× dx1 . . . dxN. (V.17)

Note the factor (1 − P12) in eq. (V.17). It works to the right on the (by P permuted) orbital productP(χ1(x1)χ2(x2) . . . χN (xN ) and reflects the two possibilities of permuting coordinates x1 and x2; 1 for the iden-

tity, P12 for their permutation (on top of P). P12 is multiplied by (−1) since we have in this case an additionalpermutation on top of P (if P implies an odd number of permutations (as reflected in εPL), then P12P implies aneven number, and vice versa.

Here, the integral in the summation over the permutations also factorizes, yet not entirely in one-electron integrals,but

〈K | O2 | K〉 = (2(N − 2)!)−1 (V.18)

×( ∫

χ∗1(x1)χ∗2(x2)g(x1,x2) (1− P12)χ1(x1)χ2(x2)dx1dx2

∫χ∗3(x3)χ3(x3)dx3 . . .

∫χ∗N (xN )χN (xN )dxN

+

∫χ∗2(x1)χ∗3(x2)g(x1,x2) (1− P12)χ2(x1)χ3(x2)dx1dx2

∫χ∗1(x3)χ1(x3)dx3 . . .

∫χ∗N (xN )χN (xN )dxN + . . .

).

The above expressions now involves two-electron integrals; since the related operator g(x1,x2) is a two-electronoperator these integrals do not factorize further. In the N ! summands corresponding to the individual permutationseach orbital pair χI , χJ is once occupied by electrons 1 and 2 and thus enters the two-electron integral. For each ofthese occupations there are (N − 2)! possibilities to permute electrons 3−N . Therefore we have

〈K | O2 | K〉 = (2(N − 2)!)−1N∑I=1

N∑J 6=I

(N − 2)!

∫χ∗I(x1)χ∗J(x2)g(x1,x2) (1− P12)χI(x1)χJ(x2)dx1dx2

=1

2

N∑I=1

N∑J 6=I

∫χ∗I(x1)χ∗J(x2)g(x1,x2)χI(x1)χJ(x2)dx1dx2 −

∫χ∗I(x1)χ∗J(x2)g(x1,x2)χJ(x1)χI(x2)dx1dx2

=1

2

N∑I,J=1

〈IJ | IJ〉 − 〈IJ | JI〉 =1

2

N∑I,J=1

〈IJ || IJ〉, (V.19)

where we have introduced the physicist’s notation for two-electron integrals 〈IJ | IJ〉, and for antisymmetrized two-electron integrals 〈IJ || IJ〉, which all are implicitly defined in eq. (V.19). In the third equality of eq. (V.19) we haveused that 〈II || II〉 = 0.

Page 28: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

28

TABLE II: Slater Condon rules for matrix elements involving a two-electron operator O2.

Slater determinant: ΨK = |K〉 =1√N !

∑PK

εPKPK χ1(x1)χ2(x2) . . . χN (xN ) = |χ1(1)χ2(2) . . . χN (N)|,

Two electron operator: O2 =1

2

∑ij

′g(xi,xj)

difference |K〉, |L〉

0 SOs, |K〉 = |L〉: 〈K|O2|K〉 =1

2

N∑I,J=1

∫χ∗I(x1)χ∗J(x2)g(x1,x2) (1− P12)χI(x1)χJ(x2)dx1dx2

=1

2

N∑I,J=1

〈IJ | IJ〉 − 〈IJ | JI〉 =1

2

N∑I,J=1

〈IJ || IJ〉

1 SOs, |K〉 = | . . . χI(i) . . . | 〈K|O2|L〉 = ΓKI ΓLJ

N∑M=1

∫χ∗I(x1)χ∗M (x2)g(x1,x2) (1− P12)χJ(x1)χM (x2)dx1dx2

|L〉 = | . . . χJ(j) . . . | = ΓKI ΓLJ

N∑M=1

〈IM | JM〉 − 〈IM |MJ〉 = ΓKI ΓLJ

N∑M=1

〈IM || JM〉

2 SOs, |K〉 = | . . . χI(i) . . . χJ(j) . . . | 〈K|O2|K〉 = ΓKI ΓKJ ΓLMΓLN

∫χ∗I(x1)χ∗J(x2)g(x1,x2) (1− P12)χM (x1)χN (x2)dx1dx2

|L〉 = | . . . χM (m) . . . χN (n) . . . | = ΓKI ΓKJ ΓLMΓLN (〈IJ |MN〉 − 〈IJ | NM〉) = ΓKI ΓKJ ΓLMΓLN 〈IJ ||MN〉

3 SOs, 〈K|O2|L〉 = 0

As for the one-electron operator case, the Slater Condon rules for matrix elements over O2 involving differentdeterminants can be derived similarly. Since they are also not needed for the present course we gain just provide thefinal result in Table II.

C. The Slater-Condon rules for density functions and matrices

In an entirely analogous way one can now proceed and derive expressions for the density functions and matricesintroduced in section IV for the case that the wavefunction in eqs. (IV.8) and (IV.13) is represented by a Slaterdeterminant. Let’s first investigate the case of the one-particle density matrix implicitly defined in eq. (IV.13).Inserting the expression for the determinant we get

ρ1(x′1; x1) = ((N − 1)!)−1N !∑PK

N !∑PL

εPK εPL

∫PK(χ∗1(x′1)χ∗2(x2) . . . χ∗N (xN )

)PL(χ1(x1)χ2(x2) . . . χN (xN )

)dx2 . . . dxN,

(V.20)

which is entirely analogous to eq. (V.11) with the sole difference that the operator h(x1) preventing factor zero forone-electron integrals over x1 involving different orbitals is replaced by the fact that there is no integration of x1 atall (and therefore also no factor zero for different orbitals occupied by electron 1). The train of thought is identicalhere to that used in the derivation of the Slater Condon rules for the one-electron operator case discussed above. Sowe get in entire analogy to eq. (V.14)

ρ1(x′; x) =

N∑I=1

χ∗I(x′)χI(x), (V.21)

Page 29: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

29

and from its diagonal the one-particle density function

ρ1(x) =

N∑I=1

χ∗I(x)χI(x). (V.22)

Let’s now turn to the two-particle density function defined in eq. (IV.8). Insertion of the expression for the Slaterdeterminant yields

ρ2(x1,x2) = ((N − 2)!)−1N !∑PK

N !∑PL

εPK εPL

∫PK(χ∗1(x1)χ∗2(x2) . . . χ∗N (xN )

)PL(χ1(x1)χ2(x2) . . . χN (xN )

)dx3 . . . dxN,

(V.23)

which in turn is entirely analogous to eq. (V.16) with the sole difference that the operator g(x1,x2) (preventing factorzero for integration over x1 and x2 with different orbitals occupied by electrons 1 and 2) is replaced by not integratingat all over these coordinates, which has the analogous effect. Hence, in entire analogy to the reasoning in section IIleading finally to eq. (V.19) we arrive at

ρ2(x1,x2) = ((N − 2)!)−1N∑I=1

N∑J 6=I

(N − 2)! (χ∗I(x1)χ∗J(x2) (1− P12)χI(x1)χJ(x2))

=

N∑I=1

N∑J 6=I

χ∗I(x1)χ∗J(x2)χI(x1)χJ(x2)− χ∗I(x1)χ∗J(x2)χJ(x1)χI(x2)

=

(N∑I=1

χ∗I(x1)χI(x1)

)(N∑J=1

χ∗J(x2)χJ(x2)

)−

(N∑I=1

χ∗I(x1)χI(x2)

)(N∑J=1

χ∗J(x2)χJ(x1)

)= ρ1(x1)ρ1(x2)− ρ1(x1; x2)ρ1(x2; x1). (V.24)

In the third equality of eq. (V.24) we have utilized the obvious fact that the J = I term just is zero, while inthe last equality we made use of the definitions of one-particle density function (V.22) and – matrix (V.21). Eq.(V.24) is an important results with grave consequneces: apparently for a wavefunction ansatz consisting of a singleSlater determinant (as the Hartree-Fock method, vide supra) the two-particle density function factorizes and can becalculated just from the one-particle density matrix. We will explore this further in the following section.

VI. THE FERMI – AND THE COULOMB HOLE

Let us now have a look at the spinless density matrix and density functions as introduced in eqs. (IV.7) and(IV.9), for the case of a single Slater determinant ansatz of the wavefunction. The corresponding one-particle densitymatrix is given by eq. (V.21). Obvioulsy, when inserting the product ansatz for spin orbitals, eq. (II.19), it can bepartitioned, according to the spin part, into two different contributions as

ρ1(x′; x) = Pα1 (r′; r)α∗(s′)α(s) + P β1 (r′; r)β∗(s′)β(s)

where Pα1 (r′; r) =

Nα∑I=1

φ∗I(r′)φI(r); ∀I, (η∗I (s′) ≡ α∗(s′)) ∧ (ηI(s) ≡ α(s)) . (VI.1)

The spinless one-particle density matrix is obtained by integrating out spin, i.e.,

P1(r′; r) =

∫s′1=s1

ρ1(x′; x)ds1 = Pα1 (r′; r) + P β1 (r′; r), (VI.2)

where we have used the orthonormality of the one-electron spin functions, eq. (II.26). The spinless one-particledensity function is the diagonal of the density matrix and can be written as

P1(r) = P1(r′; r) = Pα1 (r) + P β1 (r). (VI.3)

Page 30: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

30

This tells us that the total (spinless) density, which can be measured by e.g. X-ray diffraction, is the sum of thedensities corresponding to electrons with α, and β spin, respectively, which is not a surprising result. For a closed

shell system with Pα1 (r) = P β1 (r) we have

P1(r) = 2Pα1 (r) = 2

Nocc∑I=1

φ∗I(r′)φI(r). (VI.4)

It is also appropriate to define at this point the spin-density function, which is often utilized to characterize open-shellsystems,

Q1(r) = Pα1 (r)− P β1 (r). (VI.5)

It reflects the excess of the α over the β density and represents the density of spin angular momentum about thez-axis. It is of course zero for closed-shell systems.

Now lets do the same exercise for the two-particle density function given in eq. (V.24). Integrating out the twospin coordinates yields

P2(r1, r2) =

∫ρ1(x1)ρ1(x2)ds1ds2 −

∫ρ1(x1; x2)ρ1(x2; x1)ds1ds2. (VI.6)

The first integral factorizes, and by utilizing eq. (VI.3) we get∫ρ1(x1)ρ1(x2)ds1ds2 =

∫ρ1(x1)ds1

∫ρ1(x2)ds2 =

(Pα1 (r1) + P β1 (r1)

)(Pα1 (r2) + P β1 (r2)

). (VI.7)

The second integral is a little bit more intricate. Employing eq. (VI.1) yields∫ρ1(x1; x2)ρ1(x2; x1)ds1ds2 (VI.8)

=

∫ (Pα1 (r1; r2)α∗(s1)α(s2) + P β1 (r1; r2)β∗(s1)β(s2)

)(Pα1 (r2; r1)α∗(s2)α(s1) + P β1 (r2; r1)β∗(s2)β(s1)

)ds1ds2

Due to the orthonormality of the one-electron spin function this simplifies to∫ρ1(x1; x2)ρ1(x2; x1)ds1ds2 = Pα1 (r1; r2)Pα1 (r2; r1) + P β1 (r1; r2)P β1 (r2; r1). (VI.9)

The total spinless two-particle density function hence takes the form

P2(r1, r2) = Pα1 (r1)Pα1 (r2)− Pα1 (r1; r2)Pα1 (r2; r1)

+ P β1 (r1)P β1 (r2)− P β1 (r1; r2)P β1 (r2; r1)

+ Pα1 (r1)P β1 (r2) + P β1 (r1)Pα1 (r2). (VI.10)

This is a very important result with severe implications. Let’s first have a look at electron pairs with electrons of thesame spin, i.e., αα or ββ electron pairs. Their contribution to P2(r1, r2) is given in the first two lines of eq. (VI.10).Evidently, when the two electrons coalesce (r1 = r2) these contributions are zero (recall, that the diagonal of thedensity matrix is the density function). Therefore, according to the meaning of P2(r1, r2) (the probability of findingan electron at r1, and simultaneously another at r2), the probability density of finding two electrons with same spin atthe same place, is zero. This is reasonable, since such electrons would violate the Pauli principle, which is, as we havelearned, a consequence of the antisymmetry of the wavefunction. So here it pays off to work with Slater determinant:the antisymmetry is automatically built into the wavefunction and all its consequences are automatically taken careof. P2(r1, r2) exhibits a depletion for r1 → r2 and goes to zero for r1 = r2. This is the so called Fermi hole inP2(r1, r2).

But what about αβ (or βα) pairs? As we can see from the last line of eq. (VI.10) there is no effect or depletionwhen two electrons of opposite spin coalesce. The contribution of such pairs to P2(r1, r2) is simply the product ofcorresponding one-particle densities and therefore entirely independent of the relative position of the two electrons.In other words, the two electrons are un-correlated. Is this reasonable? Obviously not, since even though there isno violation of the Pauli principle here, the two electrons, inspite of their opposite spin, avoid each other due to

Page 31: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

31

their mutual Coulomb repulsion. In reality, P2(r1, r2) exhibits also a depletion for r1 → r2 for pairs of electrons withopposite spin. This is the so called Coulomb hole in P2(r1, r2). Apparently, it is not described at all by a wavefunctionansatz based on a single Slater determinant. So we can conclude, that any single determinant ansatz including Hartree-Fock theory from the beginning has the handicap of not describing the Coulomb hole, or Coulomb correlation. Apossible route to describe the Coulomb hole are electron correlation methods based on multi determinant anatze. Thesimplest one is Configuration Interaction (CI) which employs a linear combination of Slater determinants. Electroncorrelation methods are the topic of a later course.

VII. THE HARTREE-FOCK METHOD

In the following we will now discuss the Hartree-Fock (HF) method, which is the basic ab initio electronic structuremethod and platform for more sophisticated methods like configuration interaction, perturbation theory, or coupledcluster methods. The HF method is specified by the following to statements: (i) it is a single determinant ansatz. (ii)

the orbitals forming the determinant are chosen such that the energy expectation value E = 〈Ψ|Hel|Ψ〉 (which is afunctional of the orbitals) constitutes a minimum. In other words, the energy expectation value is stationary w.r. tolinear variations of the orbitals, and, according to the variational principle, bounded from below. The HF wavefunctioncan be considered as the variationally best Slater determinant, yielding the lowest possible energy expectation valueE.

According to what we have learned in sections V A and V B we are now in a position to write E in terms of orbitalsas

E = 〈Ψ|Hel|Ψ〉 =

N∑I=1

∫χ∗I(x1)h(x1χI(x1)dx1 +

1

2

N∑I,J=1

∫χ∗I(x1)χ∗J(x2)g(x1,x2) (1− P12)χI(x1)χJ(x2)dx1dx2

=

N∑I=1

〈χI |h|χI〉+1

2

N∑I,J=1

〈χIχJ ||χIχJ〉. (VII.1)

In the last equality of eq. (VII.1) employing the Dirac bra-ket notation for the integrals, we have explicitly put theorbital symbols, rather than just the orbital indices, for didactical reasons.

A. The stationary conditions

Now, according to the stationary conditions specifying the HF determinant E has to vanish for linear variations ofthe orbitals χK → χ′K = χK + δχK , where δχK is an infinitesimal change of orbital χK . So we have

0!= δE(k) = E(k)′ − E = 〈δχK |h|χK〉+

1

2

N∑J=1

〈δχKχJ ||χKχJ〉+1

2

N∑I=1

〈χIδχK ||χIχK〉+ c.c.

= 〈δχK |h|χK〉+

N∑I=1

〈δχKχI ||χKχI〉+ c.c., (VII.2)

where c.c. stands for the corresponding complex conjugate terms, with the variations on the ket, rather than the braside. In the last equality we have renamed the summation index from J to I, and exploited the fact that two-electronintegrals are invariant w.r. to swapping the two electron coordinates. Eq. (VII.2) specifies a set of N equations,one for each orbital χK . However, up to this point we have overlooked a subtlety. Namely, when deriving the SlaterCondon rules in sections V A and V B, which are used here to express the energy expectation value E according to eq.(VII.1) we have assumed that the orbitals used to construct the Slater determinant form an orthonormal basis. Sincevarying the orbitals may compromise the orthonormality we have to guarantee it via constraints. The mathematicaldevice for optimizations under constraints is the method of Lagrange: Instead of the target functional a Lagrangefunctional is optimized, to which the different constraints times a multiplier are added. Thus instead of E we optimize

Page 32: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

32

the Lagrangian

L = E −N∑

I,J=1

εJI (〈χI |χJ〉 − δIJ) , (VII.3)

where the εJI are the Lagrange multipliers, which have to be determined. Note that E depends on N orbitals whileL depends on N orbitals and N(N + 1)/2 independent multipliers. The stationary conditions for linear variations ofthe orbitals χK , eq. (VII.2), are now replaced by

0!= δL(k) = δE(k) −

N∑I=1

εIK〈δχK |χI〉+ c.c.

= 〈δχK |h|χK〉+

N∑I=1

〈δχKχI ||χKχI〉 −N∑I=1

εIK〈δχK |χI〉+ c.c.. (VII.4)

Since the Lagrangian also has to be stationary w.r. to linear variations in the multipliers εLK we have in addition theset of equations

0!= δL(LK) = δεLK (〈χK |χL〉 − δKL) , (VII.5)

which have to be fulfilled for arbitrary variations δεLK . This yields simply the orthonormality constraints of theorbitals (as it is always the case for the method of Lagrange, where the additional terms in the Lagrangian containingthe constraints are always linear in the multipliers).

Before we now proceed and explore the HF equations (VII.4) further, we first want to introduce the Coulomb andexchange operators, which are very expedient to cast the HF equations in a more concise form.

B. The Coulomb and exchange operators

The Coulomb operator JI(x) related to orbital χI is defined by its action on a general orbital χ as

JI(x)χ(x) =

[∫g(x,x2)χ∗I(x2)χI(x2)dx2

]χ(x). (VII.6)

Obvioulsy, JI(x) is a one-electron operator, since it acts only on orbitals depending on electron coordinate x (the

second electron coordinate of g(x,x2) depends on is integrated out). The action of JI(x) on χ(x) is to multiply χ(x)

by the Coulomb potential caused by the charge distribution | χI |2. Furthermore, JI(x) is a local operator, implying

that in order to calculate JI(x)χ(x) for a certain x we do not have to know χ(x) over its whole domain.

The exchange operator KI(x) related to orbital χI , on the other hand, is defined by its action on a general orbitalχ as

KI(x)χ(x) =

[∫g(x,x2)χ∗I(x2)χ(x2)dx2

]χI(x). (VII.7)

As JI(x), KI(x) is a one-electron operator. Yet in contrast to the previous equation the electron coordinates x and

x2 are now swapped for the ket functions. Consequently, χ, i.e., the function KI(x) acts on, is now part of the

integrand. Hence, KI(x) is a non local operator, implying that in order to calculate KI(x)χ(x) for a certain x weindeed have to know χ(x) over its whole domain (since χ is part of the integrand).

Let us now define the Coulomb and exchange operators related to all but a certain orbital, i.e.,

J(x)χ(x) =

N∑I=1

JI(x)χ(x) =

[∫g(x,x2)

N∑I=1

χ∗I(x2)χI(x2)dx2

]χ(x) =

[∫g(x,x2)ρ1(x2)dx2

]χ(x), (VII.8)

K(x)χ(x) =

N∑I=1

KI(x)χ(x) =

[∫g(x,x2)

N∑I=1

χ∗I(x2)χ(x2)dx2

]χI(x) =

∫g(x,x2)ρ1(x2; x)χ(x2)dx2, (VII.9)

Page 33: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

33

where we have used eqs. (V.22) and (V.21), respectively. Note that for the local Coulomb operator the one-particledensity function pops up in the integrand of eq. (VII.8), while for the non-local exchange operator we have theone-particle density matrix instead.

Using the definitions of the Coulomb and exchange operators in eqs. (VII.8) and (VII.9) we can now re-write

N∑I=1

〈χJχI ||χKχI〉

=

N∑I=1

(∫χ∗J(x1)χ∗I(x2)g(x1,x2)χK(x1)χI(x2)dx1dx2 −

∫χ∗J(x1)χ∗I(x2)g(x1,x2)χI(x1)χK(x2)dx1dx2

)=

∫χ∗J(x1)J(x1)χK(x1)dx1 −

∫χ∗J(x1)K(x1)χK(x1)dx1 =

∫χ∗J(x)

(J(x)− K(x)

)χK(x)dx

= 〈χJ |J − K|χK〉 = 〈χJ |G|χK〉, (VII.10)

where we have introduced in the last equality the new one-electron operator,

G(x) = J(x)− K(x), (VII.11)

which is simply the difference between Coulomb and exchange operator. Using this we can now go in and re-writethe stationary conditions for the orbitals (VII.4) in a more compact form as

0!= δL(k) = 〈δχK |h+ G|χK〉 −

N∑I=1

εIK〈δχK |χI〉+ c.c. = 〈δχK |F |χK〉 −N∑I=1

εIK〈δχK |χI〉+ c.c., (VII.12)

where in the last equality we have introduced the one-electron Fock operator

F (x) = h(x) + G(x) = h(x) + J(x)− K(x). (VII.13)

Similarly, we can re-write the expression of the energy expectation value (VII.1) as

E = 〈Ψ|Hel|Ψ〉 =

N∑I=1

〈χI |h+1

2G|χI〉 =

1

2

N∑I=1

〈χI |F + h|χI〉. (VII.14)

C. The Hartree-Fock equations

After introducing these new operators we now proceed and explore the stationary conditions for the orbitals (VII.12)further. To do this we write it completely, i.e., including the complex conjugate terms, yielding

0!=〈δχK |F |χK〉 −

N∑I=1

εIK〈δχK |χI〉+ 〈χK |F |δχK〉 −N∑I=1

εKI〈χI |δχK〉

= 〈δχK |

(F |χK〉 −

N∑I=1

εIK |χI〉

)+

(〈χK |F −

N∑I=1

εKI〈χI |

)|δχK〉. (VII.15)

Note that the equation has to be fulfilled for independent bra and ket variations for all orbitals χK . Hence we havetwo sets of N equations,

〈δχK |

(F |χK〉 −

N∑I=1

εIK |χI〉

)!= 0 F |χK〉 −

N∑I=1

εIK |χI〉!= 0, ∀K, and (VII.16)(

〈χK |F −N∑I=1

εKI〈χI |

)|δχK〉

!= 0 〈χK |F −

N∑I=1

εKI〈χI |!= 0, ∀K. (VII.17)

Page 34: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

34

Let us now complex conjugate eq. (VII.17) and subtract the result from eq. (VII.16). This yields

0 = 〈δχK |F |χK〉 − 〈χK |F |δχK〉∗ −N∑I=1

(εIK〈δχK |χI〉 − ε∗KI〈χI |δχK〉∗) =

N∑I=1

(ε∗KI − εIK) 〈δχK |χI〉

ε∗KI = εIK , (VII.18)

where we have used 〈χK |F |δχK〉∗ = 〈δχK |F |χK〉 (F (x) is Hermitian), and 〈χI |δχK〉∗ = 〈δχK |χI〉. This shows thatthe matrix of the Lagrange multipliers, ε, is Hermitian, or, if real valued, symmetric. This is not surprising! Wehave already anticipated that not all N2, but only N(N + 1)/2 multipliers are independent, since if orbital χK isorthogonal to χI , then also χI is orthogonal to χK . However, it now follows that eq. (VII.17) is just the complexconjugate of eq. (VII.16). Thus for real valued orbitals (which we usually employ) both are equivalent, and we justhave to consider, say eq. (VII.16). We now want to have a closer look at this Hartree-Fock equation,

F |χK〉 −N∑I=1

εIK |χI〉 = 0, ∀K. (VII.19)

Solving this equation yields the proper orbitals for constructing the Slater determinant, which in turn has thelowest energy expectation value E = 〈Ψ|Hel|Ψ〉. From its structure the Hartree-Fock equation (VII.19) almostlooks like an eigenvalue problem; if the matrix of the Lagrange multipliers, ε would be diagonal, (VII.19) indeedwould be an eigenvalue problem. To see this more clearly we expand the orbitals χK(x) in an orthonormal basisϕ1(x), . . . , ϕP (x), . . . , ϕM (x) as

|χK〉 =

M∑P=1

|ϕP 〉CPK , with 〈ϕP |ϕQ〉 = δPQ, and thus CPK = 〈ϕP |χK〉. (VII.20)

Provided that the basis ϕ1(x), . . . , ϕP (x), . . . , ϕM (x) is complete such an expansion is exact and can be donewithout loss of generality. Inserting this expansion for χK(x) and χI(x) into eq. (VII.19), multiplication with anyϕP (x)∗, and integration over x then yields

M∑Q=1

〈ϕP |F |ϕQ〉CQK −N∑I=1

εIK

M∑Q=1

〈ϕP |ϕQ〉CQI = 0, ∀K.

M∑Q=1

FPQCQK =

N∑I=1

εIKCPI , ∀K, FC = Cε (VII.21)

where we have utilized the orthonormality (VII.20) of the basis ϕ1(x), . . . , ϕP (x), . . . , ϕM (x). FPQ = 〈ϕP |F |ϕQ〉 is

the Fock matrix, i.e., the representation of the Fock operator F (x) in the basis ϕ1(x), . . . , ϕP (x), . . . , ϕM (x). Ongoing from eq. (VII.19) to eq. (VII.21) we change from the operator to the matrix form of the HF equations; both areequivalent, provided that the basis ϕ1(x), . . . , ϕP (x), . . . , ϕM (x) is complete. In matrix form the HF orbitals χKare found in terms of the elements of the coefficient matrix C. It is now crystal clear, that for a diagonal multipliermatrix ε eq. (VII.21) is a simple linear algebra eigenvalue problem (assuming that M is finite). Actually, we canexploit the fact that a Slater determinant (as any determinant) is invariant w.r. to rotations among its orbitals (cf.discussion in section III). We can therefore choose a unitary rotated set of HF orbitals χK , represented in the basisϕ1(x), . . . , ϕP (x), . . . , ϕM (x) by the coefficient matrix C, with

C = CU, with UU† = U†U = 1 C = CU†. (VII.22)

The unitary rotation matrix U is chosen such that it diagonalizes the multiplier matrix ε, i.e.,

U†εU = ε, where εIK = εKδIK . (VII.23)

Since ε is Hermitian (cf. eq. (VII.18)), it is guaranteed that such an U exists. Combining eqs. (VII.21), (VII.22),and (VII.23) leads to

FCU† = CU†ε FC = Cε, or

M∑Q=1

FPQCQK = εKCPK , ∀K, (VII.24)

Page 35: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

35

which now evidently is a simple linear algebra eigenvalue problem. Eq. (VII.24) represents the HF equations forcanonical orbitals χK in matrix form and corresponds to the HF equations in operator form,

F |χK〉 = εK |χK〉, ∀K, with |χK〉 =

M∑P=1

|ϕP 〉CPK . (VII.25)

Since F (x) (and F) is Hermitian, its eigenvectors χK(x) form an orthonormal basis. Therefore, multiplying eq.(VII.25) with χ∗I(x) and integrating over x yields

〈χI |F |χK〉 = FIK = εKδIK , (VII.26)

which shows that the matrix representation F of the Fock operator in the canonical orbital basis is diagonal andcorresponds to the diagonalized matrix of the Lagrange multipliers. The latter thus have the physical meaning oforbital energies of the related canonical orbitals. Among the infinite manifold of all possible HF orbital sets yieldingthe same HF determinant with the same energy expectation value the canonical orbitals are special in the sense thatthey are the only ones with a well defined orbital energy. We will come back to that when discussing Koopmans’theorem in the next section.

Eq. (VII.25) (or equivalently eq. (VII.24)) is an independent particle eigenvalue equation with the Fock operatorbeing an effective one-electron Hamiltonian describing a single electron moving in the field of the nuclei (contained in

h(x)) and the mean Coulomb-exchange field caused by the other electrons (contained in the G(x) = J(x) − K(x)).

Note that F (x) depends itself via G(x) on the HF orbitals (cf. eqs. (VII.8) and (VII.9)). Therefore, the HF eigenvalueproblem has to be solved repeatedly until self-consistence is reached (the orbitals obtained from the diagonalizationof the last Fock operator are virtually the same as those of the previous iteration). For that reason the HF theory isalso known as the Self Consistent Field (SCF) theory.

The total N -electron (0th-order) independent particle model Hamiltonian is defined as

H(0)(x1, . . . ,xN ) =

N∑i=1

F (xi) (VII.27)

(cf. eq. (II.6) for comparison). It is left as an exercise to show that the HF Slater determinant |Ψ〉 is eigenfunction

of H(0),

H(0)(x1, . . . ,xN )Ψ(x1, . . . ,xN ) =

N∑I=1

εIΨ(x1, . . . ,xN ) = E(0)Ψ(x1, . . . ,xN ), (VII.28)

with E0 being the sum of the canonical orbital energies. H(0) is the starting point for Møller-Plesset (MP) perturbationtheory. MP perturbation theory is one way to deal with the missing Coulomb hole (cf. discussion in section VI), i.e.,

electron correlation, and will be discussed in a later course. Nevertheless, note that the expectation value over H0,

〈Ψ|H(0)|Ψ〉 =

N∑I=1

〈χI |F |χI〉 = E(0) (VII.29)

is different from the HF energy

E = 〈Ψ|Hel|Ψ〉 =1

2

N∑I=1

〈χI |F + h|χI〉 =1

2

(E(0) +

N∑I=1

hII

). (VII.30)

To derive eqs. (VII.29) and (VII.30) we have utilized Table I, eqs. (VII.27), (VII.14), and (VII.26). The differencebetween E and E(0) amounts to

E(1) = E − E(0) = −1

2E(0) +

1

2

N∑I=1

hII = −1

2

N∑I,J=1

〈IJ ||IJ〉 (VII.31)

Page 36: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

36

and corresponds to the first order correction in MP perturbation theory. E(1) corrects for the implicit double countingof electron repulsion in

E(0) =

N∑I=1

〈χI |F |χI〉 =

N∑I=1

hII +

N∑I,J=1

〈IJ ||IJ〉, (VII.32)

where the proper factor of 1/2 is missing before the second summation when just adding up the canonical orbitalenergies (compare above equation with eq. (VII.1)). Therefore, the lowest order correction due to electron correlationeffects is of second order in MP perturbation theory.

D. Koopmans’ theorem

Let’s now explore further the orbital energies εK related to the canonical orbitals χK(x). According to eq. (VII.26)we have

εK = 〈χK |F |χK〉 = 〈χK |h+ G|χK〉 = 〈χK |h|χK〉+

N∑I=1

〈χK χI ||χK χI〉, (VII.33)

where we have used eqs. (VII.13) and (VII.10). For the corresponding HF energy of our N electron system we have,according to eq. (VII.1)

NE =

N∑I=1

〈χI |h|χI〉+1

2

N∑I=1

N∑J=1

〈χI χJ ||χI χJ〉, (VII.34)

while for a corresponding N − 1 electron system, where we have deleted orbital χK(x) from the determinant we getanalogously

N−1EK =

N∑I=1I 6=K

〈χI |h|χI〉+1

2

N∑I=1I 6=K

N∑J=1I 6=K

〈χI χJ ||χI χJ〉. (VII.35)

For the difference of these two energies we thus obtain

NE −N−1 EK = 〈χK |h|χK〉+1

2

N∑J=1

〈χK χJ ||χK χJ〉+1

2

N∑I=1

〈χI χK ||χI χK〉 = εK , (VII.36)

where we have used the symmetry of the electron repulsion integrals (indistinguishability of electron 1 and 2), andthe fact that 〈χK χK ||χK χK〉 = 0. We conclude that the orbital energy εK related to canonical orbital χK(x) canbe interpreted, in the framework of HF theory, as the ionization energy related to the removal of the electron fromspin orbital χK(x). Note that in this derivation the orbitals of the N , and the N − 1 electron system are assumedto be identical, i.e., they correspond to the solution of the HF equations of the N electron system. In other words,the orbitals are un-relaxed for the N − 1 electron system (solving the HF equations for that system would yielddifferent, for that system more optimal orbitals). Consequently, N−1EK is too high in energy and the ionizationpotential calculated according to Koopmans’ theorem is too large. On the other hand, the missing correlation energy(cf. section VI) is larger (more negative) for the system with more electrons, i.e., the N electron system. Missingorbital relaxation and correlation contributions thus are of opposite sign and cancel to a large extent, rendering theKoopmans’ ionization potentials as quite a fair approximation to the true IPs.

What happens, if we add an additional electron to our N electron system, i.e., if we augment the determinant by afurther canonical orbital εA? (the HF eqs. (VII.25) of course produce an arbitrary number of canonical orbitals andwe usually build the HF determinate from just the N orbitals with lowest orbital energies). The orbital energy εA ofthis virtual (or unoccupied) orbital εA of the N electron system is, as above,

εA = 〈χA|F |χA〉 = 〈χA|h+ G|χA〉 = 〈χA|h|χA〉+

N∑I=1

〈χAχI ||χAχI〉. (VII.37)

Page 37: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

37

It is now left as an exercise to the reader, to prove, along the same lines as for the IP, that the εA corresponds to theelectron affinity of the N electron system, related to putting an additional electron into spin orbital εA,

N+1EA −N E = εA. (VII.38)

As for the Koopmans’ IPs the orbitals are determined for the N electron system and thus are non-optimal for theN + 1 electron system. The Koopmans’ EAs thus are too small due to the neglect this orbital relaxation effect. Incontrast to the IPs the neglect of electron correlation here has the same sign and thus increases the error in theKoopmans’ EAs even further. Koopmans’ EAs thus are not such a good approximation to the true EAs.

As we now have learned, the difference in orbital energies εA− εK does NOT correspond to an electronic excitationof the N electron system (related to the promotion of an electron from the occupied orbital χK(x) to the virtualorbital εA). Instead it corresponds to the difference between IP (removing an electron from the N electron system)and EA (adding a further electron to the N electron system), which is an entirely different physical process.

E. Brillouin’s theorem

In the following we want to investigate a matrix element of the form 〈ΨAK |Hel|Ψ0〉. |ΨA

K〉 is related to the HF(ground state) determinant |Ψ0〉 by a substitution of the occupied orbital χK(x) by the virtual orbital χA(x). |ΨA

K〉and |Ψ0〉 hence differ by one spin orbital, and we obtain, according to the Slater-Condon rules in Tables I and I

〈ΨAK |Hel|Ψ0〉 = 〈χA|h|χK〉+

N∑I=1

〈χAχI ||χK χI〉 = 〈χA|h+ G|χK〉 = 〈χA|F |χK〉 = FAK = 0, (VII.39)

where we again have used eqs. (VII.10) and (VII.13), and in the last equality eq. (VII.26). Obviously matrix elementsmixing the HF (ground state) determinant with singly substituted (or excited) determinants generated from the HFdeterminant do vanish. Such matrix elements occur in electron correlation methods like configuration interaction asmentioned in section VI. Consider for a moment a Configuration Interaction Singles (CIS) wavefunction, which is alinear combination of the HF determinant and all singly substituted generated from it, i.e.,

|ΨCIS〉 = c0|Ψ0〉+

N∑I

M−N∑A

cIA|ΨAI 〉 (VII.40)

(M >> N is the total number of spin orbitals available (occupied plus virtual). As will be discussed later in a lectureseries dedicated to electron correlation methods, the CI coefficients can be determined by solving the eigenvalueproblem (

〈Ψ0|Hel|Ψ0〉 〈Ψ0|Hel|ΨBJ 〉

〈ΨAI |Hel|Ψ0〉 〈ΨA

I |Hel|ΨBJ 〉

)(c0cJB

)= E0

(c0cIA

). (VII.41)

E0 corresponds to the ”correlated” ground state energy, i.e., is the lowest eigenvalue of the CI matrix on the rhs ofeq. (VII.42). However, according to Brillouin’s theorem the CI matrix becomes block diagonal, and we obtain(

〈Ψ0|Hel|Ψ0〉 0

0 〈ΨAI |Hel|ΨB

J 〉

)(1

0

)= E0

(1

0

)(VII.42)

the pure HF determinant as the eigenstate corresponding to the lowest eigenvalue, which is the HF energy E0 =〈Ψ0|Hel|Ψ0〉. Obviously, in order to improve the ground state energy beyond E0 we have to include at least doublysubstituted determinants, as a consequence of Brillouin’s theorem.

The fulfillment of Brillouin’s theorem,

〈χA|F |χK〉 = FAK = 0, (VII.43)

in fact is sufficient to specify a proper set of Hartree-Fock orbitals. We have already mentioned in section VII C thatthe canonical orbitals are just one possible set of valid spin orbitals for a HF determinant. They obviously fulfillBrillouin’s theorem since the Fock matrix is diagonal in that basis (cf. eq. (VII.26)). However, any orbital set forwhich the occupied-virtual blocks of the Fock matrix do vanish, are valid HF orbitals leading to identical energyand determinant. For example, rather than canonical orbitals we could as well use localized orbitals, as long asBrillouin’s theorem is fulfilled. Orbitals thus are not unique! They are just appropriate one-particle functions to builddeterminants.

Page 38: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

38

F. Introduction of a finite basis: the Roothaan-Hall equations

So far we have discussed the Hartree-Fock equations from a formal angle. The Hartree-Fock equations according toeq. (VII.25) with the Fock operator specified in eq. (VII.13) clearly is a integro-differential equation difficult to solve.In section VII C we have introduced a formally complete orthogonal basis to transform the Hartree-Fock equationsin linear algebra form. As long as the introduced basis is complete this is an entirely equivalent description, i.e.,no approximation is invoked. However, the matrices involved are infinite and therefore cannot be handled on anycomputers. In order to obtain a practical method we need to introduce a finite basis, which consequently impliesan approximation. When specifying such a basis we have to find a reasonable compromise between accuracy andefficiency (the more compact the basis, the smaller the matrices, and the higher the efficiency of the method). Insolid state physics, which is dealing with surfaces and crystals, i.e., two- and three-dimensional periodic systems aplane wave basis set is an obvious choice. In chemistry, where the electronic structure of finite molecules is to bedescribed, atom centered local functions, i.e., atomic orbitals (AOs) are mostly used instead. This is known as theLinear Combination of Atomic Orbitals (LCAO) approximation. As atomic orbitals pre-contracted Gaussians arealmost exclusively used in all the program packages. We will see in a later section why Gaussians are so an excellentchoice.

Let us now expand our molecular orbitals χP (x) in a basis of AOs as

|χP 〉 =

M∑ν=1

|ϕν〉CνP . (VII.44)

The ϕν(x) with µ = 1 . . .M represent the AOs, the CνP is the orbital coefficient in the LCAO of MO χP (x).Analogously to section VII C we now expand the MOs in eq. (VII.25) according to eq. (VII.44), multiply on the braside with ϕ∗µ(x), and integrate over x, which yields

M∑ν

〈ϕµ|F |ϕν〉CνP = εP

M∑ν

〈ϕµ|ϕν〉CνP , or

M∑ν

FµνCνP = εP

M∑ν

SµνCνP , or in matrix form

FC = SCε. (VII.45)

Note that the AOs form a non-orthogonal basis, which gives rise to the overlap matrix (or metric) S. F is the Fockmatrix in AO basis, and ε the diagonal matrix holding the canonical orbital energies. The coefficient matrix Ctransforms from the AO to the canonical MO basis. In the following we will drop the overbar, implicitly assumingthat the MOs are always the canonical ones. S, F, and also ε and C are M ×M matrices with M >> N . The HFdeterminant is built from the P = 1 . . . N MOs with the lowest orbital energies εP , which are denoted as occupiedMOs. The P = N + 1 . . .M unoccupied or virtual MOs are not used in the HF wavefunction, but can be salvaged inpost HF electron correlation treatments like configuration interaction. Occupied MOs are usually indexed by indicesI, J,K, . . . (spin orbitals) or i, j, k, . . . (spatial orbitals); virtual MOs by A,B,C, . . . or a, b, c, . . . .

By the LCAO approximation the original HF integro-differential equation is mapped onto a linear algebra problem,i.e., a generalized eigenvalue problem involving matrices with finite dimensions, which is amenable to calculations ona computer. These equations are known as the Roothaan-Hall equations.

Let us next have a closer look at the Fock matrix. For an element of F we have

Fµν = 〈ϕµ|F |ϕν〉 = 〈ϕµ|h+ G|ϕν〉 = 〈ϕµ|h|ϕν〉+

N∑I=1

〈ϕµχI ||ϕνχI〉

= 〈ϕµ|h|ϕν〉+

N∑I=1

M∑ρ,σ=1

C∗ρICσI〈ϕµϕρ||ϕνϕσ〉 = hµν +

M∑ρ,σ=1

Dρσ〈µρ||νσ〉 = hµν +Gµν , (VII.46)

where we have introduced in the second to last equality the density matrix in AO basis D with elements

Dµν =

N∑I=1

C∗µICνI . (VII.47)

Page 39: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

39

The last equality implicitly defines the two-electron part of the Fock matrix. The Fock matrix obviously depends viathe density matrix on the unknown MO coefficient matrix C, which has to be determined according to eq. (VII.45),which in turn depends on F. For that reason, the HF equations have to be solved iteratively, until self-consistency isreached, as already mentioned in section VII C.

The relation between D and ρ1(x′; x) is easily obtained by expanding the MOs of eq. (V.21) in terms of AOs, whichyields

ρ1(x′; x) =

N∑I=1

χ∗I(x′)χI(x) =

M∑µ,ν=1

N∑I=1

C∗µICνIϕ∗µ(x′)ϕν(x) =

M∑µ,ν=1

Dµνϕ∗µ(x′)ϕν(x). (VII.48)

So D indeed is the representation of the one-particle density matrix ρ1(x′; x) in AO basis.

Finally, let us re-write the energy expression for the HF energy, eq. (VII.1) in the AO basis, yielding

E =

N∑I=1

〈χI |h|χI〉+1

2

N∑I,J=1

〈χIχJ ||χIχJ〉

=

M∑µ,ν=1

N∑I=1

C∗µICνIhµν +1

2

M∑µ,ν=1

N∑I=1

C∗µICνI

M∑ρ,σ=1

N∑J=1

C∗ρJCσJ〈µρ||νσ〉

=

M∑µ,ν=1

Dµνhµν +1

2

M∑µ,ν,ρ,σ=1

DµνDρσ〈µρ||νσ〉 =

M∑µ,ν=1

Dµν

(hµν +

1

2Gµν

). (VII.49)

The HF energy is obtained from the HF density matrix by tracing it with the one- and two-electron integrals.

G. Solving the Roothaan-Hall equations

Now let’s turn to the question on how to solve the Roothaan-Hall equations (VII.45). Obviously, since the metricS is not diagonal this is not an ordinary eigenvalue problem, i.e., Sµν = 〈ϕµ|ϕν〉 6= δµν . However, we can transformeq. (VII.45) into an orthonormal basis

|ϕ′µ〉 =

M∑ν=1

|ϕν〉Xνµ, with 〈ϕ′µ|ϕ′ν〉 = δµν =

M∑ρ,σ=1

X∗µρ〈ϕρ|ϕσ〉Xσν , or X†SX = 1. (VII.50)

Since S is hermitian there exists a unitary transformation matrix U diagonalizing it, i.e., U†SU = s, with U†U = 1.s is the diagonalized metric of the original AO basis. We can use it to specify the transformation matrix X. There areactually two choices, (i) the symmetric, and (ii) the canonical orthogonalization. Choosing symmetric diagonalizationwe define X as

X = Us−12 U†. (VII.51)

We then indeed have

X†SX = Us−12 U†SUs−

12 U† = Us−

12 ss−

12 U† = U†U = 1, (VII.52)

as required (X indeed transforms to a properly orthonormalized basis).

On the other hand, choosing canonical orthogonalization we define X as

X = Us−12 , (VII.53)

which leads to

X†SX = s−12 U†SUs−

12 = s−

12 ss−

12 = 1, (VII.54)

which also is correct.

Page 40: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

40

For a molecular orbital χP (x) we have, according to eq. (VII.44),

|χP 〉 =

M∑ν=1

|ϕν〉CνP =

M∑µ,ν=1

|ϕ′µ〉X−1µν CνP =

M∑µ=1

|ϕ′µ〉C ′µP , with C′ = X−1C, or C = XC′. (VII.55)

In the second equality of the above equation we have inserted eq. (VII.50) after its inversion. It is now straightforwardto transform the Roothaan-Hall equations,

FC = SCε FXC′ = SXC′ε X†FXC′ = X†SXC′ε F′C′ = C′ε, with F′ = X†FX. (VII.56)

The transformed Roothaan-Hall equations obviously constitute an ordinary eigenvalue problem which can be solvedwith standard methods. The resulting eigenvectors C′ are easily back-transformed to the original AO basis bymultiplication on the left with X.

H. Integrating out spin: restricted closed shell Hartree-Fock

In most cases closed shell molecules are treated, implying that the HF determinant consist of N spin orbitalsχI(x) = φ(I+1) DIV 2(r)η(s), where η(s) is either α(s), or β(s). In other words, we have N/2 different spatial orbitalsφI(r), and each is ”doubly occupied” by an α- and a β electron. The HF determinant thus has the form (cf. eq.(III.3))

Ψ(x1, . . . ,xN ) =1√N !

∥∥∥φ1(r1)α(s1) φ1(r2)β(s1) φ2(r3)α(s3) φ2(r4)β(s4) . . . φN2

(rN−1)α(sN−1) φN2

(rN )β(sN )∥∥∥ .

(VII.57)

In such a case it is easy to integrate out the dependence on the spin coordinates (cf. eq. (II.26)), such that expressionsinvolving only the spatial orbitals and integrations over spatial coordinates remain. Let’ s first consider the densitymatrix for the restricted closed shell Hartree-Fock (RHF) case. For the spinless one-particle density matrix we get

P1(r′; r) =

∫s′1=s1

ρ1(x′; x)ds1 =

M∑µ,ν=1

N/2∑i=1

C∗µiCνiϕ∗µ(r′)ϕν(r)

∫s′1=s

(α∗(s)α(s) + β∗(s)β(s)) ds

= 2

M∑µ,ν=1

N/2∑i=1

C∗µiCνiϕ∗µ(r′)ϕν(r) =

M∑µ,ν=1

Dµνϕ∗µ(r′)ϕν(r). (VII.58)

by combining eqs. (VI.2) with (VII.48). The density matrix in the AO basis for the RHF case thus is

Dµν = 2

N/2∑i=1

C∗µiCνi. (VII.59)

Note that we use capital letters I, J, . . . to index spin orbitals, but non-capital letters i, j, . . . to index spatial orbitals.For the lhs of the spinless Roothaan-Hall equation we obtain

M∑ν=1

[∫ϕ∗µ(r1)α∗(s1)Fϕν(r1)α(s1)dr1ds1

]Cνp =

M∑ν=1

[∫ϕ∗µ(r1)α∗(s1)hϕν(r1)α(s1)dr1ds1

+

N/2∑i=1

M∑ρ,σ=1

C∗ρiCσi

∫ϕ∗µ(r1)α∗(s1)ϕ∗ρ(r2)α∗(s2)r−112 ϕν(r1)α(s1)ϕσ(r2)α(s2)dr1dr2ds1ds2

+

N/2∑i=1

M∑ρ,σ=1

C∗ρiCσi

∫ϕ∗µ(r1)α∗(s1)ϕ∗ρ(r2)β∗(s2)r−112 ϕν(r1)α(s1)ϕσ(r2)β(s2)dr1dr2ds1ds2

−N/2∑i=1

M∑ρ,σ=1

C∗ρiCσi

∫ϕ∗µ(r1)α∗(s1)ϕ∗ρ(r2)α∗(s2)r−112 ϕν(r2)α(s2)ϕσ(r1)α(s1)dr1dr2ds1ds2

−N/2∑i=1

M∑ρ,σ=1

C∗ρiCσi

∫ϕ∗µ(r1)α∗(s1)ϕ∗ρ(r2)β∗(s2)r−112 ϕν(r2)α(s2)ϕσ(r1)β(s1)dr1dr2ds1ds2

Cνp. (VII.60)

Page 41: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

41

In eq. (VII.60) we consider the case of a spin orbital with α(s1) spin function (of course an entirely analogous equa-tion also exists for a spin orbital with β(s1) spin function). Furthermore, the summation over N in eq. (VII.46) ispartitioned into to sums over N/2, corresponding to spin orbitals involving α(s2), and β(s2) spin functions, respec-tively. Since none of the operators in (VII.60) depends on spin, spin is easily integrated out by exploiting eq. (II.26).Obviously, the last term in (VII.60) vanishes, yielding

M∑ν=1

[∫ϕ∗µ(r1)Fϕν(r1)dr1

]Cνp =

M∑ν=1

[∫ϕ∗µ(r1)hϕν(r1)dr1

+ 2

N/2∑i=1

M∑ρ,σ=1

C∗ρiCσi

∫ϕ∗µ(r1)ϕ∗ρ(r2)r−112 ϕν(r1)ϕσ(r2)dr1dr2

−N/2∑i=1

M∑ρ,σ=1

C∗ρiCσi

∫ϕ∗µ(r1)ϕ∗ρ(r2)r−112 ϕν(r2)ϕσ(r1)dr1dr2

Cνp,or, written in a more compact form after using eq. (VII.59),

M∑ν=1

FµνCνp =

M∑ν=1

[hµν +

M∑ρ,σ=1

Dρσ

(〈µρ|νσ〉 − 1

2〈µρ|σν〉

)]Cνp =

M∑ν=1

[hµν +Gµν ]Cνp. (VII.61)

The rhs of the spinless Roothaan-Hall equation, involving only the overlap integrals, is trivial, and we obtain as thefinal expression for the spinless Roothaan-Hall equation again

FC = SCε, (VII.62)

with Sµν =

∫ϕ∗µ(r1)ϕν(r1)dr1, and the Fock matrix F being defined as,

Fµν = hµν +

M∑ρ,σ=1

Dρσ

(〈µρ|νσ〉 − 1

2〈µρ|σν〉

)= hµν +Gµν , (VII.63)

where the last equality implicitly defines the two electron matrix G for the RHF case.

It is left to the reader to verify that the RHF energy expression can be written as

E =

M∑µ,ν=1

Dµν

(hµν +

1

2Gµν

), (VII.64)

i.e., that it has exactly the same form as eq. (VII.49), but with matrices D and G for the RHF case, as defined ineqs.(VII.59) and (VII.61), respectively. What we have achieved now is (i) the removal of all spin integration in ourintegrals, and (ii) a reduction in the number of (spatial) orbitals which we have to determine in the SCF procedurefrom N to N/2.

So far, we have discussed the construction of the Fock- and density matrices, as well as how to solve the Roothaan-Hall equations in a finite basis of atom centered basis functions and in particular for the RHF case. Thus, we are nowin a a position to sketch the computer algorithm of a Hartree-Fock program, as it is implemented in many ab initioelectronic structure program packages like MOLPRO, TURBOMOLE, GAUSSIAN, QCHEM, etc., cf. Table III.

In the following we want to have a look on what kind of atom centered basis functions ϕµ(x) it is advisable to usein the LCAO approximation.

Page 42: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

42

TABLE III: The Hartree-Fock SCF procedure

1. Specification of the molecular geometry (Cartesian coordinates of nuclei, Z-matrix),

Specification of the atom centered basis set

2. Pre-calculation of all one-electron integrals hµν , Sµν ; stored on disk

3. For non-direct SCF: Pre-calculation of two-electron integrals 〈µρ|νσ〉 ; stored on disk

4. Diagonalization of metric S, transformation matrix X; stored on disk

5. Obtain a first guess of the density matrix D (e.g. pieced together from atomic densities,

or calculated by using a semi-empirical method.

6. Calculation of the two electron part of the Fock matrix, G, by contracting the electron repulsion integrals

〈µρ||νσ〉 with the density matrix D cf. eq. (VII.46).

7. Calculation of the Fock matrix F by adding to G the one-electron integral matrix h, cf. eq. (VII.46).

8. Transform the Fock matrix F with X, i.e., F′ = X†FX, cf. eq. (VII.56).

9. Solve the ordinary eigenvalue problem F′C′ = C′ε, cf. eq. (VII.56).

10. Back-transform the resulting MO coefficient matrix, C = XC′, cf. eq. (VII.55).

11. Construct new density matrix D from back-transformed MO coefficient matrix C, according to eq. (VII.47).

12. eventually extrapolate new density matrix D from the one just calculated,and from those

of previous iterations (convergence acceleration, DIIS (direct inversion of the iterative subspace) )

13. Calculate the Hartree-Fock energy E by contracting D with h and G, according to eq. (VII.49).

14. Test convergence (energy difference relative to E of previous iteration,

norm of change in density matrix, other criteria)

IF NOT CONVERGED GOTO 6.

15. Calculate other properties of interest from converged density matrix (expectation values, e.g. dipole moment)

Print/Plot Hartree-Fock orbitals, density, etc.

VIII. BASIS SETS AND INTEGRAL EVALUATION

As we have seen in the previous section, the following integrals over atom centered basis functions have to beevaluated,

Sµν =

∫ϕ∗µ(r1 −RA)ϕν(r1 −RB)dr1, (VIII.1)

hµν =

∫ϕ∗µ(r1 −RA)

(−1

2∇2

1 +ZC

|r1 −RC |

)ϕν(r1 −RB)dr1, (VIII.2)

〈µρ|νσ〉 =

∫ϕ∗µ(r1 −RA)ϕ∗ρ(r2 −RC)

1

|r2 − r1|ϕν(r1 −RB)ϕσ(r2 −RD)dr1dr2, (VIII.3)

where we have assumed that basis function ϕµ(r1) is centered on atom A, ϕν(r1) on atom B, etc. The meaning ofthe position vectors is indicated in Fig. 3.

Obviously, the argument of the basis functions corresponds to the position of electron 1 or 2 as it should be, butargument zero now refers to the individual origin of each basis function. As one can see from eqs. (VIII.1)-(VIII.3)the overlap integrals involve at most two, the nuclei-electron attraction integrals at most three, and the two-electronrepulsion integrals (ERIs) at most four different atoms. Furthermore, in particular the number of ERIs grows quite

Page 43: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

43

steeply with the number of basis functions, i.e., as M4. So appropriate basis functions should on the one hand bealready quite good approximations to the target MOs, such that already a relatively ”short” LCAO expansion

|φI〉 =

M∑µ=1

|ϕµ〉CµI

with not too large M provides a sufficiently accurate description for the MOs |φI〉. On the other hand, the basisfunctions should be of a kind that the evaluation of the integrals in eqs. (VIII.1)-(VIII.3), and in particular the ERIscan be carried out efficiently. Based on the discussion of the hydrogen atom Slater-Type-Orbitals (STOs) could be areasonable choice due to their similarities to the MOs of the hydrogen atom. In polar coordinates (r, θ, φ) STOs havethe form

ϕζ,n,l,m(r, θ, φ) = NYlm(θ, φ)rn−1e−ζr, (VIII.4)

where N is a normalization constant, r is the distance between the electron and the origin of the function, ζ is anappropriate exponent, n = 1, 2, . . . ; l = 0, 1, . . . ;m = −l, . . . ,+l are integral quantum numbers, and Ylm(θ, φ) arethe spherical harmonics, encountered already when discussing the hydrogen atom problem. STOs obviously have thecorrect exponential decay behavior and the nuclear cusp as known from the solution of the radial part of the hydrogenatom problem. However, unlike the hydrogen atom radial functions, the radial parts of the STOs do not have anynodes for r > 0. Furthermore, all s-type orbitals (l = 0) except 1s (n = 1) have a node at r = 0. The correct radialbehavior of a certain atomic orbital thus has to be taken care of by forming appropriate linear combinations of STOswith different exponents ζ.

Unfortunately, the necessary integrals defined in eqs. (VIII.1-VIII.3) over STOs are difficult to calculate. Inparticular, the calculation of three- and four-center ERIs in eq. (VIII.3) cannot be performed analytically. So, eventhough STOs would perfectly fulfill our first requirement of leading to short and compact LCAO expansions, they donot meet our second requirement of efficient integral evaluation. An alternative choice of basis functions are Gaussians(or GTOs for Gaussian-Type-Orbitals). Again in polar coordinates (r, θ, φ) GTOs have the form

ϕζ,n,l,m(r, θ, φ) = NYlm(θ, φ)r2n−2−le−ζr2

. (VIII.5)

In contrast to the STO specification above the radial functions are different for orbitals belonging to the same n, butwith different angular momenta l, e.g., for the 2s and the 2p shell. Much of the usefulness of the GTOs stems fromthe fact that they are not confined to a local, polar coordinate system, but that they can easily be expressed in termsof their Cartesian components, i.e.,

ϕζ,lx,ly,lz,RA(x, y, z) = NGζ,lx(x−XA)Gζ,ly (y − YA)Gζ,lz (z − ZA); with (VIII.6)

Gζ,lx(x) = xlxe−ζx2

.

The GTO is assumed to be centered on atom A with position vector RA = (XA, YA, ZA). The sum lx + ly + lzdefines the angular momentum of the GTO. For example, lx + ly + lz = 1 implies a p-shell with the three components

O

C

A

el1

r1

RC

RA

r1-R

A

el2

r2

r2-r1

r2-R

C

FIG. 3: Position vectors of nuclei A, C, and an electrons 1 and 2, relative to the origin of the coordinate system O.

Page 44: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

44

−10 −5 0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

r

G1(r)=N

1exp(−0.3r

2)

G2(r)=N

2exp(−0.05(r−5)

2)

G3(r)=G

1(r)G

2(r)

FIG. 4: Two normed s-type 1-D Gaussians G1(r) = N1 exp(−0.3r2), G2(r) = N2 exp(−0.05(r − 5)2), centered at r = 0, andr = 5, respectively, and their (unnormed) product G3(r) = G1(r)G2(r).

((lx = 1, ly = 0, lz = 0), (lx = 0, ly = 1, lz = 0), (lx = 0, ly = 0, lz = 1)), lx + ly + lz = 2 a d-shell with six components,etc. Note that there is a difference in the number of components for a representation in polar, and in Cartesiancoordinates. In the latter a d-shell comprises six components as just said, while in the former it comprises just fivecomponents. When transforming a d-shell from Cartesian to polar coordinates one obtains the five d-components inpolar coordinate plus one additional s-function x2 + y2 + z2.

GTOs perfectly fulfill our second requirement; the integrals in eqs. (VIII.1-VIII.3) over GTOs can be handled veryefficiently, and recursive analytic formulae are available to evaluate ERIs involving three- or four centers. At the heartof this is the Gaussian Product Theorem GPT, stating that a product of two Gaussians centered at two (different)centers A and B can be expressed as a short expansion of Gaussians, all centered on a single center P , i.e.,

GζA,lxA (x−XA)GζB ,lxB (x−XB) =

lxA+lxB∑k=0

ckGζP ,k(x−XP ), with (VIII.7)

ζP = ζA + ζB ,

XP =ζAXA + ζBXB

ζP,

GζP ,k(x) = xke−ζP x2

.

It is left to the reader as an exercise to (i) verify the GPT for the case of two s-type Gaussians centered on atomsA and B, and (ii) to check that an analogous product theorem does not hold for Slater functions. A graphicalrepresentation of the GPT for two s-type Gaussians is provided in Fig. 4.

As a consequence of the GPT, all ERIs (VIII.3) can now be expressed in terms of just two-center quantities, which isof enormous help when evaluating these integrals. There are a bunch of different analytic algorithms used in differentprogram packages to evaluate ERIs, like Rys-Gauss quadrature, the Murchie-Davidson, or the Obara-Saika scheme,which all are based on the GPT. They essentially differ in the recurrence relations to shift angular momentum fromone centre to the other.

Another nice property of a GTO concerns its derivative. We have

d

dxϕζ,lx,ly,lz,RA

(x, y, z) =d

dxNGζ,lx(x−XA)Gζ,ly (y − YA)Gζ,lz (z − ZA)

= N

(dGζ,lx(x−XA)

dx

)Gζ,ly (y − YA)Gζ,lz (z − ZA), with (VIII.8)

dGζ,lx(x−XA)

dx=

d

dx(x−XA)lxe−ζ(x−XA)2 =

(lx(x−XA)lx−1 − 2ζ(x−XA)lx+1

)e−ζ(x−XA)2 .

This shows that the derivative e.g., w.r. to x, leads in the product to two x-component Gaussians, one with the

Page 45: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

45

−15 −10 −5 0 5 10 150

0.2

0.4

0.6

0.8

1

STO−1G

STO−2G

STO−3G

STO

FIG. 5: An STO (red), and linear combinations of one (STO-1G, black), two (STO-2G, green), and three (STO-3G, blue)Gaussians fitting that STO.

angular momentum increased by one, the other decreased by one, relative to the original lx. Derivatives of Gaussians,as they e.g. appear in the integral for the kinetic energy, eq. (VIII.2), are also simple to evaluate.

Obviously, GTOs exhibit neither the proper exponential decay behavior, nor the nuclear cusp. Fig. 5 displays anSTO reference curve (STO, red) and a GTO fitted by a least-square fit to the STO reference (STO-1G, black). Ascan be seen by comparing these two curves, the GTO decays quicker and is differentiable at x = 0 (the first derivativew.r. to x vanishes at x = 0). However, already a linear combination of two GTOs with different exponents fitted tothe STO provide a much better description (STO-2G, green curve), and a linear combination of just there Gaussianswith different exponents almost matches the STO. Of course, any linear combination of GTOs will never lead to anon-differentiable cusp, but this is in practice irrelevant, since it affects only a tiny volume element. The bottom-lineof this is, that even though GTOs do neither exhibit the nuclear cusp nor the proper exponential decay behavior asthe STOs, just a linear combination of a few GTOs with proper exponents are required to provide a very accurateapproximate description of these features. This makes GTOs the most widely used basis functions for electronicstructure calculations on molecular systems. Most of the prominent ab initio quantum chemistry program packageslike MOLPRO, TURBOMOLE, QCHEM, GAUSSIAN, ... are employing GTOs as basis functions. On the other hand, forperiodic systems (crystals, surface slabs), the use of plane waves is a natural choice; the latter are commonly used inthe solid state physics community. Plane waves, in contrast to GTOs, form an orthonormal basis, which has manyadvantages. However, since our target systems are molecules we focus on GTOs.

To fit and pre-contract GTOs to STOs as done in Fig. 5 actually was a common strategy in earlier days, the termSTO-nG implies a pre-contracted set of n GTOs with the exponents optimized to optimally fit an STO representing acertain orbital of an atom. The STO-3G basis set is still used as a minimal set for testing purposes, e.g., for the atomsof the first row it consists of five functions (1s, 2s, 2p), each represented by three pre-contracted GTOs. However,nowadays, much richer basis sets are employed in serious calculations.

A GTO basis set for a certain atom type is represented by a set of primitive GTOs (PGTOs), i.e., individual GTOsof a certain angular momentum type and with a certain exponent, which are then pre-contracted to a smaller set ofcontracted GTOs (CGTOs). The exponents and contraction coefficients are atom-type specific; they are determinedonce and for all either by fitting to reference STOs (as in the case of the STO-nG basis sets), or, as done in morerecent times, by minimizing the variational energy in HF or CI calculations of the related atom, or of characteristicsmall molecules formed with that atom. Such a CGTO thus is a fixed linear combination of PGTOs and henceforthconsidered as a single basis function. In ab initio quantum chemistry program packages GTO basis sets are usuallyavailable as libraries, i.e., optimal exponents and contraction coefficients are available for most of the atoms, and areentirely transparent for the user. In the following we will provide a short overview over different GTO basis sets.

The smallest basis sets are, as the name already indicates, minimal basis sets. They comprise just as many CGTOsas are necessary to accommodate all the electrons of the neutral atoms of the molecule. E.g., for the water moleculea minimal set comprises two s functions (originating from the two H atoms) plus another two s functions (from the1s and 2s shell, respectively, of oxygen) plus a p-function set (from 2p shell) consisting of three components, thus in

Page 46: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

46

total seven functions. An example for a minimal basis set is above mentioned STO-3G set. Mimimal basis sets aremuch too small for serious calculations and are just used for test purposes, e.g., when debugging new code.

When extending the basis set the focus is usually on the valence shells of the of the individual atoms of the molecule.This is motivated by the fact that chemistry involves the valence electrons, but hardly affects the electrons on innershells of the atoms. The latter thus are already well described by the related CGTOs, which were optimized in e.g.atomic calculations (vide infra). In so-called valence double zeta basis sets the number of CGTOs representing thevalence shells of the atoms are doubled. For our water molecule this implies that we now have two s functions (ratherthan one) for each H atom, plus another three s functions (one from the 1s shell, two from the 2s shell of oxygen), plustwo p-function sets (from 2p shell), hence in total 13 functions. Examples are the 3-21G and 6-31G basis sets. Thenomenclature of these basis sets actually reveals how they are constructed: 6-31G implies a single CGTO consisting ofsix PGTOs describing an inner shell (1s of oxygen), and two CGTOs describing the valence shells (2s, 2p of oxygen),one consisting of three, the other of a singe PGTO. Valence triple zeta - and valence quadruple zeta basis sets areanalogously constructed.

Just splitting the valence shells into more and more CGTOs however is not a very balanced way to improve abasis set. In a molecule, when the electrons on a certain nucleus experience the field exerted from the other atomsand nuclei, this nucleus no longer is the optimal center for an expansion of the molecular orbitals in terms of basisfunctions, but a point slightly displaced from the position of that nucleus. This displacement is only small such thatwe can stick to our LCAO expansion around the nucleus and describe the slightly displaced basis function as

ϕζ,lx,ly,lz,RA(r + ∆r) = ϕζ,lx,ly,lz,RA

(r) +∇ϕζ,lx,ly,lz,RA(r) ·∆r, with (VIII.9)

∇ϕζ,lx,ly,lz,RA(r) ·∆r =

d

dxϕζ,lx,ly,lz,RA

(x, y, z)∆x+d

dyϕζ,lx,ly,lz,RA

(x, y, z)∆y +d

dzϕζ,lx,ly,lz,RA

(x, y, z)∆z.

According to eq. (VIII.8) this implies that in addition to our ordinary Gaussian basis set we should also includeadditional functions with lower and higher angular momenta than in the functions of the original basis, but with thesame exponents. Such functions with lower angular momenta are usually covered by the original basis anyway, butfunctions with higher angular momenta (denoted as polarization functions) indeed have to be added. This impliesthat for example for hydrogen atoms in a molecule not just s, but also p, and eventually even higher GTOs should beincluded in the basis set; for atoms of the second period not just s and p, but d, and eventually f GTOs. Examplesof such basis sets including polarization functions are 6-31G*, 6-31G**, or the correlation consistent Dunning basissets cc-pVXZ (X=D,T,Q). The polarization function shells remain uncontracted in most basis sets.

Finally, some systems like negatively charged anions, or polar systems with one part of molecule partially negativelycharged, exhibit diffuse charge distributions for which the standard basis sets are inappropriate. Note also, that aGTO decays too quickly compared to an STO, as is displayed in Fig. 5. To cover such cases standard basis sets can beaugmented by additional diffuse GTOs, i.e., GTOs with particularly low exponents, which also remain uncontracted.Diffuse polarization functions are usually not needed. Examples for such basis set with diffuse functions are 6-31G*+,6-311G**++, or the augmented correlation consistent Dunning basis sets aug-cc-pVXZ (X=D,T,Q). It is usually agood idea to employ basis sets with diffuse functions when polarizabilities are to be calculated, or interaction energiesof intermolecular complexes and clusters, or electronically excited states. For such purposes methods beyond theHartree-Fock approximation are required. The Hartree-Fock approximation excludes e.g. van der Waals dispersiveforces. In a pure Hartree-Fock picture there is no attractive, but just a repulsive interaction between two Ar atoms.This obvious failure of Hartree-Fock theory is intimately connected to the lack of electron correlation: in section VI wehave demonstrated that any single Slater determinant ansatz (like Hartree-Fock) lacks the description of the Coulombhole. Therefore, to properly account for dispersive forces post-Hartree-Fock electron correlation methods like Møller-Plesset perturbation theory, or Coupled Cluster methods are required, which include many Slater determinants. Theseare beyond the scope of the present discussion and deferred to a later lecture series.

Instead we will turn our attention to a method, which tries to tackle the electron correlation method in an entirelydifferent and less straightforward way, namely Density Functional Theory (DFT).

IX. DENSITY FUNCTIONAL THEORY (DFT)

Density Functional Theory has its origin in solid state physics, but is nowadays widely used also in chemistry forthe description of molecules. It has a semi empirical flavor, as we will see, and its theoretical foundation is not asstrong as for wavefunction based methods, i.e., for Hartree-Fock plus electron correlation methods.

Page 47: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

47

A. The Hohenberg-Kohn Theorem

The basis of DFT is the Hohenberg-Kohn theorem. It states that the electronic energy of the electronic groundstate of a system is completely determined by its one-particle density function ρ1(x) defined in eq. (IV.6), sectionIV. Or, in other words, it states that the energy of a system is a functional of ρ1(x). From section IV C we alreadyknew that the energy is a functional of the one-particle density matrix, and the two-particle density function, yet thatthe energy can also be regarded as a functional of the much simpler one-particle density function alone, comes as asurprise.

The proof of the Hohenberg-Kohn theorem is relatively simple and proceeds as follows. Let’s start be writing oncemore the clamped nuclei Hamiltonian (I.4) in atomic units,

Hel = −N∑i=1

1

2∇2i −

M∑K=1

N∑i=1

ZKriK

+1

2

∑ij

′ 1

rij+ Enuc = −

N∑i=1

1

2∇2i + Vext(r) +

1

2

∑ij

′ 1

rij+ Enuc. (IX.1)

The nuclear repulsion energy Enuc does not have any influence on the electrons. It just adds to the electronic energyby a constant value. The only term which couples the electrons to external potentials is the second term, Vext(r).If external electronic or magnetic fields are not present, as in eq. (IX.1), Vext corresponds to the potential betweenelectrons and nuclei. It is thus clear, that Vext(r), as any potential, depends on the position vector r. For ease ofreading we drop in the following the function argument when writing Vext(r).

Let’s now assume that we have two different external potentials, Vext, and V ′ext, which, when solving the relatedSchrodinger equations

Hel|Ψ〉 = E0|Ψ〉,H ′el|Ψ′〉 = E′0|Ψ′〉,

and calculating the density ρ1(x) according to eq. (IV.6), yield exactly the same density. Now, obviously, |Ψ′〉 is onlyan approximation to the true solution |Ψ〉 of the first Schrodinger equation. Consequently, according to the variationalprinciple we can conclude that

〈Ψ′|Hel|Ψ′〉 > E0, 〈Ψ′|H ′el|Ψ′〉+ 〈Ψ′|Hel −H ′el|Ψ′〉 > E0, E′0 + 〈Ψ′|Vext − V ′ext|Ψ′〉 > E0

E′0 +

∫ρ1(x)(Vext − V ′ext)dx > E0. (IX.2)

In the last step we made use of the premise that ρ1(x) = ρ′1(x). Note that we just consider the electronic groundstate (hence the subscript 0 in the energies), for which the variational principle is strictly valid.

On the other hand, |Ψ〉 is an approximation to the true solution |Ψ′〉 of the second Schrodinger equation, and wecan conclude that

〈Ψ|H ′el|Ψ〉 > E′0, 〈Ψ|Hel|Ψ〉+ 〈Ψ|H ′el −Hel|Ψ〉 > E′0, E0 + 〈Ψ|V ′ext − Vext|Ψ〉 > E′0

E0 +

∫ρ1(x)(V ′ext − Vext)dx > E′0, E0 −

∫ρ1(x)(Vext − V ′ext)dx > E′0. (IX.3)

Now adding up the two inequalities (IX.2) and (IX.3) yields the logical contradiction

E0 + E′0 > E0 + E′0 0 > 0. (IX.4)

Apparently, our initial premise, ρ1(x) = ρ′1(x), is wrong! If the external potentials Vext, and V ′ext, are different,then also the one-particle density functions ρ1(x), and ρ′1(x), must be different! In other words, there is a bijectivemapping between densities ρ1(x), and energies, E0, i.e., the energy is a functional E0[ρ1(x)] of the density ρ1(x).This is an amazing result! The energy is uniquely determined by the ρ1(x), i.e., a much much simpler object thanthe wavefunction itself. However, here comes the catch: in contrast to eq. (IV.17), where we have a clear prescriptionon how to compute the exact energy from the exact one-particle density matrix, and the exact two-particle densityfunction (provided that these objects are at hand), there is no recipe on how to calculate the exact energy from theexact density function. The Hohenberg-Kohn theorem is just a proof of existence of such a mapping, but it does nottell us more. Actually, calculating the exact energy from the exact density function could be as complicated as solvingthe Schrodinger equation in the first place. The connection between energy and density is unknown und thereforeterrain of modelling and approximations.

Page 48: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

48

B. Partitioning the Energy Functional

Now that we know of its existence by virtue of the Hohenberg-Kohn theorem we can partition the energy functionalinto terms which are straightforward to compute, and others, which are not. In the following we just consider thespinless densities P1(r) defined in eq. (IV.7). Our Hamiltonian does not depend on spin, which thus can be integratedout. Dropping furthermore the subscript 0 for convenience (the whole discussion indeed is restricted to electronicground states) we write the energy functional as

E[P1(r)] = T [P1(r)] + Eext[P1(r)] + Eee[P1(r)] = T [P1(r)] + Eext[P1(r)] + J [P1(r)] +K[P1(r)] (IX.5)

where T [P1(r)], Eext[P1(r)], and Eee[P1(r)] represent the kinetic energy, the external energy (electron-nucleus inter-action energy in the absence of external fields), and the electron-electron interaction energy, respectively. The latteris further partitioned into the classical Coulomb repulsion energy J [P1(r)] plus its complement, denoted as K[P1(r)].Eext[P1(r)] and J [P1(r)] are classical electron-nucleus and electron-electron Coulomb energies, respectively,

Eext[P1(r)] = −M∑K=1

∫ZKP1(r)

|RK − r|dr, J [P1(r)] =

1

2

∫P1(r)P1(r′)

|r− r′|drdr′, (IX.6)

and easily calculated. More problematic are T [P1(r)], and K[P1(r)]. What physicists usually do in such a situationis that they employ a model to calculate such terms; in the present case the uniform electron gas, an excellent modelfor simple metals (recall that DFT has its roots in solid state physics). For the model of the uniform electron gasT [P1(r)] and K[P1(r)] can be written as

TTF[P1(r)] = CF

∫P

531 (r)dr, CF = 3

10

(3π2) 2

3 , (IX.7)

KD[P1(r)] = CX

∫P

431 (r)dr, CX = − 3

4

(3π

) 13 . (IX.8)

Using these expressions in the energy functional leads to the Thomas-Fermi,

ETF[P1(r)] = TTF[P1(r)] + Eext[P1(r)] + J [P1(r)], (IX.9)

and to the Thomas-Fermi-Dirac,

ETFD[P1(r)] = ETF[P1(r)] +KD[P1(r)], (IX.10)

theories (the former neglects any electron-electron interactions beyond the classical Coulomb repulsion). The errors inthe total energy are 15-50 % (in comparison to Hartree-Fock theory, which has usually less than a 1 % error in the totalenergy). Moreover, neither Thomas-Fermi, nor Thomas-Fermi-Dirac theory predict chemical bonds, and molecules donot exist. This is a rather devastating outcome, making DFT in this form fairly uninteresting for chemistry. So whereis the problem? Apparently, the model of the uniform electron gas is entirely unsuitable for one of the two termsT [P1(r)], or K[P1(r)], or eventually even for both. Most problematic is the kinetic energy T [P1(r)], which obviouslyis much better described by the Hartree-Fock Slater determinant. The Kohn-Sham theory benefits from that in thefollowing way:

C. Kohn-Sham DFT, part1

The basic idea is to partition T [P1(r)] into two terms, a first large term which similarly calculated as in Hartree-Focktheory, and a second small correction term. This is accomplished by re-specifying the Hamilton operator (IX.1) as

Hel(λ) = −N∑i=1

1

2∇2i + Vext(λ) +

λ

2

∑ij

′ 1

rij+ Enuc. (IX.11)

Here, λ serves as a knob to dim the electron-electron interaction: for λ = 1 we have the physical system with normalelectron-electron interaction, for λ = 0 the electron-electron interaction is switched off. Note that in eq. (IX.11)Vext(λ) is a function of λ. We choose it such that for any setting of λ still the exact density P1(r) is obtained (aswhen solving the corresponding Schrodinger equation for λ = 1 and calculating P1(r) subsequently via eqs. (IV.6)

Page 49: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

49

and (IV.7)). Thus, Hel(λ = 0) is the Hamiltonian of a fictitious non-interacting system, for which, when solving thecorresponding Schrodinger equation the exact density P1(r) is obtained. This is the so-called Kohn-Sham system.In other words, we have specified an unphysical Hamiltonian for a non-interacting system, which yields the exactdensity P1(r) of the physical interacting system. Needless to say, that we don’t know Vext(λ = 0). However, we knownow exactly how to calculate the kinetic energy of the Kohn-Sham system: since the (properly antisymmetrized)eigenfunction of a non-intercating Hamiltonian is a Slater determinant we just calculate it as the expectation valueover the (one-electron) kinetic energy operator via the Slater-Condon rules as

TS =

N/2∑i=1

〈φi| −1

2∇2|φi〉. (IX.12)

of course this looks awfully like the kinetic energy of Hartree-Fock theory, with the sole difference that the orbitalsφi(r) used to construct the Slater determinant (cf. eq. (VII.57) )are not Hartree-Fock, but Kohn-Sham spatialorbitals. We will see in a minute, how to get them. For the Kohn-Sham system with λ = 0, TS is the exact kineticenergy by construction (provided that the Kohn-Sham orbitals are exact). For the true system with λ = 1, TS is anapproximation, but much much better than TTF. Let’s now go back and rewrite our energy functional for the truesystem as

EKS[P1(r)] = TS[P1(r)] + Eext[P1(r)] + J [P1(r)] + Exc[P1(r)], with

Exc[P1(r)] = T [P1(r)]− TS[P1(r)] + Eee[P1(r)]− J [P1(r)]. (IX.13)

An attentive reader will object at the previous equation, pointing out that the kinetic energy is no longer a properfunctional of the density, but rather a functional of the Kohn-Sham orbitals. One could counterargue, that it is inprinciple possible to write in turn the orbitals as a functional of the density, but this is non-trivial, and we sweep thisobjection under the carpet, for the present discussion.

In eq. (IX.13) the infamous exchange-correlation functional Exc[P1(r)] now appears as the container for everythingthat is unclear and difficult to compute. Obviously, due to the term Eee[P1(r)]− J [P1(r)] it consists of the electron-electron interaction everything but the straightforward classical Coulomb repulsion (IX.6), namely exchange, but alsoelectron correlation effects. Furthermore, due to the term T [P1(r)] − TS[P1(r)] it also contains the small correctionterm for the kinetic energy between λ = 0, and λ = 1. Of course, Exc[P1(r)] is unknown and has to be modeled. Atthat point the method no longer is exact. It would be exact when the exact Exc[P1(r)] would be known and used.That is what the DFT people mean when they say that DFT is in principle exact. In practice, DFT is as good asthe employed Exc[P1(r)] are. Over the past decades an incredible zoo of different Exc[P1(r)] functionals have beenproposed, which have become increasingly complicated over the years. We will have a glimpse at that zoo later.

Let’s now see how to obtain the Kohn-Sham orbitals. If we look at EKS[P1(r)] in eq. (IX.13), it looks a lot like theHartree-Fock energy functional. In fact, when replacing Exc[P1(r)] by the Hartree-Fock exchange energy we retrieveexactly the Hartree-Fock energy functional (cf. eq. (IX.6). We can thus proceed as we did in Hartree-Fock theory.We variationally minimize EKS[P1(r)] w.r. to the Kohn-Sham orbitals φi(r) with the condition that the φi(r) forman orthonormal basis (recall that we used this property when deriving the Slater-Condon rules). To this end we setup the Lagrangian

LKS[P1(r)] = EKS[P1(r)]−N∑

I,J=1

εji 〈φi|φj〉 − δij) , (IX.14)

which indeed is virtually identical to eq. (VII.3) in Hartree-Fock theory. We proceed in an entire analogous way bysetting up stationary conditions for linear variations of the orbitals. The first three terms of EKS[P1(r)] in LKS[P1(r)]are identical to those in the Hartree-Fock Lagrangian (VII.3) and do depend explicitly on the Kohn-Sham orbitalsφi(r), the fourth term, Exc[P1(r)], though does not and we have to discuss functional derivatives first.

D. Short digression: Functional derivatives

Consider a function of a single variable F (y). Its derivative at y = q is dFdy

∣∣∣q. If we want to know how the function

changes if we move an infinitesimal amount away from y = q by dy we can calculate this change as

δF =dF

dy

∣∣∣∣q

dy. (IX.15)

Page 50: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

50

Now let us consider the same for a function of n variables, F (y1, y2, . . . , yn). Here we have

δF =∂F

∂y1

∣∣∣∣q1

dy1 +∂F

∂y2

∣∣∣∣q2

dy2 + · · ·+ ∂F

∂yn

∣∣∣∣qn

dyn =

n∑k=1

∂F

∂yk

∣∣∣∣qk

dyk = ∇F |y0· dy, (IX.16)

where ∇F |y0means that the gradient of function F is evaluated at position vector y0 = (q1, q2, . . . , qn) in the variable

space. We can write the independent variables y1, y2, . . . yn collectively as y(k), k = 1, 2, . . . , n. Then we can regardour function of n variables, F (y1, y2, . . . , yn), as a kind of functional F [y(k)], where the function y(k) is just definedfor the discrete integer values k = 1, 2, . . . , n.

Now, true functionals however depend on functions, which are defined for continuous, real valued arguments within acertain interval [a, b]. Let’s consider such a ”true” functional F [y(x)]. We can represent the function y(x) by its valueson n points, where the k-th point is defined as xk = a+kε, with nε = b−a. Obviously, the more points n we have, thesmaller becomes ε. This yields the following approximate representation of the function y(x) ≈ y(xk), k = 1, 2, . . . , n.According to the second equality of (IX.16) we have for the derivative of this approximate functional

δF =

n∑k=1

∂F

∂yk

∣∣∣∣y0

dyk. (IX.17)

The larger n is, i.e., the smaller ε is, the finer is the representation of y(x), and in the limit n → ∞, y(xk) becomesidentical to y(x). How does dF in eq. (IX.19) behave in the limit n → ∞? To see this recall the definition of anintegral, ∫ b

a

dxf(x) = limε→0

n∑k=1

εf(xk). (IX.18)

Furthermore, let’s rewrite eq. (IX.19) such that it takes the form

δF =

n∑k=1

ε

(1

ε

∂F [y(xk)]

∂y(xk)

∣∣∣∣y0(xk)

)dy(xk). (IX.19)

Now taking the limit n→∞, or equivalently, ε→ 0, and using eq. (IX.18) we obtain

δF =

∫ b

a

dxδF [y(x)]

δy(x)

∣∣∣∣y0(x)

δy(x) (IX.20)

When writing the previous equation we have introduced the notation δy(x) = dy(xk)|n→∞. Moreover, the denominatorε has been absorbed into δF [y(x)]/δy(x). Eq. (IX.20) defines the functional derivative δF [y(x)]/δy(x). The meaningof eq. (IX.20) is analogous to that of eq. (IX.16): here, the change δF of a function F of discrete variables isproportional to the infinitesimal changes in these variables with proportionality constant equal to the partial derivativeof the function w.r. to these variables. On the other hand, the change δF of functional F [y(x)] on altering slightly thefunction y(x) is proportional to the infinitesimal change δy(x) with constant of proportionality equal to the functionalderivative. Since y(x), as well as δF [y(x)]/δy(x) are continuous functions of x we have to integrate over x, ratherthan to sum over the discrete variable index.

E. Kohn-Sham DFT, part2

Now we are in a position to write down the linear variation of the functional Exc[P1(r)] w.r. to the density P1(r).Using eq. (IX.20) we get

δExc =

∫drδExc[P1(r)]

δP1(r)δP1(r) =

∫drVxc([P1(r)]; r)δP1(r), (IX.21)

where we have dropped the explicit reference to the reference point. This equation also implicitly defines the exchangecorrelation potential as the functional derivative of the exchange correlation energy w.r. to the density, i.e.,

Vxc([P1(r)]; r) =δExc[P1(r)]

δP1(r). (IX.22)

Page 51: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

51

Note that the exchange correlation potential still is a functional of the density, but it is also a function of r, whereasthe exchange correlation energy (as any energy) is independent of r (r is integrated out). Actually, we are interestednot in the linear variation of Exc[P1(r)] w.r. to the density P1(r), but w.r. to the k-th Kohn-Sham orbital, sincewe want to minimize the Lagrangian (IX.14) w.r. to linear variations in the orbitals φi(r). Since Kohn-Sham DFTimplies a single determinant ansatz (vide supra) we have the relation (cf. eq. (V.22))

P1(r) = 2

N/2∑i=1

φ∗i (r)φi(r). (IX.23)

Using this, we get

δP1(r)(k) = 2 (δφ∗k(r)φk(r) + φ∗k(r)δφk(r)) , (IX.24)

and by plugging that in eq. (IX.21),

δE(k)xc = 2

∫drVxc([P1(r)]; r) (δφ∗k(r)φk(r) + φ∗k(r)δφk(r)) = 2〈δφk|Vxc([P1(r)]; r)|φk〉+ 2〈φk|Vxc([P1(r)]; r)|δφk〉.

(IX.25)

Having this last piece at hand we can now proceed exactly as in the Hartree Fock case, i.e., requiring stationarity forthe Kohn-Sham Lagrangian (IX.14),

δL(k)KS[P1(r)]

!= 0, ∀k. (IX.26)

w.r. to variations of the Kohn-Sham orbitals. This finally yields the Kohn-Sham equations

FKS|φk〉 −N/2∑i=1

εik|φi〉 = 0, ∀k, with (IX.27)

FKS(r) = −1

2∇2 −

∑K

ZK|RK − r|

+

∫P1(r′)

|r− r′|dr′ + Vxc([P1(r)]; r)

= −1

2∇2 + Vext(λ = 0)

= h(r) + J(r) + Vxc([P1(r)]; r), (IX.28)

which determine the Kohn-Sham orbitals. By comparing the first and second equality in (IX.28) we can now recognizeVext(λ = 0) of the fictitious non-interacting Kohn-Sham system: it is the sum of the nucleus-electron potential, thepotential between an electron and the mean field of all electrons, and the exchange-correlation potential, the lattertwo evaluated with the exact density.

Eqs. (IX.27,IX.28) are identical to the closed-shell Hartree-Fock equations apart from the last term in the Kohn-

Sham operator, where Vxc([P1(r)]; r) substitutes −1/2K(r). In order to proceed to a practical method we could againintroduce a finite basis set as in section VII F, which leads us to

FKSC = SCε, (IX.29)

with

(FKS)µν = hµν +

M∑ρσ=1

Dρσ〈µρ|νσ〉+ 〈µ|Vxc([P1(r)]; r)|ν〉. (IX.30)

Comparing the last equation to eq. (VII.63) in the Hartree-Fock case it is clear that when changing an existing Hartree-Fock program to a DFT program the only thing to do is to replace the calculation of the exchange contribution by thecalculation of 〈µ|Vxc([P1(r)]; r)|ν〉. The latter is usually evaluated numerically by integration over a grid. The form ofVxc([P1(r)]; r) of course is given by the specification of the exchange correlation functional Exc[P1(r)] (Vxc([P1(r)]; r)is its functional derivative). For local functionals of the form

Exc[P1(r)] =

∫drf(P1(r)) (IX.31)

Page 52: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

52

the functional derivative just corresponds to the partial derivative of f(P1(r)) w.r. to the density P1(r), i.e.,

Vxc([P1(r)]; r) =δExc[P1(r)]

δP1(r)=∂f(P1(r))

∂P1(r). (IX.32)

We demonstrate this for the simple example of the local density approximation (LDA) exchange functional (IX.8),

ELDAx [P1(r)] = Cx

∫drP1(r)4/3. (IX.33)

For the linear variation of ELDAx [P1(r)] w.r. to the density we straightforwardly get

δELDAx [P1(r)] = ELDA

x [P1(r) + δP1(r)]− ELDAx [P1(r)] = Cx

∫dr(

[P1(r) + δP1(r)]4/3 − P1(r)4/3

)= Cx

∫dr

(P1(r)4/3 +

4

3P1(r)1/3δP1(r)− P1(r)4/3

)= Cx

∫dr

4

3P1(r)1/3δP1(r), (IX.34)

where in the third equality we made a first-order Taylor expansion for [P1(r) + δP1(r)]4/3

. Comparing eq. (IX.34) toeq. (IX.21) we can conclude that

V LDAx [P1(r)] = Cx

4

3P1(r)1/3 =

∂CxP1(r)4/3

∂P1(r). (IX.35)

Let’s now summarize the Kohn-Sham part: Kohn-Sham DFT, in contrast to Hartree-Fock is in principle exact,provided that the exact exchange correlation functional would be know and employed, which of course cannot berealized, in practice. The computational cost is comparable to Hartree-Fock; provided that simple functionals areemployed, even much faster, since the expensive calculation of the exchange part of Hartree-Fock is replaced bynumerical integration over a simple exchange-correlation potential.

The accuracy of Kohn-Sham DFT of course strongly depends on the choice of the functional, and there are manydifferent ones. Since it is not very insightful in the present context to discuss individual functions in length, we willjust have a quick glimpse at the zoo of functionals.

F. The zoo of functionals

It is customary to split Exc[P1(r)] into a pure exchange, and a pure correlation energy functional. Furthermore,these energy functionals are often written in terms of the related energy densities εx[P1(r)], εc[P1(r)], such that wehave

Exc[P1(r)] = Ex[P1(r)] + Ec[P1(r)] =

∫drP1(r)εx[P1(r)] +

∫drP1(r)εc[P1(r)]. (IX.36)

Starting from

Exc[P1(r)] =

∫drP1(r)εxc[P1(r)] (IX.37)

and employing eq. (IX.32) it is easy to show that for local functionals we have the relation

Vxc([P1(r); r)] = εxc[P1(r)] + P1(r)∂εxc[P1(r)]

∂P1(r). (IX.38)

1. The local density approximation, LDA

Employing once more the uniform electron gas to model the exchange correlation energy we arrive at the localdensity approximation (LDA), which we already encountered above. For the exchange functional we have (cf. eqs.(IX.8), and (IX.33)),

ELDAx [P1(r)] = Cx

∫drP1(r)4/3, εLDA

x [P1(r)] = CxP1(r)1/3 (IX.39)

Page 53: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

53

The correlation energy of the uniform electron gas has been determined accurately by Ceperly and Alder in quantumMonte Carlo calculations for a number of different densities. An analytic interpolation formula proposed by Vosko,Wilk, and Nusair (VWN) is usually employed to exploit this data in DFT calculations. The actual formula can befound in their article (Can. J. Phys. 58, 1200, 1980) and is not given here. The combination of these two functionalcorresponds to the LDA approximation in DFT. It is still often used in physics, for chemistry it is usually not accurateenough.

2. Gradient corrected functionals, GGAs

Judging from the lack of success of the LDA approximation in chemistry we conclude that the electron density ina molecule is not so uniform as in the electron gas. Recall that the use of that model for the kinetic energy lead todisastrous results, as we have seen for the Thomas-Fermi theory. Using it in the Kohn-Sham context to model theexchange correlation functional is not as bad, but still unusable for applications in chemistry. How could one improveon it? An next generation of functionals, denoted as the generalized gradient approximation (GGA) introduces alsothe gradient of the density (and eventually higher derivatives) into the functionals. There exists a plethora of differentGGAs, proposed by different research schools. We certainly do not want to discuss these GGA functionals in detailhere, or to compare any of the related formulae. Just to convey the general idea behind GGAs we provide here twoformula for functionals of the exchange energy density, which both are relatively simple. The first is the functionalby Perdew and Wang, known by the acronym PW86, which reads as

εPW86x [P1(r)] = εLDA

x [P1(r)](1 + ax2 + bx4 + cx6

)1/15, with x =

|∇P1(r)|P1(r)4/3

. (IX.40)

The second is the B88 functional by Becke,

εPW86x [P1(r)] = εLDA

x + ∆εB88x , with ∆εB88

x = −βP1(r)1/3x2

1 + 6βx sinh−1 x, (IX.41)

which is frequently used in the community. The individual parameters are determined by fitting to know data sets.At this point, the DFT method becomes semi-empirical. There are also corresponding GGA correlation functionals,for example the LYP (by Lee, Yang, and Parr), the P86 (by Perdew, 1986), and the PW81 (Perdew, Wang, 1991)functionals. We won’t provide any formulae for these, here.

GGAs certainly are a major improvement over LDA. GGAs are used in chemical applications, for example inthe context of geometry optimizations, or in Carr-Parinello molecular dynamics calculations. A DFT calculationbased on GGAs is usually much cheaper than a Hartree-Fock calculation. On the other hand, some fundamentalphysical principles are violated. For example, the correlation energy is not zero for a one-electron system (P86,PW91). Furthermore, a fundamental problem of DFT is the self-repulsion of electrons. In Hartree-Fock theory theself-repulsion of electrons in the Coulomb term is naturally cancelled by the exchange term. However, in DFT wethrow away the Hartree-Fock exchange term and replace it by an obscure exchange correlation potential. Cancellationof the self-repulsion of electrons in the Coulomb term thus is no longer guaranteed, as in Hartree-Fock theory. Thishas severe consequences e.g. when calculating electronic excitation energies via TD-DFT linear response theory.Excitation energies of excited states with charge transfer character are much underestimated and then mix with otherlower states, such that calculated TD-DFT spectra are qualitatively wrong if charge transfer states are involved.

The severe problem of the self-repulsion of electrons is alleviated when employing hybrid functionals, the next stepon our ladder of functionals.

3. Hybrid functionals

The essence of hybrid functionals is to introduce a certain fraction of non-local Hartree-Fock like exchange intothe exchange correlation functional. In doing so DFT takes out a second loan from Hartree-Fock theory. From ourdiscussion so far, we know that the the exchange correlation potential Vxc([P1(r)]; r) is the functional derivative ofthe exchange correlation energy Exc[P1(r)] (cf. eqs. (IX.21) and (IX.22)). Furthermore, for local functionals theconnection is trivial, i.e., given by eq. (IX.32). But for non-local functionals, where we cannot apply the simple trickof the first-order Taylor expansion the connection between Exc[P1(r)] and Vxc([P1(r)]; r) is more obscure. However,

Page 54: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

54

there is an exact connection via the fictitious non-interacting Kohn-Sham system, which is always valid. This is theadiabatic connection formula,

Exc[P1(r)] =

∫ 1

0

dλ〈Ψ(λ)|Vxc([P1(r)]; r, λ)|Ψ(λ)〉. (IX.42)

So we are integrating over the λ parameter of the Kohn-Sham Hamiltonian (IX.11) from λ = 0 (Kohn-Sham system)to λ = 1 (real system). Wave function and exchange correlation potential of course depend on λ, i.e., the choice ofthe Hamiltonian.

Eq. (IX.42) can be proven via the Hellmann-Feynman theorem, which states that the derivative of the energyexpectation value w.r. to the strength of an external perturbation is equal to the expectation value over the relatedperturbation operator. For example, the dipole moment of a system can be calculated either as the expectation valueover the position operator, or, alternatively, as the derivative of the energy w.r. to the strength of the applied electricfield. The Hellmann-Feynman theorem holds for exact and variational wave functions. A detailed discussion of theHellmann-Feynman theorem is deferred to a subsequent lecture series.

In the present context we consider λ as the perturbation strength parameter. The Hellmann-Feynman theoremthen can be written as

d

dλ〈Ψ(λ)|Hel(λ)|Ψ(λ)〉 = 〈Ψ(λ)|∂Hel(λ)

dλ|Ψ(λ)〉. (IX.43)

Plugging (IX.11) into the rhs of this equation yields

d

dλ〈Ψ(λ)|Hel(λ)|Ψ(λ)〉 = 〈Ψ(λ)|∂Vext(λ)

dλ+ Vee|Ψ(λ)〉, with Vee =

1

2

∑ij

′ 1

rij. (IX.44)

We now integrate this equation from λ = 0 to λ = 1. By construction, the density P1(r) remains constant in thisintegration. Since we simultaneously integrate and differentiate w.r. to the λ variable we get

〈Ψ(1)|Hel(1)|Ψ(1)〉 − 〈Ψ(0)|Hel(0)|Ψ(0)〉 = 〈Ψ(1)|Vext(1)|Ψ(1)〉 − 〈Ψ(0)|Vext(0)|Ψ(0)〉+

∫ 1

0

dλ〈Ψ(λ)|Vee|Ψ(λ)〉,

E(1)− E(0) =

∫drP1(r)Vext(1)−

∫drP1(r)Vext(0) +

∫ 1

0

dλ〈Ψ(λ)|Vee|Ψ(λ)〉. (IX.45)

On the other hand, the energy E(0) of the fictitious non-interacting Kohn-Sham system is

E(0) = TS[P1(r)] +

∫drP1(r)Vext(0). (IX.46)

Plugging the latter in the previous equation yields

E(1) = TS[P1(r)] +

∫drP1(r)Vext(1)︸ ︷︷ ︸Eext[P1(r)]

+

∫ 1

0

dλ〈Ψ(λ)|Vee|Ψ(λ)〉. (IX.47)

By comparing with (IX.13) we can now identify∫ 1

0

dλ〈Ψ(λ)|Vee|Ψ(λ)〉 = J [P1(r)] + Eexc[P1(r)]. (IX.48)

Furthermore, partitioning of Vee into the λ independent classical Coulomb potential, and the exchange-correlationpotential,

Vee =

∫dr′

P1(r′)

|r− r′|+ Vxc([P1(r)]; r, λ) (IX.49)

and insertion of this into the lhs of eq. (IX.48) yields∫ 1

0

dλ〈Ψ(λ)|Vee|Ψ(λ)〉 =1

2

∫drdr′

P1(r)P1(r′)

|r− r′|︸ ︷︷ ︸J[P1(r)]

+

∫ 1

0

dλ〈Ψ(λ)|Vxc([P1(r)]; r, λ)|Ψ(λ)〉. (IX.50)

Page 55: Theoretical Chemistry I - Willkommen Chemistry I Martin Schutz 1 1Institute of Physical and Theoretical Chemistry, University of Regensburg, Universit atsstraˇe 31, D …

55

Comparison of eqs. (IX.48) and (IX.50) finally implies eq. (IX.42), q.e.d.

How can we now exploit the adiabatic connection formula (IX.42) to devise a hybrid functional? We could adhoc assume that the exchange correlation potential Vxc([P1(r)]; r, λ) depends linearly on λ. If this is the case theintegration over λ just corresponds to the mean value of the integrands for λ = 0 and λ = 1,

Exc[P1(r)] =1

2(〈Ψ(λ)|Vxc([P1(r)]; r, 0)|Ψ(λ)〉+ 〈Ψ(λ)|Vxc([P1(r)]; r, 1)|Ψ(λ)〉) . (IX.51)

For the first term of eq. (IX.51) corresponding to the fictitious non-interacting Kohn-Sham system (λ = 0) there is noelectron correlation, and the exact wavefunction corresponds to a single Slater determinant. The exact exchange energycorresponds to Hartree-Fock like exchange, evaluated with the Kohn-Sham orbitals. We denote this by EHF

x [P1(r)].An attentive reader might again object, pointing out that in order to calculate the exchange the density matrix ratherthan the density function is required, so we once more leave the proper concept of density functional theory. Weagain sweep this under the carpet, just mentioning that with the Kohn-Sham orbitals there is of course no problemto calculate the exchange; we proceed just in the same way as in the case of Hartree-Fock theory.

The second term of eq. (IX.51) corresponding to the real system (λ = 1) of course is unknown. We may approximateit by once more employing the model of the uniform electron gas. This leads to Becke’s Half-and-Half (H+H) hybridfunctional,

EH+Hxc [P1(r)] =

1

2EHF

x [P1(r)] +1

2

(ELDA

x [P1(r)] + ELDAc [P1(r)]

). (IX.52)

Better hybrid functionals are obtained when combining the hybrid approach with GGAs. For example the Beckethree-parameter functional B3 is widely used,

EB3xc [P1(r)] = (1− a)ELDA

x [P1(r)] + aEHFx [P1(r)] + b∆EB88

x [P1(r)] + ELDAc [P1(r)] + c∆EGGA

c [P1(r)]. (IX.53)

The parameter a, b, c are determined by fitting to experimental data and depend on the chosen ∆EGGAc [P1(r)]. Using

that of Lee, Yang, and Parr we end up at the famous B3-LYP functional, which is so often used in chemistry that italmost reaches the status of a religion.

Hybrid functionals are computationally much more expensive than purely local and semi-local GGAs, since theexpensive non-local exchange, as in Hartree-Fock theory, has to be computed. The computational cost of DFT withhybrid functionals is comparable to Hartree-Fock theory.

The problem of electron self-repulsion of course is not rigorously fixed, but just alleviated, to some extent. In orderto rigorously fix it, 100%, rather than just a fraction of non-local exchange would be needed, and then one would bemore or less thrown back to Hartree-Fock theory, where rather expensive correlation methods are required to reallyimprove the results. DFT with the functionals discussed also does not describe van der Waals dispersion. Dependingon the functional, it may happen, that there is an attractive interaction between say two argon atoms, but this hasnothing to do with the proper physics of dispersion: the van der Waals interaction energy should decay as R−6 withdistance R between the argons, but DFT would predict an exponential decay instead.

There is very active research in the development of new DFT functionals. These new developments, such as range-separated hybrid functionals, or new correlation fucntionals on the basis of the random phase approximation arefar beyond the scope of this very basic introduction to DFT. At the end of this discussion a personal comment ofthe author might be in order: in the course of his carreer the author (who himself is not active in the developmentof DFT, but rather of wavefunction based methods) has observed a steady increase in the complexity of the newlyderived functionals. DFT also more and more leaves the original basis of DFT and uses orbital, rather than densityfunctionals. This in turn brings the computational complexity of such DFT calculations back into the range ofwavefunction based correlation methods (like perturbation theory or coupled cluster methods). For example, therandom phase approximation can be regarded as a simplified coupled cluster doubles method. All in all, the ”no freelunch” theorem appears still to be valid: the price to pay in order to achieve the same reliability and accuracy aswavefunction based correlation methods is to accept a similar computational complexity as that of the latter. In thatsense, DFT, employed with hybrid functionals as the B3-LYP appears to be a sensible compromise between efficiencyand accuracy, despite of its inherent problems. This is also illustrated by the plethora of DFT/B3-LYP applicationsin the literature. For high accuracy and reliability, on the other hand, one may want to employ wavefunction basedcorrelation methods.