Ab Initio Methods for Electron Correlation in Molecules

John von Neumann Institute for Computing
Ab Initio Methods for Electron Correlation in Molecules

Peter Knowles, Martin Schutz, and Hans-Joachim Werner
published in
Modern Methods and Algorithms of Quantum Chemistry, Proceedings, Second Edition, J. Grotendorst (Ed.), John von Neumann Institute for Computing, Julich, NIC Series, Vol. 3, ISBN 3-00-005834-6, pp. 97-179, 2000.
c 2000 by John von Neumann Institute for Computing Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. To copy otherwise requires prior specic permission by the publisher mentioned above.
http://www.fz-juelich.de/nic-series/
AB INITIO METHODS FOR ELECTRON CORRELATION IN MOLECULES

PETER KNOWLES School of Chemistry University of Birmingham Edgbaston Birmingham, B15 2TT United Kingdom E-mail: P.J.Knowles@bham.ac.uk MARTIN SCHUTZ AND HANS-JOACHIM WERNER Institute for Theoretical Chemistry University of Stuttgart Pfaenwaldring 55 70569 Stuttgart Germany E-mail: {schuetz, werner}@theochem.uni-stuttgart.de
Reliable ab initio electronic structure calculations require high-level treatment of electron correlation eects. For molecules in electronic ground states, singlereference correlation methods, which are based on the Hartree-Fock self-consistent eld (SCF) wavefunctions as zeroth order approximation, are usually sucient. Mller-Plesset perturbation theory up to fourth order (MP2-MP4) and coupledcluster methods with all single and double excitations followed by a perturbative treatment of triple excitations [CCSD(T)] are the most popular single-reference methods. All of these approaches can also be formulated in a local framework which gives a demand on computational resources that scales only linearly with system size; they can also be carried out using integral-direct techniques, that avoid the storage of large numbers of two electron integrals by recomputing them on demand. For computing electronically excited states or global potential energy functions, multiconguration self-consistent eld (MCSCF) wavefunctions are required for a qualitatively correct representation of the wavefunction. The major part of dynamical electron correlation eects can then be accounted for by subsequent multireference correlation treatments, in which a large number of single and double excitations relative to the MCSCF reference congurations are taken into account. In multireference conguration interaction (MRCI) calculations the expansion coecients are determined variationally. Alternatively, the coecients can be obtained by rst-order perturbation theory, and the energy be evaluated to second (MRPT2) or third (MRPT3) order. These lecture notes give a short review of all these methods.
1 1.1
Introduction Electron correlation and the conguration interaction method
Hartree-Fock Self-Consistent Field (SCF) Theory enjoys considerable success in the rst-principles determination of molecular electronic wavefunctions and properties. However, there are important situations where the underlying assumption of molecular orbital theory, that the electronic wavefunction can be approximated by an antisymmetrized product of orbitals, breaks down. There are still further situations where SCF does provide a reasonable qualitative description, but fails to
predict energetics to desired accuracy. We explore here the deciencies of HartreeFock, and survey the various techniques available for going beyond SCF. Hartree-Fock is a mean eld theory, in which each electron has its own wavefunction (orbital), which in turn obeys an eective 1-electron Schrdinger equation. o The eective hamiltonian (Fock operator) contains the average eld (Coulomb and exchange) of all other electrons in the system. The total electronic wavefunction for the molecule, ignoring complications introduced by the Pauli principle, is a simple product of the orbitals. Following the Born interpretation of wavefunctions, this implies that if P (r1 , r2 ) is the probability density for nding electrons labelled 1 and 2 in regions of space around r1 and r2 respectively, P (r1 , r2 ) = P (r1 )P (r2 ) (1)
i.e., the probability density for a given electron is independent of the positions of all others. In reality, however, the motions of electrons are more intimately correlated. Because of the direct Coulomb repulsion of electrons, the instantaneous position of electron 2 forms the centre of a region in space which electron 1 will avoid. This avoidance is more than that caused by the mean eld, and is local; if electron 2 changes position, the Coulomb hole for electron 1 moves with it. In contrast, in the mean-eld theory, electron 1 has no knowledge of the instantaneous position of 2, only its average value, and thus motions are uncorrelated, and there is no depletion in P (r1 , r2 ) near r1 = r2 . The eects of neglecting electron correlation in Hartree-Fock are spectacularly illustrated when one attempts to compute complete potential curves for diatomic molecules using SCF. Figure 1 shows potential curves for H2 from both a very accurate calculation and from Hartree-Fock. It is seen that the spin-restricted Hartree-Fock (RHF) approximation breaks down as dissociation is reached, predicting energies which are much too high, and a potential curve characteristic of the interaction of ions rather than neutral atoms. The RHF wavefunction for the X1 + ground state of H2 takes the form g
X = Ag (1)g (2)
(2)
where A is the antisymmetrizing operator, and are the usual one-electron spin functions, and the bonding orbital g = Zg (A + B ), with A an s-like orbital centred on atom A, and Zg a normalization constant. As the atoms become 1 innitely separated, A 1sA , Zg 2 and thus X 1 A 1s 1s + 1s 1s + 1s 1s + 1s 1s A B B A A A B B 2 (3)
The rst two terms are direct products of neutral 2 S hydrogen atom wavefunctions on the two atoms A and B, as desired. However, the last two terms describe a spurious H+ . . . H pair. The overall energy of this unphysical wavefunction exceeds the energy of two hydrogen atoms by half the dierence of the ionization energy and electron anity of H (i.e., 6.4 eV), and at long range the potential energy curve has an unphysical ionic R1 behaviour.
Figure 1. Potential Energy Curves for H2
5.0
RHF UHF Exact
Energy / eV
0.0
5.0 0.0
1.0
2.0 R / Angstrom
3.0
4.0
The failure of RHF for this example can be easily understood in terms of electron correlation. At long internuclear separations, if one electron is located near atom A, the other will on physical grounds be found close to atom B. This correlation is reected in the exact wavefunction, which is asymptotically the product of hydrogenic orbitals on the two nuclei. In contrast, within the Hartree-Fock framework, each electron is made to experience only the average eect of the other. Since in RHF, the two electrons are constrained to be in the same spatial orbital, this g orbital will be symmetrical between the atoms, and thus each electron has equal probability of being on A or B, irrespective of the position of the other electron. The possibility of both electrons being on the same atom is not excluded, as reected in the ionic terms in the RHF wavefunction (3). In the case of H2 , and in fact for a number of other dissociating molecules, Hartree-Fock theory can give correct behaviour provided the restriction to identical spatial orbitals for and spin is relaxed. The Unrestricted HF (UHF) wavefunction for H2 is identical to RHF at short bond lengths, but when the two atoms are separated, it becomes variationally advantageous for the and spin orbitals to localize on dierent hydrogen atoms. In this way, a correct asymptotic energy is obtained, as seen in Figure 1. However, the wavefunction can never be identical to the exact wavefunction. Asymptotically, the UHF wavefunction is ei ther A1s 1s or A1s 1s , whereas the true wavefunction is the sum of these two B A A B degenerate determinants. Although the energy is unaected, the UHF wavefunc tion is not an eigenfunction of the spin-squared operator S 2 , being an unphysical mixture of singlet and triplet states. This spin contamination is displeasing, and
Figure 2. Potential Energy Curves for F2
6.0
4.0
RHF UHF Exact
Energy / eV
2.0
0.0
2.0 1.0
2.0 R / Angstrom
3.0
can have serious undesirable eects. In the case of the H2 UHF potential curve, at the point where UHF and RHF diverge, the curve is discontinuous in its second derivative. For more advanced correlation methods which build on UHF, spin contamination has a disastrous eect1,2 . In the case of F2 (Figure 2), UHF does not repair the inability of RHF to give an energy at equilibrium geometry which is lower than at dissociation, and as a consequence the UHF potential curve is purely repulsive. For all these reasons, the use of UHF is becoming increasingly rare. 1.2 Long-range correlation Molecular Dissociation
In order to understand a theory which goes beyond the inability of RHF to describe dissociation, we examine rst of all an excited 1 + state of H2 for which the RHF g wavefunction takes the form
E = Au (1)u (2)
(4) = (5)
and where we now have two electrons in the antibonding orbital u Zu (A B ). Asymptotically, this becomes
1 E 2 A 1s 1s + 1s 1s 1s 1s 1s 1s A B B A A A B B
This wavefunction also contains an unphysical mixture of covalent and ionic terms. However, we observe that it is possible to construct purely ionic or purely covalent 2 2 wavefunctions by taking a linear combination of X and E . In X E = g u , the ionic terms cancel exactly, and the correct asymptotic wavefunction is obtained.
This is an example of conguration interaction (CI), whereby the wavefunction is considered as being a mixture of several Slater determinants. For H2 at general internuclear separations, the form of the CI wavefunction is = c X X + c E E (6)
and the coecients specifying this linear combination must be allowed to vary, since it is known that near equilibrium, the RHF wavefunction is already a good approximation. Thus the best wavefunction near equilibrium will have cX 1 and 1 1 cE small, in contrast to their asymptotic values of 2 and 2 . In general, in the standard CI method, the variational principle is used to determine the CI coecients. For any approximate wavefunction, the Rayleigh quotient E= |H| | (7)
is an upper bound to the exact ground-state energy E, i.e., E E. Variational methods proceed by assuming that the best wavefunction will be the one which gives the lowest, i.e. minimum, E. In the specic case of a linear expansion, as in CI, i.e., =
I
cI I
(8)
minimising E is equivalent to nding the lowest eigensolution of the hamiltonian matrix H, whose elements are the integrals HIJ = I |H|J , i.e. one needs to solve Hc = Ec (10) (9)
with the minimum Rayleigh quotient E appearing as the eigenvalue. The linear ansatz allows also the calculation of approximations to excited states, through the Hylleraas-Undheim-MacDonald theorem, which states that the n-th eigenvalue is an upper bound to the exact energy of the (n 1)-th excited state. Finding the lowest few eigensolutions of a symmetric matrix is a well-studied problem; for the diagonally-dominant hamiltonian matrices invariably arising in molecular CI, algorithms exist3 which will converge in around ten iterations, each of which requires the evaluation of the action of the hamiltonian matrix on some trial vector, i.e., vI =
J
HIJ cJ .
(11)
This feature allows the solution of CI problems of very large dimensions; because H is often extremely sparse, forming Hc is much easier than forming the matrix itself, and the limiting factor is the availability of memory to store c and v. Calculations with more than 109 congurations have been carried out in this way.
1.3
Short-range correlation the Interelectronic Cusp
Although consideration of electron correlation is clearly vital for the proper description of molecules closed to dissociation, it also has important implications in situations where Hartree-Fock is a reasonable approximation. Since the hamilto1 nian operator contains rij , the inverse distance between two electrons, the nature of the electronic wavefunction in regions close to rij = 0 will have a strong eect on the energy. We will consider initially the helium atom, for which the hamiltonian is
1 H = 2 2 1
1 2
2 2
2 1 2 + . r1 r2 r12
(12)
The electronic wavefunction will satisfy Schrdingers equation o H(r1 , r2 ) = E(r1 , r2 ) (13)
at all points in six-dimensional space. We note that close to r12 = 0 there is a paradox; the left hand side of (13) apparently becomes innite, because of the 1/r12 Coulomb singularity, whereas E is constant, and so the right hand side is well behaved. The local energy H/ cannot have singularities since it is constant, and the inescapable conclusion is that there must be an additional singularity in the left hand side of (13) which exactly cancels 1/r12 close to r12 = 0. Since the electrons are not necessarily close to a nucleus, the only candidate for this cancelling term is the kinetic energy. It is convenient to transform to centre-of-mass and relative coordinates, R=
1 2
(r2 + r1 ) ;
r = r 2 r1 ,
2 r
(14)
in which the hamiltonian becomes H = 1 4

2 R
2 2 r1 r2
1 . r
(15)
If we expand the two-electron wavefunction in a Taylor series in r about r = 0, on the (correct for the singlet state) assumption that angular terms in r can be ignored at low order, = a 0 + a1 r + a 2 r 2 + . . . then the Schrdinger equation expands as o 0 = r1 (a0 2a1 ) + r0 a1 6a2 4R1 E + r1 (. . . r 2 (17) (16)
The r1 singularity is removed if a1 = 1 a0 , or 2 =

r=0
.
r=0
(18)
This is the well-known cusp condition4,5,6,7 , which shows that in whatever direction one moves from r = 0, the wavefunction increases linearly. The exact wavefunction must have the shape depicted in Figure 3, showing the existence of a Coulomb Hole around the point of coalescence. In Figure 3, the wavefunctions are plotted against z = z2 z1 , with the two electrons having identical x, y coordinates.
Figure 3. The interelectronic cusp
SCF
Exact
The Hartree-Fock wavefunction is 1 RHF = A1s 1s = 1s(r1 )1s(r2 ) 2 ((1)(2) (1)(2))
(19)
which has no special behaviour near coalescence; in fact it is easy to show that RHF /r = 0 at r = 0. Thus the RHF wavefunction must have the shape shown in Figure 3; clearly, it overestimates the probability of nding the two electrons close together, and this in turn implies an overestimate of the electron repulsion energy. This is consistent with the variational principle, which requires the RHF energy to be higher than the exact energy. We dene the correlation energy to be where E exact is the lowest exact eigenvalue of Schrdingers equation. For He, o E 0.042 hartree = 1.1 eV. The above analysis for the helium ground state, consisting of two electrons with opposing spin, needs to be modied when spins are instead aligned. A triplet spin wavefunction, e.g., (1)(2) is symmetric with respect to electron label exchange, and so, by the Pauli principle, the spatial wavefunction must be antisymmetric. This has the consequences that, in a picture like Figure 3, the triplet wavefunction must pass through the origin, and has dipole rather than monopole r angular variation. There is a corresponding cusp condition specifying 2 /r2 in terms of /r at the coalescence point5 , but the important thing is that in the energetically important region, the electrons are already kept apart by the Pauli principle, even in Hartree-Fock, and the eects of electron correlation neglect are fairly minor. Electron correlation eects are most important for electrons with opposing spins. A further observation for polyelectronic systems is that the biggest contributions will come from pairs of electrons which occupy the same regions of physical space. If orbitals are well localized, there will be a large contribution to the correlation energy from each doubly occupied orbital, with smaller additions from pairs consisting of two dierent orbitals. This leads to a rough rule of thumb, that each doubly occupied orbital contributes approximately 1 eV to the total electron correlation energy. In atomic and molecular systems, an alternative and equivalent way of visualising two-electron correlations relative to the nuclear positions is possible. If one electron is far from the nucleus of an atom, then the second electron will prefer to E = E RHF E exact (20)
be closer to the nucleus than its Hartree-Fock average; this is termed radial correlation. If a rst electron is, say, to the right of a nucleus, then another electron will tend to visit regions of space to the left of that nucleus more than predicted by HF; this is termed angular correlation. These short-range correlation eects arising from the Coulomb hole can be represented using CI wavefunctions just as with the long-range correlations discussed above. The simplest such wavefunction representing the angular correlation in the helium atom would have the form (21) = A 1s (1)1s (2) + 2p (1)2p (2) + 2p (1)2p (2) + 2p (1)2p (2)
x x y y z z
It is straightforward to show that such an ansatz introduces explicit r12 dependence into the wavefunction. This demonstrates that CI does support correlated wavefunctions. However, unfortunately, the r12 dependence introduced is entirely 2 in terms of r12 ; there are no linear terms. A CI wavefunction can never satisfy the cusp condition (18), since its gradient will always be zero at coalescence; however, 2 given sucient terms, the linear combination of functions of r12 will give a reasonable representation of the shape of the Coulomb hole. Because the expansion functions are not ideally suited to the problem, the convergence of the CI expansion is unfortunately slow, and this is discussed further below. Historically, even some of the earliest molecular electronic structure calculations8,9 used 2-electron basis functions of a type better adapted to the problem than orbital products (i.e., CI). Inclusion of linear terms in r12 is an ecient way to obtain an accurate wavefunction with a small number of functions, and probably it will remain the approach of choice when very high accuracy is needed, particularly for atoms. However, despite successful research activity in this area 10,11 this approach has not yet emerged as the best method generally applicable to molecules; CI expansions remain computationally preferable. The reason for this preference is that, although very large numbers of basis functions might be required, the hamiltonian integrals which have to be computed for CI are much simpler than for explicitly correlated wavefunctions. The explicit r12 terms introduce 3- and 4-electron integrals12,13 which are potentially very numerous. In contrast, CI needs only the two-electron integrals required in an SCF calculation. Although the 3and 4-electron integrals can be reasonably approximated11, explicitly correlated wavefunctions still remain a specialist rather than general-purpose tool. 1.4 Second Quantization
The adoption of the CI (or other related) approach to electron correlation implies that we deal with wavefunctions which are represented as vectors in a linear space of Slater determinants; this space is in turn a subspace of N -fold products of orbitals. For the moment, we will assume that we generate all of the N -electron basis functions that we can after appropriate symmetry adaptation (electron antisymmetry, point group, etc.). Therefore the N -electron basis set is determined entirely by a choice of 1-electron basis. Before considering what this choice should be for optimum accuracy, we consider the analysis and manipulation of N -electron functions of this orbital-product type. We note initially that the orbital basis will contain at least the SCF occupied orbitals, denoted {i }, but in order that further
congurations be generated, it must be augmented by virtual or external orbitals, {a }. Both the occupied and virtual orbitals can be considered as linear combinations of an underlying chosen xed basis { }, which will usually be atom-centred functions, exactly as in basis-set SCF calculations. The functions p and depend only on the spatial coordinate r; where spin-orbitals are required, they will be denoted by p (x) and can be constructed as a product of a spatial orbital p (r) and a spin function or . Consider a complete (innite) one particle basis set {p (r), p = 1, 2, . . . }; any function of the position r can be represented as a linear combination of the spatial orbitals f (r) =
p
xp p (r) .
(22)
For a system of N electrons, a complete spatial basis can then be generated by taking all possible products p1 (r1 )p2 (r2 ) . . . pN (rN ), i.e., any N particle spatial function may be expanded as F (r1 , r2 , . . . rN ) =
p1 p2 ...pN
Xp1 p2 ...pN p1 (r1 )p2 (r2 ) . . . pN (rN ) .
(23)
This fact is not much use for practical calculations, since we cannot use an innite set of functions, but if we consider now the case of a nite one particle basis {p , p = 1, 2, . . . , m}, then we see the concept of the corresponding complete N particle space, composed of all possible products of orbitals. A variational calculation in such a basis will yield the lowest possible energy eigenvalue for the given one particle basis set, and such a calculation is termed Full or Complete conguration interaction (FCI). It is, however, easily appreciated that the number of possible orbital products mN (m one electron and spin orbitals, N electrons) can become exceedingly large. We introduce the useful concept of second quantization by dening the orbital excitation operator as (assuming orthogonal orbitals)
N
Epq =
i=1
|p (i) q (i)| .
(24)
The Dirac bracket notation means that whenever the brackets become closed, f (i)|g(i) , integration over the coordinates of electron i is performed on the func tions within the bracket, di f (i)g(i). If Epq is made to act on any N electron function which is a product of orbitals, or a linear combination of such products, the eect is for each occurrence of q to generate a function which is identical, but with q replaced by p . Thus if q does not appear, Epq annihilates the function. Epq is a spatial orbital excitation operator; it acts on space coordinates and does not aect spin. In fact, it can be decomposed into a sum of operators which excite and spin orbitals separately, Epq = e + e = p q + p q , where q destroys pq pq spin orbital q and p creates spin orbital p . The idea of second quantization is that the orbitals themselves now become quantum mechanical operators. Thus a Slater determinant can be viewed as arising from successive applications of creation
operators on the empty (vacuum) state,

. . . r q p vacuum = A(p q r . . . ) .
(25)
The analysis that follows continues to use pure spatial orbitals p ; however, exactly analogous results are obtained by using explicit spin-orbitals p and spin-orbital excitation operators epq . Further details of the properties of the second quantization can be found in the literature 14 . As well as the single orbital excitation operators Epq , it is possible to dene multiple excitation operators:
N
Epq,rs =
i=j N
|p (i) q (i)| |r (j) s (j)| Ers,pq |p (i) q (i)| |r (j) s (j)| |t (k) u (k)|
(26)
Epq,rs,tu =
i=j=k
(27)
etc. These can all be formulated as combinations of the single excitations:

N N
Epq,rs =
i,j
|p (i) q (i)| |r (j) s (j)|
|p (i) q (i)|r (i) s (i)| (28) (29)
= Epq Ers qr Eps
Similar consideration of the identical operator Ers,pq yields the commutation relation for the single excitations: [Epq , Ers ] = Epq Ers Ers Epq = qr Eps ps Erq . (30)
Given that any wavefunction we construct is ultimately composed as a linear combination in the space of orbital products, then the following completeness identity is true for all i = 1, 2, . . . , N
m p
|p (i) p (i)|
| = | .
(31)
Now we insert this identity into the electronic hamiltonian operator

N N
H=Z+
i
h(i) +
i>j
1 rij ,
(32)
where Z is the nuclear repulsion energy, rij are the separations of the electrons, and h(i) is the single particle hamiltonian for each electron, incorporating its kinetic energy and the eld of all the nuclei. This has the eect of replacing H by the eective model or second quantized hamiltonian HM , with the understanding that
10
the only thing we will ever do with HM is to take matrix elements between functions in the orbital product space:
N m pq m 1 |p (i) |r (j) p (i)| r (j)|rij |q (i) |s (j) q (i)| s (j)|
HM = Z +
i N
|p (i) p (i)|h(i)|q (i) q (i)|
+
i>j pqrs
(33) =Z +
pq
hpq Epq +
1 2 pqrs
(pq|rs)Epq,rs ,
(34)
where we introduce the one and two electron hamiltonian integrals hpq = p |h|q = = dr1 dr1 (1)h(1)q (1) p (35)
1 (pq|rs) = p (1)| r (2)|r12 |q (1) |s (2) 1 dr2 (1) (2)r12 q (1)s (2) . p r
(36)
For matrix elements between the N electron basis functions we then have I |H|J = I |HM |J = Z I |J +
pq
hpq I |Epq |J +
1 2
pqrs
(pq|rs) I |Epq,rs |J . (37)
In this way, we separate integrals hpq , (pq|rs) and coupling coecients dIJ = pq IJ I |Epq |J , Dpqrs = I |Epq,rs |J . The coupling coecients depend only on the algebraic structure of the N electron functions, and not on such factors as molecular geometry, external elds, etc. We illustrate the use of the second-quantized formalism by considering CI wavefunctions for two electrons. Unnormalized spin-adapted basis functions can be constructed as p q q p pq = 1 A( ) A( ) , (38)
2
with the upper (+) sign for spin S = 0 (singlet) and the lower () for S = 1 (triplet). The total wavefunction can then be expanded in this basis as =
pq
Cpq (1 pq )pq Cpq pq , (39)
=
pq
The orbital excitation operator Ers when acting on pq will completely annihilate the function if s is not equal to at least one of p, q; otherwise, each occurence of s is replaced by r . Thus Ers pq = (1 pq )sq pr (40)
11
and then Ers,tu pq = (1 pq )sp uq rt , (41)
where pq has the eect of swapping the labels p, q in whatever follows it. Then the action of the hamiltonian operator is Hpq = (1 pq ) i.e., H =
pq
hrq pr +
1 2 rs
(rp|qs)rs
(42)
Cpq Hpq rs (K(C)rs + 2(hC)rs ) . (43)
=
rs
Here, we have dened a generalized exchange matrix K(C), which for any given coecient matrix C is K(C)rs =
pq
Cpq (rp|qs) .
(44)
1.5
Orbital basis sets
Calculations with complete (innite) orbital basis sets are impossible; therefore, one immediately wants to know how to choose optimally a nite basis set such that the CI wavefunction is as close to the exact wavefunction as possible for a given number of orbitals. Insight into this problem can be gained from the two-electron example developed above. Consider the one-electron density matrix generated by the wavefunction, dened as dpq = |Epq | For the two-electron example, it is straightforward to show using (40) that dpq = 2
s
(45)
Csp Csq ,
(46)
or d = 2C C. Suppose that we now consider truncating the basis set by deleting the last (mth) orbital to leave m 1 remaining functions. The overlap between the new and old wavefunctions is New |Old = Old |Old 2
pqr
Cpq Crm pq |rm +
pq
Cpq Cmm pq |mm (47)
2 Ignoring the last (Cmm ) term, which can be shown to be of lesser importance, we deduce that the amount that the overlap diers from unity is dmm . Consider making linear transformations amongst the underlying orbitals. Of all the possible transformations, the one which minimises dmm is that which brings d to diagonal form, with dmm being the smallest eigenvalue. Such orbitals are known as natural
2 = 1 2(C C)mm + Cmm
12
orbitals (NOs), and are of great utility in interpreting correlated many-electron wavefunctions. The trace of the density matrix is equal to the number of electrons, leading to an interpretation of the eigenvalues as occupation numbers. In the above example, therefore, if natural orbitals are chosen, the eects of deleting the last (m-th) orbital are minimized. In other words, the CI wavefunction in m1 orbitals is as good as it can be. We have thus shown that of all the possible choices of orbitals, natural orbitals oer the most compact or ecient basis set, for a two-electron system. For many-electron systems, the situation is, of course, more complicated. One can still dene natural orbitals as density matrix eigenvalues, but their relationship with the wavefunction is not so transparent. For the special case of CI wavefunctions that contain up to double excitations from the Hartree-Fock determinant, then one can also construct pair natural orbitals (PNOs) for each pair of occupied orbitals that are excited; these PNOs do have similar properties to the two-electron NOs, and typically show a similar convergence of eigenvalues towards zero. The true NOs, however, are an average of the various PNOs, and the convergence of their spectrum and their usefulness in evaluating the correlating eect of basis functions is usually less advantageous. In contrast to Hartree-Fock, where reasonably good wavefunctions can be obtained using a double-zeta plus polarization (DZP) basis set allowing for simple contraction and deformation of atomic orbitals, a much larger basis set is required for recovering a large fraction of the correlation energy; i.e., the sequence of NO occupation numbers is found to be rather slowly convergent. It is then not a trivial problem to decide straightaway what basis functions { } should be used for optimum recovery of electron correlation eects. The idea of using natural orbitals to obtain basis sets is taken to the extreme in the atomic natural orbital (ANO) basis scheme15 . Here, the basis functions are (approximate) atomic natural orbitals, obtained from a CI calculation on each of the molecules constituent atoms. The idea is that the ANOs, which are nearoptimum correlating functions for the atomic problem, will be good functions for describing molecular electron correlation. Within each of the atomic symmetries (s, p, d, . . . ), each contracted basis function is a linear combination of all the primitive gaussian functions; thus each primitive function enters in to all contractions (general contraction). Within the ANO scheme, there also arises the concept of sequences of basis sets, in which each basis set is derived from the previous one by the addition of the next most important atomic natural orbital. This allows for the systematic improvement of basis sets and consequent elimination of possible spurious errors arising from unbalanced choices of basis functions. For example, for most rst row atoms, examination of the ANO occupation numbers identies [3s2p1d], [4s3p2d1f ] and [5s4p3d2f 1g] as good choices of contracted basis sets, whilst a set such as [5s3p2d2f 2g] is unbalanced, and would be inecient in recovering electron correlation eects. For certain applications, selection of a small or mediumsized ANO set will not necessarily result in a good basis set, and can lead to spurious results. An example is the calculation of atomic or molecular electrical polarizabilities. Here, it is vital to include diuse basis functions, particularly of d type in the usual case that the highest atomic shell is of p type. Such basis functions do not appear in the set which is optimum for the correlation problem, and so such functions must
13
be included additionally, or the basis set redesigned somewhat. This case occurs to a milder degree in all molecules, where the atomic functions are polarized by their neighbours; even for SCF calculations, polarization functions are required to cover this eect, and the optimum gaussian exponents are not necessarily related to those best for correlation. Another type of calculation which presents problems for ANO sets is that where several dierent atomic states are involved; the classic case is in transition metal chemistry, where dn s2 , dn+1 s1 and dn+2 atomic states often all make signicant contributions to the molecular situation. ANO bases based on each state are drastically dierent, particularly for the d orbitals, which are much more diuse in dn+2 than in dn ; so the use of an ANO set derived from one particular atomic state can introduce an unwanted bias towards that state. A partial solution is to select functions which are eigenfunctions of the sum of the density matrices for each state16,17,18 , although caution is still needed. For general applications, a good compromise is found in the correlation consistent basis sets19 , which are similar to ANO sets, except that the most diuse s and p functions are left uncontracted, and the polarization functions are simple uncontracted gaussians designed to cover both the polarization and correlation requirements. In fact, the advantage in using ANOs for the polarization functions is not that great, and the correlation consistent basis sets are usually more compact than standard ANOs for a given level of accuracy. Just as with ANOs, a systematic sequence of basis sets is dened, with members conventionally denoted cc-pVDZ, cc-pVTZ, cc-pVQZ, cc-pV5Z, etc., which for 1st row atoms comprise 3s2p1d, 4s3p2d1f , 5s4p3d2f 1g, 6s5p4d3f 2g1h . . . . 1.6 Dynamical vs. Non-Dynamical Correlation
The correlation energy arising from overestimation of short-range electron repulsions in Hartree-Fock wavefunctions is usually referred to as dynamical correlation. Dynamical correlation is always reduced when a normal chemical bond (i.e., doubly occupied orbital) is broken. It is the neglect of dynamical correlation which causes the RHF equilibrium energy of F2 to be higher than twice the RHF energy of a uorine atom, since in F2 there are 9 pairs of electrons, but in each F there are only 4. The eect is so pronounced for F because the molecular orbitals are considerably smaller than their atomic parents, and crowding the electrons together means there is more correlation energy. Where dynamical correlation eects are important, Hartree-Fock will therefore generally overestimate bond lengths and underestimate binding. An extreme example is that of rare-gas dimers, which are unbound at the Hartree-Fock level, but in reality are held together by dispersion, which is a manifestation of dynamic correlation. That part of the correlation energy arising from long-range correlation eects, such as observed on molecular dissociation, is often referred to as non-dynamical (or static) correlation. Static correlation eects mean that (spin-restricted) HartreeFock tends to articially overbind molecules underestimating bond lengths and overestimating vibrational frequencies. Thus the eects of dynamic and non-dynamic correlation are very often in opposition, and the partial cancellation of correlation errors enhances the value of SCF; it is often observed that, for example, use of methods which represent properly the non-dynamical correlation eects leads to
14
much worse agreement of computed properties with experiment than RHF. The division between dynamical and non-dynamical correlation is dicult to dene in most cases. For example, when thinking about electron correlation in a bond in a molecule, the radial and angular short-range concepts are somewhat blurred with the ideas of long-range dissociation-enabling correlation. One useful visualization is that the non-dynamical correlation is that which is recovered with the minimum CI expansion describing properly all correlation eects; in contrast, convergence of the dynamical correlation energy with increasing size of CI expansion is very slow. When non-dynamical correlation is weak, Hartree-Fock theory already provides a qualitatively correct description of the wavefunction. Under such circumstances, which, fortunately, apply for the majority of molecules in their ground state near equilibrium geometry, one may use single-reference methods for representing the dynamical correlation eect. These methods build on the SCF reference determinant, typically using perturbative arguments to dene classes of congurations or excitations deemed to be of most importance in constructing an approximate correlated wavefunction. For most excited states, for molecules that are close to dissociation, and for situations in which there is near electronic degeneracy, HartreeFock is a poor approximation. Static correlation eects often mean that there is no single Slater determinant that dominates the wavefunction, and perturbative or other approaches that assume a good single-reference starting point are doomed to failure. Under such circumstances, a viable way forward is to rst deal with the static correlation problem using a CI expansion that covers all the important effects. One may then go further using this many-determinant reference as a starting point for further recovery of the dynamic correlation. Such approaches are termed multi-reference methods. 2 Closed-Shell single-reference methods
In this section we will discuss the most important electron correlation methods based on closed-shell Hartree-Fock reference functions. This includes Mller-Plesset perturbation theory, singles and doubles conguration interaction (CISD), and nonvariational variants like the coupled-electron pair approximation (CEPA), as well as coupled cluster methods with single and double excitations (CCSD). The eect of triple excitations can be accounted for by perturbation theory, leading to CCSD(T). From a computational point of view, it is important to minimize the logic in the code, and to formulate the theory in terms of matrix and vector operations. The most ecient operations one can perform on any kind of current hardware are matrix multiplications. This applies both to vector computers as well as to RISC workstations or even PCs. The reason for this is that on most machines the bottleneck is not the oating point operation itself, but getting the data from the memory, in particular if the quantities involved do not t into the fast cache. By an appropriate unrolling of the three loops in a matrix multiplication one can achieve that each data element obtained from memory can be used in several oating point operations, and this this way often about 80% of the theoretical peak performance can be achieved.
15
For the formulation of the theory in terms of matrix multiplications it is essential to use unnormalized or even non-orthogonal conguration state functions. We start with a general discussion of the conguration spaces which are common to all methods discussed in the subsequent sections. 2.1 The rst-order interacting space
According to second-order perturbation theory, the most important contributions to the correlation energy arise from congurations I which have non-zero matrix elements I |H|SCF , i.e., which span the rst-order interacting space of SCF . In the following, the SCF wavefunction will be denoted |0 SCF . According to the Slater-Condon rules only Slater determinants can contribute which dier by at most two spin-orbitals from the Hartree-Fock determinant. The spin adapted singly and doubly excited congurations are conveniently generated by applying the excitation operators Eai to the reference function a = Eai |0 , (48)
i
where i, j refer to occupied orbitals in |0 , and a, b to virtual orbitals (unoccupied in |0 ). If |0 is an optimized closed-shell Hartree-Fock wavefunction, the matrix elements a |H|0 = 2fai vanish for all single replacements a , since the optimized i i orbitals satisfy the conditions fai = 0 (Brillouin theorem). Therefore, the rst-order wavefunction is a linear combination of all doubly excited congurations ab ij (1) = 1 2
ij Tab ab , ij ij ab
ab = Eai Ebj |0 , ij
(49)
(50)
ij where Tab are the amplitudes. Note that the operators Eai and Ebj commute, and therefore
ab = ba , ij ji
(51)
i.e., the conguration set used in the expansion of (1) is redundant. In the formulation of correlation theories it will be convenient to use this redundant set, but we must account for this by the restriction
ij ji Tab = Tba . ij Tab
(52)
We will consider as matrices with elements ab. Dierent matrices are labeled by the superscripts ij:
ij [Tij ]ab = Tab ,
Tij = Tji .
(53)
The matrix elements for i > j, all a, b and i = j, a b form the non-redundant set of amplitudes. The denition of the doubly excited congurations in eq. (49) is most simple, but has the disadvantage that the resulting functions are non-orthogonal. Using the commutation relations (30) and the fact that zero results if an external annihilator acts on the reference function |0 one obtains ab |cd = ac bd 0|Eik,jl |0 + ad bc 0|Eil,jk |0 , (54) ij kl
16
where 0|Eik,jl |0 are the elements of the second-order reduced density matrix of the reference function. For closed-shell Hartree-Fock reference functions one obtains explicitly 0|Eik,jl |0 = 4ik jl 2il jk , ab |cd = ac bd (4ik jl 2il jk ) + ad bc (4il jk 2ik jl ) . ij kl
(55)
Straightforward use of these non-orthogonal congurations is in principle possible, but leads to some complications. There are two ways for simplication: in the rst case a set of orthogonal conguration state functions is dened as 1 ab for p = 1, i j, a b , (56) + pba ab = ij ijp 2 ij where p = 1 corresponds to singlet coupling of the two external electrons, and p = 1 to triplet coupling. Note that these functions are not normalized; for a closed-shell reference function we have ab |cd = (2 p)pq (ac bd + pad bc )(ik jl + pil jk ) , ijp klq and thus the normalization factors are ab |ab = (2 p)pq (1 + pab )(1 + pij ) . ijq ijp (58) As will become clear later, for an ecient formulation of all electron correlation methods it is essential not to normalize the congurations. This was rst realized in the theory of self-consistent electron pairs (SCEP) by Meyer20 , who showed that by using unnormalized congurations all terms involving the virtual orbital labels a, b, . . . can be formulated in a computationally convenient matrix form without any logic. Most importantly, this concerns the factor (1+pab ), which implies a dierent normalization for diagonal congurations (a = b) than for non diagonal ones (a = b). We note that in the original SCEP theory of Meyer20 the congurations were 1/2 normalized by the factors [(2 p)(1 + pij )] , but this leads to some unnecessary factors in the resulting equations. A similar denition is possible for multireference wavefunctions and will be used in section 5. For single-reference methods it turns out that even simpler equations can be obtained by directly using the congurations (49) together with a set of contravariant congurations21,22 1 ij ab = (2ab + ab ) (59) ij ji 6 which have the properties ij kl ab |cd = ac bd ik jl + ad bc il jk , (60)
ij ab |(1) = Tab , ij ij ab |H|(0) = (ai|bj) .
(57)
(61) (62)
The last expression is obtained by inserting the hamiltonian in second quantization (cf. eq. (34)) 1 ab |Ers,tu |(0) (rs|tu) , ab |H|(0) = ij 2 rstu ij (63)
17
and realizing that the indices r, t must be external and match a, b, while s, u must be internal and match i, j according to eq. (60) 1 ij ij ab |Eck Edl |(0) (ck|dl) ab |H|(0) = 2
kl cd
1 = 2
kl
cd
ij kl ab |cd (ck|dl) = (ai|bj) .
(64)
We can now express (1) either in the original basis or in the basis of contravariant functions 1 ij ij Tab ab , (65) (1) = Tab ab = ij ij 2 ij ij
ab ab
which leads to The factor has been omitted in the second sum for convenience in later expressions. For the singles we can dene the contravariant space analogously, but in this case only the normalization of a and a diers i i 1 a (67) i = a , 2 i a ti = 2ti . (68) a 2.2 Matrix notation
1 2 ij ji ij Tab = 2Tab Tab
or
Tij = 2Tij Tji .
(66)
ij We have seen above that the amplitudes Tab for a given correlated orbital pair (ij) ij can be considered as a matrix T , and the amplitudes ti of the single excitations a as vectors ti . Unless otherwise noted, here and in the following i, j, k, l refer to occupied orbitals, a, b, c, d to virtual orbitals (unoccupied in the reference function), and p, q, r, s to any orbitals. In open-shell and MCSCF methods, t, u, v, w will denote open-shell (active) orbitals. Similarly, it is convenient to order the two-electron integrals over two occupied and two virtual orbitals into matrices. In this case there are two types, namely Coulomb and exchange matrices ij Jab = (ab|ij) , ij Kab
(69) (70)
= (ai|bj) .
The labels ij refer to dierent matrices, and ab to their elements. Often it will be possible to write equations in matrix form, involving matrix multiplications and additions, and then bold face letters will be used for matrices, e.g., Jij and Kij . For convenience in later expressions, we also dene and the closed shell Fock matrix
ij ij Lij = 2Kab Kba , ab ii ii 2Jrs Krs .
(71) (72)
frs = hrs +
i
18
In the subsequent sections, the matrix f will only refer to the external part, i.e, the elements fab . 2.3 Second-order Mller-Plesset perturbation theory
The simplest electron correlation method to treat electron correlation is MllerPlesset perturbation theory, which is a special variant of Rayleigh-Schrdinger pero turbation theory, with the zeroth-order hamiltonian
Nel
H (0) =
i=1
f (i) =
rs
Ers frs ,
(73)
and with H (1) = H H (0) ,
(74)
where f(i) is the closed-shell Fock operator for electron i. For optimized orbitals the matrix elements fai vanish (Brillouin conditions), and it is then easily shown that the Hartree-Fock wavefunction (0) = SCF is an eigenfunction of H (0) , i.e., H (0) (0) = E (0) (0) ,
mocc
(75) (76) (77)
E (0) = 2 E (0) + E (1) =

i=1 (0)
fii , |H|(0) = E SCF ,
where E SCF is the Hartree-Fock energy expectation value. The rst-order wavefunction is expanded according to eq. (50), and the ampliij tudes Tab are obtained by solving the rst-order perturbation equations ij ij ab |H (0) E (0) |(1) + ab |H|(0) = 0 (78)
for all i j, ab. Inserting eq. (50) and evaluating the matrix elements yields the linear equations
ij ij Rab = Kab + c ij ij fac Tcb + Tac fcb kj ik fik Tab + Tab fkj k
= 0.
(79)
For the case that canonical Hartree-Fock orbitals are used which obey fij = fab = one obtains
ij ab Rij = Kab + ( ij Tab a i ij a ab
, ,
(80) (81)
+ +
b b
ij Kab /( a
i i
ij j )Tab j)
(82) (83)
19
which is, of course, the well known MP2 expression. Using eqs. (61) and (62) the second-order energy takes the form E (2) = (0) |H|(1) =
ij ab
ij (0) |H|ab Tab ij =

ij
=
ij
Kij Tji
Kij (2Tji Tij ) ,
(84)
where Kij Tji =

ab ij ji Kab Tba = ab ij ij Kab Tab
(85) (86)
denotes the trace of the matrix product in the brackets. From the above equations it is obvious that evaluating the second-order energy ij is trivial once the exchange integrals Kab = (ai|bj) are available. These integrals are in the MO basis, and must therefore be generated from the 2-electron integrals in the AO basis by a four-index transformation (ai|bj) =
Xa Xb Xi Xj (|) .
(87)
This transformation is most eciently done in four steps, each being a matrix multiplication, i.e. (|j) =
(|)Xj , (|j)Xi ,
(88) (89) (90) (91)
(i|j) = (i|bj) =
(j|i)Xb , (i|bj)Xa .
(ai|bj) =
Since the number of occupied orbitals i, j is usually much smaller than the number of basis functions, the number of transformed integrals becomes smaller in each step, and therefore the rst quarter transformation step is most expensive. It requires about 1 mval m4 operations, where m is the number of basis functions and mval 2 the number of correlated orbitals. Since both mval and m increase linearly with system size N , the computational eort scales with O(N 5 ). For large systems not only the computation time but also the storage of the two-electron integrals and intermediate quantities is a severe bottleneck. Chapter 6 discusses integral-direct transformations, in which the integrals (|) are computed on the y whenever needed, without being ever stored on disk.
20
An alternative way to compute the second-order energy is to start from the Hylleraas functional E2 = 2 (1) |H|(0) + (1) |H (0) E (0) |(1) =2
ij
Kij Tji + Tij f Tji fij (K + R )T

ij ij
Tik Tkj
k
=
ij
ji
(92)
ij Minimizing this functional with respect to the Tab yields E2 ij = 2Rab , T ij

ab
(93)
with the Vij dened in eq. (79). Thus, the Hylleraas functional is stationary with ij respect to small variations of the Tab if the rst-order perturbation equations are ij fullled, i.e. Rab = 0. For the corresponding amplitudes we have E2 = E (2) . It (2) is straightforward to show that in general E2 E for any set of trial function (1) . The stationary property is very convenient for deriving the MP2 gradient expression and in the context of local electron correlation methods to be discussed later. Even though we will not discuss applications of the methods in this article, it should be noted that the applicability of MP2 is restricted to cases with a sucient large HOMO-LUMO gap. If this is not the case, the energy denominators in eq. (83) become small and the perturbation expansion diverges. 2.4 Singles and doubles conguration interaction
In singles and doubles conguration interaction (CISD) the expansion coecients are determined variationally. Consequently, the resulting energy is an upper bound to the exact energy, but it is not size extensive or size consistent, i.e., it does not scale correctly with the number of electrons or the number of independent subsystems. Therefore, CISD usually yields poor results, and it is not recommended to be used. However, much better results can be obtained by some simple modications of the variational conditions, leading to the coupled electron pair approximation (CEPA)23,24 or the coupled pair functional (CPF)25 , which are approximately size consistent and yield much better results at the same computational cost as CISD. The rst matrix formulation of CISD is due to Meyer and known as SCEP theory20 (cf. section 2.1). This method was formulated originally in the AO basis, but here we will continue to work in a basis of orthogonal MOs, which is somewhat simpler. However, we will come back to the AO formulation when discussing local electron correlation theories. The CISD wavefunction is expanded in terms of the same congurations as used in the MP2 wavefunction, but also includes single excitations CISD = SCF +
ia
ti a + a i
1 2
ij Tab ab . ij ij ab
(94)
21
ij The coecients ti , Tab are optimized variationally by minimizing the Rayleigh a quotient
E CISD =
CISD |H|CISD . CISD |CISD a a ti ti +

ai ij ab
(95)
Using eqs. (61) and (65) one nds for the norm N = CISD |CISD = 1 + = 1+
i
ij ij Tab Tab (2 ij ) Tij Tji . (96)
i i
t t +
ij
ij Dierentiating the expectation value with respect to the Tab yields the eigenvalue equations
i i ra = a |H E CISD |CISD = 0 , ij Rab = ab |H E CISD |CISD = 0 . ij
(97)
These equations can be solved iteratively (direct CI ). In each iteration one has to compute the residuals
ij Rab i i ra = va E CISD ti , a
(98) (99)
ij Vab
CISD
ij Tab
where
i i va = a |H E SCF |CISD ij Vab = ab |H E SCF |CISD , ij
(100) (101)
The residuals are used to obtain an update of the CI-coecients by simple perturbation theory: ti = a a |H E CISD |a i i
i ra
and E CISD = E CISD E SCF is the correlation energy 1 ij ij ij i i E CISD = (fa + va )ti + (Kab + Vab )Tab . a N i ij
ab ij Rab
(102)
ij Tab =
ab |H E CISD |ab ij ij
. (103)
This procedure relies on the fact that the hamiltonian in the conguration basis is diagonal dominant. Convergence can be improved and guaranteed by the Davidson procedure26 . For the sake of simplicity, we will restrict the following discussion to double excitations (CID); the inclusion of single excitations is quite straightforward and does not lead to any principle diculties. In the CID case the matrices V ij take the explicit form
ij ij Vab = Kab + K(Tij )ab + kl ij kl Kkl Tab + Gij + Gji ab ba
(104)
22
with the auxiliary matrices Gij = Tij f

k
Tik fkj + Tik Jkj + (Tik Jkj ) Tik Kkj
(105)
The matrices Gij account for the contributions of the two-electron integrals over two external and two occupied orbitals, i.e., all matrices occuring in eq. (105) are dened in the space of external orbitals only. The evaluation of all Gij requires 2m3 val matrix multiplications. Since each matrix multiplication involves 2m3 oating ext point operations, the total cost scales with the sixth power of the molecular size. Note the exceedingly simple matrix form of these equations, which do not involve any complicated logic. This is solely due to the fact that unormalized and nonorthogonal congurations are used, as outlined in section 2.1. In contrast, in the early direct CISD method of Roos and Siegbahn27 , which employed orthonormalized conguration state functions, about 140 dierent types of matrix elements had to be distinguished. The so called external exchange operators K(Tij ) in the second term of (104) account for all contributions of integrals over four external orbitals K(Tij )ab =
cd ij Tcd (ac|db) .
(106)
There terms require about m2 m4 oating point operations, and for large basis val ext sets and not too many correlated orbitals mval their evaluation dominates the total computational cost. As written in eq. (106) one would need a full integral transformation for generating the integrals (ac|dc). This would not only be rather expensive (O(N 5 ) operations), but also double the disk space. The transformation can be avoided by expanding the virtual MOs in the integral, yielding K(Tij )ab =
Xa Xb
cd
ij Xc Tcd Xd (|) ij T (|)
Xa Xb
ij
= X K(T )AO X
ab
(107)
The quantities Tij = XTij X are the amplitudes in the AO basis. These are AO MO precomputed and then contracted with the two-electron integrals (|), which very much resembles the calculation of the exchange terms in the Fock matrix. The resulting operators in the AO basis K(Tij ) are nally backtransformed into the MO basis by the two matrix multiplications in the last line. Similar operators are also needed in coupled cluster theory (cf. section 2.5) and multirefence conguration interaction (cf. section 5). The third-order energy in Mller-Plesset perturbation energy is obtained as E (3) =
ij ab ij ij ij (Kab + Vab )Tab ,
(108)
ij ij where the Vab and Tab are computed from the MP2 amplitudes. Note that this energy expression is similar to the expectation value, eq. (102), but without the
23
normalization factor. In contrast to the CID energy E (3) is size consistent, but not an upper bound to the exact energy. Finally, we note that the CEPA equations23,24 can be obtained from the CISD equations by replacing in the residual the correlation energy by individual pair energies, e.g., CEPA-2
ij ab Rab = Vab ij ij Tab
(109) (110)
with
ij
= (2 ij )
ij ij Kab Tab . ab
Other CEPA variants use slightly dierent expressions for the residual. The CEPA correlation energy is the sum of all pair energies E CEPA =
ij ij
(111)
Obviously, the computational eort per iteration is virtually the same as for CISD, but the results are much better (almost as good as for CCSD(T) if singles are included). 2.5 Singles and doubles coupled-cluster
The main disadvantage of the variational conguration interaction method is the fact that it is not size consistent. This can easily be understood by considering two independent subsystems, e.g., two water molecules. The correct wavefunction for the total system AB should then be the (antisymmetrized) product of the wavefunctions of the two molecules A and B. If each of these wavefunctions contains double excitations from the SCF determinant, the total system will contain quadruple excitations, e.g., (A) =
SCF
1 (A) + (A) = [1 + 2
c
(A) (A) ab Tij Eai Ebj ]SCF (A) ij ab cd Tkl Eck Edl ]SCF (B) kl cd (B) (B)
(B) = SCF (B) + c (B) = [1 +
1 2
(AB) = SCF (AB) + A[SCF (A)c (B) + SCF (B)c (A)] + 1 4

(A) (A) (B) (B) ab cd Tij Tkl Eai Ebj Eck Edl SCF (AB) ij ab kl cd
(112)
where A is the antisymmetrizer. It is seen that that the coecients of the quadruple ab cd excitations abcd = Eai Ebj Eck Edl SCF (AB) are simple products Tij Tkl of the ijkl coecients of the subsystems. However, these terms are not included in the CISD wavefunction for the dimer, and therefore the total CISD energy is not equal to the sum of the monomer energies.
24
In coupled-cluster theory28,29,30 the wavefunction is generated by an exponential excitation operator CC = exp(T )SCF , where the exponential is dened by the Taylor expansion 1 1 (114) exp(T ) = 1 + T + T T + T T T + . . . . 2! 3! The excitation operator T may be decomposed into single, double, and possibly higher excitation operators T = T1 + T2 + . . . with T1 =
ai
(113)
(115)
tai Eai ,
ij Tab Eai Ebj , ij ab
(116) (117) (118)
1 T2 = 2
etc. Truncating the expansion after T2 yields the CCSD theory31,21,32,22 . For two independent subsystems we can decompose T into a sum of two operators each acting only on one subsystem (AB) = exp(TA + TB )SCF (AB) = A exp(TA )SCF (A) exp(TB )SCF (B) = A [(A)(B)] . (119) Thus, the coupled-cluster wavefunction is size consistent as required. It implicitly contains triple, quadruple, and higher excitations, but the coecients of these are ij all products of the single and double excitation amplitudes ti and Tab . a Unfortunately, it is not possible to determine these amplitues variationally, since like the full CI expansion (113) includes up to N -fold excitations, which makes the evaluation of an expectation value too expensive. However, one can obtain a nonlinear system of equations for the amplitudes by projecting the Schrdinger equation o from the left with the contravariant congurations a and ab as dened in section i ij 2.1. An additional equation for the correlation energy is obtained by projecting with the reference function. This yields 1 2 E CCSD = 0|H(1 + T1 + T2 + T1 )|0 2 1 2 1 3 i ra = a |(H E CCSD )(1 + T1 + T2 + T1 + T1 T2 + T1 )|0 i 2 3! 1 2 1 3 = ab |(H E CCSD )(1 + T1 + T2 + T1 + T1 T2 + T1 ij 2 3! 1 2 1 2 1 4 + T1 T2 + T2 + T1 )|0 = 0 (i j, all a, b) . 2 2 4!
ij Rab
(120) = 0 (121)
(122)
25
The expansions on the right-hand side terminate after the quadruple excitations since the hamiltonian can couple only congurations that dier by at most two excitations. The number of equations corresponds exactly to the number of amplitudes. Even though these equations look quite complicated, it turns out that their solution is not much more dicult than of the CISD equations. It can be shown that in the coupled-cluster case the contributions of the energy in the residual equations cancel out, as required for a size-consistent theory. In order to exemplify the structure of the resulting equations, we will omit the single excitation operator T1 and consider only the coupled-cluster doubles (CCD) case. The full CCSD equations in a similar matrix formulation can be found in Ref. 22 . The explicit expressions for the CCD residual matrices Rij are Rij = Kij + K(Tij ) +
kl
ij,kl Tkl + Gij + Gji ,
(123)
with Gij = Tij X 1 ik Tkj Tik Ykj + Tki Zkj + (Tki Zkj ) 2 . (124)
The form of these equations is exactly the same as for the CID, discussed in the previous section, but there are now intermediate quantities which depend linearly ij on the amplitudes. In detail, the integrals Kkl in the CID equations are replaced kj kj by ij,kl , fik by ik , f by X, K by Y , and Jkj by Zkj . The explicit form of these quantites is
kl ij,kl = Kij + tr Tij Klk ,
(125) (126) (127) Lkl Tlj , (128) (129)
ik = fik +
l
tr Til Llk , Lkl Tlk
X =f
kl
1 1 Ykj = Kkj Jkj + 2 4 Zkj = Jkj 1 2
Klk Tjl .
l
The computational eort of the CCD diers from CID basically by the additional 2m3 matrix multiplications in eqs. (128) and (129), which doubles the time for evaluating the matrices Gij . However, the same external exchange operators K(Tij ) are needed, and therefore the dierence in total time is less signicant. If singles are included, there are additional terms in the intermediates, but these require only minor computational eort. The products of singles arising from the 2 3 4 T1 , T1 , and T1 terms in eqs. (121) and (122) can all be accounted for by dening modied amplitude matrices Cij = Tij + ti tj ,
1 Cij = Tij + ti tj , 2
(130)
26
and then all intermediates depend only linearly on either Tij , Cij , or Cij . The most notable dierence between CISD and CCSD is that in the latter case one needs additional contractions of singles amplitudes with 3-external integrals J(Eij )ab =
c
(ab|ci) tj , c (ai|bc) tj . c
c
(131) (132)
K(Eij )ab =
As the external exchange operators, these terms can be evaluated in two dierent ways. Either the 3-external integrals (ab|ci) are explicitly generated, which requires a more expensive integral transformation (note, however, that the eort for the rst quarter transformation is the same). Alternatively, the storage of these integrals can be avoided by computing these terms directly from the integrals in the AO basis. First, the singles amplitudes are transformed into the AO basis ti =
c
Xc ti , c
(133)
then the operators are computed in the AO basis J(Eij ) =
Xi
tj (|) , tj (|) ,
(134) (135)
K(Eij ) =
Xi
and nally they are back transformed into the MO basis J(Eij )MO = X J(Eij )AO X , K(Eij )MO = X K(Eij )AO X . (136) (137)
This procedure, which is similar to the computation of the operators Jkl and Kkl , requires about 3 m4 mocc + 4m3 m2 additions and multiplications (m basis functions, occ 4 3 mocc correlated orbitals) rather than 2 m3 m2 operations if the same quantities occ are computed from the fully transformed two-electron integrals (the full integral transformation scales as m5 ). The additional eort is, however, quite insignicant 1 as compared to the 2 m4 m2 operations needed to evaluate the operators K(Tij ) occ and will therefore not introduce a bottleneck. Nevertheless, it should be noted that the three-external integrals (ab|ci) are also needed for evaluating the perturbative correction for triple excitations, and then it is of course advantageous to use them also for the CCSD. Finally, we note that the QCISD (quadratic conguration interaction) 2 3 4 equations33 are obtained by omitting all T1 , T1 , T1 terms and the T1 T2 term in equation 122. The residuals then include only part of the singles terms present in the CCSD. Most notably, the operators J(Eij ) and K(Eij ) are not needed in QCISD; as in the case of CISD all contributions of three-external integrals can be absorbed into the external exchange operators by computing these with modied coecient matrices22 . Another variant is the Brueckner coupled-cluster doubles
27
(BCCD) theory34,35,36,37,38,39,22 . In this case the orbitals are modied in each iteration so that at convergence all singles amplitudes vanish. This can be achieved by aborbing after each update the singles into the orbitals i i + t i a a
a
(138)
with subsequent symmetrical reorthonormalization of the new occupied orbitals. Furthermore, the virtual orbitals have to be Schmidt-orthogonalized to the occupied space. Then the integral transformation must be repeated, since the Jkl and Kkl change. The Brueckner theory has some theoretical advantages. In particular, the resulting wavefunction is less sensitive to symmetry breaking problems than the CCSD wavefunction on the basis of canonical Hartree-Fock orbitals. 2.6 Computational aspects
As already pointed out, the matrix formulation with a minimum amount of logic is one of the prerequisites for an ecient CISD or CCSD program. Often this can be exploited to the best possible extent by using highly optimzed routines for matrix multiplication (e.g, dgemm), which are available in BLAS (basic linear algebra subroutines) libraries on many platforms. These routines also allow to transpose one or both of the two matrices to be multiplied on the y, without the need to precompute and store the transposed matrix. This is often useful, since the amplitudes Tij are stored only for i j, and Tji is the transpose of Tij . The same holds for the operators Kkl = Klk . It is equally important to think carefully about memory and I/O usage. The number of amplitudes Tij , as well as the number of transformed integrals Jkl , Kkl scale with the fourth power of the molecular size, and in large calculations it will often not be possible to keep all these quantities simultaneously in high speed memory. One can then use paging algorithms, which read blocks of data from disk as required. The algorithm should therefore be optimized so that for a given amount of available memory the I/O is minimized. As a rst example consider the evaluation of the matrices Gij in the CID case. The Gij do not need to be stored but their contribution can be immediately added to the residuals Rij . If the outer two loops run over j and k, one Jkj and one Kkj at a time need to be in memory and have to be read just once for a given kj. The simplest algorithm would then assume that all Rij and Tik can be kept in memory. Should this not be possible, one could split them into batches. For instance, if k is the outermost loop, one could read in this loop all Tik for a xed k; if still not all Rij t into memory, one could treat the largest possible subsets of them together. In this case, one would have to read the Jkj , Kkj , and Tik for each batch of Rij . Reading all the Jkj and Kkj for each batch of Rij could be avoided if each batch would comprise only a subset of j. The situation is more complicated in the coupled cluster case, since then one has to evaluate the intermediates Y kj and Zkj instead of simply reading the Jkj and Kkj . This requires all operators Kkl for a xed k and all Tlj for xed j. Thus, the simplest algorithm requires to keep all Rij and Tij together with all Kkj for a xed k in memory. A simple paging over the Rij and/or Tij as in the CI case
28
is not possible, since this would involve repeated calculation of the intermediate quantities. It would be possible, however, rst to evaluate the the Y kj and Zkj , using a similar paging algorithm as in the CI case, and store these on disk. The Rij are then computed in a second stage, exactly as in the CI case, but instead of the Jkj and Kkj one would read Ykj and Zkj . The computation time and memory requirements can be much reduced if molecular symmetry is exploited, which is easy as long as only one-dimensional irreducible representations are present, i.e. D2h and subgroups. If symmetry adapted molecular orbitals are used, all matrices are blocked. The block structure of a given matrix ij Tab is determined by the product symmetry of the orbitals i and j, which must be the same as the product symmetry of a and b. The same holds for the Rij , Jij , and Kij . Of course, only the non-zero blocks are stored, and since each symmetry block can have a dierent dimension, the matrices are stored in one-dimensional arrays; block dimensions and osets are precomputed and kept in memory. It is then convenient to have a set of subroutines for operations like matrix multiplications, matrix traces, outer products etc., which handle all the symmetry blocking internally. Thus, the rest of the program requires only a minimum amount of the symmetry information, and stays most readable and easy to debug. 2.7 Triple excitations
The accuracy of coupled cluster calculations with single and double excitations (CCSD) can be signicantly improved by subsequently computing the eects of higher order excitations through Rayleigh-Schrdinger perturbation theory (RSPT) o based on the Fock (Mller-Plesset) hamiltonian and the computed CCSD amplitudes of single and double excitations40,33,41 . The most important such correction is that which is linear in triple excitations, since its inclusion gives an energy expression which is consistent with the exact solution of Schrdingers equation up to o fourth order41,42,43,44. The most widely used ansatz of this type, usually denoted CCSD(T)41 , is also consistent with many of the fth order terms, and includes much of the sixth and higher order energies as well45,46 , provided that the reference wavefunction is a true variational solution of the Hartree-Fock equations. This analysis takes into account the fact that terms such as T1 T2 present in the CCSD expansion already partially includes the eects of triple excitations. The evaluation of the triples (T) correction requires terms like
ijk Wabc = d ij (bd|ck)Tad im (mj|ck)Tab + permutations. m
(139)
The rst terms scales with m3 m4 , the second with m4 m3 , where mval and val ext val ext mext are the number of correlated and virtual (external) orbitals, respectively. Thus, the computational cost increases with O(N 7 ), where N is a measure of the molecular size. In most cases the calculation of the triples correction is therefore much more expensive than the CCSD calculation itself, and the applicability is limited to quite small molecules. The elapsed time (not the cost!) can be reduced by parallelization of the code, but is should be noted that this does not substantially increase the molecular size that can be handled. Doubling the molecular size increases the time by a factor of 128, and therefore even the largest parallel computers
29
Table 1. CPU timesa of coupled cluster calculations for glycine peptidesb
Program (Gly)1 (Gly)2 Basis functions 95 166 Transformationc 10 180 CCSD (11 iterations) 312 7453 Triples (T) correction 520 21081 a) In seconds on Sun Enterprise 3500, Ultrasparc 336 MHZ processor b) Using Cs symmetry c) Partial transformation to generate two-external integrals Jkl , Kkl and the three-external integrals (ab|ci).
(Gly)3 237 1471 62741 220486
do not help much further. The dramatic increase of CPU time with molecular size is demonstrated in Table 1 for some glycine peptides, (Gly)n HO[C(O)CH2 NH]n H, using the correlation consistent double zeta basis set (cc-pVDZ) of Dunning19 . The increase of the CPU times is close to the expected theoretical factors. It is easily estimated that the evaluation of the triples correction for the next larger peptide (Gly)4 would already take about three weeks of CPU time. Another bottleneck of the triples calculation is the storage of the integrals (ab|ci) over three external and one occupied orbitals, which must be stored on disk. Since these integrals have less permutational symmetry than the integrals in the AO basis, and the molecular orbitals are more diuse than the basis functions, the number of signicant integrals may even be larger than the number of AO integrals. The cc-pVDZ basis set used in these calculations is too small for obtaining reliable results. Table 2 shows the dependence of the CPU times on the basis set for closed-shell coupled-cluster calculations on another molecule, p-dimethylbenzene C8 H10 , performed in Cs symmetry on a medium workstation. It is seen that increasing the basis set by about a factor of 1.6 increases the CPU times by a factor of 8-12, as expected from the quartic dependence. The larger calculation does not even include f -functions on the carbon atoms, as would be required for accurate results. The computation time is strongly dominated by the triples correction, while the dierences of the various methods are quite small. Clearly, the treatment of molecules of this size is about the maximum what can be done in a reasonable time, which demonstrates the limitations of the conventional coupled cluster methods. Even the fastest current workstations or supercomputers are only about a factor of 3-4 faster, and do not much extend the range of applicability. The strong dependence of the computer time on the molecular size can be dramatically reduced using local correlation methods, as will be discussed in section 4. In particular, as will be demonstrated in section 4.3, the evaluation of an approximate local triples corrections no longer dominates the calculation, but takes only a small amount of the total time.
30
Table 2. CPU timesa of coupled cluster calculations for C8 H10 with dierent basis sets
Program Transformationd CCSD/iteration QCISD/iteration BCCD/iteration
cc-pVDZb 35 374 360 399
cc-pVTZ(d/p)c 318 2313 2180 2520
Transformatione 119 1443 Triples (T) correction 9059 122515 a) In seconds on HP J282, PA8000/180MHZ processor b) 162 basis functions (114a , 48a ) c) 274 basis functions (188a , 86a ) d) Partial transformation to generate the two-external integrals Jkl , Kkl e) Partial transformation to generate two-external integrals Jkl , Kkl and the three-external integrals (ab|ci)
Open-shell single-reference methods
The coupled-cluster treatment of open-shell systems is more complicated that the closed shell case since additional types of orbitals and excitations occur. First of all, it is possible to use either a spin-unrestricted (UHF) or a spin-restricted (RHF) Hartree-Fock wavefunction as a reference. In the UHF case the and spin orbitals are optimized independently, which leads to a wavefunction that is not an eigenfunction of the total spin operator S 2 . It is well known that the problems associated with the spin-contamination of the UHF wavefunction can become magnied when electron correlation eects are introduced1 , in particular in second-order perturbation theory (UMP2). It is therefore more desirable to use RHF orbitals. The second diculty is the denition of the excitation operators used in coupledcluster treatments. It turns out that a fully spin-adapted treatment based on an RHF reference function and the spin-free excitation operators Ers is very complicated. Is is much easier to use spin-orbital excitation operators eai , which replace a spin-orbital i by another spin orbital a with the same spin. However, then the correlated wavefunction is not spin-adapted, even if an RHF reference functions is used. This problem already arises in the linear conguration interaction theory if the rst order interacting space, spanned by the functions eai ebj |0 , is used as a basis; this is due to the fact that for high-spin open shell cases this space does not include all possible Slater determinants of given MS which arise from a particular occupancy of spatial orbitals. For instance, in a three electron case with reference function | |, the determinant | | is a triple excitation and not included 1 1 2 a b 2 in the rst order interacting space. This function would be necessary, however, to generate one of the two possible doublet spin eigenfunctions together with the determinants | | and | |. A quartet spin contamination arises if the a b 2 a b 2
31
latter two Slater determinants have coecients of dierent magnitude. Thus, the RHF-UCISD and RHF-UCCSD theories based on spin-orbital single and double excitations are not spin adapted. As will be shown in Section 3.2, the spin contamination in the linear UCISD wavefunction can be quite easily removed by applying appropriate projection operators to the UCISD residual vector. The same projection can be used to remove the spin contamination from the linear terms of the CCSD wavefunction. But even then, the presence of higher powers of T in the CCSD can introduce a spin contamination in a nontrivial way. Fortunately, this eect is usually very small. The partial spin adpation (PSA-CCSD) of only the linear terms has a number of advantages: the number of independent parameters (amplitudes) is minimized and corresponds exactly to the rst-order interacting space; also spin contamination effects are minimized, though not entirely removed. In an optimum implementation, the computational cost of the PSA-CCSD should be approximately the same as for a closed shell calculation with the same number of correlated orbitals. 3.1 Spin-unrestricted coupled-cluster theory (UCCSD)
We will rst consider the spin unrestricted coupled cluster (UCCSD) for the case that the reference function is a high-spin RHF Slater determinant with mclosed doubly occupied and mopen singly occupied orbitals; high spin means that all open-shell electrons have spin. The UCCSD wavefunction is obtained using the following cluster operator T = T1 + T2 in the exponential ansatz (113) T =
ia
a ai a ai (ti e + ti e ) +
it ij Tab e e ai bj ij ab tj Tab e e at bj tj ab
t ti ti e +
ta
a at tt e +
ij ab
ij ai bj ij (Tab e e + Tab e e ) ai bj tu Tab e e ti uj
+ +
+
ij at
ij Tat e e ai tj
+
ij tu
+
tj au
tj Tau e e at uj
+
tu ab
tu at bu Tab e e ,
(140)
where e = a i are the usual spin-orbital excitation operators. If applied to ai a Slater determinant, e replaces spin orbital by ; = {, } denotes the ai a i spin. Here and in the following, the indices i, j refer to closed-shell orbitals, t, u to open-shell orbitals, and a, b to virtual orbitals. For each orbital pair (ij), there ij are three sets of amplitudes, namely those for pure or -spin excitations Tab and ij ij Tab , respectively, and those for mixed , excitations Tab . In total, there are about three times as many amplitudes as in the closed-shell case. The corresponding cluster amplitudes are obtained by solving a non-linear set of equations obtained ai bj by projecting the Schrdinger equation on the left with RHF |0 , e |0 , e e |0 o ai etc., as in eqs. (120) - (122). The resulting explicit equations can be found in Ref. 47 . They have a very similar matrix structure as the closed shell equations discussed in the previous sections and will not be further discussed here. It should be noted, however, that there are three times as many equations as in the closed shell case, and the total computational eort is about three times larger.
32
3.2
Partially spin-resticted coupled-cluster theory (RCCSD)
In fully spin coupled theory48 , it is recognized that the hamiltonian operator is spin free, and therefore the excitation operators used in the previous section may be replaced by the smaller set Eai , Eat , Eti and their products, where again t, u, . . . are used to denote orbitals lying in the singly occupied space, while i, j, . . . denote true closed shell orbitals, and a, b, . . . external orbitals. A simpler theory47,49,50,51 , including some but not all of the spin coupling, may be obtained by using the operators Eai , e , e and their products; because the orbitals are occupied, and at ti t t are unoccupied in 0 , the wave function is then be spin adapted for a CISD conguration expansion, which is linear in these operators. In the non-linear CCSD case products of these operators can still give a spin-contaminated contribution to the wave function. This ansatz is denoted partially spin adapted CCSD (PSACCSD). It has the advantage that the complications occuring through the spin adaption are minimized, while most of the spin-contamination is removed. A slight complication arises for the so called semi-internal congurations generated by the operators e e , which have the same orbital occupancy as the single at ti ai but a dierent spin contribution. It is easily seen that e e |0 is excitations E at ti not a spin eigenfunction; a correct spin eigenfunction is generated by the operator e e 2 e + 2 e . In fact, analysis of the action of the hamiltonian operator on at ti 1 ai 1 ai the RHF reference function shows that this operator together with Eai generates the two possible spin eigenfunctions that contribute to the rst-order interacting space. The cluster operator can now be written as T =
ia
a ai a ai (ti e + ti e ) +
it ij Tab Eai Ebj ij ab tj Tau e e at uj tj au
t ti ti e +
ta
a at tt e +
tj ab tj Tab e Ebj at
+ + with the restrictions
+
ij at
ij Tat Eai e tj tu Tab e e at bu tu ab
(141)
1 a ti = t i a 2 1 a ti = t i + a 2
ti Tat , t ti Tat , t
(142) (143)
which account for the fact that there are only two independent spin eigenfunctions for the orbital congurations . . . i t a , as discussed above. Equating the operator T with the spin-unrestricted operator in eq. (140) yields the following relations between the amplitudes
tu tu Tab = Tab , ij ij Ttu = Ttu , pj pj pj Tab = Tab Tba , ij ji ij Tar = Tar Tar ,
(144) (145) (146) (147)
33
where p, q refer to all occupied orbitals (closed + open), and rs to all openshell + 1 virtual orbitals. Setting further ti = 2 ti and tt = 1 tt we obtain a unique set of t a t 2 a pq amplitudes Trs and tp to be solved for. The number of independent parameters r is then exactly the same as in a fully spin adapted formulation and about three times smaller than in the spin-unrestricted case. The corresponding minimal set of coupled equations can be obtained by projecting the Schrdinger equation onto the o set of functions generated the individual excitation operators in the cluster operator to the reference function. Since the conguration generated in this way are nonorthonormal, simpler equations can again be derived by projecting the Schrdinger o equation with the equivalent set of contravariant congurations. For details refer to Ref. 47 . The simplest possibility to solve the PSA-CCSD equations is to compute the UCCSD residuals, and then to form appropriate linear combinations of the dierent spin components to generate the spin-restricted residuals as needed for updating the amplitudes. Finally, the UCCSD amplitudes can be generated from the PSACCSD ones using eqs. (144147). Of course, this procedure does not save any computer time relative to the UCCSD, but it requires only a minor modication of an existing UCCSD program to perform the spin projection. 4 Linear scaling local correlation methods
As pointed out in the previous sections, the computational cost of conventional electron correlation methods like MP2 or CCSD(T) increases dramatically with the size of the system. The steep scaling mainly originates from the delocalized character of the canonical MO basis. This leads to a quadratic increase of the number of amplitudes used for correlating a given electron pair, and a quartic increase of the total number of parameters. The increase of the CPU time with molecular is even steeper, being O(N 7 ) for the best method of choice, which is usually CCSD(T). From a physical point of view, however, there should be no need to correlate all electrons in an extended molecular system: dynamic electron correlation in nonmetallic systems is a short-range eect with an asymptotic distance dependence of r6 (dispersion energy), and thus the high-order dependence of the computational cost with the number of electrons of the system is just an artifact of the canonical orthogonal basis, in which the diverse correlation methods have traditionally been formulated. One natural way to circumvent this problem is to to use local orbitals to span the occupied and virtual spaces. Such local correlation methods have been proposed by several authors. Some recent papers which also summarize previous work can be found Refs. 52,53,54,55,56 . Particularly successful has been the local correlation method originally proposed by Pulay57 , which was rst implemented by Saeb and Pulay for Mller-Plesset perturbation theory up to fourth order (LMP2 - LMP4(SDQ) without triple excitations) and the coupled-electron pair approximation (CEPA)58,59 . Later it was generalized to full local CCSD by Hampel and Werner 53 . While in the early work of Saeb and Pulay58,59 it could already be shown that only 1-2% of the correlation energy (relative to a conventional calculation with the same basis set) is lost
34
by the local approximation, it was not yet possible at that time to demonstrate that the scaling of the computational cost can actually be reduced, and that larger systems than with conventional methods can be treated. Signicant progress in this direction was only made during the last few years when the local correlation methods were combined with newly developed integral-direct techniques60 , which fully expoit the possibilities for integral screening. Within such a framework, it has been possible to develop O(N ) algorithms (asymptotic linear scaling of all computational ressources, i.e. CPU time, memory and disk space with molecular size) for local MP2 54 , local CCSD 61 and even for local connected triples correction (T) 62 . In the local correlation methods the occupied space is usually spanned by localized molecular orbitals (LMOs), which are obtained from the occupied canonical orbitals of a preceeding SCF calculation by virtue of a unitary localization procedure 63,64,65 , which maintains the orthogonality the occupied SCF orbitalsa |loc = k |can Wik i with WW = 1 . (148)
The corresponding MO coecient matrices are related similarly L = Xocc W . (149)
(If core orbitals are not correlated, the localization should be restriced to the subspace of correlated valence orbitals.) The idea of Pulay was to abandon the orthogonality of the virtual orbitals, and to use a basis of functions which resemble the atomic orbitals (AOs) as much as possible. Obviously, the AOs are optimally localized, but since they are not orthogonal on the occupied orbitals one cannot use them straightaway. The strong orthogonality between the occupied and virtual spaces must be retained, since otherwise excitations would violate the Pauli exclusion principle and the theory would become very complicated. The orthogonality to the occupied space can be enforced by applying a projection operator (1 i |i i |) to the AOs, yielding projected atomic orbitals (PAOs)
mocc
|r = (1 =
i=1
|i i |)|r (150)
| Pr
with P = 1 LL S = 1 Xocc X S = Xvirt X S. occ virt (151)
Here, Xocc and Xvirt denote the rectangular submatrices of the MO coecient matrix X for the occupied and virtual (external) canonical orbitals, respectively, i.e., (Xvirt X ) = virt
a
a Recently,
Xa Xa ,
(152)
it has also been proposed to use non-orthogonal basis functions to span the occupied space66,67 , but the computational eciency of this approach has not yet been proven.
35
and the last equality in Eq. (151) follows from the orthonormality condition (Xocc X + Xvirt X )S = 1 . occ virt The PAOs are orthogonal to all occupied orbitals r |loc = (P SL)ri = 0 i but non-orthogonal among themselves r |s = (P SP)rs = (SXvirt X S)rs . virt (155) (154) (153)
For non-metallic systems the PAOs are intrinsically localized, though less well than the unprojected AOs. Due to the projection the full set of PAOs is linearly dependent, but these linear dependencies can be removed at a later stage. After having introduced local functions to span both the occupied and the virtual spaces, it is possible to truncate the expansion of the wavefunction in a physically reasonable way. First, one assigns to each localized orbital loc an orbital i domain [i] which contains all AOs needed to approximate the orbital loc with i a prescribed accuracy. In practice, always all AOs at a given atom are treated together, and as many atoms are added as required. The order in which atoms are added is determined by gross atomic Mulliken charges. The corresponding orbital domain in the virtual space is spanned by the PAOs generated by applying the projector to the selected AOs. The PAOs in domain [i] are then all spatially close to the localized orbital loc . This selection procedure can be performed fully i automatically as described in Ref. 68 . The rst approximation to the correlated wavefunction is now that single excitations from orbital loc are resticted to PAOs in the domain [i], while double i excitations from a pair of occupied LMOs i and j are restricted to a subset [ij] of PAOs. The pair domain [ij] is simply the union of the two orbital domains [i] and [j]. The immediate consequence of these truncations is that for a given pair ij ij the number of amplitudes Trs , rs [ij] no longer increases quadratically with increasing molecular size, but instead becomes independent of molecular size. The second approximation is to introduce a hierarchical treatment of dierent pairs based on the interorbital distance Rij between two LMOs i and j. Rij is dened as the shortest distance between any centre included in the orbital domain [i] and any centre in the domain [j]. We distinguish strong, weak, distant, and very distant pairs. The strong pairs have at least one atom in common and usually account for about 95% of the correlation energy. These pairs are treated at highest level, e.g., CCSD. Weak pairs are those for which the minimum distance is smaller than typcially 8 bohr. These pairs can be treated at lower level, e.g., MP2. Distant pairs (8 Rij 15 bohr) are also treated by MP2, but the required two-electron integrals can be approximated by a multipole expansion69 , which reduces the cost for the integral transformation (see section 6.3). Finally, the very distant pairs (Rij > 15 bohr) contribute to the correlation energy only by a few micro hartree and can therefore be neglected. The important point to notice is now that the number of strong, weak, and distant pairs all scale linearly with size. Only the number of very distant pairs, which are neglected, scales quadratically. This is demonstrated in Fig. 4 for linear chains of glycine peptides, (Gly)n HO[C(O)CH2 NH]n H, Thus, the
36
10000
8000
very distant pairs
number of pairs
6000
4000
weak pairs
distant pairs
2000
strong pairs
0 0 5 10 15 20
n
Figure 4. Number of pairs for a chain of Glycine peptides (Gly)n as function of the chain length.
total number of amplitudes on which the wavefunction depends scales only linearly with molecular size. This forms the basis for the development of electron correlation methods with linear cost scaling. Furthermore, the number of strong pairs remains quite modest, which is very important for an ecient CCSD algorithm (cf. section 4.2). 4.1 Local MP2 1 2
In the local LMO/PAO basis, the rst-order wave function takes the form |(1) =
ij Trs |rs ij ij ji with Trs = Tsr ,
(156)
ijP rs[ij]
where P represents the truncated pair list and it is implicitly assumed that the pair domains [ij] are dened as described above. The congurations |rs are dened as ij in eq. (49), but now the virtual labels r, s refer to the non-orthogonal PAOs. Note that the commutation relations of the excitation operators involving non-orthogonal orbitals are dierent and depend on overlap matrix elements. In order to derive the LMP2 equations in the non-orthogonal basis of PAOs we rst consider the transformation properties of the operators and amplitudes. The projected orbitals can be expressed in the basis of virtual orbitals as P = Xvirt [X S] = Xvirt V , virt (157)
37
and therefore the MP2 residual given in eq. (79) for a basis of orthogonal MOs can be transformed to the PAO basis as Rij = V Rij V . PAO MO (158)
The Fock and exchange matrices transform similarly. The transformation properties of the amplitude matrices can be obtained by expanding the projected orbitals in the pair correlation functions ij into the MO basis ij =
rs[ij] ij Trs | . . . r s . . . | = ab
which yields the relation
rs[ij]
ij Var Trs Vbs | . . . a b . . . | (159)
Tij = VTij V . MO PAO Inserting this into eq. (79) yields Rij = Kij + fPAO Tij SPAO + SPAO Tij fPAO PAO PAO PAO PAO SPAO fik Tkj + fkj Tik PAO SPAO = 0 , PAO
k
(160)
(161)
where SPAO = P SAO P = V V is the overlap matrix of the projected orbitals, cf. eq. (155). In the local basis the occupied-occupied and virtual-virtual blocks of the Fock matrix are not diagonal, and therefore the linear equations (161) have to be solved iteratively for the amplitudes Tij . Restricting the excitations to PAO ij domains [ij] of PAOs means that only the elements Trs with r, s [ij] are nonzero, ij and only the corresponding elements of the residual, Rrs , r, s [ij] must vanish at convergence. For a given set of amplitudes, the Hylleraas functional (eq. 92) E2 =
ijP rs[ij] ij ij ij ij (2Trs Tsr )(Krs + Rrs )
(162)
ij can be computed. At convergence, Rrs = 0 for r, s [ij], and then E2 = E (2) . Since the projected orbitals are not orthogonal and may even be linearly dependent, straightforward application of an update formular as eq. (103) will lead to slow or no convergence. In order to perform the amplitude update it is therefore necessary to transform the residuals to a pseudo-canonical basis, which diagonalizes the Fock operator in the subspace of the domain [ij], i.e. ij ij ij ij frs Xrs = Srs Xsa ij Rab = rs[ij] ij for a ij ij ij Xsa Rrs Xsb .
r, s [ij] ,
(163) (164)
The update is then computed in this orthogonal basis and nally backtransformed to the projected basis
ij ij Tab = Rab /( ij Trs ij b fii ij ij ij Xra Tab Xsb . ij a
fjj ) ,
(165) (166)
=
ab
38
Note that the square transformation matrix Xij is dierent for each electron pair. The dimension of this matrix corresponds to the number of projected orbitals in domain [ij] and is therefore independent of the molecular size. If the overlap maij trix Srs , r, s [ij] has small or zero eigenvalues, i.e, if the functions in the domain are linearly dependent, the corresponding eigenvectors of Sij are projected out53 . Convergence of this scheme is reached quickly; usually 5-7 iterations are sufcient to converge the energy to better than 0.1 H using no further convergence acceleration70 . In order to compute the residuals, only the small subset of exchange integrals
ij Krs = (ri|sj) =
Pr Ps
Li Lj (|)
r, s [ij]
(167)
is needed, where all r, s are close either to i or j. This makes it possible to devise an integral-direct transformation scheme which scales only linearly with molecular size54 . Taking further into account that for a given pair (ij) the number of terms k in the summation of eq. (161) becomes asymptotically independent of the molecular size (provided very distant pairs are neglected), it follows that the computational eort to solve the linear equations scales inearly with molecular size as well 54 . Thus, the overall cost to transform the integrals, to solve the linear equations (161), and to compute the second order energy depends linearly on the molecular size. This has made it possible to perform LMP2 calculations with about 2000 basis functions and 500 correlated electrons without using molecular symmetry. Since also the memory demands are small and scale linearly with molecular size, such calculations can even be performed on low-cost personal computers. Finally we note that analytical energy gradients for LMP2 have been developed71 . It has been shown that the local ansatz largely eliminates basis set superposition errors (BSSE), and it is therefore possible to optimize BSSE-free equilibrium structures of molecular clusters72,73 . Recently, also the theory for computing NMR chemical shifts using the LMP2 method has been derived and rst promising results have been obtained74 . 4.2 Local CCSD
The LCCSD equations can be obtained exactly in the same way as indicated above for the LMP2 case, namely by transforming the residuals from the MO to the PAO basis. The resulting equations dier formally from the canonical ones only by the occurrence of additional matrix multiplications with the overlap matrix. The full formalism has been presented in Ref. 53 and will therefore not be repeated here. As already pointed out before, it is usually sucient to treat pairs with interorbital distances Rij 1 bohr (strong pairs) at the CCSD level. Exceptions are cases where it is of importance to treat longe-range interactions accurately at high level, for instance for computing intermolecular interactions. In the following discussion we will assume, however, that this is not the case, and that the number of strong pairs included in the CCSD treatment is relatively small and scales linearly with molecular size, as shown in Fig. 4.
39
For the LMP2 case it is immediately obvious that the number of transformed ij exchange integrals Krs = (ri|js) that need to be computed and stored depends only linearly on the molecular size. This follows from the fact that there is a oneto-one correspondence between these integrals and the corresponding amplitudes ij Trs . In the coupled cluster case however, the situation is more complicated, since integrals like the above also couple dierent electron pairs in the CCSD formalism. Furthermore, as already discussed in section 6.6, there are additional contributions ij of Coulomb integrals Jrs = (ij|rs), as well as of integrals (ir|st) and (rs|tu) with three and four external indices, respectively. Closer inspection of the problem reveals, however, that also in the coupled cluster case the number of transformed integrals scales only linearly with molecular size. The same is true for the number of oating point operations needed to compute the residuals. In order to illustrate the main ideas we will a consider the contribution of the Yjk intermediates to the LCCD residual, cf. eqs. (124) and (128), Gij = . . . +
k
1 STik Kkj Jkj 2
1 4
STik Lkl Tlj S + . . . .

kl
(168)
Here, all matrices are assumed to be in the PAO basis. Now, since (ik) and (lj) both are strong pairs, there is only a constant number of LMOs k and l interacting with given i and j, respectively. Furthermore, since also (ij) is a strong pair, it follows that for a xed (ij) the total number of operators contributing to each G ij is asymptotically constant and independent of the molecular size. Thus, the total number of integral matrices Jkl and Kkl needed in eq. 168 scales linearly; the same holds for the number of matrix multiplications. Furthermore, the LMOs k and l of the surviving operators have to be close, which is important to achieve linear scaling in the integral transformation needed to compute the Jkl and Kkl . Note that fewer Jkl than Kkl are needed, since the Jkl only occur in the linear terms. Thus, separate operator lists for the Jkl and Kkl have to be maintained. In contrast to the canonical case, the evaluation of the residuals is driven by individual Jkl and Kkl , and the Yjk and Zjk intermediates are never explicitly computed. kl The PAO range r, s of a particular operator Krs is also independent of molecular size: since i must be close to k, and l close to j, all the r, s occuring in the matrix multiplications of eq. 168 must be within a limited distance to k, l. This leads to a dierent operator domain for each surviving operator. Again, the operator domains for the Jkl are smaller than for the Kkl . Since the number of Coulomb and exchange matrices scales linearly with molecular size, and the number of elements per matrix is independent of size, it is evident that the overall number of transformed integrals scales linearly with molecular size. So far, no approximations were involved by introducing the sparse operator lists and operator domains. However, there are a few terms like Gij = . . . STij Lkl Tlk S
kl
(169)
with no coupling between ij and kl via pair amplitudes, and for those terms additional approximations have to be introduced to achieve linear scaling. Fortunately, the integrals involved in these contractions diminish quickly with increasing
40
distance between the pairs (ij) and (kl), and it is well justied to neglect couplings between remote pairs. For a detailed discussion of these approximations we refer to Ref. 61 . Another important feature of LCCSD is the fact that the number of 3-external and 4-external integrals (ir|st) and (rs|tu) also scales linearly with molecular size, and in fact remains rather modest. For the 3-external integrals this follows from the fact that the r, s in the operators J(Ekj )rs =
t
(rs|tk)tj , t
K(Ekj )rs =
t
(rk|ts)tj t
(170)
are restricted to the J-operator domain [kj], while t in the sum is restricted to the pair domain [jj], which is identical to the orbital domain [j]. For the 4-external integrals the PAO indices simply all belong to the same pair domain [ij], since there is a one-to-one correspondence between the residual (Rij )rs and the external exchange operators 53,61 K(Tij )rs =
tu ij Ttu (rt|us) .
(171)
Thus the number of 4-external integrals per pair is a constant. Fig. 5 shows the number 3-external and 4-external integrals in the local basis as a function of the length n of a linear polyglycine peptide chain (Gly)n in a cc-pVDZ basis. Even for a molecule as large as (Gly)20 with about 1500 basis functions and almost 500 correlated electrons, the disk storage requirement to hold the 3-external integrals is less than 1.5 GByte (compared to more than 3000 GByte in the canonical case). A similar amount is required for the 4-external integrals. Disk storage of the 3-external and 4-external integrals is very appealing, since then the computational cost per iteration is minimized. It can be estimated that forming the contractions of the 3external and 4-external integrals with the amplitudes would take virtually no time (e.g., less than 50 sec for (Gly)20 ). However, the transformation for the 4-external integrals is quite complicated and has not been implemented so far. Alternatively, the contribution of these integrals can be accounted for by computing for each strong pair an external exchange operator, as dened in eq. (107). In an integral direct scheme, as will be discussed in section 6.5, it is then also possible to achieve linear cost scaling. 4.3 Local connected triples corretion
The ultimate bottleneck for accurate conventional coupled cluster calculations is the connected triples correction, as outlined in section 2.7. If canonical orbitals are used, the Fock matrix is diagonal, and the perturbative energy correction can be obtained directly without storing the triples amplitudes. In the local case this is no longer the case, and in principle an iterative scheme is required, as described above for the local MP2. One might therefore think that the evaluation of a local triples correction for large molecules is impossible, since the storage requirements for all triples amplitudes would scale as O(N 6 ). However, as for the doubles,
41
1.5
4external
file size / GByte
1.0
3external
0.5
0.0
10
15
20
n
Figure 5. Number of transformed integrals for local CCSD calculations for glycine peptides (Gly) n as function of the chain length.
triple domains can be introduced, and the correlation of distant electrons can be neglected. The theory has been outlined in Ref. 62 . Similarly to the LMP2 case the triples amplitudes are obtained by solving a system of linear equations
ijk Qijk + Wrst = 0 rst
(172)
with Qijk = rst

v
{ {
ijk ftv Tr s v Srr Sss + permutations} rs ijm fkm Tr s t Srr Sss Stt + permutations} rst
(173)
and
ijk Wrst =
{ {
ij (vs|tk)Tr v Srr + permutations} r im (mj|kt)Tr s Srr Sss + permutations}. rs
(174)
These equations have to be solved iteratively, and therefore all triples amplitudes ijk Trst must be stored on disk. This seems devastating at a rst glance, but by virtue of the local approximations the number of amplitudes can be drastically reduced:
42
Firstly, a sparse triples list (ijk) of strong triples is constructed by restricting the related pairs ij, ik and jk to strong pairs. The number of strong triples then scales linearly with molecular size. Secondly, the excitations are restricted to triple domains [ijk], constructed as the union of the three strong pair domains, i.e., [ijk] = [ij] [ik] [jk]. Since the sizes of the individual pair domains are independent of the molecular size, the size of the triples domain [ijk] is also independent of the molecular size, yielding overall an asymptotically linear scaling of the number of triples amplitudes. Another important implication of the constant size of the triple domains is that the number of required 3-external integrals occurring in eq. (174) scales only linearly with molecular size. In practice, for each orbital l a united triple domain U T (l) is dened as the union of all triple domains [ijk] comprising a common LMO index l, i.e., U T (l) = [ijk], for (i = l) (j = l) (k = l), (175)
and all 3-external integrals (vs|tl) with v, s, t U T (l) are generated using and integral-direct transformation module. Obviously, the size of U T (l) is independent of the molecular size, and the CPU time as well as memory and disk requirements of the transformation scale asymptotically linear with molecular size. In fact, the set of 3-external integrals needed for the triples correction remains pretty small 62 , and usually it is a subset of the 3-external integral set required in the preceeding coupled cluster calculation (cf. Fig. 5 in section 4.2). A linear scaling algorithm for local triples has been implemented in MOLPRO 2000. So far, inter-triples couplings via the occupied-occupied o-diagonal Fock matrix elements are neglected (couplings via the virtual-virtual block and the overlap matrices are included though). This yields about 95% of the local triples correction and has the advantage that the iterative solution of eq. (172) can be avoided. As in the canonical case, the correlation contribution of each individual triple can be computed separately. First test results62 presented in Table 3 are very promising, showing already for medium sized molecules speedups by factors 500-1000 compared to the conventional (T) calculation presented earlier in Table 1 (note that the calculations in Table 1 used molecular symmetry, while the current calculations were done with no symmetry). In these calculations about 85% of the canonical triples correction was recovered62. The savings quickly increase with increasing molecular size. In sharp contrast to the conventional case, the time to compute the local triples corrections is very small as compared to time for the preceding integral transformation and LCCSD calculation. Considering the eciency of the new triples kernel, it seems even possible to go beyond the CCSD(T) model, i.e. to include the triples into the CC iterations, even for large chemical systems. Finally, it should be emphasized that the triples amplitudes can be stored and an iterative full local triples algorithm is presently under development.
43
Table 3. CPU, disk and memory requirements for computing the (T) correction. All calculations were performed with a development version of MOLPRO 200075 . No molecular symmetry was used.
Molecule bf Memory/MW Diska /MW CPU/secb (Gly)1 95 2.57 7.46 187.3 (Gly)2 166 6.38 25.51 757.8 (Gly)3 236 8.82 39.98 934.6 (Gly)4 308 13.28 61.06 1296.6 (Gly)6 450 18.95 94.17 1852.5 a) Disk space for storage of 3-external integrals necessary for (T) only. b) HP J282 PA8000/180MHz. 5 5.1 Multireference electron correlation methods Conguration Interaction: general aspects
For a given orbital basis set, Schrdingers equation as expressed using the secondo quantized hamiltonian H (equation (34)) is solved by nding eigenvectors and eigenvalues of the hamiltonian matrix in the complete basis of N -electron orbital products. This full CI problem is of extremely large dimension for even a small number of electrons with a modest orbital basis size, and is usually intractable. However, it is important to consider it for two reasons: rst of all, where the full CI problem can be solved, it provides very important benchmark data against which approximate methods can be evaluated; secondly, the techniques and algorithms applicable to the full CI problem serve as appropriate building blocks for the sometimes more complicated approximate methods. Although the full conguration space for N electrons in m spatial orbitals consists formally of the complete set of (2m)N spin-orbital products i1 (x1 )i2 (x2 ) . . . iN (xN ), the space can be reduced substantially through symmetry considerations: Spatial (point group) symmetry. H is invariant to geometrical transformations whose only eect is to interchange identical nuclei. The action of the symmetry operators on the wavefunction is dened through T (q) = (T 1 q) (176)
where q represents the coordinates of the particles. In electronic structure calculations, the use of abelian point group symmetry is straightforward; provided each orbital is a basis for an irreducible representation, then so is every orbital product. All orbital products not of the required symmetry can then be simply discarded from the basis. For nonabelian point groups, orbital products are in general of mixed symmetry, and it is therefore usual to exploit the symmetry of only the highest abelian subgroup. Permutational symmetry. H is totally symmetric in the labels of the electrons, and so is invariant under the operation Iij which interchanges the labels of
44
2 electrons i, j, i.e., [H, Iij ] = 0. At the simplest level, Iij = 1, and so there 1 are 2 N (N 1) two dimensional symmetry groups {1, Iij }. Symmetry adapted ij = , the dierent signs corresponding to wavefunctions will satisfy I boson and fermion states. We are interested only in fermion solutions, and so it is vital to use this symmetry to exclude unwanted boson and non-physical states. In further detail, there is actually a total of N ! permutations of the electron labels, which can be build as products of Iij operators. As with point groups, we dene the action of a permutation operator on the wavefunction through equation (176). The permutations form a group isomorphic with the Symmetric Group SN , and to use permutational symmetry to the full, we must consider all of these N ! operators which commute with H. Since the electronic wavefunction is antisymmetric with respect to all the Iij , it must form a basis for the one dimensional totally antisymmetric representation of SN ; the representation matrix elements (P ) are equal to the parity P of the , which is 1 according to whether P is made up from an even or permutation P ij . To enforce the symmetry, we apply a multiple odd number of interchanges I of the Wigner projection operator for this representation, the antisymmetrizer
N! P P
1 A= N!
P .
(177)
When applied to a simple product of orbitals, A yields the corresponding Slater determinant
N! P P
1 A 1 (1)2 (2) . . . N (N ) = N! 1 = N!
P 1 (1)2 (2) . . . N (N )
1 (1) 2 (1) . . . N (1) 1 (2) 2 (2) . . . N (2) . . . . . . . . . . 1 (N ) 2 (N ) . . . N (N )
(178)
Note that, apart from a possible phase factor, exactly the same determinant would arise if A were applied to a string of the same orbitals, but in a dierent order, e.g., 2 (1)3 (2)1 (3) . . .. Therefore we can symmetry reduce the full set of mN orbital products to a much smaller basis of m Slater determinants N obtained by acting with the antisymmetrizer on each of the m unique orbital N products. The valid unique orbital products can be determined by assuming an ordering for the orbitals; each of the m orbitals i is assigned a sequence number i, i = 1, 2, . . . , m, and only orbital products i1 i2 . . . iN for which i1 < i2 < . . . < iN are included.
45
Spin symmetry. The electron spin operators are dened through

N i
S 2 = S S; S= sx =
(i) s
i 1 1 ; sy = ; sz = ; 2 2 2 1 i 1 sx = ; sy = ; sz = , (179) 2 2 2 where and are the one electron spin eigenfunctions. The nonrelativistic hamiltonian contains no spin operators, and so [H, Sz ] = [H, S 2 ] = 0. It is not possible to use simple group theory to exploit these symmetries, since the operators Sz , S 2 do not form a closed nite group. But we can use other considerations to force the N electron basis set, and hence the wavefunction, to be eigenfunctions of Sz and/or S 2 . In the case of Sz , the approach which is usually used is to use a basis of 2m orbitals, made up of m spatial orbitals i , i = 1, 2, . . . m, each multiplied by a spin function or . Then any orbital product, or Slater determinant, is automatically an eigenfunction of Sz according to (179), with eigenvalue 1 2 (N N ), where N is the number of spin orbitals i in the function, and N = N N . Thus the basis is already adapted to Sz symmetry, and we may discard all those N electron functions with the wrong Sz eigenfunction. 2m This reduces the size of the Slater determinant basis from N to MD = m N m N , (180)
m since for each of the N possible arrangements of the spin orbitals there m are N choices for the spin orbitals.
For S 2 , the situation is not so simple. Orbital products or Slater determi nants are not in general eigenfunctions of S 2 ; for example, following (179), 2 (1) (2) = (1) (2) + (1) (2). If the symmetry is to be exploited, S 1 1 2 2 2 1 Slater determinants must be linearly combined into functions which are eigen functions of S 2 . Such functions are often termed Conguration State Functions (CSFs). As a simple example, for two electrons in two orbitals with N = N = 1, the normalized Slater determinants are 1 A , 1 1 A , 1 1 A , 2 2 A , 2 2 A , 1 2 A ; 2
the normalized CSFs with S = 0 are
and the CSF with S = 1 (i.e., the eigenvalue of S 2 is S(S + 1) = 2) is (1/ 2)(A A ) . 1 2 2 1 Generally, the set of Slater determinants exactly spans the sets of CSFs with 1 spin quantum numbers S = 1 (N N ), 2 (N N ) + 1, . . . , 1 N . Ignoring 2 2
1 2 (1/ 2)(A + A ) , 2 1
46
any point group symmetry, the number of CSFs with spin quantum number S is given by the Weyl formula 76 MC = 2S + 1 m+1
1 2N
m+1 S
m+1 m 1N S 2
(181)
for the case that S = 1 (N N ). So, for example, for S = 0 and large m, 2 the number of CSFs is less than the number of Slater determinants by a factor 1 of about 2 N + 1. The advantage of reducing the basis in this way has to be oset against the increased complexity of the functions which must be dealt with; in practice both Slater determinants and CSFs are commonly used, and we discuss the practicalities of matrix element evaluation with each below. Orbital rotation symmetry. If we have all (unique) orbital products possible for N electrons in m orbitals, then the basis is invariant to rotations (or in fact any nonsingular linear transformation) of the orbitals amongst themselves. These rotations form a continuous group U (m), the unitary group (or GL(m), the general linear group), and the theory of such groups is exploited to advantage, for example, in the Graphical Unitary Group Approach (GUGA) 77 for conguration interaction. In order to perform a variational conguration interaction calculation in either the full or a truncated conguration space, it is necessary to nd an eigenvector of the matrix H of the hamiltonian operator H in the appropriate conguration space. Direct construction and diagonalization of H is usually out of the question since it is typically of dimension 103 107 ; but algorithms to nd a few eigenvectors for such matrices exist3,78 , and rely on the construction, for a few ( 1020) given trial vectors c, of the action of H on c, v = Hc . (182)
Other ab initio approaches which are not simple matrix eigenproblems can also proceed through (182). Therefore it is vital to have an ecient scheme for constructing (182) from the hamiltonian integrals hpq , (pq|rs). Following (37), this means we must be able to compute rapidly the set of one and two particle coupling IJ coecients dIJ , Dpqrs . pq In many circumstances, the most ecient schemes for building (182) require computation only of the one particle coecients dIJ , without explicit construction pq IJ of the two body terms Dpqrs . This is achieved through a formal insertion of the resolution of the identity as a sum over the complete space of orbital products,
IJ Dpqrs = I |Epq Ers qr Eps |J
=
K
I |Epq |K
K |Ers |J qr I Eps |J (183)
=
K
dIK dKJ qr dIJ . pq rs ps
Note that Epq commutes with electron label permutations and spin operators; therefore the set of intermediate states {K } can be reduced to the full set of Slater
47
determinants or CSFs as convenient; but the same is not true for point group operations, and {K } must therefore extend over all spatial symmetries. The algorithm for building (182) then proceeds as 79 DO K = 1, M KJ DO p, q = 1, m such that dpq = 0
K K Fpq = Fpq + dKJ cJ pq
END DO END DO
(184)
DO r s
DO p q DO K = 1, M
K K K Ers = Ers + Fpq (pq|rs) END DO
END DO END DO
(185)
DO K = 1, M DO p, q = 1, m such that dIK = 0 pq

K vI = vI + Epq dIK pq END DO
END DO
(186)
The one electron part and second term of (183) are easily dealt with in an additional stage, or may be included in (184186) by modifying the two electron integrals. The advantage of using this scheme is that, for suciently large cases, the computation time is dominated by (185), requiring approximately 1 M m4 oating 2 point operations, and this step is a large dimension matrix multiplication capable of driving most computer hardware at optimal speeds. In what follows, therefore, we are concerned principally with the evaluation, rapidly and in the correct order for assembly of (184186), of the nonzero dIJ , without the need to consider the pq IJ more complicated structure, and much larger number, of Dpqrs coecients. In some circumstances, simple Slater determinants oer the most ecient route to calculating (182), whilst elsewhere the greater compactness of the CSF basis is important. Therefore we develop techniques for evaluating dIJ in both types of basis set. pq 5.2 Matrix elements between Slater Determinants I,J = A
Any Slater determinant can be written in the form
I J
(187)
48
where I (r1 , r2 , . . . , rN ) is a string (product) of occupied spin orbitals
I = 1 (1)2 (2) . . . N (N ) , I I I
(188)
which is completely specied by the ordered list of sequence numbers of occupied orbitals, {I1 < I2 < . . . < IN }. Similarly, J (rN +1 , rN +2 , . . . , rN ) is a string of occupied spin orbitals. For the case of a complete basis of determinants, this is a particularly helpful classication, since a wavefunction is then specied by a fully populated rectangular matrix of coecients C, =
IJ
CIJ I,J ,
(189)
and this simple rectangular addressing structure makes for a particularly ecient computer implementation. For certain special types of incomplete CI expansion, it is possible to obtain similar structures80 , but it is the case of full CI (FCI) for which the determinant basis has found particularly useful application. For the evaluation of coupling coecients, we can exploit the fact that the orbital excitation operator partitions as Epq = e + e , pq pq (190)
where e , e excite only , spin orbitals respectively; thus the eect of Epq on pq pq any determinant is to produce at most two new determinants: Epq A
epq I J = A ( I ) J + A e I pq
I ( J ) . epq
(191)
Note that the excitation is completely independent of J , and so once a particular spin excitation has been characterized, one can use the information found for all strings, obtaining I,J |Epq |K,L =

epq I |A A| K JL .
(192)
For this to be non zero, I must be identical to K apart from the replacement of by . Suppose that in I , p appears as a function of electron i, and in q p K , q is correspondingly in position j, i.e.,
I = I1 (1)I2 (2) . . . Ii1 (i 1)p (i)Ii+1 (i + 1) . . . Ij (j) . . . and
(193)
K = I1 (1)I2 (2) . . . Ii1 (i 1)Ii+1 (i)Ii+2 (i + 1) . . . q (j) . . . (194) Then
This is not the same as the string I , but is related to it by a permutation of the electron labels, known as the lineup permutation L, which in this case is the cyclic j), dened through permutation C(i, C(i, j)1 (i)2 (i + 1) . . . ji+1 (j) = ji+1 (i)1 (i + 1) . . . ji (j) ; epq Thus L K = C(i, j) K = epq true:
Epq K = I1 (1)I2 (2) . . . Ii1 (i 1)Ii+1 (i)Ii+2 (i + 1) . . . p (j) . . . (195)
(196)
I . For any permutation P , the following is

PA
P A = AP =
(197)
49
and so the matrix element (192) is
etu I |A A| K =
I |AL1 A| I = L N ! I |A| I
since A = N !A, and the only nonzero contribution to I |A| I comes from the identity permutation. Therefore all coupling coecients are 0 or 1, and the sign is determined by the parity of the lineup permutation L. Hence the construction of F in (184) proceeds as

(198)
DO K DO p, q = 1, m such that I = Epq K exists Determine parity L of lineup permutation L DO J F (K, J, pq) END DO END DO END DO
L C(I, J)
(199)
The innermost loop over J contains no logic or even multiplication and vectorizes perfectly on all pipeline computers. A similar loop structure is required for the contributions from e , and the logic of (186) can be treated in a similar fashion. pq Because the number of , strings is rather small ( MD ), all the necessary single excitation information can be computed once and held in high speed storage. The result is a perfectly vectorized, disk free algorithm81,82 , where for reasonably sized problems at least, there is practically no overhead above the cost of the matrix multiplication (185). There have been a number of algorithmic developments which have further enhanced the eciency and applicability of the determinant FCI method. Olsen et al. 80 showed how it was possible to reduce the operation count to be proportional to N 2 m2 rather than m4 , with, however, some degradation of the vector performance; their method is particularly useful when the ration m/N is relatively large. Zarrabian et al. 83 have used an alternative resolution of the identity to (183), with an intermediate summation over N 2 electron (rather than N electron) Slater determinants. Again, when m/N is large, there are many fewer of these, allowing for considerable enhancement in eciency. 5.3 Matrix elements between Conguration State Functions
In order to build a basis of spinadapted CSFs, we begin by nding explicit spin functions , which are not dependent on space coordinates, and which satisfy S 2 = S(S + 1). Having done this, we then attempt to build fully symmetry adapted spacespin functions. For a single electron, there are two possible spin functions (s), where s represents the spin coordinate, namely the usual and . For N electrons, the complete space of spin functions is then spanned exactly by the
50
N electron primitive spin functions, written as [i1 i2 . . . iN ] where the function i of the spin coordinates of each electron in turn may be or . There are a total of 2N such functions, and they are eigenfunctions of Sz , the eigenvalue MS being 1 (N N ) where N is the number of times appears in the function, 2 and N = N N ; it is then convenient to group them together in sets of those functions sharing the same MS , the number in each set being 1 NN S . +M 2 The primitive spin functions are not in general eigenfunctions of S 2 , and so we seek linear combinations which will be spin eigenfunctions. This is achieved most simply by repeated application of standard angular momentum coupling theory84,85 . If we have two independent physical systems in each of which we have sets of angular momentum eigenfunctions, {|J1 M1 } and {|J2 M2 }, then the members of the set of all products of such wavefunctions are not in general eigenfunctions of the total angular momentum for the combined system. But for a given J1 , J2 and feasible nal quantum numbers J, M , it is possible to nd exactly one composite eigenfunction |JM = J1 J2 M1 M2 |JM |J1 M1 |J2 M2 (200)
M1 M2
where the number J1 J2 M1 M2 |JM is a standard ClebschGordon coecient. Note that all the dierent M1 and M2 components appear in the sum, but only a single J1 and J2 value is involved. For N electron spin functions, this suggests a recursive scheme whereby N electron functions are made from such a composite of an N 1 electron system with a further single electron. The N 1 electron functions arise in the same way from N 2 electron spin eigenfunctions, the chain being repeated 1 down to a single particle. For each coupling, the value of J2 is 2 , and so the sum 1 over M2 extends over two possible values, 2 , i.e. a contribution involving for the last electron and a contribution with . In this genealogical construction, each N electron function is fully described by its parentage the history of the coupling scheme which can be visualized as a path on the branching diagram shown in Figure 6. Because in the angular momentum coupling one need sum only over the M and not the S quantum numbers, there are in general many independent functions having the same S, MS , but dierent ancestry, and we label the functions as N S,M, where is an index which distinguishes functions with dierent parentage. The N number fS of such functions is indicated at each node on the branching diagram, and one can show inductively that
N fS = 1 2N
N S
1 2N
N S1
(201)
It follows, again inductively, that

N (2S + 1) fS = 2N . S
(202)
It is straightforward to show 86 that the genealogical functions are orthonormal. For each path on the diagram, there are (2S + 1) functions, corresponding to the N possible dierent MS values, and so S (2S + 1) fS represents the total number of
51
Figure 6. The branching diagram
S 5
1 1
4
1

1 8 7 6 5 4 3 2 1 2

3
1

27 20

2
1

14 9 28 14 5 14

48
1
1 1

42

independent N electron branching diagram functions; this is the same as the number of primitive spin functions, and so we have a complete set of spin functions. Because eventually we need to consider the eect of the antisymmetrizing oper ator A, it is important to develop the permutation properties of the spin functions. Since S 2 is totally symmetric in the particle labels, it commutes with any permuN tation, P S 2 = S 2 P . Then it follows that, since S 2 N S,M, = S(S + 1)S,M, , 2 N S 2 P N S,M, = P S S,M, S,M, = S(S + 1) P N , (203) (204)
i.e. P N Since S,M, is a spin eigenfunction with quantum numbers S, M . N N N {S,M, , = 1, 2, . . . , fS } is a complete set of such functions, then P S,M, must be a linear combination of these: S,M, = P N
N i.e., {N S,M, , = 1, 2, . . . , fS } is a basis for a representation of the symmetric group SN . The representation is actually isomorphic with particular cases of Youngs Orthogonal Representation, which is generated (also genealogically) using ideas from
N S,M, U (P ) ,
52
35

75
90
42
10
(205)
the theory of SN . Youngs orthogonal representation is often depicted graphically. A given representation is drawn as a Young diagram, consisting of N adjoining square boxes with rows numbered numerically downwards, and columns rightwards; there may not be more rows in column i than in column i 1, nor columns in row j than in row j 1. For example, in S4 , the possible Young diagrams are (206)
1 For the case of the spin 2 particles which are our exclusive concern, then only those representations whose Young diagram has at most two rows are relevant, and they correspond to spin quantum numbers S equal to half the dierence between the number of boxes in the two rows. Thus for S4 , , , represent,
respectively, the sets of spin functions with S = 0, 1, 2. Within each representation, a given basis function is depicted as a Young tableau, which is an arrangement of the numbers 1, 2, . . . , N in the Young Diagram, such that numbers always increase along all rows and down all columns. For the tworow Young frames which we consider, the number of such tableaux (i.e., the dimension of N the representation) is exactly fS , and in fact there is a onetoone correspondence between the branching diagram functions and the tableaux; when a particle number appears in the rst row, its spin is coupled up, and for those in the second row, the spin is coupled down. For the case of four electrons, the complete set of Young tableaux and corresponding branching diagram functions are shown in Figure 7. The representation matrices U(P ) constitute all the information which we require for developing properties of the branching diagram functions; for example, the branching diagram functions themselves can be generated from a primitive spin function by use of a suitable projection operator. Formulae for the U (P ) for any permutation P are straightforward to derive from simple rules given in terms of the Young tableaux 86 , or, equivalently, from consideration of the ClebschGordon coecients 86 . Having obtained the representation matrices, we are now in a position to use them in constructing a basis of space and spin functions which are spin eigenfunctions and satisfy the Pauli principle. We write members of this basis as A = A A N S,M,
(207)
where the spatial function A is usually an ordered product of spatial orbitals, and N S,M, is a branching diagram function. Note that the antisymmetrizer involves a sum over all permutations P , and each P permutes both the space and the spin
53
Figure 7. Branching diagram symbols and Young tableaux for 4 electrons
N = 4, 1 2 3 4

S = 0; 1 3 2 4

4 f0 = 2
N = 4, 1 2 3 4

4 S = 1; f1 = 3 1 2 4 1 3 4 3 2

N = 4, S = 2; 1 2 3 4

coordinate labels. Inserting the denition of the antisymmetrizer, A = 1 N! 1 N! 1 f

f P P f P P
Pspace A Pspace A
Pspin U (P )
A ,
where we dene a set of spatial functions A = f N!

P U (P )P A P
This has the appearance of a projection operator on A for a representation with matrices V (P ) = P U (P ). This is the conjugate representation to that supported by the spin functions, and appears in the Young theory as the reversal of the roles of rows and columns, e.g., (spin) (space). Note that all
54
4 f2 = 1
(208)
(209)
(210)
, = 1, 2, . . . , f are involved in each of the spacespin functions A . For the the coupling coecients dA,B = A |Epq |B , pq (211)
as with determinants, a nonzero contribution will arise only if A and B dier by the orbital excitation q p . Ignoring any complications which arise from doubly occupied orbitals, we must again have A = LEpq B , where L is the appropriate lineup permutation. Inserting (208) into (211) we obtain dA,B = pq = 1 N! 1 N! 1 N!
P Q PQ P Q PQ
P A |QL1 A P A |QL1 A
U (P )U (Q) | U (P )U (Q)
since the spin functions are orthogonal =

P Q PQ
P A |QL1 A U (P 1 Q) ,
(212)
using the representation property of U(P ). Orbital orthogonality then gives the = QL1 , and so requirement that P dA,B = tu = 1 N!
2 1 Q) L Q U (LQ Q
L U (L)
(213)
Thus knowledge of the lineup permutation and the representation matrix elements is sucient to generate any desired oneparticle coupling coecient. The above is based on the assumption that p and q are singly occupied in A , B respectively. When one or both orbitals are doubly occupied, further considerations are necessary. Firstly, many of the spin functions give rise to vanishing A because of the operation of the Pauli principle acting through the antisymmetrizer. If the orbitals are ordered such that the doubly occupied appear rst in their respective pairs, then only those spin functions which couple each pair to singlet are allowed. This of course gives a drastic reduction in the number of possible spin N functions, since it is now fS with N referring to the number of singly occupied orbitals only. Following this, there are slight complications to the above scheme for the coupling coecients; there appear four distinct cases depending on the excited orbital occupancies, of which (213) is one. How are the relevant representation matrices obtained? Equation (213) shows that one needs all of the representation matrices for all possible cyclic permutations. These matrices can be generated by writing the cycle as a sequence of elementary transpositions, C(i, j) = C(j 1, j)C(j 2, j 1) . . . C(i + 1, i + 2)C(i, i + 1) ; (214)
the representation matrices for these transpositions are very sparse, and can be obtained from the shapes of the Young tableaux86 . The matrix for the cycle is then obtained by matrix multiplication. Unfortunately, this algorithm is too slow for
55
practical use, and it is much better to precompute and store all of the necessary matrices. The number of matrices that must be stored can be reduced considerably by using a resolution of the identity analogous to that used to factorize two-body matrix elements into sums of products of one-body elements (equation (183)). If we introduce a (ctitious) additional orbital which is dened to occur lexically always after any other orbital, the following identity holds. Epq = Epa Eaq (215)
This allows one to make use of just those cycles involving the last electron, since the orbital a will always be occupied by only this electron in the ordered orbital product string. This is the basis of an ecient algorithm for matrix element evaluation that is fast enough for general use in full and other CI computations87 . 5.4 Molecular Dissociation and the MCSCF method
As discussed in section 1.6, in many situations electron correlation eects are purely of the dynamic type, in the sense that Hartree-Fock is a good zero-order approximation, and under such circumstances, single-reference methods provide an ecient and accurate way to getting correlation energies and correlated wavefunctions. However, wherever bonds are being broken, and for many excited states, the Hartree-Fock determinant does not dominate the wavefunction, and may sometimes be just one of a number of important electronic congurations. If this is the case, single-reference methods, which often depend formally on perturbation arguments for their validity, are inappropriate, and one must seek from the outset to have a rst description of the system that is better than Hartree-Fock. Only then can one go on to attempt to recover the remaining dynamic correlation eects. As in H2 , we can build a general qualitatively correct wavefunction by selecting a number of congurations which are meant to describe all possible dissociation pathways, etc., and then writing the wavefunction as a linear CI expansion
M
=
I
cI I .
(216)
The energy is then minimized with respect to not only the cI (as in the CI method), but also to changes in the common set of orbitals t which are used to construct the I . This orbital optimization is analogous to what is done in the SCF method, hence the name multiconguration self consistent eld (MCSCF), which is given to this approach. Provided all the necessary congurations are included in the set I , then the method should give a qualitatively correct description of the electronic structure. Nearly all molecules dissociate to valence states of their constituent atoms, in which only the valence orbitals (e.g., 2s, 2p in carbon) are occupied. So ignoring the complications which might occur for Rydberg molecular states, a good description can be obtained by including I which have only valence orbitals of the molecule occupied. This has important computational consequences, and we distinguish in a calculation the relatively small number of internal (or valence) orbitals t , u , v , . . . from the usually much larger number of external orbitals a , b , . . .,
56
which are unoccupied in all congurations, and so actually are not part of the wavefunction. We continue to use the notation p , q , r , . . . to denote general molecular orbitals from any set. The internal and external orbitals take the roles of the occupied and virtual orbitals in an SCF calculation; as the calculation proceeds, the internal and external orbitals are mixed amongst each other until the optimum internal orbitals are found. Taking these ideas to the extreme suggests the use of a CI expansion consisting of all possible congurations in the valence space, i.e., a FCI type of wavefunction. This approach 88,89,90 is often termed complete active space SCF (CASSCF) and has the feature that it is to some extent a black box; the sometimes rather dicult problem of selecting suitable congurations I is replaced by the simpler identication of important orbitals. If the active orbital space coincides with the true valence space, then correct dissociation at all limits is automatically guaranteed, although there may be many congurations included which are completely unimportant. As a simple example, consider the ground state of N2 . The quartet spin N atom ground state is described by the conguration 2p 2p 2p . On bringing two N atoms together, one can make 20 CSFs with the x y z correct spin (singlet) and space (Ag in D2h ) symmetries, of which one is dominant near equilibrium bond length, but all of which are important at dissociation. The CASSCF wavefunction, a FCI expansion of 6 electrons in 6 orbitals, contains 32 CSFs. Although the ansatz may be wasteful in this way, we note that a complete CI expansion enables the use of special ecient techniques91 , so a CASSCF calculation may actually be easier than a smaller more general MCSCF calculation with the same internal orbital space. 5.5 Determination of MCSCF wavefunctions
We have considered earlier how the matrix elements HIJ = I |H|J are obtained in terms of one and two electron integrals htu , (tu|vw) and coupling coecients dIJ , tu IJ Dtuvw : I |H|J = dIJ htu + tu
tu 1 2 tuvw IJ Dtuvw (tu|vw) .
(217)
Thus the expression for the energy is E=

I
cI I |H|
IJ
cJ J
J 1 2 tuvw IJ 1 2 tuvw IJ cI cJ Dtuvw (tu|vw)
=
tu
cI cJ dIJ htu + tu dtu htu +

tu
Dtuvw (tu|vw) ,
(218)
where we see the introduction of the one and two electron density matrices d tu , Dtuvw , which in this context can be viewed as expectation values of the coupling coecients. This energy expression is the quantity which must be made stationary with respect to changes in the CI coecients cI and the orbitals t , subject to the
57
constraints c2 = 1 I
I
(normalization) (orbital orthogonality) .
(219) (220)
t |u = tu
For the CI coecients, introducing a Lagrange multiplier E for the rst constraint, and setting the dierential with respect to cI to zero, gives the stationary conditions
J
I |H|J cJ EcI = 0 ,
(221)
i.e., the usual matrix eigenvalue equations obtained in regular CI theory. For the orbitals, the most straightforward approach is to parametrize orthogonal rotations U amongst the orbitals (t p p upt ) by means of the matrix elements Rtu of an antisymmetric matrix. Any orthogonal matrix may be represented as U = exp(R)
1 The advantage of this formulation is that the 2 m(m + 1) orthogonality constraints 1 are automatically satised, leaving 2 m(m 1) free parameters which are contained in the lower triangle of R. There is then no need for Lagrange multipliers, and numerical methods for unconstrained optimization may be used. To derive the variational conditions for orbital rotations, we note that the orbitals vary on R through (222) as
where R = R .
(222)
p Rrs
R=0
= sp r rp s ,
(223)
and that the integrals htu , (tu|vw) given by (35), (36) are quadratic and quartic, respectively, in the orbitals. Then we obtain htu Rrs (tu|vw) Rrs = (1 rs )(1 + tu )st hru = (1 rs )(1 + tu )(1 + tu,vw )st (ru|vw) , (224) (225)
R=0
R=0
where the operator ij permutes the labels i, j in what follows it. Thus the derivative of the energy, which is zero for the converged wavefunction, is given by 0= with Frs =
u
E = 2(1 rs )Frs , Rrs dsu hru +

uvw
(226)
Dsuvw (ru|vw) .
(227)
Equations (221) and (226) must be solved to obtain the MCSCF wavefunction. Note that for some orbital rotations Rrs , the variational condition (226) is always obeyed automatically; for example, if both r, s are external, then the density matrix elements are all zero. The same can occur in a more subtle way for certain internal internal orbital rotations, e.g., for a CASSCF, all internalinternal rotations show this behaviour. When an Rrs behaves like this it is known as a redundant variable,
58
and is best removed from the optimization altogether92 . Note also that (226) is highly nonlinear, in contrast to the linear eigenvalue problem which appears in the CI method; E is 4th order in the orbitals, and innite order in R, since the orbitals are in fact periodic functions because of the orthogonality constraint. In order to solve numerically the variational equations (221) and (226), the standard approach is to use some kind of quasi-Newton approach93,94 that utilizes the gradients of the energy expression to construct a Taylor series for the energy in powers of the parameters that express changes in the wavefunction. Truncation of this power series gives an approximate energy expression that is accurate for small displacements, and which is easier to minimize than the full energy expression. For a given approximate solution, we construct the gradient vector g = and hessian matrix h = 2E p p (229)
p=0
E p
(228)
p=0
where the set of parameters {p } contains the changes in CI coecients {cI } and the non-redundant orbital change generators {Rrs }. The approximate energy expression E2 (p) = E2 (0) +
g p +
1 2
h p p
(230)
is then minimized by solving the linear equations 0 = g +
h p
(231)
The solution p denes a step that is applied to the wavefunction to improve it. Thus the overall procedure is iterative, each iteration consisting of the construction of the energy, gradient and hessian, followed by solution of the linear Newton-Raphson equations. The Newton-Raphson equations can be very large in dimension, particularly for a large CASSCF full CI expansion; therefore, usually, they have to be solved iteratively as well, using relaxation or expansion vector techniques 95 similar to the Davidson diagonalization algorithm3 . These iterations are usually referred to as microiterations to distinguish them from the enclosing macroiterations in each of which a new expansion point is dened. The generic Newton-Raphson algorithm suers in this context from two distinct problems associated with robustness and eciency. First of all, the second-order expansion (230) is valid only for small displacements q, and it is often the case that the predicted step length is outside the trust region of the truncated Taylor series. Modications that restrict the step length96 , or recast the linear equation system as an eigenvalue problem such that the step length is automatically restricted (augmented hessian method97 ) are helpful in improving global convergence. Secondly, however, even with such methods, as many as 20 macroiterations may be required, and each macroiteration is expensive. For each new set of orbitals, in order to construct the gradient and hessian, a subset of the molecular-orbital electron-repulsion
59
integrals must be constructed, specically those with up to two external indices (Jtu , Ktu ), by a computationally demanding transformation of the atomic-orbital integrals, which themselves have to be read from disk or computed on the y. It is therefore highly desirable to reduce the number of macroiterations. Both problems are solved by adopting an ansatz98,91 in which the microiterations involve optimization of an approximate energy functional that is second order in the orbital changes themselves, T = U 1, rather than in the generators R. This energy functional is periodic in the orbitals, just like the true energy, and its use gives an algorithm that is much more robust; in fact, in almost all cases, quadratic convergence is seen from the outset, and typically only three macroiterations are needed. Of course, there is additional complication in that the microiterations are solving non-linear rather than linear equations, but these can be eectively addressed using convergence accelerators such as DIIS99 . 5.6 Multireference Perturbation Theory
In order to go beyond a qualitatively correct MCSCF wavefunction REF and recover as much of the correlation energy as possible, as in the single-reference case, we begin by writing the exact wavefunction in a perturbation series Exact = REF + (1) + 2 (2) + . . . , (232)
where is an ordering parameter which will eventually be set to 1. Suppose that we can nd an operator H (0) such that H (0) REF = E (0) REF . In the particular case where REF is the solution of the SCF equations, an appropriate H (0) is the many-electron Fock operator,
N m
H (0) =
i
f (i) =
tu
ftu Etu
(233)
where f is the orbital Fock operator; in other cases it may or may not be possible to nd a suitable operator, but the arguments we develop still hold. If we write H = H (0) + H (1) , and separate terms of dierent order in in the Schrdinger o equation, at rst order we obtain H (0) E (0) (1) + H (1) REF REF |H (1) |REF REF = 0 . (234)
We expand (1) , the rst order correction to the wavefunction, and also HREF , the action of the full hamiltonian on the approximate wavefunction, as linear combinations of N -electron congurations in the full space, (1) =
I
I cI
(1)
HREF =
I
I hI ,
(235)
and assume (although again this is not critical) that H (0) I = EI I . This will be true for the Fock H (0) (EI is then the sum of the Fock eigenvalues for the orbitals
60
occupied in I ), and approximately true for others. The rst order equation then becomes cI I EI E (0) =
(1)
hI I + REF |H|REF REF .
(236)
This tells us that the basis functions which are required for (1) are exactly those which appear in the action of H on REF . This set of functions is the rst order interacting space. Recall that the hamiltonian consists of single and double excitation operators; this means that in turn the rst order space consists of all those congurations which are at most doubly excited with respect to the reference function REF . In the language of second quantization, the rst-order space consists of all the non-null congurations {Etu,vw REF }. These arguments can be generalized to higher orders of perturbation theory; at second order, congurations related to the rst-order wavefunction by up to double excitations will be introduced, and so the second-order interacting space consists of congurations which are singly, doubly, triply and quadruply excited relative to REF . One route to carry these ideas forward is to simply apply regular RayleighSchrdinger perturbation theory to obtain the perturbation series for the energy. o With the choice of Fock H (0) , this is the single-reference Mller-Plesset theory 100,101 (MP) or Many-Body Perturbation Theory (MBPT)102 . For multiconguraREF tional , the choice of zero order hamiltonian is not so obviously unique, but a number of dierent variants have been very successfully used103,104,105,106,107 . These are generally non-diagonal in the conguration basis, and so solution of the rst-order equations must be carried out iteratively; in contrast, for a Hartree-Fock reference with canonical molecular orbitals, each Slater determinant is an eigen function of H (0) , and so the rst-order equations have an explicit analytic solution. Multireference perturbation theory at second order (MRPT2 or CASPT2) is now well established as a robust and reliable technique particularly, for example, in the computation of electronic excitation energies106 , and is computationally feasible in almost all cases where the underlying MCSCF or CASSCF calculation is possible. Third-order perturbation theory103,108 can also be carried out for smaller systems, and the results show signicant dierences from second order, indicating the need for caution in the use of CASPT2. 5.7 Multireference Conguration Interaction
Although perturbation theory may be a dangerous tool to rely on, the interacting space hierarchy concept provides useful insight on how to design other methods. If we consider doing a variational CI calculation, we now know that, even though FCI may be impossible, we expect to obtain most of the correlation energy using a basis consisting of the rst-order interacting space. In the case of an RHF reference wavefunction REF this is the singles and doubles (CISD) method, with the basis consisting of all Slater determinants which are related to REF by a single or double spin-orbital excitation. Strictly speaking, for RHF REF , singles do not formally enter until second order perturbation theory, but in practice their eect can be quite
61
signicant, and there are fewer of them than doubles, and so they are invariably included as well. The same kind of approach can be taken for an MCSCF REF . The rst-order space is certainly spanned by a wavefunction of the form =
I
cI I +
Sa
cS a + a S
P ab
P Cab ab , P
(237)
where the three types of conguration I , a , ab contain respectively 0, 1, 2 S P occupied external orbitals, and the set of congurations is the union of the sets of CSFs obtained by making all possible single and double excitations on each reference conguration in turn. For the case that REF consists of a single closed shell conguration, (237) is the single-reference CISD wavefunction; when REF contains more than one conguration, variational treatment of (237) is usually referred to as multireference CI (MRCI)109,110,111,112 . Since there are usually many more external orbitals than internal orbitals, the doubly external congurations ab are expected to be by far the most numerous, P just as in the single-reference case, and we focus attention on these in considering what work has to be done in evaluating hamiltonian interactions. In the general multi-reference case, it is not possible to arrive at explicit matrix-oriented expressions for the hamiltonian matrix elements. However, some simplication beyond the general CI matrix element strategy presented in section 5.3 is certainly possible; just as in the single-reference case, there is special structure associated with the pairs of external orbitals a , b . In the formation of CSFs ab , it is advantageous to P take the occupied orbital string which is inserted into equation (207) such that the orbitals a and b appear as functions of the coordinates of electrons 1 and 2 respectively; this means that the function is pure singlet or triplet coupled in the two external orbitals, exactly as in the single-reference case, and allows for some simplication in matrix element evaluation. The structure of the wavefunction in the external orbitals is then no more complicated than in the single-reference problem, and so closed formulae for those parts involving external orbitals are obtainable; for example, the contribution from all external integrals has exactly the same form as in single-reference SDCI, and can be obtained eciently by computing the external exchange matrices for each pair P . However, for the internal orbitals, the CSFs are completely general in character, and ultimately one must compute one and two particle coupling coecients using the general techniques of section 5.3. For example, tu that part of the hamiltonian containing the Coulomb integrals, tuab Jab Eab Etu , gives rise to matrix elements ab |H|cd = P Q 1 pq 2
tu mn (P, Q) (1 + pab )(1 + qcd ) bd Jac , mn
(238)
where p = 1 according to whether ab is singlet or triplet coupled in the external P space. tu (P, Q) is simply a one particle coupling coecient for the operator Etu between the functions ab and ab , P Q tu (P, Q) = ab |Etu |cd . P Q (239)
62
Although coupling coecient evaluation is required, all the coupling coecients are completely independent of the external orbital labels; thus many hamiltonian matrix elements share the same coupling coecients in a regular manner. Discovery of this property 113,114 rst opened the way for large scale MRCI calculations. Although the coupling coecient evaluation problem is dramatically reduced by exploiting these special properties, the MRCI method is still severely restricted by computational diculties. For even quite modest numbers of reference congurations, the number of pair functions ab can be rather large; this means that the P dimension of the hamiltonian matrix can easily exceed the length of vector which can be stored on the computer, and, more importantly, the number of matrix elements which must be evaluated becomes completely unmanageable. Nevertheless, benchmark calculations, in which MRCI results are compared with those from full CI in the same basis, indicate that MRCI is the ab initio method of choice for all circumstances in which single determinant descriptions do not work, and that very high accuracy may be obtained115,116 . An alternative formulation which avoids the rapid increase in basis size with the number of reference congurations is possible 113 . Instead of selecting singly and doubly excited CSFs from each reference conguration, we can construct congurations by applying excitation operators to the reference wavefunction as a single entity: =
tuvw
C tuvw Etu,vw REF +

tuva tup 1 Cab 2 p ab tu
tuv Ca Eat,uv REF
Eat,bu + pEau,bt REF .
(240)
This is the internally contracted MRCI (ICMRCI)113,117,118 wavefunction, and it is obvious that the number of congurations is now independent of the number of reference functions, depending only on the numbers of internal and external orbitals. In this way, the size of CI expansion is reduced typically by one or two orders of magnitude; the conguration set, however, still spans the rst order interacting space, and although CMRCI can be considered as only an approximation to MRCI, benchmark calculations show that in most cases the extra error introduced by the contraction is several times smaller than the error of MRCI relative to full CI 118 . The price that is paid is that the congurations are now much more complicated, being in fact linear contractions of CSFs according to the values of the reference coecients. This means that coupling coecient evaluation is now a formidable problem; the simple CSF coupling coecients are replaced by reduced density matrices of high order. For example, for the Coulomb integrals considered previously, the coupling coecients are tu (vwp, xyq) = pq (1 + pxy ) REF |Evx,wy,tu |REF . (241) This third-order density matrix is evaluated using the general resolution-of-identity techniques used in the full CI problem, i.e.,. REF |Evx,wy,tu |REF =
K
REF |Evx,wy |K K |Etu |REF + lower order terms (242)
63
where the {K } are appropriate CSFs. For a given bra (vw) and ket (xy), all the matrix elements REF |Evx,wy |K are found by successively applying the opav , Exa , Eaw , Eya (a is a ctitious unoccupied orbital) to REF . For erators E processing a given Coulomb matrix Jtu , these matrix elements are combined with precomputed K |Etu |REF . An additional complication in ICMRCI is that the congurations are non orthogonal in a nontrivial way, and their orthogonalization can be a computational bottleneck117 . For this reason, the standard approach to ICMRCI is a hybrid that combines the best features of uncontracted and contracted wavefunctions 118 ; contraction is carried out only where it is easiest, and of most benet, namely for the doubly external congurations, and the all-internal and singly-externals are left uncontracted. An unfortunate feature of an MRCI calculation is that, just as in the singlereference CISD case, the energy is not an extensive function of the number of electrons as it should be. This undesirable feature of any truncated variational CI calculation can to some extent be avoided in MRCI by error cancellation across a potential energy surface; provided, for example, dissociation asymptotes are computed as supermolecules rather than by adding fragment energies, reasonable results can be obtained for dissociation energies. It is also true that the size-consistency errors for MRCI are usually much less than for single-reference CISD, since MRCI already contains some of the important quadruple congurations. However, the eects can never be completely avoided. One way to view the lack of size-consistency in variational CI is by considering the Rayleigh quotient correlation energy functional itself, E= |H E REF | . | (243)
Suppose is, for example, restricted to contain double excitation congurations only, and that the coecient of the reference wavefunction is kept xed (intermediate normalization, |REF = 1). Then the numerator of this expression can be shown to grow linearly with system size N ; however, the denominator also grows, but as 1 + N , where is a constant. This spoils the proper linear scaling of the correlation energy. In the absence so far of problem-free multireference coupledcluster approaches, this analysis gives rise to a number of approximate ways to correct for the eects of lack of extensivity. The simplest, the Davidson or +Q correction 119,26 , involves a straightforward rescaling of the correlation energy by | , i.e. replacing the denominator of (243) by 1 once the wavefunction has been determined. More explicitly, E CI+Q = 1 c2 CI 0 E , c2 0 (244)
where c2 is the weight of the reference wavefunction REF in the nal normalized 0 CI wavefunction. Alternative approaches (ACPF120 , AQCC121 ) introduce at the outset a denominator in the energy functional that does not increase with system size. This modied approximate functional is then minimized to determine the wavefunction and energy.
64
Integral-direct methods
Since the rst formulation of the LCAO nite basis scheme for molecular HartreeFock calculations, computer implementations of this method have traditionally been organised as a two-step process. In the rst step all the two-electron repulsion integrals (ERIs) over four contracted Gaussian basis functions are calculated and stored externally on disk, while the second step comprises the iterative solution of the Hartree-Fock Roothaan equations, where in each iteration the integrals from the rst step are retrieved from disk and contracted with the present density matrix to form a new Fock matrix. This subdivision of the computational process into the two steps was motivated by the relatively high CPU cost necessary to generate the ERIs using rather complicated analytical recurrence relations, which was clearly dominating a Hartree-Fock calculation. For post Hartree-Fock calculations, which are traditionally formulated using the canonical SCF orbitals from a preceding Hartree-Fock calculation as a basis, an integral transformation of the AO ERIs generated in the rst step to the canonical MO basis is required prior to the actual correlated calculation. The computational complexity of such an integral transformation scales with O(N 5 ), where N is a measure of the molecular size or the number of correlated electrons. It also is quite memory and disk intensive. The amount of disk space required to hold the AO (and MO) ERIs scales as O(N 4 ). The last several decades have witnessed continuous rapid advances in computer technology, and in fact the progress in CPU technology has been much faster than the development of I/O facilities. Furthermore, much eort has been invested in improving integration techniques. Hence, with the conventional two step procedure one now faces the dilemma of being able to compute large numbers of integrals rapidly, but spending a relatively large amount of time and resources in their storage and retrieval. In fact, the size of chemical systems one can handle today with the conventional method described above is primarily limited by the disk space required to store the AO ERIs, rather than the CPU time required to compute these. Integral-direct methods oer a solution to this problem. The philosophy is to eliminate the O(N 4 ) bottleneck of AO ERI storage altogether by recomputing the ERIs on the y whenever needed, thus trading disk space and I/O load at the expense of additional CPU time. Integral-direct methods were rst used in Hartree-Fock (SCF) theory almost two decades ago (direct SCF approach by Almlf et al. 122 ), and it constituted a break of a paradigm at that time. These o days, direct SCF programs are part of virtually all ab initio program packages used by the community. Since the pioneering direct SCF work integral-direct methods have been extended to electron correlation methods like multicongurational SCF 123,124,60 , many-body perturbation theory [MBPT(2)] 125,126,127,60 , MBPT(2) gradients 128 and coupled cluster methods 129,130,60 . In contrast to the SCF method, where the ERIs over atomic orbitals (AOs) (i.e., the basis functions) are immediately contracted to the Fock matrix in AO basis, and only AO integrals are needed, correlation methods including MCSCF require an AO to MO integral transformation, as discussed above. Hence an intermediate four-indexed quantity (rather than the two-indexed Fock matrix in direct SCF procedures) arises and has to be dealt with. A full 4-index transformation, carried out as four quarter transformations
65
has a op count that scales as O(m5 ) with the number of basis functions m, and has O(m4 ) storage requirements. At a rst sight the storage requirements for such an integral transformation seem to rule out any integral-direct implementation of a correlated method, since no savings to the conventional method seem to be possible. Fortunately enough, however, most correlation methods can be reformulated in terms of AO ERIs and a reasonably small subset of MO integrals 20 . Such MO integral subsets typically have two indices restricted to the occupied orbital space of dimension mocc , which is usually much smaller than m. For example, the computation of the MBPT(2) energy requires only the exchange integrals (ia|jb), while for direct MCSCF and all other correlation methods the Coulomb (ij|pq) and exchange (ip|jq) MO integrals are needed. The disk space necessary to hold such a subset of MO integrals then is O(m2 m2 ), i.e. for a ratio m/mocc 10 this means occ savings of a factor of 100 and larger in the storage requirements, compared to the conventional method. In the work by Schtz et al. 60 it was demonstrated that u for integral-direct implementations of most electron correlation methods (MP24(SDQ), CCSD, QCISD, BCCD, MCSCF, MRPT2/3, MRCI) only three integraldirect kernel procedures are necessary. The only exception are methods involving triply or higher excited congurations. Apart from the trivial Fock matrix construction routine these involve a generalized partial integral transformation and a module for the construction of external exchange operators which corresponds basically to a two-index contraction of AO ERIs with the doubles amplitude matrices, backtransformed to AO basis, as explained in section 2.4. Integral-direct methods are especially powerful in the context of local correlation methods 57,58,59,53,54 . Here, additional savings are possible by describing occupied and virtual correlation spaces in terms of localized MOs and projected (non-orthogonal) AOs, respectively, which in turn allows to exploit the short range character of dynamic correlation (asymptotic distance dependence is r 6 in insulators). In such a scheme, a hierarchical treatment of dierent electron pairs is possible, depending on relative distance of the corresponding LMOs. Furthermore, the virtual space spanned by the non-orthogonal projected AOs can be partitioned into domains (cf. section 4). As a result of this, only very small subsets of (transformed) integrals are required even for methods including triply excited congurations, and the number of these integrals scales linearly with the molecular size. This, in turn, opens the path for O(N ) electron correlation methods and hence the treatment of very large molecular systems at a level of very high accuracy. 6.1 The direct SCF method
In the most naive implementation, writing a computer code for a direct SCF scheme comprises little more than just replacing the reading of one- and two-electron integrals in the SCF algorithm by their repeated calculation. However, in order to get an ecient program, it is clear that such a change in the paradigm calls for major restructuring of the code. Since the computation of the two-electron integrals is rather expensive, a direct algorithm should be integral driven, i.e. integral evaluation concerns should dictate the order of events. Once an integral has been computed, it should be used to the maximum extent possible, as long as no external
66
storage is invoked. Two-electron repulsion integrals (ERIs) are integrals of the following form (assuming real basis functions) (|) =
1 (1) (1)r12 (2) (2)dr1 dr2 ,
(245)
where , , , denote contracted Cartesian Gaussians, =
c (r) =
c (x (x)y (y)z (z)),
(246)
with x (x) = (x x )k exp[a (x x )2 ], (247)
and x (x) . . . symbolize Cartesian components of primitive Gaussians, centred at origins r = (x , y , z ). Usually, these centres are taken to be the atoms, but sometimes basis functions are also positioned between atoms. One of the most important reason to choose Gaussians as basis functions is the separability into products of Cartesian components, as indicated in eq. (246). Another equally important reason for the ecacy of a Gaussian basis set is the fact that a twocentre product of Gaussians can be expressed as a short expansion of one-centre Gaussians the Gaussian Product Theorem, (GPT)
k +k
x (x)x (x) =
i=0
Ci
k +k
P i (x),
with
(248)
xP =
a x + a x , aP aP = a + a ,
2
P i (x) = xi eaP (xxP ) . For the case of two s-type Gaussians (k = k = 0) the single expansion coecient is
0 C0 = exp[(a a /aP )(r r )2 ].
(249)
In a geometrical interpretation, the GPT states that the product of two Gaussian functions (with arbitrary polynomial factors) can be expressed as a nite sum of new Gaussians, all centred at a single point P , which is located on the line connecting the two original centres r and r . The ERIs as given in eq. (245) can be evaluated analytically using various methods. At the heart of all these methods lies the GPT and some recurrence relations to shift angular momenta from one function to the other. Here, we will not go into the details; for a recent review we refer to Ref. 131 . From eq. (245) it is immediately evident that the ERIs obey the permutational symmetry relations (|) = (|) = (|) = (|) = (|) = (|) = (|) = (|) . (250)
67
By exploiting this permutational symmetry the number of integrals that need to be evaluated can be reduced by about a factor of eight. In modern quantum chemical codes the ERIs are usually evaluated over shell quadruplet batches. A shell typically comprises all contracted functions of a given centre and given angular momentum. For example, an s-shell of a 3s2p1d basis set comprises three functions, a p-shell six, and a d-shell 5 functions. In order to exploit an integral shell quadruplet batch to its maximum extent, i.e. to make use of the permutational symmetry mentioned above, the code should drive triangularly over the shell quadruplets. In the following we will use M, R, N, S as symbols for shells of basis functions, i.e., M, R, etc. A direct Fock builder performs a two-index contraction of each integral batch (MR|NS) with the related piece of the density matrix. If it runs over the minimal integral list (i.e. exploits the full permutational symmetry of the ERIs), each integral batch contributes to the Fock matrix via two Coulomb and four exchange components, as indicated in the pseudocode below. DO M=1,NShell DO R=1,M DO N=1,M DO S=1,N | R (for N=M) compute integral shell quadruplet block (MR|NS) compute Coulomb component of Fock matrix: f(M,R)=f(M,R)+4*(MR|NS)*d(N,S) f(N,S)=f(N,S)+4*(MR|NS)*d(M,R) compute exchange component of Fock matrix: f(R,N)=f(R,N)-(MR|NS)*d(M,S) f(R,S)=f(R,S)-(MR|NS)*d(M,N) f(M,N)=f(M,N)-(MR|NS)*d(R,S) f(M,S)=f(M,S)-(MR|NS)*d(R,N) END DO END DO END DO END DO 6.2 Integral prescreening
Obviously, the ERI supermatrix is a four-indexed quantity. Therefore, the computational eort to evaluate the ERIs scales nominally as N 4 , where N is a measure for the size of the chemical system (e.g. the number of basis functions for a given basis set). For instance, for a system with 100-200 atoms, involving about 2000 basis functions or more, the ERI supermatrix would comprise 1012 1013 integrals. It is clear that even though the algorithms for ERI evaluation have been drastically improved over the last two decades, no code can deal with all these integrals in a routine calculation. In the integral-direct approach the storage bottleneck is removed by reevaluating ERIs on the y whenever needed. One is then in the situation that the integral evaluation is the bottleneck. The solution to the problem is not only to generate
68
the ERIs more eciently, but to search for algorithms that can avoid the calculation of negligible integrals altogether. Fortunately, the ERI supermatrix is very sparse for extended chemical systems. Consider for a moment an ERI (|), as given in eq. (245). Since both and are Gaussian functions and involve the same electron coordinate r1 , it is immediately clear from eqs. (248) and (249) that the integrand decreases exponentially with the distance between the centres r r . The same holds for and . In fact, also the value of the ERI drops exponentially with the distance between and or and . Unfortunately, the two Gaussian pairs () and () are coupled by the Coulomb interaction 1/r12 , which is long range. Hence, the ERI still might be signicant even if () is far away from (). Therefore, the number of non-vanishing ERIs scales asymptotically with N 2 rather than with N 4 . In a direct SCF scheme the ERIs are reevaluated in each iteration and immediately contracted over two indices with the corresponding density matrix elements. Now, for an extended (but non-periodical) chemical system, the density itself is also sparse (i.e. D(M, N ) becomes small if M is distant from N ), provided that the HOMO-LUMO gap is large enough (which is usually the case for non-metallic systems). Furthermore, the exchange components of the Fock matrix requires contractions of the ERIs where the rst index involves one function of the rst Gaussian pair (), while the second index corresponds to one function of the second pair (). Hence, by virtue of the sparsity of the density matrix, the number of ERIs with non vanishing contributions to the Fock exchange component scales asymptotically linear (i.e. as O(N )) with molecular size. Unfortunately, this is not true for the Coulomb component, where the density connects just functions within each pair. Thus, a straightforward scheme would lead to O(N 2 ) scaling. However, since Coulomb repulsion is a relatively simple (i.e. classical) form of interaction, one can employ multipole expansions132,133,134,135 for the long range interactions, for which linear scaling with molecular size can be achieved. If then the evaluation of the Coulomb and exchange contributions to the Fock matrix is done separately, an overall linear scaling of the Fock matrix construction in integral-direct SCF calculations can be achieved.136 . A prerequisite for approaching quadratic or even linear scaling in a direct SCF scheme is a method to estimate the integral values as accurately as possible without actually computing them. This estimate must not be done for each integral or each integral batch individually, since then the test would scale itself with N 4 and become the bottleneck. A strict upper bound for the ERI (|) can be obtained from the Schwartz inequality 137 The Q necessary to compute the Schwartz estimates for the ERIs are just two indexed quantities, and can easily be precomputed outside the the nested loop over shell quadruplet batches. The number of non-negligible such integrals scales linearly with molecular size, and it is possible to evaluate them in a way that the overhead with quadratic scaling is very small. Furthermore, since the ERI prescreening takes place at the level of shell batches, only the maximum values of Q over the respective shells, i.e. the QM R =
M,R
|(|)| Q Q ,
with
Q =
(|) .
(251)
Max Q
(252)
69
are required. The four nested shell loops can now be replaced by two loops over the pairs (M R) and (N S) with non-negligible QM R and QN S , respectively, and within these loops the product QM R QN S can be tested against a threshold. Formally, this prescreening procedure scales quadratically with molecular size, but the prefactor is very small. A more powerful prescreening scheme has also to take the density matrix into account. As we have seen above, each ERI contributes with two Coulomb and four exchange components to the Fock matrix, and therefore the following test is required QM R QN S dmax , with (253) dmax = max(4|dM R |, 4|dN S |, |dM N |, |dM S |, |dRN |, |dRS |) .
If the exchange component of the Fock matrix is constructed separately, eq. (253) reduces to QM R QN S dmax , with dmax = max(|dM N |, |dM S |, |dRN |, |dRS |), (254)
leading to an overall linear scaling of shell quadruplets that survive the test, and consequently the number of ERIs that have to be computed. The eciency of this prescreening scheme can be enhanced in several ways. First, since ERIs are evaluated batchwise over whole shells, it might be desirable to split o diuse functions (small exponents) from tight functions (large exponents), and to treat diuse functions in separate shells. Even though this will increase the total number of shell quadruplets, the actual number of integrals to be computed can be reduced. Second, the eectivity of the prescreening schemes in eqs. (253) and (254) can be enhanced further by constructing incremental Fock matrix updates in each new iteration, rather than the total Fock matrix. Consider the the Fock matrices of two consecutive iterations m 1 and m:
(m1) f = h + (m) f = h + (m1) d {2(|) (|)}, (m) d {2(|) (|)} .
(255)
Obviously, the m
th
Fock matrix can also be computed via the recurrence relation

(m) (m1) {d d }{2(|) (|)},
(m) (m1) f = f +
i.e. by generating an incremental two-electron repulsion matrix, obtained by contracting the ERIs with an dierence density matrix d(m) = d(m) d(m1) . Towards convergence, d(m) will become very sparse, and thus the prescreening be more and more eective. The advantages of this recursive construction of the Fock matrix can be further enhanced by the minimized density dierence approach 137 , where rather than simple density dierences a linear combination of a history of densities (and Fock matrices) is used, which minimizes the density residual. On should note at this point, however, that the prescreening thresholds may have to be tightened towards convergence in order to avoid numerical noise and thus a deterioration of the convergence behaviour of the SCF. Changing the thresholds on the other hand implies the calculation of a full Fock matrix, i.e., a restart of the
70
density dierence procedure. Moreover, the DIIS (direct inversion of the iterative subspace 99 ) convergence accelerator has to be restarted as well. The philosophy of the direct SCF approach was based on the observation that the eciency of integral processing had outgrown the storage and I/O capacities on modern computer systems. Evidently though, after eliminating the storage and I/O bottleneck at the cost of additional CPU time, the evaluation of the ERIs again becomes the bottleneck in large direct SCF calculations, despite of all the ERI prescreening discussed above. Much work has therefore been dedicated to improve the eciency of ERI evaluation and Fock matrix construction. Some of these ideas can be summarized as early contraction schemes, where the Fock matrix is built directly from the two-centre integrals in the Gaussian Product basis (cf. GPT, eq. (248)), avoiding the handling of explicit four-centre ERIs over primitive or contracted basis functions as much as possible. Other ideas go into the direction of (approximately) reexpanding a product of basis functions in a new auxiliary basis (approximate three-centre expansions138 ). The approximate three-centre expansions appear in a dierent context (RI-DFT, RI-MP2) in other lectures of this winter school. A discussion of these methods is beyond the scope of this brief overview. Excellent overviews of these methods can be found in Refs. 139,140 . 6.3 Integral-direct MP2
As shown in section 2.3, the MP2 contribution to the correlation energy for a closed shell system can be written in spin-free formalism as E (2) =
i,j,a,b
(ia|jb)[2(ia|jb) (ib|ja)] , i+ j a b
(256)
where i , j , a , b are the corresponding eigenvalues of the Fock matrix. The MO exchange integrals (bj|ia) are computed from the AO integrals (ERIs) through a four-index transformation as shown in eq. (91). In the following, we will denote the four quarter transformation steps by Q1, Q2, Q3 and Q4, respectively. The nominal operation count (without any prescreening) of the Q1 step scales with O(mocc m4 ), while the others scale with O(m2 m3 ), i.e. the cost of all steps increases with occ O(N 5 ). For applications on large molecules it is therefore essential to reduce this steep scaling by prescreening techniques, similar to the direct SCF case. The memory requirements of the four individual transformation steps can be minimized by performing these over xed shells. This seems to be quite natural, since the ERIs are generated anyway as individual batches over shell quadruplets. In a straightforward scheme of that type the storage requirements to hold an individual AO ERI batch then are O(s4 ) (s denotes an average shell size, which is independent of the molecular size), O(mocc ms2 ) for the ERIs after the Q1 and Q2 steps, and O(m2 m2 ) after the Q3 and Q4 steps, respectively. Apparently, while occ the computational burden is largest for the initial transformation step, the memory requirements are highest for the nal step. In the canonical MP2 case the MO integrals are immediately consumed and accumulated to the MP2 correlation energy, according to eq. (256). A straightforward way to reduce the memory requirements of the critical Q3 and Q4 steps then is to segment the rst MO index i into indi-
71
vidual chunks (as large chunks as possible, given by the available memory) and to multipass over the AO integral list for each chunk individually 126,127 . This reduces the memory requirements from O(m2 m2 ) to O(Imocc m2 ) (I denotes the chunk occ size) at the cost of repeated ERI evaluations. In order for this algorithm to work, one of the ERI permutational symmetries (i.e. the () () symmetry) must be abandoned, thus one integral pass involves twice as many ERIs as the minimal list. The algorithm is free of any I/O operations and can be considered as fully direct. Yet the disadvantages are obvious: repeated ERI evaluation might become quite costly, and the number of passes increases quartically with increasing system size and constant memory. A more ecient, semi direct algorithm generates in a rst step the whole set of half transformed integrals (j|i). The transformation of the remaining two indices , to the virtual basis takes place after an intermediate ij bucket sort, which rearranges the ERIs to integral matrices K , and transforms individual Kij matrices one after the other. If the permutational symmetry of the slow pair () (i.e. ) is abandoned, the maximum memory requirements are solely O(smocc m2 ). Such an algorithm is outlined in pseudocode below (algorithm A) DO M=1,NShell DO R=1,NShell DO N=1,M DO S=1,N | R (for M=N) Compute integral block (MR|NS) Q1 step over shell block: (MR|Nj) = (MR|Nj) + (MR|NS) * X(S,j) (MR|Sj) = (MR|Sj) + (MR|NS) * X(N,j) END DO END DO (Mi|Nj) = (Mi|Nj) + (MR|Nj) * X(R,i) END DO write (Mi|Nj) to disk END DO perform bucket sort/(Mi|Nj)=(Mi|Nj)+(Nj|Mi) Note, that in order to keep the () () permutational symmetry the triangularity in the operator indices i, j is lost. The nal operator matrices Kij (i j) ij ji are formed by adding up the partial results K + K (i j), which is performed during the bucket sort, as indicated above. By virtue of an elaborate paging algorithm, it is even possible to maintain also the permutational symmetry (algorithm B), i.e. R_End=0 R_Pass=0 R_Start=R_End+1 R_End=MIN(NShell,R_End+R_Batch) R_Pass=R_Pass+1
72
if(R_Pass.gt.1) Read (Ri|Nj) for shells R_Start to R_End DO M=R_Start,NShell IF(R_Pass.gt.1.and.M.gt.R_End) Read (Mi|Nj) for shell M DO R=R_Start,MIN(R_End,M) DO N=1,M DO S=1,N | R (for M=N) Compute integral block (MR|NS) Q1 step over shell block: (MR|Nj) = (MR|Nj) + (MR|NS) * X(S,j) (MR|Sj) = (MR|Sj) + (MR|NS) * X(N,j) END DO END DO (Mi|Nj) = (Mi|Nj) + (MR|Nj) * X(R,i) (Ri|Nj) = (Ri|Nj) + (MR|Nj) * X(M,i) END DO IF(M.GT.R_End) Write (Mi|Nj) for shell M END DO Write (Ri|Nj) for shells R_Start to R_End If(R_End.LT.NShell) goto 1 perform bucket sort/(Mi|Nj)=(Mi|Nj)+(Nj|Mi) This means that the full permutational symmetry of the AO ERIs is exploited. This algorithm is very ecient for molecular systems of intermediate size. However, for large systems and limited memory, the paging overhead might become too excessive (even though no multipassing whatsoever over the integral list is involved, as in the fully direct scheme), and algorithm A becomes more ecient. The Q1 and Q2 transformation steps require matrix multiplications, in which at least one of the matrix dimensions corresponds to the shell size. For small shells the vector lengths are too short for a good performance to be achieved. Therefore, it is advantageous to merge adjacent R and S shells until an upper limit of 32-64 basis functions is reached. Signicant speedups (factors of 4-6) were observed, if such shell merging was invoked 60 . For applications on larger molecules, integral prescreening is of utmost importance. In order to assess the values of the AO ERIs, the Schwartz inequality (eq. 251) is again employed. Furthermore a test density Dmax is constructed from the MO coecient matrix C as
max D = Max Ci Cj ij
(257)
The prescreening criterions for the direct transformation at the level of shell quadrulets then are
max QM R QN S DRS 1
(258)
before integral evaluation, and

M,R,N,S
Max
max (|)DRS 2
(259)
73
before the Q1 step, respectively. Such a prescreening leads to a reduction of the computational cost of the dominant Q1 step from O(N 5 ) to O(N 3 ) 60 . The overall scaling however deteriorates again for larger molecules due to the subsequent transformations steps, which, because of the delocalized character of canonical orbitals, scale worse than O(N 3 ). In particular the Q4 step (i.e. the transformation of the Kij to the canonical virtuals) would still scale as O(N 5 ), although with a small prefactor, but nevertheless will ultimately constitute the bottleneck of the calculation. The remedy to this problem are local correlation methods, discussed in section 4. In combination with local correlation methods integral-direct MP2 algorithms with linear cost scaling have been implemented, which enable calculations of molecules with more than 2000 basis functions and 500 correlated electrons 54 . 6.4 Integral-direct MCSCF
In MCSCF calculations the orbitals are optimized simultaneously with the CI coecients. Thus, an integral transformation is required in each iteration, which constitutes one of the major bottlenecks in conventional MCSCF calculations. In a direct scheme, this bottleneck is even much more severe, since each direct transformation also involves recomputation of all AO ERIs. It is therefore of utmost importance that the MCSCF converges in as few iterations as possible. MCSCF orbital optimization methods can be classied as rst-order or secondorder methods. In the former only the rst derivatives of the energy with respect to the variational parameters are computed exactly, and updates of the parameters are obtained using some approximation of the Hessian (e.g. a BFGS update scheme). In rst-order methods the coupling of the orbitals and CI-coecients is neglected. One particular advantage of rst-order methods is that only a very compact set of transformed integrals is required, i.e. an integral distribution of the form (pj|kl) with only a single external index. In fact, j, k, l here run just over active orbitals, while the inactive orbitals (doubly occupied in all CSFs) can be accounted for by a single Fock matrix 141,142 . Thus, any storage bottleneck connected to the integral transformation is avoided. An integral-direct rst-order MCSCF method has been described by Frisch et al.124 . In second-order methods, also the second energy derivatives are computed exactly, yielding quadratic convergence near the nal solution. Naturally, rst-order methods require less eort per iteration, but are often slowly convergent and appear to be only useful for the optimization of CASSCF wavefunctions141. In this case convergence is facilitated by the fact that orbital rotations among active orbitals are redundant. Even with second-order methods convergence is often dicult to achieve for general MCSCF wavefunctions142 . The radius of convergence and the speed of convergence can be substantially increased by taking into account certain higher-order terms, as rst proposed by Werner and Meyer143,144 and further rened by Werner and Knowles98,91 . Using the latter method (in the following denoted WMK), convergence can often be achieved in only 2-3 iterations, in particular for CASSCF wavefunctions. Almost cubic convergence behaviour is observed near the solution. In the light of the discussion above, the WMK method is particulary useful in an integral-direct context, while the advantage of the simple and ecient
74
transformation of rst-order methods is spoilt by its slow convergence behaviour. The integral sets required by the WMK method are identical to those used by ordinary second-order methods: in additon to the exchange integrals (ip|jq) also the Coulomb integrals (pq|ij) are necessary. Furthermore, the very same integral sets, generated in the last iteration, can be reused in a subsequent CASPT2 or MRCI calculation. The additional Coulomb integral set can be produced simultaneously with the exchange integrals by modifying the above MP2 transformation algorithm A in the following way (algorithm C): DO M=1,NShell DO R=1,NShell DO N=1,NShell DO S=1,R Compute integral block (MR|NS) Q1 step over shell block: (MR|Nj) = (MR|Nj) + (MR|NS) * X1(S,j) (MR|Sj) = (MR|Sj) + (MR|NS) * X1(N,j) END DO END DO Q2 (J) step: (MR|ij) = (MR|Nj) * X2(N,i) (summed over N) write (MR|ij) to disk Q2 (K) step: (Mi|Nj) = (Mi|Nj) + (MR|Nj) * X2(R,i) END DO write (Mi|Nj) to disk END DO perform bucket sort Note, that compared to algorithm A the permutational symmetry between the pairs () () is lost, thus the AO integral list in algorithm C is four times as long as the minimal list. As in the MP2 case (algorithm B), the permutational symmetry () can be maintained by using an analogous paging algorithm, which might be advantageous for intermediate cases. 6.5 Integral-direct multireference correlation methods
The internally contracted MRCI and MRPT methods as discussed in section 5 can be formulated in terms of matrix operations142 involving the same Coulomb and exchange matrices Jij and Kij as needed in the preceeding MCSCF. In the MRCI and MRPT3 all contributions of 4-external integrals (ab|cd) can be taken into account by computing for each pair P an external exchange operator (EEO), as dened in eq. (106)117,118,108 . These operators can be computed directly from the two-electron integrals in the AO basis by rst transforming the amplitude matrices into the AO basis and nally transforming these back into the MO basis (cf. eqs (107). For an integral-direct implementation the internally contracted MRCI
75
scheme is particularly useful, since the number of pairs and thus external exchange operators that need to be computed is minimized and does not depend on the number of reference congurations. In uncontracted MRCI methods the number of pairs P for which the EEOs K(TP ) must be computed is excessively larger than in the internally contracted case. This does not only lead to higher computational cost, but also to a storage bottleneck in the direct evaluation of these operators The direct construction of the EEOs from the minimal AO integral list is accomplished by contracting two indices of the AO ERI (|) with the two AO indices of the backtransformed amplitudes 107, in all possible ways, which result in exchange type contributions, and can be regarded as a Fock build (excluding Coulomb contributions) of nP Fock matrices simultaneously (nP denotes the number of pairs P ). A shell driven out-of-core algorithm for such a construction of the EEOs, as implemented in MOLPRO 60 , is given in pseudocode below (module DKEXT). R_End=0 R_Pass=0 R_Start=R_End+1 R_End=MIN(NShell,R_End+R_Batch) Read amplitudes for shells R_Start to R_End R_Pass=R_Pass+1 IF(R_Pass.gt.1) Read operators for shells R_Start to R_End DO M=R_Start,NShell If(M.GT.R_End) then Read amplitudes for shell M If(R_Pass.gt.1) Read operators for shell M End If DO R=R_Start,MIN(R_End,M) DO N=1,M S_End=N If(N.EQ.M) S_End=R DO S=1,S_End Compute integral block (MR|NS) Compute contributions to operators END DO END DO END DO IF(M.GT.R_End) Write operators for shell M END DO Write operators for shells R_Start to R_End If(R_End.LT.NShell) goto 1
The algorithm employs a paging algorithm, which is quite similar to that used in the direct transformation scheme discussed in section 6.3. The amplitudes and MR EEOs are presorted according to shell blocks TP, with M running slowest, and stored on disk. In this way it is possible to read/write them for a given shell M
76
and for all P and R. All contributions arising from integrals over one occupied and three external orbitals (ia|bc) can be taken into account by an additional set of EEOs K(DP ), where DP are modied coecient matrices117,118,22 , which dier from the TP by the addition of internal-external blocks arising from contributions of single excitations. In single-reference methods (CISD, MP4(SDQ), QCSID, CCSD) as well as for evaluating the MRPT3 energy it is sucient to compute only the latter set of operators. In MRCI calculations, this would in principle be possible as well, but since then complicated correction terms are necessary118 it is easier to compute the operators K(TP ) and K(DP ) separately. Of course, the two sets can be computed together in a single integral pass. Since the EEOs depend explicitly on the amplitudes that must be computed in each iteration. The computational complexity of EEO formation is nominally a task O(m2 m4 ) = O(N 6 ). In an integral-direct context this can be reduced to O(N 4 ) occ by virtue of integral prescreening 60 . In order to get ecient prescreening, it is important to include the amplitudes into the prescreening scheme. Nevertheless, in integral-direct calculations with large basis sets, the EEO construction often dominates the computational eort. 6.6 Integral-direct coupled cluster methods
The rst integral-direct CCSD method was developed by Koch and coworkers 129,145. In this method the transformed integrals are never stored on disk. Instead, distributions of AO integrals (|) are generated for xed , all , . One such distribution at a time is kept in memory and consumed immediately to compute all contributions to the CCSD residual (fully direct CCSD). This method, although very ecient on vector computers due to long vector lenghths, suers from some severe bottlenecks (most importantly, the m3 memory requirements of the integral distributions, mentioned above), which limit the application range for larger systems. An alternative method has been proposed by Schtz, Werner and Lindh 60 , u which diers from the above method by the fact, that the partially transformed integrals are stored on disk (3/2m2 m2 words are required). Considering that the occ doubles amplitudes as the variational parameters of the iterative CCSD procedure and the residuals have to be stored on disk anyway in several instances (due to DIIS convergence acceleration), with a required diskspace of nDIIS m2 m2 , this certainly occ does not constitute a further bottleneck, and seems to be a reasonable strategy. The immediate advantage is that the remaining program remains entirely unchanged, and that the same integral-direct modules as for the MCSCF and MRCI programs can be used. Furthermore, in such a scheme the maximum memory requirements can be reduced to O(mocc m2 ), and to O(N ) for local CCSD (cf. section 4.2). The MP3, MP4(SDQ), QCISD and CCSD methods, which all are related, require the same internal operators Jij , Kij , and the EEOs K(Dij ) as introduced for the MRCI case in the previous section. A further complication arises in the CCSD method 22 , where the additional operators J(Eij ) and K(Eij ) (cf. eqs. (132)) are needed. As discussed in section 2.5, these operators can be obtained by a generalized integral transformation (cf. eqs. (133)-(137)). This transformation can be
77
performed using the same integral-direct module as employed for generating the Jij and Kij matrices, but since they depend on the singles amplitudes they must be performed in each iteration. An important point to notice is that the latter operators are only needed for CCSD, but not for the QCISD (quadratic conguration interaction) method33 . While the computational eort for these two methods is not too much dierent in conventional calculations22 , in the integral-direct case the full CCSD takes signicantly more time, due to this additional transformation which must be performed in each iteration. For most applications, QCISD and CCSD results are very similar, and QCISD may often be more cost eective for integral-direct calculations of large molecules, even though from a theoretical point of view CCSD is more satisfactory. If the 3-external integrals are available though, as is usually the case for local CCSD calculations, then the construction of the J(Eij ) and K(Eij ) operators takes little time, hence there is little reason to use the QCISD model in that case. Acknowledgments Financial support from the EC as part of the TMR network Potential Energy Surfaces for Spectroscopy and Dynamics, contract No. FMRX-CT96-088 (DG 12 BIUO) and as part of the RTN network Theoretical Studies of Electronic and Dynamical Processes in Molecules and Clusters (THEONET II), contract No. RTN1-1999-00121 is gratefully acknowledged. Much of the research described in these notes has been supported by DFG, EPSRC, Fonds der Chemischen Industrie and BASF AG. References 1. R. H. Nobes, J. A. Pople, L. Radom, N. C. Handy, and P. J. Knowles, Chem. Phys. Letters 138, 481 (1987). 2. P. J. Knowles and N. C. Handy, J. Phys. Chem. 92, 3097 (1988). 3. E. R. Davidson, J. Comput. Phys. 17, 87 (1975). 4. T. Kato, Commun. Pure Appl. Math. 10, 151 (1957). 5. R. T. Pack and W. Byers Brown, J. Chem. Phys. 45, 556 (1966). 6. W. A. Bingel, Z. Naturforsch. Teil A 18, 1249 (1963). 7. V. A. Rassolov and D. M. Chipman, J. Chem. Phys. 104, 9908 (1996). 8. E. A. Hylleraas, Z. Phys. A 54, 347 (1929). 9. H. M. James and A. S. Coolidge, J. Chem. Phys. 1, 825 (1933). 10. W. Kutzelnigg, Theor. Chim. Acta 68, 445 (1985). 11. W. Kutzelnigg and W. Klopper, J. Chem. Phys. 94, 1985 (1991). 12. D. C. Clary and N. C. Handy, Phys. Rev. A14, 1607 (1976). 13. D. C. Clary, Mol. Phys. 34, 793 (1977). 14. P. Jrgensen and J. Simons, Second QuantizationBased Methods in Quantum Chemistry (Academic Press, New York, 1981). 15. J. Almlf and P. R. Taylor, J. Chem. Phys. 86, 4070 (1987). o 16. J. Almlf and P. R. Taylor, J. Chem. Phys. 92, 551 (1990). o 17. P.-O. Widmark, P.-. Malmqvist, and B. O. Roos, Theor. Chim. Acta 77, A
78
18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52.
291 (1990). J. Almlf and P. R. Taylor, Adv. Quant. Chem. 22, 301 (1991). o T. H. Dunning Jr., J. Chem. Phys. 90, 1007 (1989). W. Meyer, J. Chem. Phys. 64, 2901 (1976). P. Pulay, S. Saeb, and W. Meyer, J. Chem. Phys. 81, 1901 (1984). C. Hampel, K. A. Peterson, and H.-J. Werner, Chem. Phys. Lett. 190, 1 (1992). W. Meyer, Int. J. Quantum Chem. Symp. 5, 341 (1971). W. Meyer, J. Chem. Phys. 58, 1017 (1973). R. Ahlrichs, P. Scharf, and C. Ehrhardt, J. Chem. Phys. 82, 890 (1985). S. R. Langho and E. R. Davidson, Int. J. Quant. Chem. 8, 61 (1974). B. O. Roos and P. E. M. Siegbahn, in Methods of Electronic Structure Theory, edited by H. F. Schaefer III (Plenum, New York, 1977). J. Czek, J. Chem. Phys. 45, 4256 (1966). J. Czek, Adv. Chem. Phys. 14, 35 (1969). J. Czek and J. Paldus, Int. J. Quantum Chem. 5, 359 (1971). G. D. Purvis and R. J. Bartlett, J. Chem. Phys. 76, 1910 (1982). G. E. Scuseria, C. L. Janssen, and H. F. Schaefer III, J. Chem. Phys. 89, 7382 (1988). J. A. Pople, M. Head-Gordon, and K. Raghavachari, J. Chem. Phys. 87, 5968 (1987). J. Czek and J. Paldus, Phys. Scripta 21, 251 (1980). R. J. Bartlett and G. D. Purvis, Phys. Scripta 21, 255 (1980). R. A. Chiles and C. E. Dykstra, J. Chem. Phys. 74, 4544 (1981). G. Scuseria and H. F. Schaefer III, Chem. Phys. Lett. 142, 354 (1987). N. C. Handy, J. A. Pople, M. Head-Gordon, K. Raghavachari, and G. W. Trucks, Chem. Phys. Lett. 164, 185 (1989). K. Raghavachari, J. A. Pople, E. S. Replogle, M. Head-Gordon, and N. C. Handy, Chem. Phys. Lett. 167, 115 (1990). M. Urban, J. Noga, S. J. Cole, and R. J. Bartlett, J. Chem. Phys. 83, 4041 (1985). K. Raghavachari, G. W. Trucks, J. A. Pople, and M. Head-Gordon, Chem. Phys. Letters 157, 479 (1989). S. A. Kucharski and R. J. Bartlett, Adv. Quantum Chem. 18, 281 (1986). S. A. Kucharski, J. Noga, and R. J. Bartlett, J. Chem. Phys. 90, 7282 (1989). K. Raghavachari, J. A. Pople, E. S. Replogle, and M. Head-Gordon, J. Phys. Chem. 94, 5579 (1990). Z. He and D. Cremer, Theor. Chim. Acta 85, 305 (1993). M. J. O. Deegan and P. J. Knowles, Chem. Phys. Letters 227, 321 (1994). P. J. Knowles, C. Hampel, and H.-J. Werner, J. Chem. Phys. 99, 5219 (1993). C. Janssen and H. F. Schaefer III, Theor. Chim. Acta 79, 1 (1991). P. J. Knowles, C. Hampel, and H.-J. Werner, J. Chem. Phys. 111, 0000 (2000). P. Neogrdy, M. Urban, and I. Huba, J. Chem. Phys. 100, 3706 (1994). a c P. G. Szalay and J. Gauss, J. Chem. Phys. 107, 9028 (1997). S. Saeb and P. Pulay, Annu. Rev. Phys. Chem. 44, 213 (1993).
79
53. C. Hampel and H.-J. Werner, J. Chem. Phys. 104, 6286 (1996). 54. M. Schtz, G. Hetzer, and H.-J. Werner, J. Chem. Phys. 111, 5691 (1999). u 55. R. A. Friesner, R. B. Murphy, M. D. Beachy, M. N. Ringnalda, W. T. Pollard, B. D. Dunietz, and Y. Cao, J. Phys. Chem. A 103, 1913 (1999). 56. G. Reynolds, T. J. Martinez, and E. A. Carter, J. Chem. Phys. 105, 6455 (1996). 57. P. Pulay, Chem. Phys. Letters 100, 151 (1983). 58. P. Pulay and S. Saeb, Theor. Chim. Acta 69, 357 (1986). 59. S. Saeb and P. Pulay, J. Chem. Phys. 86, 914 (1987). 60. M. Schtz, R. Lindh, and H.-J. Werner, Mol. Phys. 96, 719 (1999). u 61. M. Schtz and H.-J. Werner, manuscript in preparation. u 62. M. Schtz and H.-J. Werner, Chem. Phys. Lett., in press. u 63. S. F. Boys, in Quantum Theory of Atoms, Molecules, and the Solid State, edited by P. O. Lwdin, page 253 (Academic, New York, 1966). o 64. C. Edmiston and K. Ruedenberg, J. Chem. Phys. 43, S97 (1965). 65. J. Pipek and P. G. Mezey, J. Chem. Phys. 90, 4916 (1989). 66. M. Head-Gordon, P. E. Maslen, and C. A. White, J. Chem. Phys. 108, 616 (1998). 67. G. E. Scuseria and P. Y. Ayala, J. Chem. Phys. 111, 8330 (1999). 68. J. W. Boughton and P. Pulay, J. Comput. Chem. 14, 736 (1993). 69. G. Hetzer, P. Pulay, and H.-J. Werner, Chem. Phys. Lett. 290, 143 (1998). 70. G. Rauhut, P. Pulay, and H.-J. Werner, J. Comput. Chem. 19, 1241 (1998). 71. A. ElAzhary, G. Rauhut, P. Pulay, and H.-J. Werner, J. Chem. Phys. 108, 5185 (1998). 72. M. Schtz, G. Rauhut, and H.-J. Werner, J. Phys. Chem. A 102, 5997 (1998). u 73. N. Runeberg, M. Schtz, and H.-J. Werner, J. Chem. Phys. 110, 7210 (1999). u 74. J. Gauss and H.-J. Werner, Mol. Phys., in press. 75. MOLPRO is a package of ab initio programs written by H.-J. Werner and P. J. Knowles, with contributions from J. Almlf, R. D. Amos, A. Berning, o P. Celani, D. L. Cooper, M. J. O. Deegan, A. J. Dobbyn, F. Eckert, S. T. Elbert, C. Hampel, G. Hetzer, T. Korona, R. Lindh, A. W. Lloyd, W. Meyer, M. E. Mura, A. Nicklass, K. Peterson, R. Pitzer, P. Pulay, G. Rauhut, M. Schtz, u H. Stoll, A. J. Stone, P. R. Taylor, and T. Thorsteinsson. 76. J. Paldus, J. Chem. Phys. 61, 5321 (1974). 77. J. Hinze, editor, The Unitary Group (SpringerVerlag, Berlin, 1979). 78. E. R. Davidson, in Methods in Computational Molecular Physics, edited by G. H. F. Diercksen and S. Wilson (Reidel, Dordrecht, 1983). 79. P. E. M. Siegbahn, Chem. Phys. Letters 109, 417 (1984). 80. J. Olsen, B. O. Roos, P. Jrgensen, and H. J. A. Jensen, J. Chem. Phys. 89, 2185 (1988). 81. P. J. Knowles and N. C. Handy, Chem. Phys. Letters 111, 315 (1984). 82. P. J. Knowles and N. C. Handy, Comput. Phys. Commun. 54, 75 (1989). 83. S. Zarrabian, C. R. Sarma, and J. Paldus, Chem. Phys. Letters 155, 183 (1989). 84. D. M. Brink and G. R. Satchler, Angular Momentum (Clarendon, Oxford, 2nd edition, 1968).
80
85. 86. 87. 88. 89. 90. 91. 92.
93.
94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116.
R. N. Zare, Angular Momentum (Wiley, New York, 1988). R. Pauncz, Spin Eigenfunctions (Plenum, New York, 1979). P. J. Knowles and H.-J. Werner, Chem. Phys. Lett. 145, 514 (1988). K. Ruedenberg, L. M. Cheung, and S. T. Elbert, Int. J. Quantum Chem. 16, 1069 (1979). B. O. Roos, P. Taylor, and P. E. M. Siegbahn, Chem. Phys. 48, 157 (1980). P. E. M. Siegbahn, J. Almlf, A. Heiberg, and B. O. Roos, J. Chem. Phys. o 74, 2384 (1981). P. J. Knowles and H.-J. Werner, Chem. Phys. Letters 115, 259 (1985). M. R. Homann, D. J. Fox, J. F. Gaw, Y. Osamura, Y. Yamaguchi, R. S. Grev, G. Fitzgerald, H. F. Schaefer III, P. J. Knowles, and N. C. Handy, J. Chem. Phys. 80, 2660 (1984). W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in Fortran 77: The Art of Scientic Computing (Cambridge University Press, 2nd edition, 1992). P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization (Academic Press, 1981). J. A. Pople, R. Krishnan, H. B. Schlegel, and J. S. Binkley, Int. J. Quant. Chem. S13, 225 (1979). J. Olsen, D. L. Yeager, and P. Jrgensen, Adv. Chem. Phys. 54, 1 (1983). D. Yarkony, Chem. Phys. Letters 77, 634 (1981). H.-J. Werner and P. J. Knowles, J. Chem. Phys. 82, 5053 (1985). P. Csszr and P. Pulay, J. Mol. Struc. 114, 31 (1984). a a C. Mller and M. S. Plesset, Phys. Rev. 46, 618 (1934). J. A. Pople, R. Krishnan, H. B. Schlegel, and J. S. Binkley, Int. J. Quant. Chem. 14, 545 (1978). R. J. Bartlett and D. M. Silver, J. Chem. Phys. 62, 3258 (1975). R. B. Murphy and R. P. Messmer, Chem. Phys. Letters 183, 443 (1991). K. Andersson, P.-. Malmqvist, and B. O. Roos, J. Chem. Phys. 96, 1218 A (1992). K. Hirao, Chem. Phys. Letters 196, 397 (1992). B. O. Roos, K. Andersson, M. P. Fulscher, P. A. Malmqvist, L. Serranoandres, K. Pierloot, and M. Merchan, Adv. Chem. Phys. 93, 219 (1996). P. Celani and H.-J. Werner, J. Chem. Phys., in press. H.-J. Werner, Mol. Phys. 89, 645 (1996). R. J. Buenker and S. D. Peyerimho, Theor. Chim. Acta 35, 33 (1974). P. E. M. Siegbahn, Int. J. Quantum Chem. 18, 1229 (1980). J. Lischka, R. Shepard, F. B. Brown, and I. Shavitt, Int. J. Quantum Chem. Symp. 15, 91 (1981). V. R. Saunders and J. H. van Lenthe, Mol. Phys. 48, 923 (1983). W. Meyer, in Methods of Electronic Structure Theory, edited by H. F. Schaefer III (Plenum, New York, 1977). P. E. M. Siegbahn, J. Chem. Phys. 72, 1647 (1980). C. W. Bauschlicher, P. R. Taylor, N. C. Handy, and P. J. Knowles, J. Chem. Phys. 85, 1469 (1986). C. W. Bauschlicher, Jr., S. R. Langho, and P. R. Taylor, Adv. Chem. Phys.
81
77, 103 (1990). 117. H.-J. Werner and E. A. Reinsch, J. Chem. Phys. 76, 3144 (1982). 118. H.-J. Werner and P. J. Knowles, J. Chem. Phys. 89, 5803 (1988). 119. E. R. Davidson, in The world of quantum chemistry, edited by R. Daudel and B. Pullman (Reidel, Dordrecht, 1974). 120. R. J. Gdanitz and R. Ahlrichs, Chem. Phys. Lett. 143, 413 (1988). 121. P. G. Szalay and R. J. Bartlett, Chem. Phys. Letters 214, 481 (1993). 122. J. Almlf, J. K. Faegri, and K. Korsell, J. Comput. Chem. 3, 385 (1982). o 123. P. Taylor, Int. J. Quantum Chem. 31, 521 (1987). 124. M. Frisch, I. N. Raganzos, M. A. Robb, and H. B. Schlegel, Chem. Phys. Lett. 189, 524 (1992). 125. S. Sb and J. Almlf, Chem. Phys. Letters 154, 83 (1989). o 126. M. Head-Gordon, J. Pople, and M. Frisch, Chem. Phys. Letters 153, 503 (1988). 127. M. Schtz and R. Lindh, Theor. Chim. Acta 95, 13 (1997). u 128. M. Frisch, M. Head-Gordon, and J. Pople, Chem. Phys. Letters 166, 275 (1990). 129. H. Koch, O. Christiansen, R. Kobayashi, P. Jrgensen, and T. Helgaker, Chem. Phys. Lett. 228, 233 (1994). 130. W. Klopper and J. Noga, J. Chem. Phys. 103, 6127 (1995). 131. R. Lindh, in The Encyclopedia of Computational Chemistry Vol.2, edited by P. v. R. Schleyer, N. L. Allinger, T. Clark, J. Gasteiger, P. A. Kollman, H. F. S. III, and P. R. Schreiner, page 1337 (John Wiley & Sons: Chichester, 1998). 132. C. A. White, B. G. Johnson, P. M. W. Gill, and M. Head-Gordon, Chem. Phys. Letters 230, 8 (1994). 133. C. A. White, B. G. Johnson, P. M. W. Gill, and M. Head-Gordon, Chem. Phys. Letters 253, 268 (1996). 134. J. C. Burant, G. E. Scuseria, and M. J. Frisch, J. Chem. Phys. 105, 8969 (1996). 135. M. Challacombe, E. Schwegler, and J. Almlf, J. Chem. Phys. 104, 4685 o (1996). 136. C. Ochsenfeld, C. A. White, and M. Head-Gordon, J. Chem. Phys. 109, 1663 (1998). 137. M. Hser and R. Ahlrichs, J. Comput. Chem. 10, 104 (1989). a 138. O. Vahtras, J. Almlf, and M. Feyereisen, Chem. Phys. Letters 213, 514 o (1993). 139. J. Almlf, in Modern Electronic Structure Theory, edited by D. Yarkony, numo ber I in Advanced Series in Physical Chemistry - Vol. 2, page 110 (World Scientic Publishing Co. Pte. Ltd., 1995). 140. J. Almlf, in Lecture Notes in Quantum Chemistry, European Summer School o in Quantum Chemistry, edited by B. Roos, number 64 in Lecture Notes in Chemistry, page 1 (Springer-Verlag Berlin Heidelberg, 1994). 141. B. O. Roos, Int. J. Quantum Chem. Symp. 14, 175 (1980). 142. H.-J. Werner, Adv. Chem. Phys. 69, 1 (1987). 143. H.-J. Werner and W. Meyer, J. Chem. Phys. 73, 2342 (1980). 144. H.-J. Werner and W. Meyer, J. Chem. Phys. 74, 5794 (1981).
82
145. H. Koch, A. S. de Mers, T. Helgaker, and O. Christiansen, J. Chem. Phys. a 104, 4157 (1996).
83

Ab Initio Methods for Electron Correlation in Molecules

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ab Initio Methods for Electron Correlation in Molecules

Uploaded by

Copyright:

Available Formats

John von Neumann Institute for Computing

Ab Initio Methods for Electron Correlation in Molecules

AB INITIO METHODS FOR ELECTRON CORRELATION IN MOLECULES

Introduction Electron correlation and the conguration interaction method

Figure 1. Potential Energy Curves for H2

RHF UHF Exact

Figure 2. Potential Energy Curves for F2

RHF UHF Exact

Short-range correlation the Interelectronic Cusp

in which the hamiltonian becomes H = 1 4

The r1 singularity is removed if a1 = 1 a0 , or 2 =

Figure 3. The interelectronic cusp

The Hartree-Fock wavefunction is 1 RHF = A1s 1s = 1s(r1 )1s(r2 ) 2 ((1)(2) (1)(2))

Xp1 p2 ...pN p1 (r1 )p2 (r2 ) . . . pN (rN ) .

operators on the empty (vacuum) state,

etc. These can all be formulated as combinations of the single excitations:

|p (i) q (i)| |r (j) s (j)|

|p (i) q (i)|r (i) s (i)| (28) (29)

= Epq Ers qr Eps

Now we insert this identity into the electronic hamiltonian operator

|p (i) p (i)|h(i)|q (i) q (i)|

(pq|rs) I |Epq,rs |J . (37)

Cpq (1 pq )pq Cpq pq , (39)

and then Ers,tu pq = (1 pq )sp uq rt , (41)

Cpq Hpq rs (K(C)rs + 2(hC)rs ) . (43)

Orbital basis sets

Cpq Crm pq |rm +

Cpq Cmm pq |mm (47)

2 = 1 2(C C)mm + Cmm

ij kl ab |cd (ck|dl) = (ai|bj) .

Tij = 2Tij Tji .

and with H (1) = H H (0) ,

(75) (76) (77)

E (0) = 2 E (0) + E (1) =

fii , |H|(0) = E SCF ,

ij (0) |H|ab Tab ij =

Kij (2Tji Tij ) ,

where Kij Tji =

(88) (89) (90) (91)

Kij Tji + Tij f Tji fij (K + R )T

ij Minimizing this functional with respect to the Tab yields E2 ij = 2Rab , T ij

CISD |H|CISD . CISD |CISD a a ti ti +

ij ij Tab Tab (2 ij ) Tij Tji . (96)

with the auxiliary matrices Gij = Tij f

Tik fkj + Tik Jkj + (Tik Jkj ) Tik Kkj

ij Xc Tcd Xd (|) ij T (|)

(B) = SCF (B) + c (B) = [1 +

(AB) = SCF (AB) + A[SCF (A)c (B) + SCF (B)c (A)] + 1 4

(116) (117) (118)

ij,kl Tkl + Gij + Gji ,

(125) (126) (127) Lkl Tlj , (128) (129)

tr Til Llk , Lkl Tlk

1 1 Ykj = Kkj Jkj + 2 4 Zkj = Jkj 1 2

then the operators are computed in the AO basis J(Eij ) =

Table 1. CPU timesa of coupled cluster calculations for glycine peptidesb

(Gly)3 237 1471 62741 220486

Program Transformationd CCSD/iteration QCISD/iteration BCCD/iteration

cc-pVDZb 35 374 360 399

cc-pVTZ(d/p)c 318 2313 2180 2520

Open-shell single-reference methods

ij ai bj ij (Tab e e + Tab e e ) ai bj tu Tab e e ti uj

Partially spin-resticted coupled-cluster theory (RCCSD)