Introduction to Electronic Structure Calculations: The variational principle


Introduction to Electronic Structure Calculations: The variational principle


This post is the first part in an upcoming series explaining how to perform electronic structure calculations. The focus of this series is how to set-up an electronic structure calculation from scratch. We are not going to use any of the available quantum chemical codes, but write our very own code. This way, we will obtain a better understanding of how quantum chemical calculations are being performed. The way I am going to present the topic is by first providing the underlying theory and then further explaining the theory by giving a set of examples.

The outline of this series is as follows: We start of with explaining some basics such as the (linear) variational principle, the Hartree product and the Slater determinant. From there on, we derive the Hartree-Fock equations and show how to set-up a simple calculation for the H2 molecule. Next, we explain about basis sets, Gaussian integrals and the Hartree-Fock self-consistent field procedure. Finally, we talk about atomic and molecular orbitals, the construction of MO diagrams and the visualization of orbitals.

For this series, I am going to assume that you have a basic understanding of quantum mechanics. You know what a wave function is and its statistical interpretation. Furthermore, I assume that you have passed introductory courses in calculus and linear algebra.

The variational principle

Given a normalized wave function $\lvert\phi\rangle$ that satisfies the appropriate boundary conditions, then the expectation value of the Hamiltonian is an upper bound to the exact ground state energy. In other words, the lowest energy we can find, gives us the best wave function. Thus,

if $ \langle \phi \rvert \phi \rangle=1 $, then $ \langle \phi \lvert \hat{H} \rvert \phi \rangle \geq E_{0} $.

Example 1: The energy of the 1s orbital in hydrogen

Let's exemplify this by providing an example. In the following example, we wish to obtain the best value for $ \alpha $ for the trial function $ \lvert \phi \rangle = N \exp \left( - \alpha r^{2} \right) $ (where $N$ is a normalization constant) that describes the 1s orbital in the H-atom. The Hamiltonian for the H-atom in atomic units is: $$ \hat{H} = -\frac{1}{2} \nabla^{2} - \frac{1}{r}. $$ The first step is to find $E$ as a function of $\alpha$. Therefore, we need to evaluate the following integral $$ E = \langle \phi \lvert \hat{H} \rvert \phi \rangle = \int_{0}^{\infty} \textrm{d}r \int_{0}^{2\pi} \textrm{d} \theta \int_{0}^{\pi} \textrm{d} \varphi \, r^{2} \sin \varphi N \exp \left( - \alpha r^{2} \right) \left[ -\frac{1}{2} \nabla^{2} - \frac{1}{r} \right] N \exp \left( - \alpha r^{2} \right) .$$ Note that because we are integrating in spherical coordinates, we have introduced the Jacobian $$ J = r^{2} \sin \varphi .$$ Furthermore, the radial part of the Laplacian $\nabla^{2}$ in spherical coordinates has the following form: $$ \nabla^{2} = r^{-2} \frac{\textrm{d}}{\textrm{d} r} \left( r^{2} \frac{\textrm{d}}{\textrm{d} r} \right)$$ (we can ignore the angular parts of the Laplacian, because the 1s trial wave function is spherically symmetrical)

The above integral can be divided into two seperate integrals of the following forms $$ I_1 = - \left( \frac{4 \pi N^{2}}{2} \right) \int_0^\infty r^2 \exp \left( -\alpha r^2 \right) \left[ r^{-2} \frac{\textrm{d}}{\textrm{d} r} \left( r^{2} \frac{\textrm{d}}{\textrm{d} r} \right) \exp \left( -\alpha r^2 \right) \right] \textrm{d}r $$ and $$ I_2 = - \left( \frac{4 \pi N^{2}}{2} \right) \int_0^\infty r^2 \exp \left( -\alpha r^2 \right) \frac{1}{r} \exp \left( -\alpha r^2 \right) \textrm{d}r .$$ Since the wavefunction has no angular dependencies, the azimuthal ($\theta$) and polar ($\varphi$) parts have already been integrated in the above equations. These give the factor $4\pi$. To solve the first integral ($I_{1}$), we need to apply the Laplacian. This gives $$ I_1 = - \left( \frac{4 \pi N^{2}}{2} \right) \int_0^\infty \exp \left( -\alpha r^2 \right) \left[ -6 \alpha^{2} + 4 \alpha^{2}r^{4} \right] \exp \left( -\alpha r^2 \right) \textrm{d}r .$$ Slightly rearranging this equation gives us $$ I_1 = - 2 \pi N^{2} \int_0^\infty \left( 4 \alpha^{2}r^{4} -6 \alpha^{2} \right) \exp \left( -2 \alpha r^2 \right) \textrm{d}r .$$ This integral can be solved by performing integration by parts, but here we will use just use the following general formulas: $$\int_{0}^{\infty} \textrm{d}r \, r^{2m} \exp \left(-\alpha r^{2} \right) = \frac{(2m)!\pi^{1/2}}{2^{2m+1}m!\alpha^{m+1/2}} $$ and $$\int_{0}^{\infty} \textrm{d}r \, r^{2m+1} \exp \left(-\alpha r^{2} \right) = \frac{m!}{2\alpha^{m+1}} .$$ Applying these formulas for $I_1$ yields $$I_{1} = -2 \pi N^{2} \left[ 4 \alpha^{2} \frac{4! \pi^{1/2}}{2^5 2! (2 \alpha)^{5/2}} - 6 \alpha \frac{2!\pi^{1/2}}{2^3 1! (2\alpha)^{3/2}} \right] = N^{2} \frac{3 \pi^{3/2}} {4 \sqrt{2\alpha}} $$ and for $I_{2}$: $$I_{2} = -4 \pi N^{2} \int_{0}^{\infty} r \exp \left( -2 \alpha r^{2} \right) \textrm{d}r = -4 \pi N^{2} \frac{1}{4 \alpha} = \frac{-\pi}{\alpha} N^{2} .$$ Summing these two parts gives $$\langle \phi \lvert H \rvert \phi \rangle = I_1 + I_2 = N^2 \left( \frac{3 \pi ^{3/2}}{4 \sqrt{2 \alpha}} - \frac{\pi}{\alpha} \right) .$$ Using the same procedure, we can also evaluate the integral $ \langle \phi \rvert \phi \rangle $. $$ \langle \phi \rvert \phi \rangle = 4 \pi N^{2} \int_{0}^{\infty} r^2 \exp \left( -2 \alpha r^2 \right) \textrm{d}r = N^{2} \frac{\pi^{3/2}}{2 \sqrt{2}\alpha^{3/2}} $$ In order to evaluate the expectation value for the energy $E$ for the trial wave function $\lvert \varphi \rangle$, we need to evaluate $$ E = \frac{\langle \phi \lvert \hat{H} \rvert \phi \rangle }{\langle \phi \rvert \phi \rangle} = \frac{3 \alpha}{2} - \frac{2 \sqrt{2} \alpha ^{1/2}}{\pi ^{1/2}} .$$ The reason we need to divide by $\langle \phi \rvert \phi \rangle$ is because $\lvert \phi \rangle$ is not normalized yet. If $\lvert \phi \rangle$ would be normalized, then the integral $\langle \phi \rvert \phi \rangle$ would be equal to one and thus drop out. The best value for $\alpha$ can then be found by differentiating $E$ with respect to $\alpha$ and finding that value for $\alpha$ where the differential is zero. $$ \frac{\textrm{d}E}{\textrm{d}\alpha} = 0 .$$ Thus $$ \frac{\textrm{d}E}{\textrm{d}\alpha} = \frac{3}{2} - \frac{\sqrt{2}}{\pi ^{1/2} \alpha^{1/2}} $$ and $$ \alpha = \frac{8}{9 \pi}.$$ This gives us the following upper bound for the energy $E$: $$E = \frac{3}{2} \frac{8}{9 \pi} - \frac{2 \sqrt{2} \left( \frac{8}{9 \pi} \right)^{1/2}}{\pi ^{1/2}} = -\frac{4}{3 \pi} \approx -0.4244 .$$

The exact energy for the 1s orbital in hydrogen is -0.5 HT with the corresponding wave function $\lvert \phi \rangle = \frac{1}{\sqrt{\pi}} \exp \left( - r \right)$.

The linear variational principle

Although the exact answer to the energy of the 1s orbital in hydrogen is known, let us assume that we do not know it and would like to obtain it. One way is to pick various trial wave functions with one independent variable and optimize that variable. The lowest energy we are going to find gives us then the best approximation of the ground state wave function. This approach would be quite tedious (and how would you know what trial wave function to choose in the first place anyway?).

It would be nice if we could have just a single trial wave function that has a lot of 'variational freedom'. Here, I propose to use a trial wave function that is a linear set of fixed wave function like so:

$$ \lvert \phi \rangle = \sum_{i=1}^N c_{i} \lvert \psi_{i} \rangle $$

If we can assume that the wave functions $\lvert \psi_i \rangle$ are orthonormal in the sense that $\langle \psi_i \lvert \psi_j \rangle = \delta_{ij}$, then the energy is

$$E = \sum_{ij} \langle \psi_i \lvert \hat{H} \rvert \psi_j \rangle = \sum_{ij} c_{i}c_{j} H_{ij}$$

and we are tasked with finding the best set of coefficients $c_{i}$ that minimizes $E$. It are exactly these coefficients (and accompanying wave functions) that provide us with the necessary variational freedom. (Note that $H_{ij} = \langle \psi_i \lvert \hat{H} \rvert \psi_j \rangle$)

The procedure to find this set is by using Langrange's method of undetermined multiplies. Herein

$$L \left( c_1, c_2, \cdots, E \right) = \langle \phi \lvert \hat{H} \rvert \phi \rangle - E \left(\langle \phi \lvert \phi \rangle - 1 \right) $$ and we need to find that particular set of $c_k$ where

$$ \frac{\partial L}{\partial c_{k}} = 0 $$

for each $k$.

For each $k$, this gives the following equation

$$ \sum_{j} c_j H_{kj} + \sum_i c_i H_{ik} - 2 E c_{k} = 0 $$.

Since $\langle \psi_a \lvert \hat{H} \rvert \psi_b \rangle = \langle \psi_b \lvert \hat{H} \rvert \psi_a \rangle = H_{ab} = H_{ba}$, we can simplify the above equation to

$$ \sum_{j} H_{ij} c_j - E c_{i} = 0. $$

The above expression is just the matrix expression

$$ H\vec{c} = E\vec{c} .$$

This last equation looks probably very familiar to you and relates to the matrix eigenvalue problem. It is a very typical problem in linear algebra. To solve the matrix eigenvalue problem, one needs to diagonalize the $H$ matrix. This gives a set of eigenvalues and corresponding eigenvalues. The eigenvector corresponding to the lowest eigenvalue is the best set of $c_k$ we are interested in.

Example 2: The 2x2 linear variational problem for finding the energy of the 1s orbital of hydrogen

To illustrate the above procedure, we are going to find the best solution for the 1s orbital in the hydrogen atom using a trial wave function that is a linear combination of two orthonormal wave functions. The two orthonormal wave functions are $$\psi_{1} = \frac{2}{\pi^{3/4}} \exp \left( -r^2 / 2 \right)$$ and $$\psi_{2} = \frac{2\pi^{1/4}r - 4/\pi^{1/4}}{\sqrt{2\pi(3\pi-8)}} \exp \left( -r^2 / 2 \right).$$ And because of the orthonormality condition, the following integrals hold $$\langle \psi_{1} \lvert \psi_{2} \rangle = 0$$ and $$\langle \psi_{1} \lvert \psi_{1} \rangle = \langle \psi_{2} \lvert \psi_{2} \rangle = 1.$$ Applying the Hamiltonian to these wave functions, we can obtain the following 2x2 H-matrix: $$H = \begin{bmatrix} \langle \psi_{1} \lvert \hat{H} \rvert \psi_{1} \rangle & \langle \psi_{1} \lvert \hat{H} \rvert \psi_{2} \rangle \\\langle \psi_{2} \lvert \hat{H} \rvert \psi_{1} \rangle & \langle \psi_{2} \lvert \hat{H} \rvert \psi_{2} \rangle\end{bmatrix} = \begin{bmatrix} -0.378379 & -0.0185959 \\ -0.0185959 & 1.09531 \end{bmatrix}$$ Note that the H-matrix is symmetric. This is because we can exchange $\lvert \psi_{1} \rangle$ and $\lvert \psi_{2} \rangle$ and obtain the same result. Furthermore, note that although $\langle \psi_{1} \lvert \psi_{2} \rangle = 0$, this does not mean that the integral is also zero when the Hamiltonian operator is applied to either of the wave functions.

In order to find the best energy and approximation to the ground state wave function, we need to diagonalize the H-matrix. This can be easily done by hand, or using the very handy Wolfram Alpha website. The result we obtain is $$D = \begin{bmatrix} 1.09554 & 0 \\ 0 & -0.378614 \end{bmatrix}$$ for the diagonal (eigenvalue) matrix with corresponding $$X = \begin{bmatrix} -0.0126156 & -0.99992 \\ 0.99992 & -0.0126156 \end{bmatrix}$$ for the eigenvector matrix. The first column of $X$ contains the coefficients corresponding to the first eigenvalue (energy) of the $D$-matrix and the second column of $X$ for the second eigenvalue. Since the second eigenvalue is lower ($-0.378614 < 1.09554$), the second column of $X$ contains the coefficients we are looking for. Thus, the best approximation to the ground state wave function is: $$\lvert \phi \rangle = c_{1} \lvert \psi_{1} \rangle + c_{2} \lvert \psi_{2} \rangle = -0.99992 \lvert \psi_{1} \rangle - 0.0126156 \lvert \psi_{2} \rangle.$$ The corresponding energy for this wave function is -0.378614 HT. This result is actually worse than what we obtained using the procedure of example 1 (which was -0.4244 HT). The reason for this is that the variational freedom using a linear combination of only two wave functions is actually less than one where we can vary a coefficient inside an exponent. To get better results, we would need to use a larger set of wave functions in the linear combination.

In Example 2 it was shown how we can obtain the best energy given a trial wave function which is a linear combination of a set of wave functions. Let us call these wave functions in the linear combination the basis functions. In order for the procedure in example 2 to work, the basis functions need to be orthonormal to each other. Finding a set of orthonormal functions by hand is somewhat tedious (try it for yourself if you do not believe me!). It would therefore be very handy if we could apply the above methodology (albeit somewhat changed) using a set of non-orthonormal basis functions.

I already showed you how you can use Lagrange's method of undetermined multipliers in order to derive the matrix equation $Hc = Ec$. When the basis functions are no longer orthonormal, this matrix equation changes slightly and becomes

$$ H\vec{c} = ES\vec{c}.$$

Herein, $S$ is termed the overlap matrix where each term in the matrix is

$$ S_{ij} = \langle \psi_{i} \lvert \psi_{j} \rangle.$$

The introduction of the overlap matrix $S$ makes the matrix diagonalization a tiny bit more complicated. There are several procedures for obtaining the eigenvalues and -vectors. The procedure we are going to use follows the canonical orthogonalization scheme. It works as follows

  1. Construct the overlap matrix $S$.
  2. Diagonalize the overlap matrix $S$ to obtain the eigenvalues of the overlap matrix and its corresponding eigenvector matrix $U$.
  3. Construct a transformation matrix $X = U s^{-1/2}$.
  4. Transform the Hamiltonian matrix to the canonically orthonormalized basis set $H' = X^{\dagger} H X$.
  5. Diagonalize the transformed Hamiltonian matrix to obtain $D$ and $C'$ (where C' are the transformed coefficients).
  6. Obtain the coefficients of the original basis by calculating $C = XC'$.

The above probably looks a bit complicated, but really, it is not. Let me elaborate by showing you another example.

Example 3: Finding the energy of the 1s orbital of hydrogen using a non-orthonormal set of basis functions

The two functions we are going to use are $$\psi_{1} = \exp \left( -r^{2} / 2 \right) $$ and $$\psi_{2} = r \exp \left( -r^{2} / 2 \right).$$ (Step 1)Their corresponding overlap matrix is $$S = \begin{bmatrix} \langle \psi_{1} \lvert \psi_{1} \rangle & \langle \psi_{1} \lvert \psi_{2} \rangle \\\langle \psi_{2} \lvert \psi_{1} \rangle & \langle \psi_{2} \lvert \psi_{2} \rangle\end{bmatrix} = \begin{bmatrix} 5.56833 & 6.2832 \\ 6.2832 & 8.35249 \end{bmatrix}.$$ From this matrix we can clearly see that these functions are neither orthogonal to each other nor normalized.

(Step 2)Diagonalizing the overlap matrix yields the following diagonal and eigenvector matrices $$D' = \begin{bmatrix} 13.396 & 0 \\ 0 & 0.524846 \end{bmatrix}.$$ and $$ U = \begin{bmatrix} 0.625975 & -0.779843 \\ 0.779843 & 0.625975 \end{bmatrix}.$$ (Step 3)We can obtain the transformation matrix by calculating $$X = U s^{-1/2} = \begin{bmatrix} 0.625975 & -0.779843 \\ 0.779843 & 0.625975 \end{bmatrix} \begin{bmatrix} 1/\sqrt{13.396} & 0 \\ 0 & 1/\sqrt{0.524846} \end{bmatrix} = \begin{bmatrix} 0.171029 & -1.07644 \\ 0.213069 & 0.864054 \end{bmatrix}.$$ The Hamiltonian matrix is $$ H = \begin{bmatrix} \langle \psi_{1} \lvert \hat{H} \rvert \psi_{1} \rangle & \langle \psi_{1} \lvert \hat{H} \rvert \psi_{2} \rangle \\\langle \psi_{2} \lvert \hat{H} \rvert \psi_{1} \rangle & \langle \psi_{2} \lvert \hat{H} \rvert \psi_{2} \rangle\end{bmatrix} = \begin{bmatrix} -2.10694 & -2.42674 \\ -2.42674 & -1.4109 \end{bmatrix}.$$ Because our wave function $\lvert \psi_{1} \rangle$ in this example differs only by a normalization constant to the wave function $\lvert \psi_{1} \rangle$ in example 2, the value for the Hamiltonian element $H_{00}$ should only differ by a factor equal to the square of the normalization constant. In other words, if we divide $H_{00}$ by the value for $S_{00}$ (which represents that difference), we obtain the value of $H_{00}$ in example 2. The same applies of course for $H_{11}$. We can obtain the value in example 1 by calculating $H_{11} / S_{11}$.

(Step 4)And applying the transformation matrix to the Hamiltonian matrix gives $$ H' = X^{\dagger} H X = \begin{bmatrix} 0.171029 & 0.213069 \\ -1.07644 & 0.864054 \end{bmatrix} \begin{bmatrix} -2.10694 & -2.42674 \\ -2.42674 & -1.4109 \end{bmatrix} \begin{bmatrix} 0.171029 & -1.07644 \\ 0.213069 & 0.864054 \end{bmatrix}$$ $$ H' = X^{\dagger} H X = \begin{bmatrix} -0.302548 & 0.32611 \\ 0.32611 & 1.01951 \end{bmatrix}.$$ (Step 5)Diagonalizing the transformed Hamiltonian matrix gives $$ D = \begin{bmatrix} 1.09557 & 0 \\ 0 & -0.378614 \end{bmatrix}$$ with corresponding eigenvector matrix $$C' = \begin{bmatrix} 0.227151 & -0.973859 \\ 0.973859 & 0.227151 \end{bmatrix}.$$ (Step 6)Finally, in order to get the coefficients in the original basis, we need to calculate $$ C = XC' = \begin{bmatrix} 0.171029 & -1.07644 \\ 0.213069 & 0.864054 \end{bmatrix} \begin{bmatrix} 0.227151 & -0.973859 \\ 0.973859 & 0.227151 \end{bmatrix} = \begin{bmatrix} -1.00945 & -0.411073 \\ 0.889866 & -0.0112284 \end{bmatrix}.$$ The lowest eigenvalue found from the diagonal matrix $D$ is -0.378614 HT. The corresponding wave function can be found by taking the coefficients of the corresponding column vector, thus giving $$\lvert \phi \rangle = -0.411073 \cdot \exp \left(-r^2 / 2 \right) - 0.0112284 \cdot r \exp \left(-r^2 / 2 \right).$$ This is effectively the same answer as found in example 2 as can be seen from the radial distribution plot (below)! This is of course not very surprising. The variation used in both approximations was by taking different polynomials in front of the exponent. Both attempts at finding the best wave function (example 2 as well as this example) use a zeroth and first order polynomial in front of the exponent. If we compare the radial distribution function of the wave function here and the one obtained in example 2, we can see that these functions overlap.

This does not necessarily imply that the underlying contributions of the basis functions are the same. If you would examine the coefficients and the corresponding basis functions, you would notice a particular difference. The contribution of each basis functions in each set is visualized in the graph below. We can clearly see that the contributions differ.

If we now sum up the contributions of the two basis functions for each linear combination, we note that the two wave functions in example 2 and this example are in principle the same.

Thus, we obtain the same result for each wave function.

Although we have now the tools to find the best approximation of the ground state energy and its corresponding wave function using any set of basis functions, the result so far was not very satisfactory. Using only a single trial wave function as for instance given in example 1 gave better results. In order to improve on the previous results, we need to expand our set of basis functions. A larger basis set should in principle give a better result.

Example 4: The energy of the 1s orbital of hydrogen using a basis set including polynomials up to 4th order

For our basis set, we are going to use the following set of basis functions: $$ \lvert \psi_{i} \rangle = r^{i} \exp \left( -r^2/2 \right) $$ for $i = 0,1,2,3,4$. The overlap matrix $S$ then looks like: $$ S = \begin{bmatrix} 5.56833 & 6.28319 & 8.35249 & 12.5664 & 20.8812 \\ 6.28319 & 8.35249 & 12.5664 & 20.8812 & 37.6991 \\ 8.35249 & 12.5664 & 20.8812 & 37.6991 & 73.0843 \\ 12.5664 & 20.8812 & 37.6991 & 73.0843 & 150.796 \\ 20.8812 & 37.6991 & 73.0843 & 150.796 & 328.879 \end{bmatrix} $$ and the hamiltonian matrix $H$ is $$ H = \begin{bmatrix} -2.10694 & -2.42674 & -4.19506 & -8.35249 & -17.7867 \\ -2.42674 & \ -1.4109 & -2.06931 & -5.25794 & -14.598 \\ -4.19506 & -2.06931 & -1.08169 & \ -2.03167 & -8.98742 \\ -8.35249 & -5.25794 & -2.03167 & 1.45319 & 2.31392 \\ -17.7867 & -14.598 & -8.98742 & 2.31392 & 22.7788 \end{bmatrix} $$ Using the same procedure as in example 3, we obtain for the lowest eigenvalue of the transformed hamiltonian matrix the energy of -0.487773 HT. This result is a significant improvement and comes very close to the exact energy of -0.5HT!

Further improvement by expanding the basis set to 10 basis functions gives a result of -0.498481 HT.


In this blogpost I have shown you that in order to find the best approximation to the ground state energy, the variational principle can be employed. By introducing some kind of variety in a trial wave function and then minimizing its corresponding energy, gives the best approximation to the ground state energy can be found.

In the linear variational method, the variational freedom is provided by the specific set of basis functions used. In this method, the best coefficients in the linear combination of said basis functions that minimize the energy can be found. Here, I have shown that a sufficiently large set of basis functions composed of a Gaussian part (the exponent) multiplied with a polynomial of increasing order is able to approach the true ground state energy of the hydrogen atom.


Finally, I would like to acknowledge Bart Zijlstra for thoroughly proofreading this post and pointing out the mistakes.

If you have questions or comments, feel free to drop a line! Like what you read? Share this page with your friends and colleagues.


What is the answer to Seven + Six?
Please answer with a whole number, i.e. 2, 3, 5, 8,...