Implementation¶

In the context of APDFT, we evaluate energy differences by

\[\begin{split}\Delta E &= \int_\Omega d\mathbf{r} \Delta V(\mathbf{r})\tilde\rho(\mathbf{r})\\ \Delta V(\mathbf{r}) &= \sum_I \frac{\Delta Z_I}{|\mathbf{r} - \mathbf{R_I}|} = \sum_I \Delta V_I(\mathbf{r})\\ \tilde\rho(\mathbf{r}) &= \sum_{n=1}^\infty \frac{1}{n!} \frac{\partial^{n-1}\rho_\lambda}{\partial\lambda^{n-1}}\end{split}\]

where the partial derivatives in \(\tilde\rho\) are obtained by means of finite difference evaluations. More precisely, we apply the chain rule and obtain

\[\begin{split}\frac{\partial\rho}{\partial\lambda} &= \sum_I \frac{\partial\rho}{\partial Z_I}\frac{\partial Z_I}{\partial\lambda} = \sum_I \frac{\partial\rho}{\partial Z_I}\Delta Z_I\\ \frac{\partial^2\rho}{\partial\lambda^2} &= \sum_I \sum_J \frac{\partial^2\rho}{\partial Z_I\partial Z_J}\Delta Z_I\Delta Z_J\\ \frac{\partial^q\rho}{\partial\lambda^q} &= \sum_{I_1} \sum_{I_2}\dots\sum_{I_q} \frac{\partial^q\rho}{\partial Z_{I_1}\partial Z_{I_2}\dots \partial Z_{I_q}}\Delta Z_{I_1}\Delta Z_{I_2}\dots\Delta Z_{I_q}\end{split}\]

running over all nuclei \(I\). For higher orders, the chain rule is applied multiple times, keeping in mind that \(\partial^2_\lambda Z_I=0\).

For all partial derivatives, we use the finite difference scheme but with a particular stencil to reduce the number of (costly) evaluations of \(\rho\). For non-mixed derivatives of arbitrary order, we use regular central finite differences with a symmetric stencil:

\[\begin{split}\frac{\partial \rho}{\partial Z_I} &\approx \frac{\rho(Z_1, \dots, Z_I+h, \dots,Z_N) - \rho(Z_1, \dots, Z_I-h, \dots,Z_N)}{2h}\\ \frac{\partial^q\rho}{\partial Z_I^q} &\approx \frac{1}{(2h)^q}\sum _{i=0}^{q}(-1)^{i}{\binom {q}{i}}\rho\left(Z_1, \dots, Z_I+\left(q-2i\right)h, \dots, Z_N\right)\end{split}\]

In the case of mixed partial derivatives, we use a symmetric stencil with only two additional points

\[\begin{split}\frac{\partial^2 \rho}{\partial Z_I \partial Z_J} \approx \frac{1}{2h^2}\Big[&\rho(Z_1, \dots, Z_I+h, \dots, Z_J+h,\dots,Z_N)\\ &-\rho(Z_1, \dots, Z_I+h,\dots,Z_N)-\rho(Z_1, \dots, Z_J+h,\dots,Z_N)\\ &+2\rho(Z_1, \dots, Z_N)\\ &-\rho(Z_1, \dots, Z_I-h, \dots,Z_N)-\rho(Z_1, \dots, Z_J-h,\dots,Z_N)\\ &+\rho(Z_1, \dots, Z_I-h, \dots, Z_J-h,\dots,Z_N)\Big]\end{split}\]

For a practical implementation, it is much more efficient to collect the prefactors \(\alpha_i\) of the individual self-consistent densities \(\rho_i(\{Z_I\})\) first, such that

\[\tilde\rho(\mathbf{r}) = \sum_i \alpha_i\rho_i(\mathbf{r})\]

where the coefficients \(\alpha_i\) depend on three components: the reference molecule, the target molecule, and the finite difference stencil.

Energies¶

Together with the above equations for \(\Delta V_I\), the expression for the change in energy can be re-cast

\[\begin{split}\Delta E &= \int_\Omega d\mathbf{r} \left(\sum_I \frac{\Delta Z_I}{|\mathbf{r} - \mathbf{R_I}|}\right) \left(\sum_i \alpha_i\rho_i(\mathbf{r})\right) \\ &= \sum_I\sum_i \Delta Z_I\alpha_i \int_\Omega d\mathbf{r} \frac{\rho_i(\mathbf{r})}{|\mathbf{r} - \mathbf{R_I}|}\end{split}\]

which is much more efficient to implement, since the (costly) evaluation of the integral in space only needs to be done once for every combination of density \(\rho_i\) and nucleus \(I\). The latter term relates to the electrostatic potential at the nucleus (EPN) which can be calculated analytically in many codes (e.g. psi4). The sign convention is such that the integral shall be positive.

This can be further simplified as an outer product, i.e. \(\mathbf{\Delta Z}=\{\Delta Z_I\}\):

\[\Delta E = \sum_{jk} \left(\left[\mathbf{\Delta Z}\otimes \mathbf{\alpha}\right]\circ \mathbf{Q}\right)_{jk}\]

where \(\mathbf{Q}_{Ii}\) is just the electrostatic potential of density \(i\) at nucleus \(I\). This formulation is numerically efficient to implement, since it only requires the QM codes to export \(\mathbf{Q}\) instead of an explicit density grid. Moreover, since the density matrix does not need to be exported, interfacing is more reliable because no basis function ordering is relevant to the results.

Density Moments¶

The electronic component of the charge distribution moments can be obtained from a linear combination of the same quantities evaluated for the individual single point calculations from the finite difference scheme. For example, for the dipole moment, we have

\[\mu_i = \int_\Omega d\mathbf{r} \rho(\mathbf{r})\mathbf{r}_i\]

Similarly to the energy, the target density \(\rho_t\) can be build from coefficients of the individual calculations in the context of the finite difference scheme. Note that these coefficients \(\beta_i\) are not the same as for the energy expression, but they still depend on the reference molecule, the target molecule, and the finite difference stencil.

\[\rho_t(\mathbf{r}) = \sum_i \beta_i\rho_i(\mathbf{r})\]

For the dipole moment, this results in a linear combination of the dipole moments of the individual single point calculations

\[\mu_i = \sum_i \beta_i \int_\Omega d\mathbf{r} \rho_i(\mathbf{r}) \mathbf{r}_i\]

Note that this only applies to the electronic component of the charge distribution moments.

Ionic Forces¶

Similarly to dipole moments, the expression for the ionic force allows linear combinations:

\[\begin{split}\mathbf{F}_I &= Z_I\int_\Omega d\mathbf{r} \rho_t(\mathbf{r})\frac{\mathbf{r} - \mathbf{R}_I}{|\mathbf{r} - \mathbf{R}_I|^3}\\ &=Z_I\sum_i \beta_i\int_\Omega d\mathbf{r} \rho_i(\mathbf{r})\frac{\mathbf{r} - \mathbf{R}_I}{|\mathbf{r} - \mathbf{R}_I|^3}\\\end{split}\]

Calculators¶

class apdft.calculator.Calculator(method, basisset, superimpose=False)[source]¶

Bases: apdft.calculator.CalculatorInterface

A concurrency-safe blocking interface for an external QM code.

get_density_on_grid(folder, gridpoints)[source]¶

static get_grid(nuclear_numbers, coordinates, outputfolder)[source]¶

Returns the integration grid used by this calculator for a given set of nuclei and geometry.

Grid weights and coordinates may be in internal units. Return value should be coords, weights. If return value is None, a default grid is used.

get_methods()[source]¶

class apdft.calculator.CalculatorInterface[source]¶

Bases: object

All the functions that need to be implemented for a new code to be supported in APDFT.

get_epn(coordinates, includeatoms, nuclear_charges)[source]¶: Extracts the electronic contribution to the electrostatic potential at the nuclei.

get_input(coordinates, nuclear_numbers, nuclear_charges, grid, iscomparison=False, includeonly=None)[source]¶

Generates the calculation input file for the external code.

It is crucial to remember that this may involve calculations where the basis set of a nucleus and its element to not match.

Parameters

coordinates – A (3,N) array of nuclear coordinates \(\mathbf{r_i}\). [Angstrom] nuclear_numbers: A N array of nuclear numbers \(q_i\). [e]
nuclear_charges – A N array of the effective nuclear charges \(q_i\). [e]
grid – Deprecated.
iscomparison – Boolean. If the input is meant for a comparison calculation, this might allow for shortcuts.
includeonly – A N’ array of 0-based atom indices to be included in the evaluation.

Returns

File contents as string.

get_runfile(coordinates, nuclear_numbers, nuclear_charges, grid)[source]¶

static get_total_energy(folder)[source]¶

Extracts the total energy of a calculation.

Parameters: folder – String. Path to the QM calculation from which the energy is to be extracted.
Returns: Total energy including nuclear-nuclear interaction [Hartree].

class apdft.calculator.MockCalculator(method, basisset, superimpose=False)[source]¶

Bases: apdft.calculator.Calculator

classmethod get_runfile(coordinates, nuclear_numbers, nuclear_charges, grid)[source]¶

Gaussian¶

class apdft.calculator.gaussian.GaussianCalculator(method, basisset, superimpose=False)[source]¶

Bases: apdft.calculator.Calculator

Performs the QM calculations for APDFT with the help of Gaussian.

General Idea: Gaussian supports both fractional nuclar charges (via the undocumented Massage keyword) and the evaluation of the electrostatic potential at the nucleus (via the Prop keyword). Earlier versions used to read fchk files and map out a grid, which is less accurate, but also feasible in Gaussian.

static get_epn(folder, coordinates, includeatoms, nuclear_charges)[source]¶

Extracts the EPN from a Gaussian log file.

The Gaussian convention is to include the nuclear interaction of all other sites. Signs are in the physical sense, i.e. the electronic contribution is negative, while the nuclear contribution is positive. This function also converts to the APDFT convention where \(\int \rho / |\mathbf{r}-\mathbf{R}_I|\) is positive.

Parameters

folder – String, the path to the calculation.
coordiantes – Nuclear coordinates
includeatoms – List of zero-based indices of the atoms to include in APDFT
nuclear_charges – Float, list of nuclear charges for this particular calculation.

Returns

Numpy array of EPN in Hartree.

get_input(coordinates, nuclear_numbers, nuclear_charges, grid, iscomparison=False, includeonly=None)[source]¶

Generates the calculation input file for the external code.

It is crucial to remember that this may involve calculations where the basis set of a nucleus and its element to not match.

Parameters

coordinates – A (3,N) array of nuclear coordinates \(\mathbf{r_i}\). [Angstrom] nuclear_numbers: A N array of nuclear numbers \(q_i\). [e]
nuclear_charges – A N array of the effective nuclear charges \(q_i\). [e]
grid – Deprecated.
iscomparison – Boolean. If the input is meant for a comparison calculation, this might allow for shortcuts.
includeonly – A N’ array of 0-based atom indices to be included in the evaluation.

Returns

File contents as string.

classmethod get_runfile(coordinates, nuclear_numbers, nuclear_charges, grid)[source]¶

static get_total_energy(folder)[source]¶: Returns the total energy in Hartree.

MRCC¶

class apdft.calculator.mrcc.MrccCalculator(method, basisset, superimpose=False)[source]¶

Bases: apdft.calculator.Calculator

static density_on_grid(densityfile, grid)[source]¶

get_density_on_grid(folder, grid)[source]¶

static get_electronic_dipole(folder, gridcoords, gridweights)[source]¶

static get_epn(folder, coordinates, includeatoms, nuclear_charges)[source]¶

Calculates the EPN from a MRCC density file.

MRCC has no support to evaluate the EPN directly. To avoid parsing the ordering of the density matrix in ao basis, the density on a grid is read instead. The resulting error should be small for the first few orders.

Parameters

folder – String, the path to the calculation.
coordiantes – Nuclear coordinates
includeatoms – List of zero-based indices of the atoms to include in APDFT
nuclear_charges – Float, list of nuclear charges for this particular calculation.

Returns

Numpy array of EPN in Hartree.

static get_grid(nuclear_numbers, coordinates, outputfolder)[source]¶

Obtains the integration grid from one of the MRCC output files.

Returns: Grid coordinates [Angstrom], grid weights.

get_input(coordinates, nuclear_numbers, nuclear_charges, grid, iscomparison=False, includeonly=None)[source]¶

Generates the calculation input file for the external code.

It is crucial to remember that this may involve calculations where the basis set of a nucleus and its element to not match.

Parameters

coordinates – A (3,N) array of nuclear coordinates \(\mathbf{r_i}\). [Angstrom] nuclear_numbers: A N array of nuclear numbers \(q_i\). [e]
nuclear_charges – A N array of the effective nuclear charges \(q_i\). [e]
grid – Deprecated.
iscomparison – Boolean. If the input is meant for a comparison calculation, this might allow for shortcuts.
includeonly – A N’ array of 0-based atom indices to be included in the evaluation.

Returns

File contents as string.

classmethod get_runfile(coordinates, nuclear_numbers, nuclear_charges, grid)[source]¶

static get_total_energy(folder)[source]¶: Returns the total energy in Hartree.

Physics-related functions¶

class apdft.physics.APDFT(highest_order, nuclear_numbers, coordinates, basepath, calculator, max_charge=0, max_deltaz=3, include_atoms=None, targetlist=None)[source]¶

Implementation of alchemical perturbation density functional theory.

Requires a working directory basepath which allows for storing the intermediate calculation results.

analyse(explicit_reference=False)[source]¶: Performs actual analysis and integration. Prints results

enumerate_all_targets()[source]¶

Builds a list of all possible targets.

Note that the order is not guaranteed to be stable.

Parameters: self – Class instance from which the total charge and number of sites is determined.
Returns: A list of lists with the integer nuclear charges.

estimate_cost_and_coverage()[source]¶

Estimates number of single points (cost) and number of targets (coverage).

Parameters: self – Class instance from which the total charge and number of sites is determined.
Returns: number of single points, number of targets.
Return type: Tuple of ints

get_energy_from_reference(nuclear_charges, is_reference_molecule=False)[source]¶

Retreives the total energy from a QM reference.

Parameters: nuclear_charges – Integer list of nuclear charges. [e]
Returns: The total energy. [Hartree]

get_epn_coefficients(deltaZ)[source]¶

EPN coefficients are the weighting of the electronic EPN from each of the finite difference calculations.

The weights depend on the change in nuclear charge, i.e. implicitly on reference and target molecule as well as the finite difference stencil employed.

get_epn_matrix()[source]¶: Collects \(\int_Omega rho_i(\mathbf{r}) /|\mathbf{r}-\mathbf{R}_I|\).

get_folder_order()[source]¶

Returns a static order of calculation folders to build the individual derivative entries.

To allow for a more efficient evaluation of APDFT, terms are collected and most of the evaluation is done with the combined cofficients of those terms. This requires the terms to be handled in a certain fixed order that is stable in the various parts of the code. Depending on the selected expansion order, this function builds the list of folders to be included.

Returns: List of strings, the folder names.

get_linear_density_coefficients(deltaZ)[source]¶

Obtains the finite difference coefficients for a property linear in the density.

Parameters: deltaZ – Array of integers of length N. Target system expressed in the change in nuclear charges from the reference system. [e]
Returns: Vector of coefficients.

get_linear_density_matrix(propertyname)[source]¶

Retrieves the value matrix for properties linear in density.

Valid properties are: ELECTRONIC_DIPOLE, IONIC_FORCE, ELECTRONIC_QUADRUPOLE. :param self: APDFT instance. :param propertyname: String. One of the choices above.

Returns: (N, m) array for an m-dimensional property over N QM calculations or None if the property is not implemented with this QM code.

predict_all_targets()[source]¶

prepare(explicit_reference=False)[source]¶: Builds a complete folder list of all relevant calculations.

class apdft.physics.Coulomb[source]¶

Collects functions for Coulomb interaction.

static nuclear_potential(coordinates, charges, at)[source]¶

static nuclei_nuclei(coordinates, charges)[source]¶

Calculates the nuclear-nuclear interaction energy from Coulomb interaction.

Sign convention assumes positive charges for nuclei.

Parameters

coordinates – A (3,N) array of nuclear coordinates \(\mathbf{r_i}\). [Angstrom]
charges – A N array of point charges \(q_i\). [e]

Returns

Coulombic interaction energy. [Hartree]

class apdft.physics.Dipoles[source]¶

Collects functions regarding the calculation of dipole moments. This code follows the physics convention of the sign: the dipole moment vector points from the negative charge center to the positive charge center.

static electron_density(reference_point, coordinates, electron_density)[source]¶

Calculates the dipole moment of a charge distribution.

Note that for a charge density, the resulting dipole momennt depends on the chosen reference point. A common choice in the molecular context is the center of mass. Electron density is a positive quantity.

\[\mathbf{p}(\mathbf{r}) = \int_\Omega \rho(\mathbf{r_i})(\mathbf{r_i}-\mathbf{r})\]

Parameters

reference_point – A 3 array of the reference point \(\mathbf{r}\). [Angstrom]
coordinates – A (3,N) array of grid coordinates \(\mathbf{r_i}\). [Angstrom]
electron_density – A N array of electron density values \(\rho\) at coordinates. [e/Angstrom^3]

Returns

Dipole moment \(\mathbf{p}\). [Debye]

static point_charges(reference_point, coordinates, charges)[source]¶

Calculates the dipole moment of point charges.

Note that for sets of point charges of a net charge, the resulting dipole moment depends on the chosen reference point. A common choice in the molecular context is the center of mass. Sign convention is such that nuclei should be given as positive charges.

\[\mathbf{p}(\mathbf{r}) = \sum_I q_i(\mathbf{r_i}-\mathbf{r})\]

Parameters

reference_point – A 3 array of the reference point \(\mathbf{r}\). [Angstrom]
coordinates – A (3,N) array of point charge coordinates \(\mathbf{r_i}\). [Angstrom]
charges – A N array of point charges \(q_i\). [e]

Returns

Dipole moment \(\mathbf{p}\). [Debye]

apdft.physics.angstrom = 1.8897261254578281¶: Conversion factor from Bohr in Angstrom.

apdft.physics.charge_to_label(Z)[source]¶

Converts a nuclear charge to an element label.

Uncharged (ghost) sites are assigned a dash.

Parameters: Nuclear charge. [e] (Z) –
Returns: Element label. [String]

apdft.physics.debye = 4.803204775077208¶: Conversion factor from electron charges and Angstrom to Debye

Math-related functions¶

class apdft.math.IntegerPartitions[source]¶

static partition(total, maxelements, around=None, maxdz=None)[source]¶

Builds all integer partitions of total split into maxelements parts.

Note that ordering matters, i.e. (2, 1) and (1, 2) are district partitions. Moreover, elements of zero value are allowed. In all cases, the sum of all elements is equal to total. There is no guarantee as for the ordering of elements.

If a center around is given, then a radius maxdz is required. Only those partitions are listed where the L1 norm of the distance between partition and around is less or equal to maxdz.

Parameters

total – The sum of all entries. [Integer]
maxelements – The number of elements to split into. [Integer]
around – Iterable of N entries. Center around which partitions are listed. [Integer]
maxdz – Maximum absolute difference in Z space from center around. [Integer]

Returns

A list of all partitions as lists.

APDFT logic¶

apdft.count_log_level_usage(_, __, event_dict)[source]¶

apdft.get_element_number(element)[source]¶

apdft.get_methods()[source]¶

apdft.read_xyz(fn)[source]¶

Command Line Interface¶

apdft.commandline.build_main_commandline(set_defaults=True)[source]¶: Builds an argparse object of the user-facing command line interface.

apdft.commandline.entry_cli()[source]¶

apdft.commandline.mode_energies(conf, modeshort=None)[source]¶

apdft.commandline.parse_into(parser, configuration=None, cliargs=None)[source]¶

Updates the configuration with the values specified on the command line.

Parameters

parser – An argparse parser instance.
configuration – A apdft.settings.Configuration instance. If None, a new instance will be returned.
args – List of split arguments from the command line.

Returns

Mode of operation, single optional argument, updated configuration.

apdft.commandline.parse_target_list(lines)[source]¶

Separates an explicit target list.

Accepted values: - for a missing atom, labels, nuclear charges. One target per line, comma separated.

Settings¶

Manages settings and config file parsing.

class apdft.settings.CodeEnum(value)[source]¶

Bases: enum.Enum

An enumeration.

G09 = 'G09'¶

MRCC = 'MRCC'¶

PSI4 = 'PSI4'¶

PYSCF = 'PYSCF'¶

get_calculator_class()[source]¶

class apdft.settings.Configuration[source]¶

Bases: object

A complete set of configuration values. Merges settings and default values.

Settings are referred to as category.variablename.

from_file()[source]¶

list_options(section=None)[source]¶: Gives all configurable options for a given section.

list_sections()[source]¶: Returns a list of all sections.

to_file()[source]¶

class apdft.settings.Option(category, name, validator, default, description)[source]¶

Bases: object

Represents a single configuration option.

get_attribute_name()[source]¶

get_description()[source]¶

get_validator()[source]¶

get_value()[source]¶

set_value(value)[source]¶

apdft.settings.boolean(val)[source]¶

apdft.settings.intelementrange(val)[source]¶