Genetic algorithm

Symmetry-constrained genetic algorithm for nanoalloys

Procreation operators meant to be used in symmetry-constrained genetic algorithm (SCGA).

class acat.ga.group_operators.Mutation(num_muts=1)[source]

Bases: OffspringCreator

Base class for all particle mutation type operators. Do not call this class directly.

class acat.ga.group_operators.GroupSubstitute(groups=None, elements=None, num_muts=1)[source]

Bases: Mutation

Substitute all the atoms in a group with a different element. The elemental composition cannot be fixed.

Parameters

groups (list of lists or list of list of lists, default None) – The atom indices in each user-divided group. Can be obtained by acat.build.ordering.SymmetricClusterOrderingGenerator or acat.build.ordering.SymmetricSlabOrderingGenerator. You can also mix structures with different groupings in one GA run by providing all possible groups in a list of list of lists, so that the algorithm will randomly assign a grouping to the structure, where for each grouping the atoms in each group are of the same type. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures.
elements (list of strs, default None) – Only take into account the elements specified in this list. Default is to take all elements into account.
num_muts (int, default 1) – The number of times to perform this operation.

substitute(atoms)[source]: Does the actual substitution

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.group_operators.GroupPermutation(groups=None, elements=None, keep_composition=False, num_muts=1)[source]

Bases: Mutation

Permutes the elements in two random groups. The elemental composition can be fixed.

Parameters

groups (list of lists or list of list of lists, default None) – The atom indices in each user-divided group. Can be obtained by acat.build.ordering.SymmetricClusterOrderingGenerator or acat.build.ordering.SymmetricSlabOrderingGenerator. You can also mix structures with different groupings in one GA run by providing all possible groups in a list of list of lists, so that the algorithm will randomly assign a grouping to the structure, where for each grouping the atoms in each group are of the same type. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures.
elements (list of strs, default None) – Only take into account the elements specified in this list. Default is to take all elements into account.
keep_composition (bool, defulat False) – Whether the elemental composition should be the same as in the parents.
num_muts (int, default 1) – The number of times to perform this operation.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

classmethod mutate(atoms, groups, elements=None, keep_composition=False)[source]: Do the actual permutation.

class acat.ga.group_operators.Crossover[source]

Bases: OffspringCreator

Base class for all particle crossovers. Do not call this class directly.

class acat.ga.group_operators.GroupCrossover(groups=None, elements=None, keep_composition=False)[source]

Bases: Crossover

Merge the elemental distributions in two half groups from different structures together. The elemental composition can be fixed.

Parameters

groups (list of lists or list of list of lists, default None) – The atom indices in each user-divided group. Can be obtained by acat.build.ordering.SymmetricClusterOrderingGenerator or acat.build.ordering.SymmetricSlabOrderingGenerator. You can also mix structures with different groupings in one GA run by providing all possible groups in a list of list of lists, so that the algorithm will randomly assign a grouping to the structure, where for each grouping the atoms in each group are of the same type. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures.
elements (list of strs, default None) – Only take into account the elements specified in this list. Default is to take all elements into account.
keep_composition (bool, defulat False) – Whether the elemental composition should be the same as in the parents.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

Comparators meant to be used in symmetry-constrained genetic algorithm (SCGA).

class acat.ga.group_comparators.GroupSizeComparator(groups=None, elements=None)[source]

Bases: object

For each given element, compares the sorted sizes of the user-divided groups that have the given element. Returns True if the sizes are the same, False otherwise. Self-symmetry is considered for particles.

Parameters

groups (list of lists, default None) – The atom indices in each user-divided group. Can be obtained by acat.build.ordering.SymmetricClusterOrderingGenerator or acat.build.ordering.SymmetricSlabOrderingGenerator or acat.build.adlayer.SymmetricPatternGenerator. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures. This is useful to mix structures with different group divisions in one GA.
elements (list of strs, default None) – Only take into account the elements specified in this list. Default is to take all elements into account.

looks_like(a1, a2)[source]: Return if structure a1 or a2 are similar or not.

class acat.ga.group_comparators.GroupCompositionComparator(groups=None, elements=None, tol=0)[source]

Bases: object

Compares the elemental compositions of all user-divided groups. Returns True if the numbers are the same, False otherwise. Self-symmetry is not considered for particles.

Parameters

groups (list of lists, default None) – The atom indices in each user-divided group. Can be obtained by acat.build.ordering.SymmetricClusterOrderingGenerator or acat.build.ordering.SymmetricSlabOrderingGenerator or acat.build.adlayer.SymmetricPatternGenerator. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures. This is useful to mix structures with different group divisions in one GA.
elements (list of strs, default None) – Only take into account the elements specified in this list. Default is to take all elements into account.
tol (int, default 0) – The maximum number of groups with different elements that two structures are still considered to be look alike.

looks_like(a1, a2)[source]: Return if structure a1 or a2 are similar or not.

Example 1

All the group operators and comparators can be easily used with other indexing-preserved operators and comparators. All operators can be used for both non-periodic nanoparticles and periodic surface slabs. To accelerate the GA, provide the symmetry-equivalent group (or all possible groups).

As an example we will find the convex hull of ternary NixPtyAu405-x-y truncated octahedral nanoalloys using the ASAP EMT calculator. Note that we must first align the symmetry axis of interest to the z direction. Here we want to study the 3-fold mirror circular symmetry around the C3 axis of the particle.

The script for a parallel symmetry-constrained genetic algorithm (SCGA) looks as follows:

from acat.build.ordering import SymmetricClusterOrderingGenerator as SCOG
from acat.ga.group_operators import (GroupSubstitute,
                                     GroupPermutation,
                                     GroupCrossover)
from acat.ga.group_comparators import (GroupSizeComparator,
                                       GroupCompositionComparator)
from ase.ga.particle_comparator import NNMatComparator
from ase.ga.standard_comparators import SequentialComparator
from ase.ga.offspring_creator import OperationSelector
from ase.ga.population import Population, RankFitnessPopulation
from ase.ga.convergence import GenerationRepetitionConvergence
from ase.ga.data import DataConnection, PrepareDB
from ase.ga.utilities import get_nnmat
from ase.io import read, write
from ase.cluster import Octahedron
from ase.optimize import BFGS
from asap3 import EMT as asapEMT
from multiprocessing import Pool
from random import randint
import numpy as np
import os

# Define population.
# Recommend to choose a number that is a multiple of the number of cpu
pop_size = 100

# Build and rotate the particle so that C3 axis is aligned to z direction
particle = Octahedron('Ni', length=9, cutoff=3)
particle.rotate(45, 'x')
particle.rotate(-35.29, 'y')
particle.center(vacuum=5.)

# Generate 100 truncated ocatahedral NixPtyAu405-x-y nanoalloys with
# mirror circular symmetry. Get the groups at the same time.
scog = SCOG(particle, elements=['Ni', 'Pt', 'Au'],
            symmetry='mirror_circular',
            trajectory='starting_generation.traj')
scog.run(max_gen=pop_size, mode='stochastic', verbose=True)
groups = scog.get_groups()
images = read('starting_generation.traj', index=':')

# Instantiate the db
db_name = 'ridge_mirror_circular_NiPtAu_TO405_C3.db'

db = PrepareDB(db_name, cell=particle.cell, population_size=pop_size)

for atoms in images:
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    db.add_unrelaxed_candidate(atoms, data=atoms.info['data'])

# Connect to the db
db = DataConnection(db_name)

# Define operators
soclist = ([2, 2, 3],
           [GroupSubstitute(groups, elements=['Ni', 'Pt', 'Au'], num_muts=3),
            GroupPermutation(groups, elements=['Ni', 'Pt', 'Au'], num_muts=1),
            GroupCrossover(groups, elements=['Ni', 'Pt', 'Au']),])
op_selector = OperationSelector(*soclist)

# Define comparators
comp = SequentialComparator([GroupSizeComparator(groups, ['Ni', 'Pt', 'Au']),
                             NNMatComparator(0.2, ['Ni', 'Pt', 'Au'])],
                            [0.5, 0.5])

def vf(atoms):
    """Returns the descriptor that distinguishes candidates in the
    niched population."""
    return atoms.get_chemical_formula(mode='hill')

# Give fittest candidates at different compositions equal fitness.
# Use this to find global minimum at each composition
pop = RankFitnessPopulation(data_connection=db,
                            population_size=pop_size,
                            comparator=comp,
                            variable_function=vf,
                            exp_function=True,
                            logfile='log.txt')

# Normal fitness ranking irrespective of composition
#pop = Population(data_connection=db,
#                 population_size=pop_size,
#                 comparator=comp,
#                 logfile='log.txt')

# Set convergence criteria
cc = GenerationRepetitionConvergence(pop, 5)

# Calculate the relaxed energies for pure Ni405, Pt405 and Au405
pure_pots = {'Ni': 147.532, 'Pt':  86.892, 'Au': 63.566}

# Define the relax function
def relax(atoms, single_point=False):
    atoms.center(vacuum=5.)
    atoms.calc = asapEMT()
    if not single_point:
        opt = BFGS(atoms, logfile=None)
        opt.run(fmax=0.1)
    Epot = atoms.get_potential_energy()
    atoms.info['key_value_pairs']['potential_energy'] = Epot

    # There is a known issue of asapEMT in GA. You can either detach
    # the calculator or re-assign to a SinglePointCalculator
    atoms.calc = None

    # Calculate mixing energy
    syms = atoms.get_chemical_symbols()
    for a in set(syms):
        Epot -= (pure_pots[a] / len(atoms)) * syms.count(a)
    atoms.info['key_value_pairs']['raw_score'] = -Epot

    # Parallelize nnmat calculations to accelerate NNMatComparator
    atoms.info['data']['nnmat'] = get_nnmat(atoms)

    return atoms

# Relax starting generation
def relax_an_unrelaxed_candidate(atoms):
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    nncomp = atoms.get_chemical_formula(mode='hill')
    print('Relaxing ' + nncomp)

    return relax(atoms)

# Create a multiprocessing Pool
pool = Pool(os.cpu_count())
# Perform relaxations in parallel. Especially
# useful when running GA on large nanoparticles
relaxed_candidates = pool.map(relax_an_unrelaxed_candidate,
                              db.get_all_unrelaxed_candidates())
pool.close()
pool.join()
db.add_more_relaxed_candidates(relaxed_candidates)
pop.update()

# Number of generations
num_gens = 1000

# Below is the iterative part of the algorithm
gen_num = db.get_generation_number()
for i in range(num_gens):
    # Check if converged
    if cc.converged():
        print('Converged')
        break
    print('Creating and evaluating generation {0}'.format(gen_num + i))

    # Performing procreations in parallel
    def procreation(x):
        # Select an operator and use it
        op = op_selector.get_operator()
        while True:
            # Assign rng with a random seed
            np.random.seed(randint(1, 10000))
            pop.rng = np.random
            # Select parents for a new candidate
            p1, p2 = pop.get_two_candidates()
            # Pure and binary candidates are not considered
            if len(set(p1.numbers)) < 3:
                continue
            parents = [p1, p2]
            offspring, desc = op.get_new_individual(parents)
            # An operator could return None if an offspring cannot be formed
            # by the chosen parents
            if offspring is not None:
                break
        nncomp = offspring.get_chemical_formula(mode='hill')
        print('Relaxing ' + nncomp)
        if 'data' not in offspring.info:
            offspring.info['data'] = {'tag': None}

        return relax(offspring)

    # Create a multiprocessing Pool
    pool = Pool(os.cpu_count())
    # Perform procreations in parallel. Especially
    # useful when running GA on large nanoparticles
    relaxed_candidates = pool.map(procreation, range(pop_size))
    pool.close()
    pool.join()
    db.add_more_relaxed_candidates(relaxed_candidates)

    # Update the population to allow new candidates to enter
    pop.update()

Example 2

If we want to study the same system but target 3 compositions: Ni0.5Pt0.25Au0.25, Ni0.25Pt0.5Au0.25 and Ni0.25Pt0.25Au0.5, we should not use GroupSubstitute operator and set keep_composition to True in GroupPermutation and GroupCrossover operators. The tolerance of the intitial compositions can be controlled by the eps parameter in SymmetricClusterOrderingGenerator.run.

The script for a fixed-composition parallel genetic algorithm now looks as follows:

from acat.build.ordering import SymmetricClusterOrderingGenerator as SCOG
from acat.ga.group_operators import (GroupSubstitute,
                                     GroupPermutation,
                                     GroupCrossover)
from acat.ga.group_comparators import (GroupSizeComparator,
                                       GroupCompositionComparator)
from ase.ga.particle_comparator import NNMatComparator
from ase.ga.standard_comparators import SequentialComparator
from ase.ga.offspring_creator import OperationSelector
from ase.ga.population import Population, RankFitnessPopulation
from ase.ga.convergence import GenerationRepetitionConvergence
from ase.ga.data import DataConnection, PrepareDB
from ase.ga.utilities import get_nnmat
from ase.io import read, write
from ase.cluster import Octahedron
from ase.optimize import BFGS
from asap3 import EMT as asapEMT
from multiprocessing import Pool
from random import randint
import numpy as np
import os

# Define population.
# Recommend to choose a number that is a multiple of the number of cpu
pop_size = 100

# Build and rotate the particle so that C3 axis is aligned to z direction
particle = Octahedron('Ni', length=9, cutoff=3)
particle.rotate(45, 'x')
particle.rotate(-35.29, 'y')
particle.center(vacuum=5.)

# Generate 100 truncated ocatahedral NixPtyAu405-x-y nanoalloys with
# mirror circular symmetry and concentrations of {x=0.5, y=0.25},
# {x=0.25, y=0.5} and {x=y=0.25}. The concentrations are allowed to
# vary by a range of 2*eps=0.1. Get the groups at the same time.
scog1 = SCOG(particle, elements=['Ni', 'Pt'],
             symmetry='mirror_circular',
             cutoff=1.,
             secondary_symmetry='chemical',
             secondary_cutoff=.1,
             composition={'Ni': 0.5, 'Pt': 0.25, 'Au': 0.25},
             trajectory='starting_generation.traj')
groups = scog1.get_groups()
scog1.run(max_gen=33, mode='stochastic', eps=0.05, verbose=True)

scog2 = SCOG(particle, elements=['Ni', 'Pt'],
             symmetry='mirror_circular',
             cutoff=1.,
             secondary_symmetry='chemical',
             secondary_cutoff=.1,
             composition={'Ni': 0.25, 'Pt': 0.5, 'Au': 0.25},
             trajectory='starting_generation.traj',
             append_trajectory=True)
scog2.run(max_gen=33, mode='stochastic', eps=0.05, verbose=True)

scog3 = SCOG(particle, elements=['Ni', 'Pt'],
             symmetry='mirror_circular',
             cutoff=1.,
             secondary_symmetry='chemical',
             secondary_cutoff=.1,
             composition={'Ni': 0.25, 'Pt': 0.25, 'Au': 0.5},
             trajectory='starting_generation.traj',
             append_trajectory=True)
scog3.run(max_gen=34, mode='stochastic', eps=0.05, verbose=True)

images = read('starting_generation.traj', index=':')

# Instantiate the db
db_name = 'ridge_mirror_circular_NiPtAu_TO405_C3.db'

db = PrepareDB(db_name, cell=particle.cell, population_size=pop_size)

for atoms in images:
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    db.add_unrelaxed_candidate(atoms, data=atoms.info['data'])

# Connect to the db
db = DataConnection(db_name)

# Define operators, now set keep_composition=True
# GroupSubstitute cannot keep the composition so it's not used
soclist = ([4, 6],
           [GroupPermutation(groups, elements=['Ni', 'Pt', 'Au'],
                             keep_composition=True, num_muts=1),
            GroupCrossover(groups, elements=['Ni', 'Pt', 'Au'],
                           keep_composition=True),])
op_selector = OperationSelector(*soclist)

# Define comparators
comp = SequentialComparator([GroupSizeComparator(groups, ['Ni', 'Pt', 'Au']),
                             NNMatComparator(0.2, ['Ni', 'Pt', 'Au'])],
                            [0.5, 0.5])

def vf(atoms):
    """Returns the descriptor that distinguishes candidates in the
    niched population."""
    return atoms.get_chemical_formula(mode='hill')

# Give fittest candidates at different compositions equal fitness.
# Use this to find global minimum at each composition
pop = RankFitnessPopulation(data_connection=db,
                            population_size=pop_size,
                            comparator=comp,
                            variable_function=vf,
                            exp_function=True,
                            logfile='log.txt')

# Normal fitness ranking irrespective of composition
#pop = Population(data_connection=db,
#                 population_size=pop_size,
#                 comparator=comp,
#                 logfile='log.txt')

# Set convergence criteria
cc = GenerationRepetitionConvergence(pop, 5)

# Calculate the relaxed energies for pure Ni405, Pt405 and Au405
pure_pots = {'Ni': 147.532, 'Pt':  86.892, 'Au': 63.566}

# Define the relax function
def relax(atoms, single_point=False):
    atoms.center(vacuum=5.)
    atoms.calc = asapEMT()
    if not single_point:
        opt = BFGS(atoms, logfile=None)
        opt.run(fmax=0.1)
    Epot = atoms.get_potential_energy()
    atoms.info['key_value_pairs']['potential_energy'] = Epot

    # There is a known issue of asapEMT in GA. You can either detach
    # the calculator or re-assign to a SinglePointCalculator
    atoms.calc = None

    # Calculate mixing energy
    syms = atoms.get_chemical_symbols()
    for a in set(syms):
        Epot -= (pure_pots[a] / len(atoms)) * syms.count(a)
    atoms.info['key_value_pairs']['raw_score'] = -Epot

    # Parallelize nnmat calculations to accelerate NNMatComparator
    atoms.info['data']['nnmat'] = get_nnmat(atoms)

    return atoms

# Relax starting generation
def relax_an_unrelaxed_candidate(atoms):
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    nncomp = atoms.get_chemical_formula(mode='hill')
    print('Relaxing ' + nncomp)

    return relax(atoms)

# Create a multiprocessing Pool
pool = Pool(os.cpu_count())
# Perform relaxations in parallel. Especially
# useful when running GA on large nanoparticles
relaxed_candidates = pool.map(relax_an_unrelaxed_candidate,
                              db.get_all_unrelaxed_candidates())
pool.close()
pool.join()
db.add_more_relaxed_candidates(relaxed_candidates)
pop.update()

# Number of generations
num_gens = 1000

# Below is the iterative part of the algorithm
gen_num = db.get_generation_number()
for i in range(num_gens):
    # Check if converged
    if cc.converged():
        print('Converged')
        break
    print('Creating and evaluating generation {0}'.format(gen_num + i))

    # Performing procreations in parallel
    def procreation(x):
        # Select an operator and use it
        op = op_selector.get_operator()
        while True:
            # Assign rng with a random seed
            np.random.seed(randint(1, 10000))
            pop.rng = np.random
            # Select parents for a new candidate
            p1, p2 = pop.get_two_candidates()
            # Pure and binary candidates are not considered
            if len(set(p1.numbers)) < 3:
                continue
            parents = [p1, p2]
            offspring, desc = op.get_new_individual(parents)
            # An operator could return None if an offspring cannot be formed
            # by the chosen parents
            if offspring is not None:
                break
        nncomp = offspring.get_chemical_formula(mode='hill')
        print('Relaxing ' + nncomp)
        if 'data' not in offspring.info:
            offspring.info['data'] = {'tag': None}

        return relax(offspring)

    # Create a multiprocessing Pool
    pool = Pool(os.cpu_count())
    # Perform procreations in parallel. Especially
    # useful when running GA on large nanoparticles
    relaxed_candidates = pool.map(procreation, range(pop_size))
    pool.close()
    pool.join()
    db.add_more_relaxed_candidates(relaxed_candidates)

    # Update the population to allow new candidates to enter
    pop.update()

Genetic algorithm for adlayer patterns

Adsorbate procreation operators that adds an adsorbate to the surface of a particle or given structure.

class acat.ga.adsorbate_operators.AdsorbateOperator(adsorbate_species, species_probabilities=None, num_muts=1)[source]

Bases: OffspringCreator

Base class for all operators that add, move or remove adsorbates.

Don’t use this operator directly!

classmethod initialize_individual(parent, indi=None)[source]: Initializes a new individual that inherits some parameters from the parent, and initializes the info dictionary. If the new individual already has more structure it can be supplied in the parameter indi.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

add_adsorbate(atoms, hetero_site_list, heights, adsorbate_species=None, min_adsorbate_distance=2.0, tilt_angle=0.0)[source]

Adds the adsorbate in self.adsorbate to the supplied atoms object at the first free site in the specified site_list. A site is free if no other adsorbates can be found in a sphere of radius min_adsorbate_distance around the chosen site.

Parameters

atoms (ase.Atoms object) – The atoms object that the adsorbate will be added to.
hetero_site_list (list) – A list of dictionaries, each dictionary contains site information given by acat.adsorbate_coverage module.
heights (dict) – A dictionary that contains the adsorbate height for each site type.
adsorbate_species (str or list of strs, default None) – One or a list of adsorbate species to be added to the surface. Use self.adsorbate_species if not specified.
min_adsorbate_distance (float, default 2.) – The radius of the sphere inside which no other adsorbates should be found.
tilt_angle (float, default 0.) – Tilt the adsorbate with an angle (in degrees) relative to the surface normal.

remove_adsorbate(atoms, hetero_site_list, return_site_index=False)[source]

Removes an adsorbate from the atoms object at the first occupied site in hetero_site_list. If no adsorbates can be found, one will be added instead.

Parameters

atoms (ase.Atoms object) – The atoms object that the adsorbate will be added to
hetero_site_list (list) – A list of dictionaries, each dictionary contains site information given by acat.adsorbate_coverage module.
return_site_index (bool, default False) – Whether to return the site index of the hetero_site_list instead of the site. Useful for moving or replacing adsorbate.

get_adsorbate_indices(atoms, position)[source]: Returns the indices of the adsorbate at the supplied position.

class acat.ga.adsorbate_operators.AddAdsorbate(adsorbate_species, species_probabilities=None, max_species=None, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, min_adsorbate_distance=2.0, adsorption_sites=None, site_preference=None, surface_preference=None, max_coverage=None, tilt_angle=None, subtract_heights=False, num_muts=1, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

Use this operator to add adsorbates to the surface. The surface is allowed to change during the algorithm run.

There is no limit of adsorbate species. You can either provide one species or a list of species.

Site and surface preference can be supplied. If both are supplied site will be considered first.

Supplying a tilt angle will tilt the adsorbate with an angle relative to the standard perpendicular to the surface.

The operator is generalized for both periodic and non-periodic systems (distinguished by atoms.pbc).

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be added to the surface.
species_probabilities (dict, default None) – A dictionary that contains keys of each adsorbate species and values of their probabilities of adding onto the surface. Adding adsorbate species with equal probability if not specified.
max_species (int, default None) – The maximum allowed adsorbate species (excluding vacancies) for a single structure. Allow all adsorbate species if not specified.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
min_adsorbate_distance (float, default 2.) – The radius of the sphere inside which no other adsorbates should be found.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the genetic algorithm. Make sure all the operators used with this operator preserve the indexing of the atoms.
site_preference (str or list of strs, defualt None) – The site type(s) that has higher priority to attach adsorbates.
surface_preference (str, default None) – The surface type that has higher priority to attach adsorbates. Please only use this for nanoparticles.
max_coverage (float, default None) – The maximum allowed coverage on the surface. Coverage is defined as (number of surface occupied sites / number of surface atoms). The maximum coverage is solely governed by min_adsorbate_distance if max_coverage is not specified.
tilt_angle (float, default 0.) – Tilt the adsorbate with an angle (in degrees) relative to the surface normal.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
num_muts (int, default 1) – The number of times to perform this operation.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Returns the new individual as an atoms object.

class acat.ga.adsorbate_operators.RemoveAdsorbate(adsorbate_species, adsorption_sites=None, site_preference=None, surface_preference=None, min_coverage=None, subtract_heights=False, num_muts=1, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

This operator removes an adsorbate from the surface. It works exactly (but doing the opposite) as the AddAdsorbate operator.

The operator is generalized for both periodic and non-periodic systems (distinguished by atoms.pbc).

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be removed from the surface.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the genetic algorithm. Make sure all the operators used with this operator preserve the indexing of the atoms.
site_preference (str or list of strs, defualt None) – The site type(s) that has higher priority to detach adsorbates.
surface_preference (str, default None) – The surface type that has higher priority to detach adsorbates. Please only use this for nanoparticles.
min_coverage (float, default None) – The minimum allowed coverage on the surface. Coverage is defined as (number of surface occupied sites / number of surface atoms). The minimum coverage is 0 if min_coverage is not specified.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
num_muts (int, default 1) – The number of times to perform this operation.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.adsorbate_operators.MoveAdsorbate(adsorbate_species, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, min_adsorbate_distance=2.0, adsorption_sites=None, site_preference_from=None, surface_preference_from=None, site_preference_to=None, surface_preference_to=None, tilt_angle=None, subtract_heights=False, num_muts=1, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

This operator removes an adsorbate from the surface and adds it again to a different site, i.e. effectively moving the adsorbate.

The operator is generalized for both periodic and non-periodic systems (distinguished by atoms.pbc).

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be added to the surface.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
min_adsorbate_distance (float, default 2.) – The radius of the sphere inside which no other adsorbates should be found.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the genetic algorithm. Make sure all the operators used with this operator preserve the indexing of the atoms.
site_preference_from (str or list of strs, defualt None) – The site type(s) that has higher priority to detach adsorbates.
surface_preference_from (str, default None) – The surface type that has higher priority to detach adsorbates. Please only use this for nanoparticles.
site_preference_to (str or list of strs, defualt None) – The site type(s) that has higher priority to attach adsorbates.
surface_preference_to (str, default None) – The surface type that has higher priority to attach adsorbates. Please only use this for nanoparticles.
tilt_angle (float, default 0.) – Tilt the adsorbate with an angle (in degrees) relative to the surface normal.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
num_muts (int, default 1) – The number of times to perform this operation.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.adsorbate_operators.ReplaceAdsorbate(adsorbate_species, species_probabilities=None, max_species=None, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, min_adsorbate_distance=2.0, adsorption_sites=None, site_preference=None, surface_preference=None, tilt_angle=None, subtract_heights=False, num_muts=1, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

This operator removes an adsorbate from the surface and adds another species to the same site, i.e. effectively replacing the adsorbate.

The operator is generalized for both periodic and non-periodic systems (distinguished by atoms.pbc).

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be added to the surface.
species_probabilities (dict, default None) – A dictionary that contains keys of each adsorbate species and values of their probabilities of replacing an adsorbate on the surface. Choosing adsorbate species with equal probability if not specified.
max_species (int, default None) – The maximum allowed adsorbate species (excluding vacancies) for a single structure. Allow all adsorbate species if not specified.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
min_adsorbate_distance (float, default 2.) – The radius of the sphere inside which no other adsorbates should be found.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the genetic algorithm. Make sure all the operators used with this operator preserve the indexing of the atoms.
site_preference (str or list of strs, defualt None) – The site type(s) that has higher priority to replace adsorbates.
surface_preference (str, default None) – The surface type that has higher priority to replace adsorbates. Please only use this for nanoparticles.
tilt_angle (float, default 0.) – Tilt the adsorbate with an angle (in degrees) relative to the surface normal.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
num_muts (int, default 1) – The number of times to perform this operation.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.adsorbate_operators.ReplaceAdsorbateSpecies(adsorbate_species, replace_vacancy=False, species_probabilities=None, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, adsorption_sites=None, tilt_angle=None, subtract_heights=False, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

This operator replace all adsorbates of a certain species with another species at the same sites. Return None if there is no adsorbate present on the surface.

The operator is generalized for both periodic and non-periodic systems (distinguished by atoms.pbc).

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be added to the surface.
species_probabilities (dict, default None) – A dictionary that contains keys of each adsorbate species and values of their probabilities of replacing an adsorbate on the surface. Choosing adsorbate species with equal probability if not specified.
replace_vacancy (bool, default False) – Whether to allow replacing adsorbates with vacancies, i.e., effectively removing all adsorbates of a certain species. Note that if you want to specify species_probabilties, you then need to also provide the probability for vacancy replacement using the keyword ‘vacancy’.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the genetic algorithm. Make sure all the operators used with this operator preserve the indexing of the atoms.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
tilt_angle (float, default 0.) – Tilt the adsorbate with an angle (in degrees) relative to the surface normal.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.adsorbate_operators.CutSpliceCrossoverWithAdsorbates(adsorbate_species, blmin, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, keep_composition=True, fix_coverage=False, min_adsorbate_distance=2.0, rotate_vectors=None, rotate_angles=None, subtract_heights=False, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

Crossover that cuts two particles with adsorbates through a plane in space and merges two halfes from different particles together (only returns one of them). The indexing of the atoms is not preserved. Please only use this operator if the particle is allowed to change shape.

It keeps the correct composition by randomly assigning elements in the new particle. If some of the atoms in the two particle halves are too close, the halves are moved away from each other perpendicular to the cutting plane.

The complexity of crossover with adsorbates makes this operator not robust enough. The adsorption site identification will fail once the nanoparticle shape becomes too irregular after crossover.

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be added to the surface.
blmin (dict) – Dictionary of minimum distance between atomic numbers. e.g. {(28,29): 1.5}
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
keep_composition (bool, default True) – Should the composition be the same as in the parents.
fix_coverage (bool, default False) – Should the adsorbate coverage be the same as in the parents.
min_adsorbate_distance (float, default 2.) – The radius of the sphere inside which no other adsorbates should be found.
rotate_vectors (list, default None) – A list of vectors that the part of the structure that is cut is able to rotate around, the size of rotation is set in rotate_angles. Default None meaning no rotation is performed.
rotate_angles (list, default None) – A list of angles that the structure cut can be rotated. The vector being rotated around is set in rotate_vectors. Default None meaning no rotation is performed.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

get_vectors_below_min_dist(atoms)[source]: Generator function that returns each vector (between atoms) that is shorter than the minimum distance for those atom types (set during the initialization in blmin).

class acat.ga.adsorbate_operators.SimpleCutSpliceCrossoverWithAdsorbates(adsorbate_species, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, keep_composition=True, fix_coverage=False, min_adsorbate_distance=2.0, adsorption_sites=None, subtract_heights=False, dmax=2.5, **kwargs)[source]

Bases: AdsorbateOperator

Crossover that divides two particles through a plane in space and merges the symbols of two halves from different particles with adosrbates together (only returns one of them). The indexing of the atoms is preserved. Please only use this operator with other operators that also preserves the indexing.

It keeps the correct composition by randomly assigning elements in the new particle.

Parameters

adsorbate_species (str or list of strs) – One or a list of adsorbate species to be added to the surface.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
keep_composition (bool, default True) – Boolean that signifies if the composition should be the same as in the parents.
fix_coverage (bool, default False) – Should the adsorbate coverage be the same as in the parents.
min_adsorbate_distance (float, default 2.) – The radius of the sphere inside which no other adsorbates should be found.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the genetic algorithm. Make sure all the operators used with this operator preserve the indexing of the atoms.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.adsorbate_operators.CatalystAdsorbateCrossover(group_by_sites=False, catalyst_indices=None)[source]

Bases: AdsorbateOperator

Crossover that divides two particles or two slabs by the catalyst- adsorbates interfaces and exchange all adsorbates (only returns one of them). The indexing of the atoms is preserved. Please only use this operator with other operators that also preserves the indexing.

The composition or the coverage is fixed if it is preserved by all other operators being used.

Parameters

group_by_sites (bool, default False) – If atoms.info[‘data’][‘groups’] is used, please set this to True if the groups is for adsorption sites, so that the offspring will inheritate the site groups. The default is grouping by slab atoms.
catalyst_indices (list of ints, default None) – The atomic indices of catalyst atoms. Only metal atoms are treated as part of the catalyst by default. Useful when the indexing is preserved during the run, and there are non-metal elements in the catalyst, e.g. metal oxides.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

Comparator objects relevant to particles with adsorbates.

acat.ga.adsorbate_comparators.count_ads(atoms, adsorbate)[source]: Very naive implementation only taking into account the symbols. atoms and adsorbate should both be supplied as Atoms objects.

class acat.ga.adsorbate_comparators.AdsorbateCountComparator(adsorbate)[source]

Bases: object

Compares the number of adsorbates on the particles and returns True if the numbers are the same, False otherwise.

Parameters:

adsorbate: list or string a supplied list of adsorbates or a string if only one adsorbate is possible

looks_like(a1, a2)[source]: Does the actual comparison.

class acat.ga.adsorbate_comparators.AdsorptionSitesComparator(min_diff_adsorption_sites=2)[source]

Bases: object

Compares the metal atoms in the adsorption sites and returns True if less than min_diff_adsorption_sites of the sites with adsorbates consist of different atoms.

Ex: a1.info[‘data’][‘adsorbates_site_atoms’] = [(‘Cu’,’Ni’),(‘Cu’,’Ni’),(‘Ni’),(‘Ni’)]

a2.info[‘data’][‘adsorbates_site_atoms’] = [(‘Cu’,’Ni’),(‘Ni’,’Ni’, ‘Ni’),(‘Ni’),(‘Ni’)]

will have a difference of 2: (2*(‘Cu’,’Ni’)-1*(‘Cu’,’Ni’)=1, 1*(‘Ni’,’Ni’,’Ni’)=1, 2*(‘Ni’)-2*(‘Ni’)=0)

looks_like(a1, a2)[source]

class acat.ga.adsorbate_comparators.AdsorptionMetalsComparator(same_adsorption_number)[source]

Bases: object

Compares the number of adsorbate-metal bonds and returns True if the number for a1 and a2 differs by less than the supplied parameter same_adsorption_number

Ex: a1.info[‘data’][‘adsorbates_bound_to’] = {‘Cu’:1, ‘Ni’:3} a2.info[‘data’][‘adsorbates_bound_to’] = {‘Cu’:.5, ‘Ni’:3.5} will have a difference of .5 in both elements:

looks_like(a1, a2)[source]

Comparator objects based on graph theory.

class acat.ga.graph_comparators.AdsorptionGraphComparator(adsorption_sites, composition_effect=True, subtract_heights=None, hmax=2, dmax=2.5, dx=0.5, **kwargs)[source]

Bases: object

Compares the graph of adsorbate overlayer + surface atoms and returns True if they are automorphic with node matches. Before checking graph automorphism by color refinement, a cheap label match is used to reject graphs that are impossible to be automorphic.

Parameters

adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path) – Provide the acat built-in adsorption sites class to accelerate the pattern generation. Make sure all the structures have the same atom indexing.
composition_effect (bool, default True) – Whether to consider sites with different elemental compositions as different sites. It is recommended to set composition_effet=False for monometallics.
subtract_heights (bool, default None) – A dictionary that contains the height to be subtracted from the bond length when allocating a type of site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
hmax (int, default 2) – Maximum number of iterations for color refinement.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.
dx (float, default 0.5) – Buffer to calculate nearest neighbor pairs.

looks_like(a1, a2)[source]

class acat.ga.graph_comparators.WLGraphComparator(hmax=2, dx=0.5, tol=1e-05)[source]

Bases: object

Compares two structures (or graphs) based on the Weisfeiler-Lehman subtree kernel (color refinement), as described in N. Shervashidze et al., Journal of Machine Learning Research 2011, 12, 2539–2561. This serves as a scalable solver for checking graph automorphism of two structures.

The graphs (nx.Graph objects) can be quite costly to obtain every time a graph is required (and disk intensive if saved), thus it makes sense to get the adjacency matrix along with e.g. the potential energy and save it in atoms.info[‘data’][‘adj_matrix’].

Parameters

hmax (int, default 2) – Maximum number of iterations for color refinement.
dx (float, default 0.5) – Buffer to calculate nearest neighbor pairs.
tol (float, default 1e-5) – Tolerance of the discrepancy between K12 and sqrt(K11*K22)

looks_like(a1, a2)[source]: a1, a2 can be eithr 2 ase.Atoms objects or 2 networkx graphs.

classmethod get_label_dict(G, symbols, hmax, dx)[source]

Example 1

All the adsorbate operators and comparators can be easily used with other operators and comparators. AddAdsorbate, RemoveAdsorbate, MoveAdsorbate, ReplaceAdsorbate, ReplaceAdsorbateSpecies and CatalystAdsorbateCrossover operators can be used for both non-periodic nanoparticles and periodic surface slabs. CutSpliceCrossoverWithAdsorbates and SimpleCutSpliceCrossoverWithAdsorbates operators only work for nanoparticles, and the latter is recommended. To strictly sort out duplicate structures, consider using AdsorptionGraphComparator or WLGraphComparator. To accelerate the GA, provide adsorsption sites and use indexing-preserved operators implemented in ACAT.

As an example we will simultaneously optimize both the adsorbate overlayer pattern and the catalyst chemical ordering of a Ni110Pt37 icosahedral nanoalloy with adsorbate species of H, C, O, OH, CO, CH, CH2 and CH3 using the EMT calculator.

The script for a parallel genetic algorithm looks as follows:

from acat.settings import adsorbate_elements
from acat.adsorption_sites import ClusterAdsorptionSites
from acat.adsorbate_coverage import ClusterAdsorbateCoverage
from acat.build.ordering import RandomOrderingGenerator as ROG
from acat.build.adlayer import min_dist_coverage_pattern
from acat.ga.adsorbate_operators import (AddAdsorbate, RemoveAdsorbate,
                                         MoveAdsorbate, ReplaceAdsorbate,
                                         SimpleCutSpliceCrossoverWithAdsorbates)
# Import particle_mutations from acat instead of ase to get the indexing-preserved version
from acat.ga.particle_mutations import (RandomPermutation, COM2surfPermutation,
                                        Rich2poorPermutation, Poor2richPermutation)
from ase.ga.particle_comparator import NNMatComparator
from ase.ga.standard_comparators import SequentialComparator, StringComparator
from ase.ga.offspring_creator import OperationSelector
from ase.ga.population import Population, RankFitnessPopulation
from ase.ga.convergence import GenerationRepetitionConvergence
from ase.ga.utilities import closest_distances_generator, get_nnmat
from ase.ga.data import DataConnection, PrepareDB
from ase.io import read, write
from ase.cluster import Icosahedron
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from collections import defaultdict
from random import uniform, randint
from multiprocessing import Pool
import numpy as np
import time
import os

# Define population
# Recommend to choose a number that is a multiple of the number of cpu
pop_size = 50

# Generate 50 icosahedral Ni110Pt37 nanoparticles with random orderings
particle = Icosahedron('Ni', noshells=4)
particle.center(vacuum=5.)
rog = ROG(particle, elements=['Ni', 'Pt'],
          composition={'Ni': 0.75, 'Pt': 0.25},
          trajectory='starting_generation.traj')
rog.run(num_gen=pop_size)

# Generate random coverage on each nanoparticle
species = ['H', 'C', 'O', 'OH', 'CO', 'CH', 'CH2', 'CH3']
images = read('starting_generation.traj', index=':')
patterns = []
for atoms in images:
    dmin = uniform(3.5, 8.5)
    pattern = min_dist_coverage_pattern(atoms, adsorbate_species=species,
                                        min_adsorbate_distance=dmin)
    patterns.append(pattern)

# Get the adsorption sites. Composition does not affect GA operations
sas = ClusterAdsorptionSites(particle, composition_effect=False)

# Instantiate the db
db_name = 'ridge_Ni110Pt37_ads.db'

db = PrepareDB(db_name, cell=particle.cell, population_size=pop_size)

for atoms in patterns:
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    db.add_unrelaxed_candidate(atoms, data=atoms.info['data'])

# Connect to the db
db = DataConnection(db_name)

# Define operators
soclist = ([1, 1, 2, 1, 1, 1, 1, 2],
           [Rich2poorPermutation(elements=['Ni', 'Pt'], num_muts=5),
            Poor2richPermutation(elements=['Ni', 'Pt'], num_muts=5),
            RandomPermutation(elements=['Ni', 'Pt'], num_muts=5),
            AddAdsorbate(species, adsorption_sites=sas, num_muts=5),
            RemoveAdsorbate(species, adsorption_sites=sas, num_muts=5),
            MoveAdsorbate(species, adsorption_sites=sas, num_muts=5),
            ReplaceAdsorbate(species, adsorption_sites=sas, num_muts=5),
            SimpleCutSpliceCrossoverWithAdsorbates(species, keep_composition=True,
                                                   adsorption_sites=sas),])
op_selector = OperationSelector(*soclist)

# Define comparators
comp = SequentialComparator([StringComparator('potential_energy'),
                             NNMatComparator(0.2, ['Ni', 'Pt'])],
                            [0.5, 0.5])

def get_ads(atoms):
    """Returns a list of adsorbate names and corresponding indices."""

    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    if 'adsorbates' in atoms.info['data']:
        adsorbates = atoms.info['data']['adsorbates']
    else:
        cac = ClusterAdsorbateCoverage(atoms)
        adsorbates = [t[0] for t in cac.get_adsorbates()]

    return adsorbates

def vf(atoms):
    """Returns the descriptor that distinguishes candidates in the
    niched population."""

    return len(get_ads(atoms))

# Give fittest candidates at different coverages equal fitness.
# Use this to find global minimum at each adsorbate coverage
pop = RankFitnessPopulation(data_connection=db,
                            population_size=pop_size,
                            comparator=comp,
                            variable_function=vf,
                            exp_function=True,
                            logfile='log.txt')

# Normal fitness ranking irrespective of adsorabte coverage
#pop = Population(data_connection=db,
#                 population_size=pop_size,
#                 comparator=comp,
#                 logfile='log.txt')

# Set convergence criteria
cc = GenerationRepetitionConvergence(pop, 5)

# Calculate chemical potentials
chem_pots = {'CH4': -24.039, 'H2O': -14.169, 'H2': -6.989}

# Define the relax function
def relax(atoms, single_point=False):
    atoms.center(vacuum=5.)
    atoms.calc = EMT()
    if not single_point:
        opt = BFGS(atoms, logfile=None)
        opt.run(fmax=0.1)
    time.sleep(1) # Add delay (only for testing)

    Epot = atoms.get_potential_energy()
    num_H = len([s for s in atoms.symbols if s == 'H'])
    num_C = len([s for s in atoms.symbols if s == 'C'])
    num_O = len([s for s in atoms.symbols if s == 'O'])
    mutot = num_C * chem_pots['CH4'] + num_O * chem_pots['H2O'] + (
            num_H - 4 * num_C - 2 * num_O) * chem_pots['H2'] / 2
    f = -(Epot - mutot)

    atoms.info['key_value_pairs']['raw_score'] = f
    atoms.info['key_value_pairs']['potential_energy'] = Epot

    # Parallelize nnmat calculations to accelerate NNMatComparator
    atoms.info['data']['nnmat'] = get_nnmat(atoms)

    return atoms

# Relax starting generation
def relax_an_unrelaxed_candidate(atoms):
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    nncomp = atoms.get_chemical_formula(mode='hill')
    print('Relaxing ' + nncomp)

    return relax(atoms, single_point=True) # Single point only for testing

# Create a multiprocessing Pool
pool = Pool(os.cpu_count())
# Perform relaxations in parallel. Especially
# useful when running GA on large nanoparticles
relaxed_candidates = pool.map(relax_an_unrelaxed_candidate,
                              db.get_all_unrelaxed_candidates())
pool.close()
pool.join()
db.add_more_relaxed_candidates(relaxed_candidates)
pop.update()

# Number of generations
num_gens = 1000

# Below is the iterative part of the algorithm
gen_num = db.get_generation_number()
for i in range(num_gens):
    # Check if converged
    if cc.converged():
        print('Converged')
        break
    print('Creating and evaluating generation {0}'.format(gen_num + i))

    def procreation(x):
        # Select an operator and use it
        op = op_selector.get_operator()
        while True:
            # Assign rng with a random seed
            np.random.seed(randint(1, 10000))
            pop.rng = np.random
            # Select parents for a new candidate
            p1, p2 = pop.get_two_candidates()
            parents = [p1, p2]
            # Pure or bare nanoparticles are not considered
            if len(set(p1.numbers)) < 3:
                continue
            offspring, desc = op.get_new_individual(parents)
            # An operator could return None if an offspring cannot be formed
            # by the chosen parents
            if offspring is not None:
                break
        nncomp = offspring.get_chemical_formula(mode='hill')
        print('Relaxing ' + nncomp)
        if 'data' not in offspring.info:
            offspring.info['data'] = {'tag': None}

        return relax(offspring, single_point=True) # Single point only for testing

    # Create a multiprocessing Pool
    pool = Pool(os.cpu_count())
    # Perform procreations in parallel. Especially useful when
    # using adsorbate operators which requires site identification
    relaxed_candidates = pool.map(procreation, range(pop_size))
    pool.close()
    pool.join()
    db.add_more_relaxed_candidates(relaxed_candidates)

    # Update the population to allow new candidates to enter
    pop.update()

Example 2

If we want to study the same system but fix the adsorbate coverage to be exactly 20 adsorbates, we should only use MoveAdsorbate, ReplaceAdsorbate and SimpleCutSpliceCrossoverWithAdsorbates operators with same particle operators. Remember to set fix_coverage to True in SimpleCutSpliceCrossoverWithAdsorbates.

The script for a fixed-coverage parallel genetic algorithm now looks as follows:

from acat.settings import adsorbate_elements
from acat.adsorption_sites import ClusterAdsorptionSites
from acat.adsorbate_coverage import ClusterAdsorbateCoverage
from acat.build.ordering import RandomOrderingGenerator as ROG
from acat.build.adlayer import RandomPatternGenerator as RPG
from acat.ga.adsorbate_operators import (AddAdsorbate, RemoveAdsorbate,
                                         MoveAdsorbate, ReplaceAdsorbate,
                                         SimpleCutSpliceCrossoverWithAdsorbates)
# Import particle_mutations from acat instead of ase to get the indexing-preserved version
from acat.ga.particle_mutations import (RandomPermutation, COM2surfPermutation,
                                        Rich2poorPermutation, Poor2richPermutation)
from ase.ga.particle_comparator import NNMatComparator
from ase.ga.standard_comparators import SequentialComparator, StringComparator
from ase.ga.offspring_creator import OperationSelector
from ase.ga.population import Population, RankFitnessPopulation
from ase.ga.convergence import GenerationRepetitionConvergence
from ase.ga.utilities import closest_distances_generator, get_nnmat
from ase.ga.data import DataConnection, PrepareDB
from ase.io import read, write
from ase.cluster import Icosahedron
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from collections import defaultdict
from random import randint
from multiprocessing import Pool
import numpy as np
import time
import os

# Define population
# Recommand to choose a number that is a multiple of the number of cpu
pop_size = 50

# Generate 50 icosahedral Ni110Pt37 nanoparticles with random orderings
particle = Icosahedron('Ni', noshells=4)
particle.center(vacuum=5.)
rog = ROG(particle, elements=['Ni', 'Pt'],
          composition={'Ni': 0.75, 'Pt': 0.25},
          trajectory='starting_generation.traj')
rog.run(num_gen=pop_size)

# Generate random coverage patterns of 20 adsorbates on nanoparticles
species = ['H', 'C', 'O', 'OH', 'CO', 'CH', 'CH2', 'CH3']
images = read('starting_generation.traj', index=':')
num_ads = 20 # Number of adsorbates, fix at this coverage throughout GA
for _ in range(num_ads): # This will take quite some time
    rpg = RPG(images, adsorbate_species=['CO','OH','N'],
              min_adsorbate_distance=3.,
              composition_effect=True)
    rpg.run(num_gen=pop_size, action='add', unique=False)
    images = read('patterns.traj')
patterns = read('patterns.traj', index=':')

# Get the adsorption sites. Composition does not affect GA operations
sas = ClusterAdsorptionSites(particle, composition_effect=False)

# Instantiate the db
db_name = 'ridge_Ni110Pt37_ads20.db'

db = PrepareDB(db_name, cell=particle.cell, population_size=pop_size)

for atoms in patterns:
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    db.add_unrelaxed_candidate(atoms, data=atoms.info['data'])

# Connect to the db
db = DataConnection(db_name)

# Define operators
soclist = ([1, 1, 2, 1, 1, 2],
           [Rich2poorPermutation(elements=['Ni', 'Pt'], num_muts=5),
            Poor2richPermutation(elements=['Ni', 'Pt'], num_muts=5),
            RandomPermutation(elements=['Ni', 'Pt'], num_muts=5),
            MoveAdsorbate(species, adsorption_sites=sas, num_muts=5),
            ReplaceAdsorbate(species, adsorption_sites=sas, num_muts=5),
            SimpleCutSpliceCrossoverWithAdsorbates(species, keep_composition=True,
                                                   fix_coverage=True,
                                                   adsorption_sites=sas),])
op_selector = OperationSelector(*soclist)

# Define comparators
comp = SequentialComparator([StringComparator('potential_energy'),
                             NNMatComparator(0.2, ['Ni', 'Pt'])],
                            [0.5, 0.5])

# Normal fitness ranking irrespective of adsorbate coverage
pop = Population(data_connection=db,
                 population_size=pop_size,
                 comparator=comp,
                 logfile='log.txt')

# Set convergence criteria
cc = GenerationRepetitionConvergence(pop, 5)

# Calculate chemical potentials
chem_pots = {'CH4': -24.039, 'H2O': -14.169, 'H2': -6.989}

# Define the relax function
def relax(atoms, single_point=False):
    atoms.center(vacuum=5.)
    atoms.calc = EMT()
    if not single_point:
        opt = BFGS(atoms, logfile=None)
        opt.run(fmax=0.1)
    time.sleep(1) # Add delay (only for testing)

    Epot = atoms.get_potential_energy()
    num_H = len([s for s in atoms.symbols if s == 'H'])
    num_C = len([s for s in atoms.symbols if s == 'C'])
    num_O = len([s for s in atoms.symbols if s == 'O'])
    mutot = num_C * chem_pots['CH4'] + num_O * chem_pots['H2O'] + (
            num_H - 4 * num_C - 2 * num_O) * chem_pots['H2'] / 2
    f = -(Epot - mutot)

    atoms.info['key_value_pairs']['raw_score'] = f
    atoms.info['key_value_pairs']['potential_energy'] = Epot

    # Parallelize nnmat calculations to accelerate NNMatComparator
    atoms.info['data']['nnmat'] = get_nnmat(atoms)

    return atoms

# Relax starting generation
def relax_an_unrelaxed_candidate(atoms):
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    nncomp = atoms.get_chemical_formula(mode='hill')
    print('Relaxing ' + nncomp)

    return relax(atoms, single_point=True) # Single point only for testing

# Create a multiprocessing Pool
pool = Pool(os.cpu_count())
# Perform relaxations in parallel. Especially
# useful when running GA on large nanoparticles
relaxed_candidates = pool.map(relax_an_unrelaxed_candidate,
                              db.get_all_unrelaxed_candidates())
pool.close()
pool.join()
db.add_more_relaxed_candidates(relaxed_candidates)
pop.update()

# Number of generations
num_gens = 1000

# Below is the iterative part of the algorithm
gen_num = db.get_generation_number()
for i in range(num_gens):
    # Check if converged
    if cc.converged():
        print('Converged')
        break
    print('Creating and evaluating generation {0}'.format(gen_num + i))

    def procreation(x):
        # Select an operator and use it
        op = op_selector.get_operator()
        while True:
            # Assign rng with a random seed
            np.random.seed(randint(1, 10000))
            pop.rng = np.random
            # Select parents for a new candidate
            p1, p2 = pop.get_two_candidates()
            parents = [p1, p2]
            # Pure or bare nanoparticles are not considered
            if len(set(p1.numbers)) < 3:
                continue
            offspring, desc = op.get_new_individual(parents)
            # An operator could return None if an offspring cannot be formed
            # by the chosen parents
            if offspring is not None:
                break
        nncomp = offspring.get_chemical_formula(mode='hill')
        print('Relaxing ' + nncomp)
        if 'data' not in offspring.info:
            offspring.info['data'] = {'tag': None}

        return relax(offspring, single_point=True) # Single point only for testing

    # Create a multiprocessing Pool
    pool = Pool(os.cpu_count())
    # Perform procreations in parallel. Especially useful when
    # using adsorbate operators which requires site identification
    relaxed_candidates = pool.map(procreation, range(pop_size))
    pool.close()
    pool.join()
    db.add_more_relaxed_candidates(relaxed_candidates)

    # Update the population to allow new candidates to enter
    pop.update()

Symmetry-constrained genetic algorithm for adlayer patterns

Procreation operators meant to be used in symmetry-constrained genetic algorithm (SCGA).

class acat.ga.group_operators.Mutation(num_muts=1)[source]

Bases: OffspringCreator

Base class for all particle mutation type operators. Do not call this class directly.

class acat.ga.group_operators.AdsorbateGroupSubstitute(adsorbate_species, species_probabilities=None, site_groups=None, max_species=None, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, adsorption_sites=None, remove_site_shells=1, remove_site_radius=None, subtract_heights=False, num_muts=1, dmax=2.5, **kwargs)[source]

Bases: Mutation

Substitute all the adsorbates (or vacancies) in a site group with a different element.

Parameters

adsorbate_species (str or list of strs) – A list of possible adsorbate species to be added to the surface.
species_probabilities (dict, default None) – A dictionary that contains keys of each adsorbate species and values of their probabilities of replacing an adsorbate on the surface. Choosing adsorbate species with equal probability if not specified.
site_groups (list of lists or list of list of lists, default None) – The site indices in each user-divided group. Can be obtained by acat.build.adlayer.OrderedPatternGenerator. You can also mix structures with different groupings in one GA run by providing all possible groups in a list of list of lists, so that the algorithm will randomly assign a grouping to the structure, where for each grouping the adsorbate / vacancy occupations in each site group are of the same type. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures.
max_species (int, default None) – The maximum allowed adsorbate species (excluding vacancies) for a single structure. Allow all adsorbate species if not specified.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the pattern generation. Make sure all the structures have the same periodicity and atom indexing. If composition_effect=True, you should only provide adsorption_sites when the surface composition is fixed. If this is not provided, the arguments for identifying adsorption sites can still be passed in by **kwargs.
remove_site_shells (int, default 1) – The neighbor shell number within which the neighbor sites should be removed. Remove the 1st neighbor site shell by default. Set to 0 if no site should be removed.
remove_site_radius (float, default None) – The radius within which the neighbor sites should be removed. This serves as an alternative to remove_site_shells.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
num_muts (int, default 1) – The number of times to perform this operation.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

substitute(atoms)[source]: Does the actual substitution of the adsorbates

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

class acat.ga.group_operators.AdsorbateGroupPermutation(adsorbate_species, site_groups=None, heights={'3fold': 1.3, '4fold': 1.3, '5fold': 1.5, '6fold': 0.0, 'bridge': 1.5, 'fcc': 1.3, 'hcp': 1.3, 'longbridge': 1.5, 'ontop': 1.8, 'shortbridge': 1.5}, adsorption_sites=None, remove_site_shells=1, remove_site_radius=None, subtract_heights=False, num_muts=1, dmax=2.5, **kwargs)[source]

Bases: Mutation

Permutes the elements in two random site groups.

Parameters

adsorbate_species (str or list of strs) – A list of possible adsorbate species to be added to the surface.
site_groups (list of lists or list of list of lists, default None) – The site indices in each user-divided group. Can be obtained by acat.build.adlayer.OrderedPatternGenerator. You can also mix structures with different groupings in one GA run by providing all possible groups in a list of list of lists, so that the algorithm will randomly assign a grouping to the structure, where for each grouping the adsorbate / vacancy occupations in each site group are of the same type. If not provided here, please provide the groups in atoms.info[‘data’][‘groups’] in all intial structures.
heights (dict, default acat.settings.site_heights) – A dictionary that contains the adsorbate height for each site type. Use the default height settings if the height for a site type is not specified.
adsorption_sites (acat.adsorption_sites.ClusterAdsorptionSites object or acat.adsorption_sites.SlabAdsorptionSites object or the corresponding pickle file path, default None) – Provide the built-in adsorption sites class to accelerate the pattern generation. Make sure all the structures have the same periodicity and atom indexing. If composition_effect=True, you should only provide adsorption_sites when the surface composition is fixed. If this is not provided, the arguments for identifying adsorption sites can still be passed in by **kwargs.
remove_site_shells (int, default 1) – The neighbor shell number within which the neighbor sites should be removed. Remove the 1st neighbor site shell by default. Set to 0 if no site should be removed.
remove_site_radius (float, default None) – The radius within which the neighbor sites should be removed. This serves as an alternative to remove_site_shells.
subtract_heights (bool, default False) – Whether to subtract the height from the bond length when allocating a site to an adsorbate. Default is to allocate the site that is closest to the adsorbate’s binding atom without subtracting height. Useful for ensuring the allocated site for each adsorbate is consistent with the site to which the adsorbate was added.
num_muts (int, default 1) – The number of times to perform this operation.
dmax (float, default 2.5) – The maximum bond length (in Angstrom) between an atom and its nearest site to be considered as the atom being bound to the site.

get_new_individual(parents)[source]: Function that returns a new individual. Overwrite in subclass.

classmethod mutate(atoms, groups, heights, adsorption_sites, hetero_site_list, remove_site_shells, remove_site_radius)[source]: Do the actual permutation of the adsorbates.

Example 1

All the symmetry-constrained adsorbate operators can be easily used with other indexing-preserved operators, e.g. All operators can be used for both non-periodic nanoparticles and periodic surface slabs. To accelerate the GA, provide adsorsption sites, site pairing (or all possible parings), and use indexing-preserved operators implemented in ACAT.

The script for a parallel symmetry-constrained genetic algorithm (SCGA) looks as follows:

from acat.settings import adsorbate_elements
from acat.adsorption_sites import SlabAdsorptionSites
from acat.adsorbate_coverage import SlabAdsorbateCoverage
from acat.build.ordering import RandomOrderingGenerator as ROG
from acat.build.adlayer import OrderedPatternGenerator as OPG
from acat.ga.adsorbate_operators import (ReplaceAdsorbateSpecies,
                                         CatalystAdsorbateCrossover)
from acat.ga.slab_operators import (CutSpliceSlabCrossover,
                                    RandomSlabPermutation,
                                    RandomCompositionMutation)
from acat.ga.group_operators import (AdsorbateGroupSubstitute,
                                     AdsorbateGroupPermutation)
from ase.ga.standard_comparators import SequentialComparator, StringComparator
from ase.ga.offspring_creator import OperationSelector
from ase.ga.population import Population, RankFitnessPopulation
from ase.ga.convergence import GenerationRepetitionConvergence
from ase.ga.utilities import closest_distances_generator, get_nnmat
from ase.ga.data import DataConnection, PrepareDB
from ase.io import read, write
from ase.build import fcc111
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from collections import defaultdict
from multiprocessing import Pool
from random import randint
import numpy as np
import time
import os

# Define population
# Recommend to choose a number that is a multiple of the number of cpu
pop_size = 50

# Generate 50 Ni48Pt16 slabs with random orderings
slab = fcc111('Ni', (4, 4, 4), vacuum=5., orthogonal=True, periodic=True)
slab_ids = list(range(len(slab)))
rog = ROG(slab, elements=['Ni', 'Pt'],
          composition={'Ni': 0.75, 'Pt': 0.25},
          trajectory='starting_generation.traj')
rog.run(num_gen=pop_size)

# Get the adsorption sites. Composition does not affect GA operations
sas = SlabAdsorptionSites(slab, surface='fcc111', ignore_sites='bridge',
                          composition_effect=False)

# Generate random coverage on each slab and save the groupings
species = ['H', 'C', 'O', 'OH', 'CO', 'CH']
images = read('starting_generation.traj', index=':')
opg = OPG(images, adsorbate_species=species, surface='fcc111',
          adsorption_sites=sas, max_species=2, allow_odd=True,
          remove_site_shells=1, save_groups=True,
          trajectory='patterns.traj', append_trajectory=True)
opg.run(max_gen=pop_size, unique=True)
patterns = read('patterns.traj', index=':')

# Instantiate the db
db_name = 'ridge_Ni48Pt16_ads.db'

db = PrepareDB(db_name, cell=slab.cell, population_size=pop_size)

for atoms in patterns:
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    db.add_unrelaxed_candidate(atoms, data=atoms.info['data'])

# Connect to the db
db = DataConnection(db_name)

# Define operators
soclist = ([3, 3, 2, 3, 3, 3],
           [RandomSlabPermutation(allowed_indices=slab_ids),
            RandomCompositionMutation(allowed_indices=slab_ids),
            ReplaceAdsorbateSpecies(species, replace_vacancy=True,
                                    adsorption_sites=sas),
            AdsorbateGroupSubstitute(species, max_species=2,
                                     adsorption_sites=sas,
                                     remove_site_shells=1),
            AdsorbateGroupPermutation(species, adsorption_sites=sas,
                                      remove_site_shells=1),
            CatalystAdsorbateCrossover(),])
op_selector = OperationSelector(*soclist)

# Define comparators
comp = StringComparator('potential_energy')

def get_ads(atoms):
    """Returns a list of adsorbate names and corresponding indices."""

    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    if 'adsorbates' in atoms.info['data']:
        adsorbates = atoms.info['data']['adsorbates']
    else:
        cac = SlabAdsorbateCoverage(atoms, adsorption_sites=sas)
        adsorbates = [t[0] for t in cac.get_adsorbates()]

    return adsorbates

def vf(atoms):
    """Returns the descriptor that distinguishes candidates in the
    niched population."""

    return len(get_ads(atoms))

# Give fittest candidates at different coverages equal fitness.
# Use this to find global minimum at each adsorbate coverage
pop = RankFitnessPopulation(data_connection=db,
                            population_size=pop_size,
                            comparator=comp,
                            variable_function=vf,
                            exp_function=True,
                            logfile='log.txt')

# Normal fitness ranking irrespective of adsorbate coverage
#pop = Population(data_connection=db,
#                 population_size=pop_size,
#                 comparator=comp,
#                 logfile='log.txt')

# Set convergence criteria
cc = GenerationRepetitionConvergence(pop, 5)

# Calculate chemical potentials
chem_pots = {'CH4': -24.039, 'H2O': -14.169, 'H2': -6.989}

# Define the relax function
def relax(atoms, single_point=False):
    atoms.calc = EMT()
    if not single_point:
        opt = BFGS(atoms, logfile=None)
        opt.run(fmax=0.1)
    time.sleep(1) # Add delay (only for testing)

    Epot = atoms.get_potential_energy()
    num_H = len([s for s in atoms.symbols if s == 'H'])
    num_C = len([s for s in atoms.symbols if s == 'C'])
    num_O = len([s for s in atoms.symbols if s == 'O'])
    mutot = num_C * chem_pots['CH4'] + num_O * chem_pots['H2O'] + (
            num_H - 4 * num_C - 2 * num_O) * chem_pots['H2'] / 2
    f = -(Epot - mutot)

    atoms.info['key_value_pairs']['raw_score'] = f
    atoms.info['key_value_pairs']['potential_energy'] = Epot

    return atoms

# Relax starting generation
def relax_an_unrelaxed_candidate(atoms):
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    nncomp = atoms.get_chemical_formula(mode='hill')
    print('Relaxing ' + nncomp)

    return relax(atoms, single_point=True) # Single point only for testing

# Create a multiprocessing Pool
pool = Pool(25)#os.cpu_count())
# Perform relaxations in parallel.
relaxed_candidates = pool.map(relax_an_unrelaxed_candidate,
                              db.get_all_unrelaxed_candidates())
pool.close()
pool.join()
db.add_more_relaxed_candidates(relaxed_candidates)
pop.update()

# Number of generations
num_gens = 1000

# Below is the iterative part of the algorithm
gen_num = db.get_generation_number()
for i in range(num_gens):

    # Check if converged
    if cc.converged():
        print('Converged')
        break
    print('Creating and evaluating generation {0}'.format(gen_num + i))

    def procreation(x):
        # Select an operator and use it
        op = op_selector.get_operator()
        while True:
            # Assign rng with a random seed
            np.random.seed(randint(1, 10000))
            pop.rng = np.random
            # Select parents for a new candidate
            p1, p2 = pop.get_two_candidates()
            parents = [p1, p2]
            # Pure or bare slabs are not considered
            if len(set(p1.numbers)) < 3:
                continue
            offspring, desc = op.get_new_individual(parents)
            # An operator could return None if an offspring cannot be formed
            # by the chosen parents
            if offspring is not None:
                break
        nncomp = offspring.get_chemical_formula(mode='hill')
        print('Relaxing ' + nncomp)
        if 'data' not in offspring.info:
            offspring.info['data'] = {'tag': None}

        return relax(offspring, single_point=True) # Single point only for testing

    # Create a multiprocessing Pool
    pool = Pool(25)#os.cpu_count())
    # Perform procreations in parallel. Especially useful when
    # using adsorbate operators which requires site identification
    relaxed_candidates = pool.map(procreation, range(pop_size))
    pool.close()
    pool.join()
    db.add_more_relaxed_candidates(relaxed_candidates)

    # Update the population to allow new candidates to enter
    pop.update()

Evolutionary Multitasking

Implementation for evolutionary multitasking (EM)

class acat.ga.multitasking.MultitaskPopulation(data_connection, population_size, num_tasks, comparator=None, logfile=None, use_extinct=False, exp_function=True, exp_prefactor=0.5, rng=<module 'numpy.random' from '/home/modules/software/SciPy-bundle/2022.05-intel-2022a/lib/python3.10/site-packages/numpy/random/__init__.py'>)[source]

Bases: Population

Different tasks are assigned to different niches. The candidates are ranked according to the effective fitness, given by the shortest distance between the raw score of the marked niche and the upper envelope after adding the individual. The raw score is given by the fitness gain in the maximum-gained niche. The raw scores of each configuration for all tasks must be provided as a Numpy array in atoms.info[‘data’][‘raw_scores’]. After providing the raw scores, the effective score of each configuration is automatically calculated and stored in atoms.info[‘key_value_pairs’][‘raw_score’]. The dominating niche of each configuration is stored in atoms.info[‘key_value_pairs’][‘dominating_niche’], the best niche (i.e. the niche closest to the upper envelope) is stored in atoms.info[‘key_value_pairs’][‘best_niche’], and the niches that are dominated by the dominating niche are stored in atoms.info[‘data’][‘niches’].

Parameters

num_tasks (int) – The number of tasks.
exp_function (bool, default True) – If True use an exponential function for ranking the fitness. If False use the same as in Population.
exp_prefactor (float, default 0.5) – The prefactor used in the exponential fitness scaling function.

get_rank(candidates)[source]

set_vf_dict(candidates, **sort_arguments)[source]

update(new_cand=None)[source]: The update method in Population will add to the end of the population, that can’t be used here since the fitness will potentially change for all candidates when new are added, therefore just recalc the population every time. New candidates are required (must not be added before calling this method). The maximum gain dynamic niching (MGDN) algorithm is executed.

get_two_candidates()[source]: Returns two candidates for pairing employing the roulete wheel selection scheme described in R.L. Johnston Dalton Transactions, Vol. 22, No. 22. (2003), pp. 4193-4207

class acat.ga.multitasking.MultitaskRepetitionConvergence(population_instance, number_of_generations, max_generations=100000000)[source]

Bases: Convergence

Returns True if the latest finished population has no fitness gain in any task for number_of_generations.

Parameters

number_of_generations (int) – How many generations need to be equal before convergence.
max_generations (int, default indefinte) – The maximum number of generations the GA is allowed to run.

converged()[source]: This function is called to find out if the algorithm run has converged, it should return True or False. Overwrite this in the inherited class.

Example 1

We implement evolutionary multitasking in a Population instance and a Convergence instance in ACAT. The maximum-gain dynamic niching (MGDN) algrotim is implemented in the MultitaskPopulation.

As an example we will simultaneously optimize both the adsorbate overlayer pattern and the catalyst chemical ordering of a Ni48Pt16 fcc111 slab with adsorbate species of H, C, O, OH, CO, CH using the EMT calculator for 10 tasks: each with CH4 chemical potentials of 20, 21, 22, 23, 24, 25, 26, 27, 28 and 29 eV, respectively.

The script for a parallel multitasking symmetry-constrained genetic algorithm (SCGA) looks as follows:

from acat.settings import adsorbate_elements
from acat.adsorption_sites import SlabAdsorptionSites
from acat.adsorbate_coverage import SlabAdsorbateCoverage
from acat.build.ordering import RandomOrderingGenerator as ROG
from acat.build.adlayer import OrderedPatternGenerator as OPG
from acat.ga.adsorbate_operators import (ReplaceAdsorbateSpecies,
                                         CatalystAdsorbateCrossover)
from acat.ga.slab_operators import (CutSpliceSlabCrossover,
                                    RandomSlabPermutation,
                                    RandomCompositionMutation)
from acat.ga.group_operators import (AdsorbateGroupSubstitute,
                                     AdsorbateGroupPermutation)
from acat.ga.multitasking import (MultitaskPopulation,
                                  MultitaskRepetitionConvergence)
from ase.ga.convergence import GenerationRepetitionConvergence
from ase.ga.standard_comparators import SequentialComparator, StringComparator
from ase.ga.offspring_creator import OperationSelector
from ase.ga.utilities import closest_distances_generator, get_nnmat
from ase.ga.data import DataConnection, PrepareDB
from ase.io import read, write
from ase.build import fcc111
from ase.calculators.emt import EMT
from ase.optimize import BFGS
from collections import defaultdict
from multiprocessing import Pool
from random import randint
import numpy as np
import time
import os

# Define population
# Recommend to choose a number that is a multiple of the number of cpu
pop_size = 50

# Define the tasks. In this case we use 10 different chemical potentials of CH4
tasks = np.arange(20., 30., 1.)

# Generate 50 Ni48Pt16 slabs with random orderings
slab = fcc111('Ni', (4, 4, 4), vacuum=5., orthogonal=True, periodic=True)
slab_ids = list(range(len(slab)))
rog = ROG(slab, elements=['Ni', 'Pt'],
          composition={'Ni': 0.75, 'Pt': 0.25},
          trajectory='starting_generation.traj')
rog.run(num_gen=pop_size)

# Get the adsorption sites. Composition does not affect GA operations
sas = SlabAdsorptionSites(slab, surface='fcc111', ignore_sites='bridge',
                          composition_effect=False)

# Generate random coverage on each slab and save the groupings
species = ['H', 'C', 'O', 'OH', 'CO', 'CH']
images = read('starting_generation.traj', index=':')
opg = OPG(images, adsorbate_species=species, surface='fcc111',
          adsorption_sites=sas, max_species=2, allow_odd=True,
          remove_site_shells=1, save_groups=True,
          trajectory='patterns.traj', append_trajectory=True)
opg.run(max_gen=pop_size, unique=True)
patterns = read('patterns.traj', index=':')

# Instantiate the db
db_name = 'ridge_Ni48Pt16_ads_multitask.db'

db = PrepareDB(db_name, cell=slab.cell, population_size=pop_size)

for atoms in patterns:
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    db.add_unrelaxed_candidate(atoms, data=atoms.info['data'])

# Connect to the db
db = DataConnection(db_name)

# Define operators
soclist = ([3, 3, 2, 3, 3, 3],
           [RandomSlabPermutation(allowed_indices=slab_ids),
            RandomCompositionMutation(allowed_indices=slab_ids),
            ReplaceAdsorbateSpecies(species, replace_vacancy=True,
                                    adsorption_sites=sas),
            AdsorbateGroupSubstitute(species, max_species=2,
                                     adsorption_sites=sas,
                                     remove_site_shells=1),
            AdsorbateGroupPermutation(species, adsorption_sites=sas,
                                      remove_site_shells=1),
            CatalystAdsorbateCrossover(),])
op_selector = OperationSelector(*soclist)

# Define comparators
comp = StringComparator('potential_energy')

# Initialize the population and specify the number of tasks
pop = MultitaskPopulation(data_connection=db,
                          population_size=pop_size,
                          num_tasks=len(tasks),
                          comparator=comp,
                          exp_function=True,
                          logfile='log.txt')

# Set convergence criteria
#cc = MultitaskRepetitionConvergence(pop, 10)
cc = GenerationRepetitionConvergence(pop, 2)

# Calculate chemical potentials
chem_pots = {'CH4': tasks, 'H2O': -14.169, 'H2': -6.989}

# Define the relax function
def relax(atoms, single_point=False):
    atoms.calc = EMT()
    if not single_point:
        opt = BFGS(atoms, logfile=None)
        opt.run(fmax=0.1)
    time.sleep(1) # Add delay (only for testing)

    Epot = atoms.get_potential_energy()
    num_H = len([s for s in atoms.symbols if s == 'H'])
    num_C = len([s for s in atoms.symbols if s == 'C'])
    num_O = len([s for s in atoms.symbols if s == 'O'])
    mutots = num_C * chem_pots['CH4'] + num_O * chem_pots['H2O'] + (
             num_H - 4 * num_C - 2 * num_O) * chem_pots['H2'] / 2
    fs = -(Epot - mutots)

    # Save the raw fitness of the configuration for all tasks
    # as a Numpy array in atoms.info['data']['raw_scores']
    atoms.info['data']['raw_scores'] = fs
    atoms.info['key_value_pairs']['potential_energy'] = Epot

    return atoms

# Relax starting generation
def relax_an_unrelaxed_candidate(atoms):
    if 'data' not in atoms.info:
        atoms.info['data'] = {'tag': None}
    nncomp = atoms.get_chemical_formula(mode='hill')
    print('Relaxing ' + nncomp)

    return relax(atoms, single_point=True) # Single point only for testing

# Create a multiprocessing Pool
pool = Pool(os.cpu_count())
# Perform relaxations in parallel.
relaxed_candidates = pool.map(relax_an_unrelaxed_candidate,
                              db.get_all_unrelaxed_candidates())
pool.close()
pool.join()
# Update the population with the newly relaxed candidates
# (DO NOT add relaxed_candidates into db before this update)
pop.update(new_cand=relaxed_candidates)

# Number of generations
num_gens = 1000

# Below is the iterative part of the algorithm
gen_num = db.get_generation_number()
for i in range(num_gens):

    # Check if converged
    if cc.converged():
        print('Converged')
        break
    print('Creating and evaluating generation {0}'.format(gen_num + i))

    def procreation(x):
        # Select an operator and use it
        op = op_selector.get_operator()
        while True:
            # Assign rng with a random seed
            np.random.seed(randint(1, 10000))
            pop.rng = np.random
            # Select parents for a new candidate
            p1, p2 = pop.get_two_candidates()
            parents = [p1, p2]
            # Pure or bare slabs are not considered
            if len(set(p1.numbers)) < 3:
                continue
            offspring, desc = op.get_new_individual(parents)
            # An operator could return None if an offspring cannot be formed
            # by the chosen parents
            if offspring is not None:
                break
        nncomp = offspring.get_chemical_formula(mode='hill')
        print('Relaxing ' + nncomp)
        if 'data' not in offspring.info:
            offspring.info['data'] = {'tag': None}

        return relax(offspring, single_point=True) # Single point only for testing

    # Create a multiprocessing Pool
    pool = Pool(os.cpu_count())
    # Perform procreations in parallel. Especially useful when
    # using adsorbate operators which requires site identification
    relaxed_candidates = pool.map(procreation, range(pop_size))
    pool.close()
    pool.join()

    # Update the population to allow new candidates to enter
    # (DO NOT add relaxed_candidates into db before this update)
    pop.update(new_cand=relaxed_candidates)

The fittest individuals covering all tasks (and which task is dominated by which individual) can be easily obtained by the following script:

from ase.db import connect
from ase.io import write

db = connect('ridge_Ni48Pt16_ads_multitask.db')
fittest_images = []
seen_dns = set()
for row in db.select('relaxed=1'):
    atoms = row.toatoms()
    f_eff = row.raw_score
    dn = row.dominating_niche
    niches = row.data['niches']
    # Get the fittest individual with an effective
    # fitness of 0 in each niche (without duplicates)
    if (f_eff == 0) and (dn not in seen_dns):
        seen_dns.add(dn)
        fittest_images.append(atoms)
        # Get the niches where this structure is the fittest
        print('Fittest niches: {}'.format(niches))
write('fittest_images.traj', fittest_images)