API Documentation

class padelpy2.calculator.Calculator(descriptors: Iterable[Descriptor | Fingerprint], config: PaDELConfig | None = None)

Bases: object

A class to compute chemical descriptors and fingerprints using PaDEL-Descriptor.

Parameters:
  • descriptors (Iterable[Union[Descriptor, Fingerprint]]) – An iterable of Descriptor or Fingerprint objects.

  • config (Union[PaDELConfig, None], optional) – Configuration for PaDEL. If None, a default configuration is used (default=None).

config

The configuration settings for PaDEL.

Type:

PaDELConfig

xml

An XML string representing the descriptor types to be computed.

Type:

str

n_2d

Number of 2D descriptors.

Type:

int

n_3d

Number of 3D descriptors.

Type:

int

n_fp

Number of fingerprints.

Type:

int

class padelpy2.config.PaDELConfig(maxruntime: int = -1, waitingjobs: int = -1, threads: int = -1, config: PathLike = None, convert3d: bool = False, detectaromaticity: bool = False, log: bool = False, maxcpdperfile: int = 0, removesalt: bool = False, retain3d: bool = False, retainorder: bool = True, standardizenitro: bool = False, standardizetautomers: bool = False, tautomerlist: PathLike = None, usefilenameasmolname: bool = False, sp_timeout: int = None)

Bases: object

Configuration settings for PaDEL-Descriptor descriptor/fingerprint generation.

Parameters:
  • maxruntime (int, default=-1) – Maximum running time per molecule (in milliseconds). Use -1 for unlimited. Default is -1.

  • waitingjobs (int, default=-1) – Maximum number of jobs to store in queue for worker threads to process. Use -1 to set it to 50*Max threads. Default is -1.

  • threads (int, default=-1) – Maximum number of threads to use. Use -1 to use as many threads as the number of CPU cores. Default is -1.

  • config (PathLike, default=None) – Configuration file. Default is None.

  • convert3d (bool, default=False) – Convert molecule to 3D. Default is False.

  • detectaromaticity (bool, default=False) – Remove existing aromaticity information and automatically detect aromaticity in the molecule before calculation of descriptors. Default is False.

  • log (bool, default=False) – Create a log file. Name of log file is the name of the descriptors file with a .log extension. Default is False.

  • maxcpdperfile (int, default=0) – Maximum number of compounds to be stored in each descriptor file. Use 0 for unlimited. Default is 0.

  • removesalt (bool, default=False) – Remove salt from molecule. Default is False.

  • retain3d (bool, default=False) – Retain 3D coordinates when standardizing structure. However, this may prevent some structures from being standardized. Default is False.

  • retainorder (bool, default=True) – Retain order of molecules in structural files for descriptor file. This may lead to large memory use if descriptor calculations are stuck at one molecule as the others will not be written to file and cleared from memory. Default is True.

  • standardizenitro (bool, default=False) – Standardize nitro groups to N(:O):O. Default is False.

  • standardizetautomers (bool, default=False) – Standardize tautomers. Default is False.

  • tautomerlist (PathLike, default=None) – SMIRKS tautomers file. Default is None.

  • usefilenameasmolname (bool, default=False) – Use filename (minus the extension) as molecule name. Default is False.

  • sp_timeout (int, default=Nonea) – Timeout for subprocess execution in seconds. Default is None (no timeout).

config: PathLike = None
convert3d: bool = False
detectaromaticity: bool = False
log: bool = False
maxcpdperfile: int = 0
maxruntime: int = -1
removesalt: bool = False
retain3d: bool = False
retainorder: bool = True
sp_timeout: int = None
standardizenitro: bool = False
standardizetautomers: bool = False
tautomerlist: PathLike = None
threads: int = -1
usefilenameasmolname: bool = False
waitingjobs: int = -1
class padelpy2.descriptors.Descriptor(desc_class: str, is_3d: bool)

Bases: object

A class to represent a descriptor that can be used in molecular property calculations.

Parameters:
  • desc_class (str) – The class name of the descriptor.

  • is_3d (bool) – Indicates whether the descriptor is 3D.

desc_class

The class name of the descriptor.

Type:

str

descriptors

List of descriptor names associated with this class.

Type:

list of str

descriptions

List of descriptions for each descriptor in descriptors.

Type:

list of str

is_3d

Indicates whether the descriptor is 3D.

Type:

bool

class padelpy2.fingerprints.Fingerprint(fp_class: str)

Bases: object

A class to represent a fingerprint that can be used in molecular property calculations.

Parameters:

fp_class (str) – The class name of the fingerprint.

fp_class

The class name of the fingerprint.

Type:

str

n_bits

The number of bits for the fingerprint.

Type:

int

description

Description of the fingerprint.

Type:

str

padelpy2.utils.check_for_invalid_mols(mols: List[Mol], check_3d: bool) None

Check for invalid molecules in terms of atom count and 3D conformers.

Parameters:
  • mols (List[Mol]) – A list of molecule objects to be checked.

  • check_3d (bool) – If True, checks whether each molecule has at least one 3D conformer.

Raises:

ValueError

  • If any molecule has more than 999 atoms. - If check_3d is True and a molecule does not have a 3D conformer.

padelpy2.utils.count_descriptor_types(descriptors: Iterable[Descriptor | Fingerprint]) Tuple[int, int, int]

Count the number of 2D descriptors, 3D descriptors, and fingerprints.

Parameters:

descriptors (Iterable[Union[Descriptor, Fingerprint]]) – An iterable containing Descriptor and/or Fingerprint objects.

Returns:

  • int – The number of 2D descriptors.

  • int – The number of 3D descriptors.

  • int – The number of fingerprints.

padelpy2.utils.create_descriptortypes_xml(descriptors: Iterable[Descriptor | Fingerprint]) str

Create an XML string for descriptor types.

This function generates an XML formatted string that specifies the types of 2D, 3D descriptors, and fingerprints to be used in molecular descriptor/ fingerprint calculations. The resulting XML can be used as input for PaDEL- Descriptor.

Parameters:

descriptors (Iterable[Union[Descriptor, Fingerprint]]) – An iterable containing instances of Descriptor or Fingerprint classes. Each instance represents a specific type of descriptor or fingerprint to include in the XML configuration.

Returns:

A string containing the XML representation of the specified descriptors and fingerprints, formatted for use with PaDEL-Descriptor.

Return type:

str

padelpy2.utils.popen_timeout(command: str, timeout: int | None) Tuple[str, str]

Run a subprocess command with an optional timeout.

Parameters:
  • command (str) – The command to be executed.

  • timeout (Union[int, None]) – The maximum time (in seconds) to wait for the command to complete. If None, there is no timeout.

Returns:

A tuple containing the stdout and stderr of the command.

Return type:

tuple

padelpy2.utils.remove_files(*paths: PathLike) None

Remove files from specified paths.

Parameters:

*paths (PathLike) – Variable length argument list of file paths to be removed.

padelpy2.utils.write_mols_to_tempfile(mols: List[Mol]) str

Write a list of molecule objects to a temporary SDF file.

Parameters:

mols (List[Mol]) – A list of molecule objects to be written to the file. Each object should be an instance of the RDKit Mol class.

Returns:

The name (path) of the created temporary SDF file containing the molecules.

Return type:

str

padelpy2.utils.write_xml_string_to_tempfile(xml: str) str

Write an XML string to a temporary file and return the file name.

Parameters:

xml (str) – The XML string to be written to a temporary file.

Returns:

The file path of the created temporary XML file.

Return type:

str

padelpy2.wrapper.padeldescriptor(maxruntime: int = -1, waitingjobs: int = -1, threads: int = -1, d_2d: bool = False, d_3d: bool = False, config: PathLike = None, convert3d: bool = False, descriptortypes: PathLike = None, detectaromaticity: bool = False, mol_dir: PathLike = None, d_file: PathLike = None, fingerprints: bool = False, log: bool = False, maxcpdperfile: int = 0, removesalt: bool = False, retain3d: bool = False, retainorder: bool = True, standardizenitro: bool = False, standardizetautomers: bool = False, tautomerlist: PathLike = None, usefilenameasmolname: bool = False, sp_timeout: int = None, headless: bool = True, use_tempfile: bool = False) PathLike | None

Execute PaDEL-Descriptor with specified parameters.

Parameters:
  • maxruntime (int, optional) – Maximum running time per molecule (in milliseconds). Use -1 for unlimited. Default is -1.

  • waitingjobs (int, optional) – Maximum number of jobs to store in queue for worker threads to process. Use -1 to set it to 50*Max threads. Default is -1.

  • threads (int, optional) – Maximum number of threads to use. Use -1 to use as many threads as the number of CPU cores. Default is -1.

  • d_2d (bool, optional) – Calculate 1D and 2D descriptors. Default is False.

  • d_3d (bool, optional) – Calculate 3D descriptors. Default is False.

  • config (PathLike, optional) – Configuration file. Default is None.

  • convert3d (bool, optional) – Convert molecule to 3D. Default is False.

  • descriptortypes (PathLike, optional) – Descriptor types file. Default is None.

  • detectaromaticity (bool, optional) – Remove existing aromaticity information and automatically detect aromaticity in the molecule before calculation of descriptors. Default is False.

  • mol_dir (PathLike, optional) – Set directory containing structural files, or a path to a file. Default is None.

  • d_file (PathLike, optional) – Set file to save calculated descriptors. Default is None.

  • fingerprints (bool, optional) – Calculate fingerprints. Default is False.

  • log (bool, optional) – Create a log file. Name of log file is the name of the descriptors file with a .log extension. Default is False.

  • maxcpdperfile (int, optional) – Maximum number of compounds to be stored in each descriptor file. Use 0 for unlimited. Default is 0.

  • removesalt (bool, optional) – Remove salt from molecule. Default is False.

  • retain3d (bool, optional) – Retain 3D coordinates when standardizing structure. However, this may prevent some structures from being standardized. Default is False.

  • retainorder (bool, optional) – Retain order of molecules in structural files for descriptor file. This may lead to large memory use if descriptor calculations are stuck at one molecule as the others will not be written to file and cleared from memory. Default is True.

  • standardizenitro (bool, optional) – Standardize nitro groups to N(:O):O. Default is False.

  • standardizetautomers (bool, optional) – Standardize tautomers. Default is False.

  • tautomerlist (PathLike, optional) – SMIRKS tautomers file. Default is None.

  • usefilenameasmolname (bool, optional) – Use filename (minus the extension) as molecule name. Default is False.

  • sp_timeout (int, optional) – Timeout for subprocess execution in seconds. Default is None (no timeout).

  • headless (bool, optional) – Run PaDEL-Descriptor in headless mode. Default is True.

  • use_tempfile (bool, optional) – Use a temporary file for output if d_file is not specified. Default is False.

Returns:

The path to the descriptors file generated by PaDEL-Descriptor, or None if no file is generated.

Return type:

Union[PathLike, None]