API Documentation
- class padelpy2.calculator.Calculator(descriptors: Iterable[Descriptor | Fingerprint], config: PaDELConfig | None = None)
Bases:
objectA class to compute chemical descriptors and fingerprints using PaDEL-Descriptor.
- Parameters:
descriptors (Iterable[Union[Descriptor, Fingerprint]]) – An iterable of Descriptor or Fingerprint objects.
config (Union[PaDELConfig, None], optional) – Configuration for PaDEL. If None, a default configuration is used (default=None).
- config
The configuration settings for PaDEL.
- Type:
- xml
An XML string representing the descriptor types to be computed.
- Type:
str
- n_2d
Number of 2D descriptors.
- Type:
int
- n_3d
Number of 3D descriptors.
- Type:
int
- n_fp
Number of fingerprints.
- Type:
int
- class padelpy2.config.PaDELConfig(maxruntime: int = -1, waitingjobs: int = -1, threads: int = -1, config: PathLike = None, convert3d: bool = False, detectaromaticity: bool = False, log: bool = False, maxcpdperfile: int = 0, removesalt: bool = False, retain3d: bool = False, retainorder: bool = True, standardizenitro: bool = False, standardizetautomers: bool = False, tautomerlist: PathLike = None, usefilenameasmolname: bool = False, sp_timeout: int = None)
Bases:
objectConfiguration settings for PaDEL-Descriptor descriptor/fingerprint generation.
- Parameters:
maxruntime (int, default=-1) – Maximum running time per molecule (in milliseconds). Use -1 for unlimited. Default is -1.
waitingjobs (int, default=-1) – Maximum number of jobs to store in queue for worker threads to process. Use -1 to set it to 50*Max threads. Default is -1.
threads (int, default=-1) – Maximum number of threads to use. Use -1 to use as many threads as the number of CPU cores. Default is -1.
config (PathLike, default=None) – Configuration file. Default is None.
convert3d (bool, default=False) – Convert molecule to 3D. Default is False.
detectaromaticity (bool, default=False) – Remove existing aromaticity information and automatically detect aromaticity in the molecule before calculation of descriptors. Default is False.
log (bool, default=False) – Create a log file. Name of log file is the name of the descriptors file with a .log extension. Default is False.
maxcpdperfile (int, default=0) – Maximum number of compounds to be stored in each descriptor file. Use 0 for unlimited. Default is 0.
removesalt (bool, default=False) – Remove salt from molecule. Default is False.
retain3d (bool, default=False) – Retain 3D coordinates when standardizing structure. However, this may prevent some structures from being standardized. Default is False.
retainorder (bool, default=True) – Retain order of molecules in structural files for descriptor file. This may lead to large memory use if descriptor calculations are stuck at one molecule as the others will not be written to file and cleared from memory. Default is True.
standardizenitro (bool, default=False) – Standardize nitro groups to N(:O):O. Default is False.
standardizetautomers (bool, default=False) – Standardize tautomers. Default is False.
tautomerlist (PathLike, default=None) – SMIRKS tautomers file. Default is None.
usefilenameasmolname (bool, default=False) – Use filename (minus the extension) as molecule name. Default is False.
sp_timeout (int, default=Nonea) – Timeout for subprocess execution in seconds. Default is None (no timeout).
- config: PathLike = None
- convert3d: bool = False
- detectaromaticity: bool = False
- log: bool = False
- maxcpdperfile: int = 0
- maxruntime: int = -1
- removesalt: bool = False
- retain3d: bool = False
- retainorder: bool = True
- sp_timeout: int = None
- standardizenitro: bool = False
- standardizetautomers: bool = False
- tautomerlist: PathLike = None
- threads: int = -1
- usefilenameasmolname: bool = False
- waitingjobs: int = -1
- class padelpy2.descriptors.Descriptor(desc_class: str, is_3d: bool)
Bases:
objectA class to represent a descriptor that can be used in molecular property calculations.
- Parameters:
desc_class (str) – The class name of the descriptor.
is_3d (bool) – Indicates whether the descriptor is 3D.
- desc_class
The class name of the descriptor.
- Type:
str
- descriptors
List of descriptor names associated with this class.
- Type:
list of str
- descriptions
List of descriptions for each descriptor in descriptors.
- Type:
list of str
- is_3d
Indicates whether the descriptor is 3D.
- Type:
bool
- class padelpy2.fingerprints.Fingerprint(fp_class: str)
Bases:
objectA class to represent a fingerprint that can be used in molecular property calculations.
- Parameters:
fp_class (str) – The class name of the fingerprint.
- fp_class
The class name of the fingerprint.
- Type:
str
- n_bits
The number of bits for the fingerprint.
- Type:
int
- description
Description of the fingerprint.
- Type:
str
- padelpy2.utils.check_for_invalid_mols(mols: List[Mol], check_3d: bool) None
Check for invalid molecules in terms of atom count and 3D conformers.
- Parameters:
mols (List[Mol]) – A list of molecule objects to be checked.
check_3d (bool) – If True, checks whether each molecule has at least one 3D conformer.
- Raises:
ValueError –
If any molecule has more than 999 atoms. - If check_3d is True and a molecule does not have a 3D conformer.
- padelpy2.utils.count_descriptor_types(descriptors: Iterable[Descriptor | Fingerprint]) Tuple[int, int, int]
Count the number of 2D descriptors, 3D descriptors, and fingerprints.
- Parameters:
descriptors (Iterable[Union[Descriptor, Fingerprint]]) – An iterable containing Descriptor and/or Fingerprint objects.
- Returns:
int – The number of 2D descriptors.
int – The number of 3D descriptors.
int – The number of fingerprints.
- padelpy2.utils.create_descriptortypes_xml(descriptors: Iterable[Descriptor | Fingerprint]) str
Create an XML string for descriptor types.
This function generates an XML formatted string that specifies the types of 2D, 3D descriptors, and fingerprints to be used in molecular descriptor/ fingerprint calculations. The resulting XML can be used as input for PaDEL- Descriptor.
- Parameters:
descriptors (Iterable[Union[Descriptor, Fingerprint]]) – An iterable containing instances of Descriptor or Fingerprint classes. Each instance represents a specific type of descriptor or fingerprint to include in the XML configuration.
- Returns:
A string containing the XML representation of the specified descriptors and fingerprints, formatted for use with PaDEL-Descriptor.
- Return type:
str
- padelpy2.utils.popen_timeout(command: str, timeout: int | None) Tuple[str, str]
Run a subprocess command with an optional timeout.
- Parameters:
command (str) – The command to be executed.
timeout (Union[int, None]) – The maximum time (in seconds) to wait for the command to complete. If None, there is no timeout.
- Returns:
A tuple containing the stdout and stderr of the command.
- Return type:
tuple
- padelpy2.utils.remove_files(*paths: PathLike) None
Remove files from specified paths.
- Parameters:
*paths (PathLike) – Variable length argument list of file paths to be removed.
- padelpy2.utils.write_mols_to_tempfile(mols: List[Mol]) str
Write a list of molecule objects to a temporary SDF file.
- Parameters:
mols (List[Mol]) – A list of molecule objects to be written to the file. Each object should be an instance of the RDKit Mol class.
- Returns:
The name (path) of the created temporary SDF file containing the molecules.
- Return type:
str
- padelpy2.utils.write_xml_string_to_tempfile(xml: str) str
Write an XML string to a temporary file and return the file name.
- Parameters:
xml (str) – The XML string to be written to a temporary file.
- Returns:
The file path of the created temporary XML file.
- Return type:
str
- padelpy2.wrapper.padeldescriptor(maxruntime: int = -1, waitingjobs: int = -1, threads: int = -1, d_2d: bool = False, d_3d: bool = False, config: PathLike = None, convert3d: bool = False, descriptortypes: PathLike = None, detectaromaticity: bool = False, mol_dir: PathLike = None, d_file: PathLike = None, fingerprints: bool = False, log: bool = False, maxcpdperfile: int = 0, removesalt: bool = False, retain3d: bool = False, retainorder: bool = True, standardizenitro: bool = False, standardizetautomers: bool = False, tautomerlist: PathLike = None, usefilenameasmolname: bool = False, sp_timeout: int = None, headless: bool = True, use_tempfile: bool = False) PathLike | None
Execute PaDEL-Descriptor with specified parameters.
- Parameters:
maxruntime (int, optional) – Maximum running time per molecule (in milliseconds). Use -1 for unlimited. Default is -1.
waitingjobs (int, optional) – Maximum number of jobs to store in queue for worker threads to process. Use -1 to set it to 50*Max threads. Default is -1.
threads (int, optional) – Maximum number of threads to use. Use -1 to use as many threads as the number of CPU cores. Default is -1.
d_2d (bool, optional) – Calculate 1D and 2D descriptors. Default is False.
d_3d (bool, optional) – Calculate 3D descriptors. Default is False.
config (PathLike, optional) – Configuration file. Default is None.
convert3d (bool, optional) – Convert molecule to 3D. Default is False.
descriptortypes (PathLike, optional) – Descriptor types file. Default is None.
detectaromaticity (bool, optional) – Remove existing aromaticity information and automatically detect aromaticity in the molecule before calculation of descriptors. Default is False.
mol_dir (PathLike, optional) – Set directory containing structural files, or a path to a file. Default is None.
d_file (PathLike, optional) – Set file to save calculated descriptors. Default is None.
fingerprints (bool, optional) – Calculate fingerprints. Default is False.
log (bool, optional) – Create a log file. Name of log file is the name of the descriptors file with a .log extension. Default is False.
maxcpdperfile (int, optional) – Maximum number of compounds to be stored in each descriptor file. Use 0 for unlimited. Default is 0.
removesalt (bool, optional) – Remove salt from molecule. Default is False.
retain3d (bool, optional) – Retain 3D coordinates when standardizing structure. However, this may prevent some structures from being standardized. Default is False.
retainorder (bool, optional) – Retain order of molecules in structural files for descriptor file. This may lead to large memory use if descriptor calculations are stuck at one molecule as the others will not be written to file and cleared from memory. Default is True.
standardizenitro (bool, optional) – Standardize nitro groups to N(:O):O. Default is False.
standardizetautomers (bool, optional) – Standardize tautomers. Default is False.
tautomerlist (PathLike, optional) – SMIRKS tautomers file. Default is None.
usefilenameasmolname (bool, optional) – Use filename (minus the extension) as molecule name. Default is False.
sp_timeout (int, optional) – Timeout for subprocess execution in seconds. Default is None (no timeout).
headless (bool, optional) – Run PaDEL-Descriptor in headless mode. Default is True.
use_tempfile (bool, optional) – Use a temporary file for output if d_file is not specified. Default is False.
- Returns:
The path to the descriptors file generated by PaDEL-Descriptor, or None if no file is generated.
- Return type:
Union[PathLike, None]