dwca package#

Package for reading Darwin Core Archive (DwCA) files.

Darwin Core Base Class#

class dwca.base.darwincore.DarwinCore#

Bases: ABC

Base class of this package.

Methods

from_file(path_to_archive)

Generate a Darwin Core Standard from a file.

to_file(path_to_archive[, encoding])

Generate a Darwin Core file using the information of this instance.

abstractmethod classmethod from_file(path_to_archive: str) DarwinCore#

Generate a Darwin Core Standard from a file.

Parameters:
path_to_archivestr

Path of the archive file.

Returns:
DarwinCore

Instance of the Darwin Core Standard.

abstractmethod to_file(path_to_archive: str, encoding: str = 'utf-8') None#

Generate a Darwin Core file using the information of this instance.

Parameters:
path_to_archivestr

Path of the archive to generate.

encodingstr, optional

Encoding of the corresponding files. Default “utf-8”.

Darwin Core Archive Class#

class dwca.base.darwincore_archive.DarwinCoreArchive(_id: str = None)#

Bases: DarwinCore

Represent a Darwin Core Archive file with all its elements.

Parameters:
_idstr, optional

A unique id for this Darwin Core Archive.

Attributes:
core

DataFile: The file with the core of the archive.

dataset_metadata

Dict[str, EML]: Metadata instances for each dataset present on DWC-A.

extensions

List[DataFile]: A list with the extension of the archive.

id

str: A unique identifier for this DarwinCoreArchive.

language

Language: Language of the Darwin Core Archive register on metadata.

metadata

EML: Metadata instance, currently supported EML.

metadata_filename

str: The filename of the metadata file.

Methods

Metadata([metadata])

Metadata class of the Darwin Core Archive storing the file name of the archive elements.

from_file(path_to_archive[, lazy, ...])

Generate a Darwin Core Archive instance from an archive file (.zip).

generate_eml([filename])

Generate an EML file on the archive.

set_eml(eml[, filename])

Set an EML file in the archive.

to_file(path_to_archive[, encoding, ...])

Generate a Darwin Core Archive file (.zip file) using the information of this instance.

merge

class Metadata(metadata: str = None)#

Bases: XMLObject

Metadata class of the Darwin Core Archive storing the file name of the archive elements.

Parameters:
metadatastr, optional

Name of the metadata file (e.g.: eml.xml)

Attributes:
NAMESPACE_TAG

Methods

add_namespace(prefix, uri)

Add a namespace to the XML object.

check_principal_tag(tag, nmap)

Checks if the tag is the Principal tag of the object.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

get_dwc_class(element)

Extract the row type from an XML element instance.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parses an lxml.etree.Element in a Metadata instance.

to_element()

Generate an element from a Metadata instance.

to_xml()

Generates text of an XML file.

PRINCIPAL_TAG = 'archive'#

str : The principal tag of the XML document.

classmethod get_dwc_class(element: Element) Type[DataFile]#

Extract the row type from an XML element instance.

Parameters:
elementlxml.etree.Element

XML element instance.

Returns:
Type[DataFile]

The Python class representing the class term.

classmethod parse(element: Element, nmap: Dict) Metadata#

Parses an lxml.etree.Element in a Metadata instance.

Parameters:
elementlxml.etree.Element

XML element to be parsed.

nmapDict

Namespace prefix:uri.

Returns:
Metadata

New Metadata instance with the data from the element.

to_element() Element#

Generate an element from a Metadata instance.

Returns:
lxml.etree.Element

XML element from Metadata instance

xmlns = 'http://rs.tdwg.org/dwc/text/'#

str : Require tag of the metadata

property core: DataFile#

DataFile: The file with the core of the archive.

property dataset_metadata: Dict[str, EML]#

Dict[str, EML]: Metadata instances for each dataset present on DWC-A.

property extensions: List[DataFile]#

List[DataFile]: A list with the extension of the archive.

classmethod from_file(path_to_archive: str, lazy: bool = False, _no_interaction: bool = False) DarwinCoreArchive#

Generate a Darwin Core Archive instance from an archive file (.zip).

Parameters:
path_to_archivestr

Path of the archive file.

lazybool, optional

Read the archive lazy. Default False.

_no_interactionbool, optional

Not to show progress bar if library tqdm is installed. Default False.

Returns:
DarwinCoreArchive

Instance of the Darwin Core Archive.

generate_eml(filename: str = 'eml.xml') None#

Generate an EML file on the archive.

Parameters:
filenamestr

Filename for the EML file to be generated. Defaults to “eml.xml”.

property id: str#

str: A unique identifier for this DarwinCoreArchive.

property language: Language#

Language: Language of the Darwin Core Archive register on metadata.

classmethod merge(first_archive: DarwinCoreArchive, second_archive: DarwinCoreArchive, _id: str = None, eml: EML = None, eml_filename: str = 'eml.xml') DarwinCoreArchive#
property metadata: EML#

EML: Metadata instance, currently supported EML.

property metadata_filename: str#

str: The filename of the metadata file.

set_eml(eml: EML, filename: str = 'eml.xml') None#

Set an EML file in the archive.

Parameters:
emlEML

Metadata instance to set.

filenamestr, optional

Filename for the EML file. Defaults to “eml.xml”.

to_file(path_to_archive: str, encoding: str = 'utf-8', compression: int = 8, compression_level: int = 6, _no_interaction: bool = False) None#

Generate a Darwin Core Archive file (.zip file) using the information of this instance.

Parameters:
path_to_archivestr

Path of the archive to generate.

encodingstr, optional

Encoding of the corresponding files. Default “utf-8”.

compressionint, optional

The ZIP compression method to use. Default zipfile.ZIP_DEFLATED.

compression_levelint, optional

Compression level to use when writing files to the archive. Default 6.

_no_interactionbool, optional

Not to show progress bar if library tqdm is installed. Default False.

Simple Darwin Core Class#

class dwca.base.simple_darwincore.SimpleDarwinCore#

Bases: DarwinCore

Class representing a Simple Darwin Core standard.

Methods

from_file(path_to_archive)

Generate a Darwin Core Standard from a file.

to_file(path_to_archive[, encoding])

Generate a Darwin Core file using the information of this instance.

classmethod from_file(path_to_archive: str) SimpleDarwinCore#

Generate a Darwin Core Standard from a file.

Parameters:
path_to_archivestr

Path of the archive file.

Returns:
DarwinCore

Instance of the Darwin Core Standard.

to_file(path_to_archive: str, encoding: str = 'utf-8') None#

Generate a Darwin Core file using the information of this instance.

Parameters:
path_to_archivestr

Path of the archive to generate.

encodingstr, optional

Encoding of the corresponding files. Default “utf-8”.

Subpackages#