dwca.classes package#

This module corresponds to the “class” type terms defined in the https://dwc.tdwg.org/list/#31-index-by-term-name “Classes” section, excluding the deprecated ones. These Python classes represent a complete file in a Darwin Core Archive.

This classes are the ones listed in the description of rowType in the Attributes section for a <core> or <extension> element. Presented below:

(…) For convenience the URIs for classes defined by the Darwin Core are: dwc:Occurrence: http://rs.tdwg.org/dwc/terms/Occurrence, dwc:Organism: http://rs.tdwg.org/dwc/terms/Organism, dwc:MaterialEntity: http://rs.tdwg.org/dwc/terms/MaterialEntity, dwc:MaterialSample: http://rs.tdwg.org/dwc/terms/MaterialSample, dwc:Event: http://rs.tdwg.org/dwc/terms/Event, dcterms:Location: http://purl.org/dc/terms/Location, dwc:GeologicalContext: http://purl.org/dc/terms/GeologicalContext, dwc:Identification: http://rs.tdwg.org/dwc/terms/Identification, dwc:Taxon: http://rs.tdwg.org/dwc/terms/Taxon, dwc:ResourceRelationship: http://rs.tdwg.org/dwc/terms/ResourceRelationship, dwc:MeasurementOrFact: http://rs.tdwg.org/dwc/terms/MeasurementOrFact, chrono:ChronometricAge: http://rs.tdwg.org/chrono/terms/ChronometricAge,

DataFile Class#

class dwca.classes.data_file.DataFile(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: XMLObject, ABC

Abstract class representing the class of data represented by each row in a data entity.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Number of lines to ignore at the start of document. Default 0 lines.

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

class Entry(**kwargs)#

Bases: object

Methods

to_dict

to_dict() Dict#
URI = 'http://rs.tdwg.org/dwc/terms/'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

add_field(field: Field) None#

Add a field to this Data File.

Parameters:
fieldField

A dwca.terms.field.Field object.

Returns:
None
as_pandas(_no_interaction: bool = False) pd.DataFrame#

Convert information in this DataFile in a pandas.DataFrame.

Returns:
DataFrame

Information as a pandas.DataFrame.

as_polars(_no_interaction: bool = False) pl.DataFrame#

Convert information in this DataFile in a polars DataFrame.

Returns:
DataFrame:

Information as a polars.DataFrame.

classmethod check_principal_tag(tag: str, nmap: Dict) None#

Overwrite due to different possible tags.

Parameters:
tagstr

Actual tag.

nmapDict

Namespace.

close() None#
property fields: List[str]#

List[str]: List of terms of this data file.

property filename: str#

str: Filename of the Data File entity.

generate_sql_table() str#

Generate the CREATE TABLE statement for SQL database.

Returns:
str

CREATE TABLE statement.

classmethod get_term_class(element: Element) Type[Field]#

Extract the Python class term from an XML element instance.

Parameters:
elementlxml.etree.Element

XML element instance.

Returns:
Type[Field]

The Python class representing the term name.

property id: int#

int: Column to be identified as primary key

property insert_sql: Generator[str, Tuple[Any]]#

Generate the INSERT INTO sql statement and the values to be inserted.

is_lazy() bool#

Check if data file load its data as a Lazy Frame.

Returns:
bool

True when file read as a Lazy Frame, False otherwise.

merge(data_file: DataFile) DataFile#
property name: str#

str: The name of the field.

property pandas: pd.DataFrame#

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

classmethod parse(element: Element, nmap: Dict) DataFile | None#

Parse an lxml.etree.Element into a concrete DataFile object.

Parameters:
elementlxml.etree.Element

An XML Element.

nmapDict

Dictionary of prefix:uri.

Returns:
DataFile

An instance of a concrete DataFile class.

classmethod parse_kwargs(element: Element, nmap: Dict) Dict#

Parse an lxml.etree.Element into the DataFile parameters.

Parameters:
elementlxml.etree.Element

An XML Element.

nmapDict

Dictionary of prefix:uri.

Returns:
Dict

The Parameters of any DataFile.

property polars: pl.DataFrame#

polars.DataFrame: Data of this DataFile as polars.DataFrame.

read_file(content: str, source_file: BinaryIO = None, lazy: bool = False, _no_interaction: bool = False) None#

Read the content of the file specified in files parameters (filename()).

Parameters:
contentstr

Content of the file

source_fileBinaryIO, optional

File to read in case of laziness.

lazybool, optional

Read the file in lazy evaluation mode. Default False.

_no_interactionbool, optional

Not to show progress bar if library tqdm is installed. Default False.

set_core_field(field: Field) None#

Set the Core field in an Extension DataFile.

Parameters:
fieldField

Primary Key from the Core DataFile.

Returns:
None
set_primary_key(primary_key: str) None#

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

Parameters:
primary_keystr

Name of the Core Data File.

Returns:
None
property sql_table: str#

str: Data file as CREATE TABLE sql statement.

to_element() Element#

Generate a XML Element instance

Returns:
lxml.etree.Element

XML Element instance

property uri: str#

str: Unified Resource Identifier (URI) for the term.

write_file(_no_interaction: bool = False) str#

Write the content as a text using format information on this object.

Returns:
str

Data File as plain text.

class dwca.classes.data_file.DataFileType(value)#

Bases: Enum

Type of data file in the Darwin Core Archive.

CORE = 0#
EXTENSION = 1#

OutsideClass Class#

This is a special class, and its idea is to represent any data file that is not defined in the standard.

class dwca.classes.outside_class.OutsideClass(_id: int, uri: str, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

Classes defined outside the Darwin Core specifications.

Parameters:
_idint

Unique identifier for the core entity.

uristr

URI of the term.

filesstr

File location, in the archive, this is inside the zip file.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into an OutsideClass object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

classmethod get_term_class(element: Element) Type[Field]#

Extract the Python class term from an XML element instance.

Parameters:
elementlxml.etree.Element

XML element instance.

Returns:
Type[Field]

The Python class representing the term name.

classmethod parse(element: Element, nmap: Dict) OutsideClass | None#

Parse an lxml.etree.Element into an OutsideClass object.

Parameters:
elementlxml.etree.Element

An XML Element.

nmapDict

Dictionary of prefix:uri.

Returns:
OutsideClass

An instance of OutsideClass.

ChronometricAge Class#

This particular “class” term was extracted from http://rs.tdwg.org/dwc/doc/chrono/ (https://chrono.tdwg.org/list/#31-index-by-term-name “Classes” section).

class dwca.classes.chronometric_age.ChronometricAge(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

An approximation of a temporal position that is supported via evidence.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/chrono/terms/ChronometricAge'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

Event Class#

class dwca.classes.event.Event(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

An action that occurs at some location during some time.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/Event'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

GeologicalContext Class#

class dwca.classes.geological_context.GeologicalContext(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

Geological information, such as stratigraphy, that qualifies a region or place.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/GeologicalContext'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

Identification Class#

class dwca.classes.identification.Identification(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A taxonomic determination (e.g., the assignment to a dwc:Taxon).

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/Identification'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

Location Class#

class dwca.classes.location.Location(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A spatial region or named place.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://purl.org/dc/terms/Location'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

MaterialEntity Class#

class dwca.classes.material_entity.MaterialEntity(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

An entity that can be identified, exists for some period of time, and consists of physical matter while it exists.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/MaterialEntity'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

MaterialSample Class#

class dwca.classes.material_sample.MaterialSample(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A material entity that represents an entity of interest in whole or in part.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/MaterialSample'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

MeasurementOrFact Class#

class dwca.classes.measurement_or_fact.MeasurementOrFact(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A measurement of or fact about a Resource.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/MeasurementOrFact'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

Occurrence Class#

class dwca.classes.occurrence.Occurrence(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

An existence of an organism at a particular place at a particular time.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/Occurrence'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

Organism Class#

class dwca.classes.organism.Organism(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A particular organism or defined group of organisms considered to be taxonomically homogeneous.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/Organism'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

ResourceRelationship Class#

class dwca.classes.resource_relationship.ResourceRelationship(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A relationship of one Resource to another.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/ResourceRelationship'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

Taxon Class#

class dwca.classes.taxon.Taxon(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#

Bases: DataFile

A group of organisms considered by taxonomists to form a homogeneous unit.

Parameters:
_idint

Unique identifier for the core entity.

filesstr

File location, in the archive, this is inside the zip file.

fieldsList[Field]

A list of the Field (columns) in the Core data entity.

data_file_type: DataFileType

The Data File Type in the Darwin Core Archive.

encodingstr, optional

Encoding of the file location (files parameter), default is “utf-8”.

lines_terminated_bystr, optional

Delimiter of lines on the file, default “\n”.

fields_terminated_bystr, optional

Delimiter of the file (cells) on the file, default “,”.

fields_enclosed_bystr, optional

Specifies the character used to enclose (mark the start and end of) each field, default empty “”.

ignore_header_linesint, optional

Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).

Attributes:
NAMESPACE_TAG
fields

List[str]: List of terms of this data file.

filename

str: Filename of the Data File entity.

id

int: Column to be identified as primary key

insert_sql

Generate the INSERT INTO sql statement and the values to be inserted.

name

str: The name of the field.

pandas

pandas.DataFrame: Data of this DataFile as pandas.DataFrame.

polars

polars.DataFrame: Data of this DataFile as polars.DataFrame.

sql_table

str: Data file as CREATE TABLE sql statement.

uri

str: Unified Resource Identifier (URI) for the term.

Methods

add_field(field)

Add a field to this Data File.

add_namespace(prefix, uri)

Add a namespace to the XML object.

all_synonyms(taxa_id[, get_names])

Get a list of all valid names of a list of taxa.

as_pandas([_no_interaction])

Convert information in this DataFile in a pandas.DataFrame.

as_polars([_no_interaction])

Convert information in this DataFile in a polars DataFrame.

check_principal_tag(tag, nmap)

Overwrite due to different possible tags.

filter_by_class(classes[, fuzzy_threshold])

Filter data by a valid class.

filter_by_family(families[, fuzzy_threshold])

Filter data by a valid family.

filter_by_genus(genera[, fuzzy_threshold])

Filter data by a valid genus.

filter_by_kingdom(kingdoms[, fuzzy_threshold])

Filter data by a valid kingdoms.

filter_by_order(orders[, fuzzy_threshold])

Filter data by a valid order.

filter_by_phylum(phyla[, fuzzy_threshold])

Filter data by a valid phylum.

filter_by_species(species[, fuzzy_threshold])

Filer data by species or any rank taxonomy below (subspecies, variety, form, etc.).

from_string(text)

Generates XML Object from a string of an XML file.

from_xml(file[, encoding])

Generates an XML Object from an XML file.

generate_sql_table()

Generate the CREATE TABLE statement for SQL database.

get_parents(taxa_id)

Get a list of taxa ids of the parent of the list taxa id provided.

get_principal_tag()

Returns the principal tag with namespaces if it is present.

get_term_class(element)

Extract the Python class term from an XML element instance.

is_lazy()

Check if data file load its data as a Lazy Frame.

object_to_element(tag[, prefix])

Generates an element using tag, adding namespace tag.

parse(element, nmap)

Parse an lxml.etree.Element into a concrete DataFile object.

parse_kwargs(element, nmap)

Parse an lxml.etree.Element into the DataFile parameters.

read_file(content[, source_file, lazy, ...])

Read the content of the file specified in files parameters (filename()).

set_core_field(field)

Set the Core field in an Extension DataFile.

set_primary_key(primary_key)

Set the primary key in an Extension DataFile to be referenced on the new SQL table.

to_element()

Generate a XML Element instance

to_xml()

Generates text of an XML file.

write_file([_no_interaction])

Write the content as a text using format information on this object.

Entry

close

merge

URI = 'http://rs.tdwg.org/dwc/terms/Taxon'#

str: Unified Resource Identifier (URI) for the term identifying the class of data.

all_synonyms(taxa_id: Iterable[str], get_names: bool = False) List[str]#

Get a list of all valid names of a list of taxa.

Parameters:
taxa_idIterable[str]

A list (or iterable) of dwca.terms.taxon.TaxonID value.

get_namesbool

Whether to get dwca.terms.taxon.ScientificName or dwca.terms.taxon.TaxonID.

Returns:
List[str]

A list of dwca.terms.taxon.TaxonID.

filter_by_class(classes: List[str], fuzzy_threshold: float = -1) None#

Filter data by a valid class.

Parameters:
classesList[str]

Class names to filter data.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

filter_by_family(families: List[str], fuzzy_threshold: float = -1) None#

Filter data by a valid family.

Parameters:
familiesList[str]

Family names to filter data.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

filter_by_genus(genera: List[str], fuzzy_threshold: float = -1) None#

Filter data by a valid genus.

Parameters:
generaList[str]

Class names to filter genus.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

filter_by_kingdom(kingdoms: List[str], fuzzy_threshold: float = -1) None#

Filter data by a valid kingdoms.

Parameters:
kingdomsList[str]

Kingdom names to filter data.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

filter_by_order(orders: List[str], fuzzy_threshold: float = -1) None#

Filter data by a valid order.

Parameters:
ordersList[str]

Order names to filter data.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

filter_by_phylum(phyla: List[str], fuzzy_threshold: float = -1) None#

Filter data by a valid phylum.

Parameters:
phylaList[str]

Phylum names to filter data.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

filter_by_species(species: List[str], fuzzy_threshold: float = -1) None#

Filer data by species or any rank taxonomy below (subspecies, variety, form, etc.).

In contrast with the other filter_by_ taxonomy methods, this one filter the taxonomic data using the scientific name field dwca.terms.taxon.ScientificName.

Warning

Because of that, use this method with precautions. If a scientific name of a rank above species (genus, order, etc…) is used, it could result in unexpected behaviour.

Parameters:
speciesList[str]

Scientific Name of species (or rank below) to filter data.

fuzzy_thresholdfloat, optional

If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.

get_parents(taxa_id: List[str]) Set[str]#

Get a list of taxa ids of the parent of the list taxa id provided.

Parameters:
taxa_idList[str]

A list of dwca.terms.taxon.TaxonID to look for parents.

Returns:
Set[str]

Set of taxa ids.