dwca.classes package#
This module corresponds to the “class” type terms defined in the https://dwc.tdwg.org/list/#31-index-by-term-name “Classes” section, excluding the deprecated ones. These Python classes represent a complete file in a Darwin Core Archive.
This classes are the ones listed in the description of rowType in the Attributes section for a <core> or <extension> element. Presented below:
(…) For convenience the URIs for classes defined by the Darwin Core are: dwc:Occurrence: http://rs.tdwg.org/dwc/terms/Occurrence, dwc:Organism: http://rs.tdwg.org/dwc/terms/Organism, dwc:MaterialEntity: http://rs.tdwg.org/dwc/terms/MaterialEntity, dwc:MaterialSample: http://rs.tdwg.org/dwc/terms/MaterialSample, dwc:Event: http://rs.tdwg.org/dwc/terms/Event, dcterms:Location: http://purl.org/dc/terms/Location, dwc:GeologicalContext: http://purl.org/dc/terms/GeologicalContext, dwc:Identification: http://rs.tdwg.org/dwc/terms/Identification, dwc:Taxon: http://rs.tdwg.org/dwc/terms/Taxon, dwc:ResourceRelationship: http://rs.tdwg.org/dwc/terms/ResourceRelationship, dwc:MeasurementOrFact: http://rs.tdwg.org/dwc/terms/MeasurementOrFact, chrono:ChronometricAge: http://rs.tdwg.org/chrono/terms/ChronometricAge,
DataFile Class#
- class dwca.classes.data_file.DataFile(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
XMLObject,ABCAbstract class representing the class of data represented by each row in a data entity.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Number of lines to ignore at the start of document. Default 0 lines.
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
- add_field(field: Field) None#
Add a field to this Data File.
- Parameters:
- fieldField
A
dwca.terms.field.Fieldobject.
- Returns:
- None
- as_pandas(_no_interaction: bool = False) pd.DataFrame#
Convert information in this DataFile in a pandas.DataFrame.
- Returns:
- DataFrame
Information as a pandas.DataFrame.
- as_polars(_no_interaction: bool = False) pl.DataFrame#
Convert information in this DataFile in a polars DataFrame.
- Returns:
- DataFrame:
Information as a polars.DataFrame.
- classmethod check_principal_tag(tag: str, nmap: Dict) None#
Overwrite due to different possible tags.
- Parameters:
- tagstr
Actual tag.
- nmapDict
Namespace.
- close() None#
- property fields: List[str]#
List[str]: List of terms of this data file.
- property filename: str#
str: Filename of the Data File entity.
- generate_sql_table() str#
Generate the CREATE TABLE statement for SQL database.
- Returns:
- str
CREATE TABLE statement.
- classmethod get_term_class(element: Element) Type[Field]#
Extract the Python
classterm from an XML element instance.- Parameters:
- elementlxml.etree.Element
XML element instance.
- Returns:
- Type[Field]
The Python
classrepresenting the term name.
- property id: int#
int: Column to be identified as primary key
- property insert_sql: Generator[str, Tuple[Any]]#
Generate the INSERT INTO sql statement and the values to be inserted.
- is_lazy() bool#
Check if data file load its data as a Lazy Frame.
- Returns:
- bool
True when file read as a Lazy Frame, False otherwise.
- property name: str#
str: The name of the field.
- property pandas: pd.DataFrame#
pandas.DataFrame: Data of this DataFile as pandas.DataFrame.
- classmethod parse(element: Element, nmap: Dict) DataFile | None#
Parse an lxml.etree.Element into a concrete DataFile object.
- Parameters:
- elementlxml.etree.Element
An XML Element.
- nmapDict
Dictionary of prefix:uri.
- Returns:
- DataFile
An instance of a concrete DataFile class.
- classmethod parse_kwargs(element: Element, nmap: Dict) Dict#
Parse an lxml.etree.Element into the DataFile parameters.
- Parameters:
- elementlxml.etree.Element
An XML Element.
- nmapDict
Dictionary of prefix:uri.
- Returns:
- Dict
The Parameters of any DataFile.
- property polars: pl.DataFrame#
polars.DataFrame: Data of this DataFile as polars.DataFrame.
- read_file(content: str, source_file: BinaryIO = None, lazy: bool = False, _no_interaction: bool = False) None#
Read the content of the file specified in files parameters (
filename()).- Parameters:
- contentstr
Content of the file
- source_fileBinaryIO, optional
File to read in case of laziness.
- lazybool, optional
Read the file in lazy evaluation mode. Default False.
- _no_interactionbool, optional
Not to show progress bar if library tqdm is installed. Default False.
- set_core_field(field: Field) None#
Set the Core field in an Extension DataFile.
- Parameters:
- fieldField
Primary Key from the Core DataFile.
- Returns:
- None
- set_primary_key(primary_key: str) None#
Set the primary key in an Extension DataFile to be referenced on the new SQL table.
- Parameters:
- primary_keystr
Name of the Core Data File.
- Returns:
- None
- property sql_table: str#
str: Data file as CREATE TABLE sql statement.
- to_element() Element#
Generate a XML Element instance
- Returns:
- lxml.etree.Element
XML Element instance
- property uri: str#
str: Unified Resource Identifier (URI) for the term.
- write_file(_no_interaction: bool = False) str#
Write the content as a text using format information on this object.
- Returns:
- str
Data File as plain text.
OutsideClass Class#
This is a special class, and its idea is to represent any data file that is not defined in the standard.
- class dwca.classes.outside_class.OutsideClass(_id: int, uri: str, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileClasses defined outside the Darwin Core specifications.
- Parameters:
- _idint
Unique identifier for the core entity.
- uristr
URI of the term.
- filesstr
File location, in the archive, this is inside the zip file.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into an OutsideClass object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- classmethod get_term_class(element: Element) Type[Field]#
Extract the Python
classterm from an XML element instance.- Parameters:
- elementlxml.etree.Element
XML element instance.
- Returns:
- Type[Field]
The Python
classrepresenting the term name.
- classmethod parse(element: Element, nmap: Dict) OutsideClass | None#
Parse an lxml.etree.Element into an OutsideClass object.
- Parameters:
- elementlxml.etree.Element
An XML Element.
- nmapDict
Dictionary of prefix:uri.
- Returns:
- OutsideClass
An instance of OutsideClass.
ChronometricAge Class#
This particular “class” term was extracted from http://rs.tdwg.org/dwc/doc/chrono/ (https://chrono.tdwg.org/list/#31-index-by-term-name “Classes” section).
- class dwca.classes.chronometric_age.ChronometricAge(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileAn approximation of a temporal position that is supported via evidence.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/chrono/terms/ChronometricAge'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
Event Class#
- class dwca.classes.event.Event(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileAn action that occurs at some location during some time.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/Event'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
GeologicalContext Class#
- class dwca.classes.geological_context.GeologicalContext(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileGeological information, such as stratigraphy, that qualifies a region or place.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/GeologicalContext'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
Identification Class#
- class dwca.classes.identification.Identification(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA taxonomic determination (e.g., the assignment to a dwc:Taxon).
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/Identification'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
Location Class#
- class dwca.classes.location.Location(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA spatial region or named place.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://purl.org/dc/terms/Location'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
MaterialEntity Class#
- class dwca.classes.material_entity.MaterialEntity(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileAn entity that can be identified, exists for some period of time, and consists of physical matter while it exists.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/MaterialEntity'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
MaterialSample Class#
- class dwca.classes.material_sample.MaterialSample(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA material entity that represents an entity of interest in whole or in part.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/MaterialSample'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
MeasurementOrFact Class#
- class dwca.classes.measurement_or_fact.MeasurementOrFact(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA measurement of or fact about a Resource.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/MeasurementOrFact'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
Occurrence Class#
- class dwca.classes.occurrence.Occurrence(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileAn existence of an organism at a particular place at a particular time.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/Occurrence'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
Organism Class#
- class dwca.classes.organism.Organism(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA particular organism or defined group of organisms considered to be taxonomically homogeneous.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/Organism'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
ResourceRelationship Class#
- class dwca.classes.resource_relationship.ResourceRelationship(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA relationship of one Resource to another.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/ResourceRelationship'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
Taxon Class#
- class dwca.classes.taxon.Taxon(_id: int, files: str, fields: List[Field], data_file_type: DataFileType = DataFileType.CORE, encoding: str = 'utf-8', lines_terminated_by: str = '\n', fields_terminated_by: str = ',', fields_enclosed_by: str = '', ignore_header_lines: int = 0)#
Bases:
DataFileA group of organisms considered by taxonomists to form a homogeneous unit.
- Parameters:
- _idint
Unique identifier for the core entity.
- filesstr
File location, in the archive, this is inside the zip file.
- fieldsList[Field]
A list of the Field (columns) in the Core data entity.
- data_file_type: DataFileType
The Data File Type in the Darwin Core Archive.
- encodingstr, optional
Encoding of the file location (files parameter), default is “utf-8”.
- lines_terminated_bystr, optional
Delimiter of lines on the file, default “\n”.
- fields_terminated_bystr, optional
Delimiter of the file (cells) on the file, default “,”.
- fields_enclosed_bystr, optional
Specifies the character used to enclose (mark the start and end of) each field, default empty “”.
- ignore_header_linesint, optional
Ignore headers at the start of document, can be one line or a list of them, default 0 (first line).
- Attributes:
- NAMESPACE_TAG
fieldsList[str]: List of terms of this data file.
filenamestr: Filename of the Data File entity.
idint: Column to be identified as primary key
insert_sqlGenerate the INSERT INTO sql statement and the values to be inserted.
namestr: The name of the field.
pandaspandas.DataFrame: Data of this DataFile as pandas.DataFrame.
polarspolars.DataFrame: Data of this DataFile as polars.DataFrame.
sql_tablestr: Data file as CREATE TABLE sql statement.
uristr: Unified Resource Identifier (URI) for the term.
Methods
add_field(field)Add a field to this Data File.
add_namespace(prefix, uri)Add a namespace to the XML object.
all_synonyms(taxa_id[, get_names])Get a list of all valid names of a list of taxa.
as_pandas([_no_interaction])Convert information in this DataFile in a pandas.DataFrame.
as_polars([_no_interaction])Convert information in this DataFile in a polars DataFrame.
check_principal_tag(tag, nmap)Overwrite due to different possible tags.
filter_by_class(classes[, fuzzy_threshold])Filter data by a valid class.
filter_by_family(families[, fuzzy_threshold])Filter data by a valid family.
filter_by_genus(genera[, fuzzy_threshold])Filter data by a valid genus.
filter_by_kingdom(kingdoms[, fuzzy_threshold])Filter data by a valid kingdoms.
filter_by_order(orders[, fuzzy_threshold])Filter data by a valid order.
filter_by_phylum(phyla[, fuzzy_threshold])Filter data by a valid phylum.
filter_by_species(species[, fuzzy_threshold])Filer data by species or any rank taxonomy below (subspecies, variety, form, etc.).
from_string(text)Generates XML Object from a string of an XML file.
from_xml(file[, encoding])Generates an XML Object from an XML file.
generate_sql_table()Generate the CREATE TABLE statement for SQL database.
get_parents(taxa_id)Get a list of taxa ids of the parent of the list taxa id provided.
get_principal_tag()Returns the principal tag with namespaces if it is present.
get_term_class(element)Extract the Python
classterm from an XML element instance.is_lazy()Check if data file load its data as a Lazy Frame.
object_to_element(tag[, prefix])Generates an element using tag, adding namespace tag.
parse(element, nmap)Parse an lxml.etree.Element into a concrete DataFile object.
parse_kwargs(element, nmap)Parse an lxml.etree.Element into the DataFile parameters.
read_file(content[, source_file, lazy, ...])Read the content of the file specified in files parameters (
filename()).set_core_field(field)Set the Core field in an Extension DataFile.
set_primary_key(primary_key)Set the primary key in an Extension DataFile to be referenced on the new SQL table.
to_element()Generate a XML Element instance
to_xml()Generates text of an XML file.
write_file([_no_interaction])Write the content as a text using format information on this object.
Entry
close
merge
- URI = 'http://rs.tdwg.org/dwc/terms/Taxon'#
str: Unified Resource Identifier (URI) for the term identifying the class of data.
- all_synonyms(taxa_id: Iterable[str], get_names: bool = False) List[str]#
Get a list of all valid names of a list of taxa.
- Parameters:
- taxa_idIterable[str]
A list (or iterable) of
dwca.terms.taxon.TaxonIDvalue.- get_namesbool
Whether to get
dwca.terms.taxon.ScientificNameordwca.terms.taxon.TaxonID.
- Returns:
- List[str]
A list of
dwca.terms.taxon.TaxonID.
- filter_by_class(classes: List[str], fuzzy_threshold: float = -1) None#
Filter data by a valid class.
- Parameters:
- classesList[str]
Class names to filter data.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- filter_by_family(families: List[str], fuzzy_threshold: float = -1) None#
Filter data by a valid family.
- Parameters:
- familiesList[str]
Family names to filter data.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- filter_by_genus(genera: List[str], fuzzy_threshold: float = -1) None#
Filter data by a valid genus.
- Parameters:
- generaList[str]
Class names to filter genus.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- filter_by_kingdom(kingdoms: List[str], fuzzy_threshold: float = -1) None#
Filter data by a valid kingdoms.
- Parameters:
- kingdomsList[str]
Kingdom names to filter data.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- filter_by_order(orders: List[str], fuzzy_threshold: float = -1) None#
Filter data by a valid order.
- Parameters:
- ordersList[str]
Order names to filter data.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- filter_by_phylum(phyla: List[str], fuzzy_threshold: float = -1) None#
Filter data by a valid phylum.
- Parameters:
- phylaList[str]
Phylum names to filter data.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- filter_by_species(species: List[str], fuzzy_threshold: float = -1) None#
Filer data by species or any rank taxonomy below (subspecies, variety, form, etc.).
In contrast with the other filter_by_ taxonomy methods, this one filter the taxonomic data using the scientific name field
dwca.terms.taxon.ScientificName.Warning
Because of that, use this method with precautions. If a scientific name of a rank above species (genus, order, etc…) is used, it could result in unexpected behaviour.
- Parameters:
- speciesList[str]
Scientific Name of species (or rank below) to filter data.
- fuzzy_thresholdfloat, optional
If given any value > 0 it will use Levenshtein Distance with that threshold instead of exact match.
- get_parents(taxa_id: List[str]) Set[str]#
Get a list of taxa ids of the parent of the list taxa id provided.
- Parameters:
- taxa_idList[str]
A list of
dwca.terms.taxon.TaxonIDto look for parents.
- Returns:
- Set[str]
Set of taxa ids.