Usage

A python package for handling and generating OBO.

class Obo(date: ~datetime.datetime | None = <factory>, data_version: str | None = None, force: bool = False)[source]

An OBO document.

ancestors(identifier: str) set[str][source]

Return a set of identifiers for parents of the given identifier.

auto_generated_by: ClassVar[str | None] = None

An annotation about how an ontology was generated

check_bioregistry_prefix: ClassVar[bool] = True

Should the prefix be validated against the Bioregistry?

classmethod cli(*args, default_rewrite: bool = False) Any[source]

Run the CLI for this class.

data_version: str | None = None

The ontology version

property date_formatted: str

Get the date as a formatted string.

descendants(identifier: str) set[str][source]

Return a set of identifiers for the children of the given identifier.

dynamic_version: ClassVar[bool] = False

Set to true for resources that are unversioned/very dynamic, like MGI

property edges_header: Sequence[str]

Header for the edges dataframe.

force: bool = False

Should this ontology be reloaded?

classmethod get_cls_cli(*, default_rewrite: bool = False) Command[source]

Get the CLI for this class.

get_edges_df(*, use_tqdm: bool = False) DataFrame[source]

Get an edges dataframe.

get_filtered_multixrefs_mapping(prefix: str, *, use_tqdm: bool = False) Mapping[str, list[str]][source]

Get filtered xrefs as a dictionary.

get_filtered_properties_df(prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, use_tqdm: bool = False) DataFrame[source]

Get a dataframe of terms’ identifiers to the given property’s values.

get_filtered_properties_mapping(prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, use_tqdm: bool = False) Mapping[str, str][source]

Get a mapping from a term’s identifier to the property.

Warning

Assumes there’s only one version of the property for each term.

get_filtered_properties_multimapping(prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, use_tqdm: bool = False) Mapping[str, list[str]][source]

Get a mapping from a term’s identifier to the property values.

get_filtered_relations_df(relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, use_tqdm: bool = False) DataFrame[source]

Get a specific relation from OBO.

get_filtered_xrefs_mapping(prefix: str, *, use_tqdm: bool = False) Mapping[str, str][source]

Get filtered xrefs as a dictionary.

get_graph()[source]

Get an OBO Graph object.

get_id_alts_mapping() Mapping[str, list[str]][source]

Get a mapping from identifiers to a list of alternative identifiers.

get_id_definition_mapping(*, use_tqdm: bool = False) Mapping[str, str][source]

Get a mapping from identifiers to definitions.

get_id_multirelations_mapping(typedef: TypeDef, *, use_tqdm: bool = False) Mapping[str, list[NormalizedNamableReference]][source]

Get a mapping from identifiers to a list of all references for the given relation.

get_id_name_mapping(*, use_tqdm: bool = False) Mapping[str, str][source]

Get a mapping from identifiers to names.

get_id_species_mapping(*, prefix: str | None = None, use_tqdm: bool = False) Mapping[str, str][source]

Get a mapping from identifiers to species.

get_id_synonyms_mapping(*, use_tqdm: bool = False) Mapping[str, list[str]][source]

Get a mapping from identifiers to a list of sorted synonym strings.

get_ids(*, use_tqdm: bool = False) set[str][source]

Get the set of identifiers.

get_literal_mappings() Iterable[LiteralMapping][source]

Get literal mappings in a standard data model.

get_literal_mappings_df() DataFrame[source]

Get a literal mappings dataframe.

get_literal_properties_df(*, use_tqdm: bool = False) DataFrame[source]

Get all properties as a dataframe.

get_mappings_df(*, use_tqdm: bool = False, include_subject_labels: bool = False, include_mapping_source_column: bool = False) DataFrame[source]

Get a dataframe with SSSOM extracted from the OBO document.

get_metadata() Mapping[str, Any][source]

Get metadata.

get_object_properties_df(*, use_tqdm: bool = False) DataFrame[source]

Get all properties as a dataframe.

get_obsolete(*, use_tqdm: bool = False) set[str][source]

Get the set of obsolete identifiers.

get_properties_df(*, use_tqdm: bool = False, drop_na: bool = True) DataFrame[source]

Get all properties as a dataframe.

get_relation(source_identifier: str, relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, target_prefix: str, *, use_tqdm: bool = False) str | None[source]

Get the value for a bijective relation mapping between this resource and a target resource.

>>> from pyobo.sources.hgnc import HGNCGetter
>>> obo = HGNCGetter()
>>> human_mapt_hgnc_id = "6893"
>>> mouse_mapt_mgi_id = "97180"
>>> assert mouse_mapt_mgi_id == obo.get_relation(human_mapt_hgnc_id, "ro:HOM0000017", "mgi")
get_relation_mapping(relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, target_prefix: str, *, use_tqdm: bool = False) Mapping[str, str][source]

Get a mapping from the term’s identifier to the target’s identifier.

Warning

Assumes there’s only one version of the property for each term.

Example usage: get homology between HGNC and MGI:

>>> from pyobo.sources.hgnc import HGNCGetter
>>> obo = HGNCGetter()
>>> human_mapt_hgnc_id = "6893"
>>> mouse_mapt_mgi_id = "97180"
>>> hgnc_mgi_orthology_mapping = obo.get_relation_mapping("ro:HOM0000017", "mgi")
>>> assert mouse_mapt_mgi_id == hgnc_mgi_orthology_mapping[human_mapt_hgnc_id]
get_relation_multimapping(relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, target_prefix: str, *, use_tqdm: bool = False) Mapping[str, list[str]][source]

Get a mapping from the term’s identifier to the target’s identifiers.

get_relations_df(*, use_tqdm: bool = False) DataFrame[source]

Get all relations from the OBO.

get_typedef_df(use_tqdm: bool = False) DataFrame[source]

Get a typedef dataframe.

get_typedef_id_name_mapping() Mapping[str, str][source]

Get a mapping from typedefs’ identifiers to names.

property hierarchy: DiGraph

A graph representing the parent/child relationships between the entities.

To get all children of a given entity, do:

from pyobo import get_ontology

obo = get_ontology("go")

identifier = "1905571"  # interleukin-10 receptor complex
is_complex = "0032991" in nx.descendants(obo.hierarchy, identifier)  # should be true
idspaces: ClassVar[Mapping[str, str] | None] = None

The idspaces used in the document

is_descendant(descendant: str, ancestor: str) bool[source]

Return if the given identifier is a descendent of the ancestor.

from pyobo import get_ontology

obo = get_ontology("go")

interleukin_10_complex = "1905571"  # interleukin-10 receptor complex
all_complexes = "0032991"
assert obo.is_descendant("1905571", "0032991")
iter_literal_properties(*, use_tqdm: bool = False) Iterable[tuple[str, str, str, str, str]][source]

Iterate over literal properties quads.

iter_object_properties(*, use_tqdm: bool = False) Iterable[tuple[str, str, str]][source]

Iterate over object property triples.

iter_only: ClassVar[bool] = False

For super-sized datasets that shouldn’t be read into memory

iter_relation_rows(use_tqdm: bool = False) Iterable[tuple[str, str, str, str, str]][source]

Iterate the relations’ rows.

iter_terms(force: bool = False) Iterable[Term][source]

Iterate over terms in this ontology.

iter_typedef_id_name() Iterable[tuple[str, str]][source]

Iterate over typedefs’ identifiers and their respective names.

iterate_alt_rows() Iterable[tuple[str, str]][source]

Iterate over pairs of terms’ primary identifiers and alternate identifiers.

iterate_alts() Iterable[tuple[Stanza, NormalizedNamableReference]][source]

Iterate over alternative identifiers.

iterate_edge_rows(use_tqdm: bool = False) Iterable[tuple[str, str, str]][source]

Iterate the edge rows.

iterate_edges(*, use_tqdm: bool = False) Iterable[tuple[Stanza, TypeDef, NormalizedNamableReference]][source]

Iterate over triples of terms, relations, and their targets.

iterate_filtered_properties(prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, use_tqdm: bool = False) Iterable[tuple[Stanza, str]][source]

Iterate over tuples of terms and the values for the given property.

iterate_filtered_relations(relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, use_tqdm: bool = False) Iterable[tuple[Stanza, NormalizedNamableReference]][source]

Iterate over tuples of terms and ther targets for the given relation.

iterate_filtered_relations_filtered_targets(relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, target_prefix: str, *, use_tqdm: bool = False) Iterable[tuple[Stanza, NormalizedNamableReference]][source]

Iterate over relationships between one identifier and another.

iterate_filtered_xrefs(prefix: str, *, use_tqdm: bool = False) Iterable[tuple[Stanza, NormalizedNamableReference]][source]

Iterate over xrefs to a given prefix.

iterate_id_definition(*, use_tqdm: bool = False) Iterable[tuple[str, str]][source]

Iterate over pairs of terms’ identifiers and their respective definitions.

iterate_id_name(*, use_tqdm: bool = False) Iterable[tuple[str, str]][source]

Iterate identifier name pairs.

iterate_id_species(*, prefix: str | None = None, use_tqdm: bool = False) Iterable[tuple[str, str]][source]

Iterate over terms’ identifiers and respective species (if available).

iterate_ids(*, use_tqdm: bool = False) Iterable[str][source]

Iterate over identifiers.

iterate_literal_mapping_rows() Iterable[LiteralMappingTuple][source]

Iterate over literal mapping rows.

iterate_mapping_rows(*, use_tqdm: bool = False) Iterable[tuple[str, str, str, str, str, float | None, str | None]][source]

Iterate over SSSOM rows for mappings.

iterate_node_rows(sep: str = ';') Iterable[Sequence[str]][source]

Get a nodes iterator appropriate for serialization.

iterate_obo_lines(emit_object_properties: bool = True, emit_annotation_properties: bool = True) Iterable[str][source]

Iterate over the lines to write in an OBO file.

Here’s the order:

  1. format-version (technically, this is the only required field)

  2. data-version

  3. date

  4. saved-by

  5. auto-generated-by

  6. import

  7. subsetdef

  8. synonymtypedef

  9. default-namespace

  10. namespace-id-rule

  11. idspace

  12. treat-xrefs-as-equivalent

  13. treat-xrefs-as-genus-differentia

  14. treat-xrefs-as-relationship

  15. treat-xrefs-as-is_a

  16. remark

  17. ontology

iterate_properties(*, use_tqdm: bool = False) Iterable[tuple[Stanza, Annotation]][source]

Iterate over tuples of terms, properties, and their values.

iterate_references(*, use_tqdm: bool = False) Iterable[NormalizedNamableReference][source]

Iterate over identifiers.

iterate_relations(*, use_tqdm: bool = False) Iterable[tuple[Stanza, TypeDef, NormalizedNamableReference]][source]

Iterate over tuples of terms, relations, and their targets.

This only outputs stuff from the relationship: tag, not all possible triples. For that, see iterate_edges().

iterate_synonym_rows(*, use_tqdm: bool = False) Iterable[tuple[str, str]][source]

Iterate over pairs of identifier and synonym text.

iterate_synonyms(*, use_tqdm: bool = False) Iterable[tuple[Stanza, Synonym]][source]

Iterate over pairs of term and synonym object.

iterate_xrefs(*, use_tqdm: bool = False) Iterable[tuple[Stanza, NormalizedNamableReference]][source]

Iterate over xrefs.

property literal_properties_header

Property dataframe header.

name: ClassVar[str | None] = None

The name of the ontology. If not given, tries looking up with the Bioregistry.

property nodes_header: Sequence[str]

Get the header for nodes.

property object_properties_header

Property dataframe header.

property properties_header

Property dataframe header.

property relations_header: Sequence[str]

Header for the relations dataframe.

root_terms: ClassVar[list[NormalizedNamableReference] | None] = None

Root terms to use for the ontology

static_version: ClassVar[str | None] = None

Set to a static version for the resource (i.e., the resource is not itself versioned)

synonym_typedefs: ClassVar[list[SynonymTypeDef] | None] = None

Synonym type definitions

to_obonet(*, use_tqdm: bool = False) MultiDiGraph[source]

Export as a :mod`obonet` style graph.

typedefs: ClassVar[list[TypeDef] | None] = None

Type definitions

write_cache(*, force: bool = False) None[source]

Write cache parts.

write_default(use_tqdm: bool = False, force: bool = False, write_obo: bool = False, write_obonet: bool = False, write_obograph: bool = False, write_owl: bool = False, write_ofn: bool = False, write_ttl: bool = False, write_nodes: bool = False, obograph_use_internal: bool = False, write_cache: bool = True) None[source]

Write the OBO to the default path.

write_edges(path: str | Path) None[source]

Write a edges TSV file.

write_metadata() None[source]

Write the metadata JSON file.

write_nodes(path: str | Path) None[source]

Write a nodes TSV file.

write_obo(file: None | str | TextIO | Path = None, *, use_tqdm: bool = False, emit_object_properties: bool = True, emit_annotation_properties: bool = True) None[source]

Write the OBO to a file.

write_obograph(path: str | Path) None[source]

Write OBO Graph json.

write_obonet_gz(path: str | Path) None[source]

Write the OBO to a gzipped dump in Obonet JSON.

write_ofn(path: str | Path) None[source]

Write as Functional OWL (OFN).

write_prefix_map() None[source]

Write a prefix map file that includes all prefixes used in this ontology.

write_rdf(path: str | Path) None[source]

Write as Turtle RDF.

ontology: ClassVar[str]

The prefix for the ontology

date: datetime | None

The date the ontology was generated

Reference

alias of NormalizedNamableReference

class Synonym(name: str, specificity: ~typing.Literal['EXACT', 'NARROW', 'BROAD', 'RELATED'] | None = None, type: ~bioregistry.reference.NormalizedNamableReference | None = None, provenance: ~collections.abc.Sequence[~bioregistry.reference.NormalizedNamableReference | ~pyobo.struct.reference.OBOLiteral] = <factory>, annotations: list[~pyobo.struct.struct_utils.Annotation] = <factory>, language: str | None = None)[source]

A synonym with optional specificity and references.

language: str | None = None

Language tag for the synonym

property predicate: NamedReference

Get the specificity reference.

specificity: Literal['EXACT', 'NARROW', 'BROAD', 'RELATED'] | None = None

The specificity of the synonym

to_obo(ontology_prefix: str, synonym_typedefs: Mapping[ReferenceTuple, SynonymTypeDef] | None = None) str[source]

Write this synonym as an OBO line to appear in a [Term] stanza.

type: NormalizedNamableReference | None = None

The type of synonym. Must be defined in OBO document!

name: str

The string representing the synonym

provenance: Sequence[NormalizedNamableReference | OBOLiteral]

References to articles where the synonym appears

annotations: list[Annotation]

Extra annotations

class SynonymTypeDef(reference: NormalizedNamableReference, specificity: Literal['EXACT', 'NARROW', 'BROAD', 'RELATED'] | None = None)[source]

A type definition for synonyms in OBO.

to_obo(ontology_prefix: str) str[source]

Serialize to OBO.

class Term(reference: ~bioregistry.reference.NormalizedNamableReference, definition: str | None = None, relationships: dict[~bioregistry.reference.NormalizedNamableReference, list[~bioregistry.reference.NormalizedNamableReference]] = <factory>, _axioms: dict[~pyobo.struct.struct_utils.Annotation, list[~pyobo.struct.struct_utils.Annotation]] = <factory>, properties: dict[~bioregistry.reference.NormalizedNamableReference, list[~bioregistry.reference.NormalizedNamableReference | ~pyobo.struct.reference.OBOLiteral]] = <factory>, parents: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, intersection_of: list[~bioregistry.reference.NormalizedNamableReference | tuple[~bioregistry.reference.NormalizedNamableReference, ~bioregistry.reference.NormalizedNamableReference]] = <factory>, union_of: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, equivalent_to: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, disjoint_from: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, synonyms: list[~pyobo.struct.struct.Synonym] = <factory>, xrefs: list[~bioregistry.reference.NormalizedNamableReference] = <factory>, namespace: str | None = None, is_obsolete: bool | None = None, type: ~typing.Literal['Term', 'Instance', 'TypeDef'] = 'Term', builtin: bool | None = None, is_anonymous: bool | None = None, subsets: list[~bioregistry.reference.NormalizedNamableReference] = <factory>)[source]

A term in OBO.

append_exact_match(reference: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, *, mapping_justification: NormalizedNamableReference | None = None, confidence: float | None = None, contributor: NormalizedNamableReference | None = None) Self[source]

Append an exact match, also adding an xref.

append_see_also_uri(uri: str) Self[source]

Add a see also property.

classmethod default(prefix, identifier, name=None) Self[source]

Create a default term.

definition: str | None = None

A description of the entity

extend_parents(references: Collection[NormalizedNamableReference]) None[source]

Add a collection of parents to this entity.

extend_relationship(typedef: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, references: Iterable[NormalizedNamableReference]) None[source]

Append several relationships.

classmethod from_triple(prefix: str, identifier: str, name: str | None = None, definition: str | None = None, **kwargs) Term[source]

Create a term from a reference.

get_property(prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str) str | None[source]

Get a single property of the given key.

get_property_literals(prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str) list[str][source]

Get properties from the given key.

get_species(prefix: str = 'ncbitaxon') NormalizedNamableReference | None[source]

Get the species if it exists.

Parameters:

prefix – The prefix to use in case the term has several species annotations.

is_obsolete: bool | None = None

An annotation for obsolescence. By default, is None, but this means that it is not obsolete.

iterate_obo_lines(*, ontology_prefix: str, typedefs: Mapping[ReferenceTuple, TypeDef], synonym_typedefs: Mapping[ReferenceTuple, SynonymTypeDef] | None = None, emit_object_properties: bool = True, emit_annotation_properties: bool = True) Iterable[str][source]

Iterate over the lines to write in an OBO file.

namespace: str | None = None

The sub-namespace within the ontology

set_species(identifier: str, name: str | None = None) Self[source]

Append the from_species relation.

reference: Reference

The primary reference for the entity

relationships: RelationsHint

Object properties

parents: list[Reference]

Relationships with the default “is_a”

synonyms: list[Synonym]

Synonyms of this term

xrefs: list[Reference]

Database cross-references, see get_mappings() for access to all mappings in an SSSOM-like interface

class TypeDef(reference: ~typing.Annotated[~bioregistry.reference.NormalizedNamableReference, 1], is_anonymous: ~typing.Annotated[bool | None, 2] = None, namespace: ~typing.Annotated[str | None, 4] = None, definition: ~typing.Annotated[str | None, 6] = None, comment: ~typing.Annotated[str | None, 7] = None, subsets: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 8] = <factory>, synonyms: ~typing.Annotated[list[~pyobo.struct.struct.Synonym], 9] = <factory>, xrefs: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 10] = <factory>, _axioms: dict[~pyobo.struct.struct_utils.Annotation, list[~pyobo.struct.struct_utils.Annotation]] = <factory>, properties: ~typing.Annotated[dict[~bioregistry.reference.NormalizedNamableReference, list[~bioregistry.reference.NormalizedNamableReference | ~pyobo.struct.reference.OBOLiteral]], 11] = <factory>, domain: ~typing.Annotated[~bioregistry.reference.NormalizedNamableReference | None, 12, 'typedef-only'] = None, range: ~typing.Annotated[~bioregistry.reference.NormalizedNamableReference | None, 13, 'typedef-only'] = None, builtin: ~typing.Annotated[bool | None, 14] = None, holds_over_chain: ~typing.Annotated[list[list[~bioregistry.reference.NormalizedNamableReference]], 15, 'typedef-only'] = <factory>, is_anti_symmetric: ~typing.Annotated[bool | None, 16, 'typedef-only'] = None, is_cyclic: ~typing.Annotated[bool | None, 17, 'typedef-only'] = None, is_reflexive: ~typing.Annotated[bool | None, 18, 'typedef-only'] = None, is_symmetric: ~typing.Annotated[bool | None, 19, 'typedef-only'] = None, is_transitive: ~typing.Annotated[bool | None, 20, 'typedef-only'] = None, is_functional: ~typing.Annotated[bool | None, 21, 'typedef-only'] = None, is_inverse_functional: ~typing.Annotated[bool | None, 22, 'typedef-only'] = None, parents: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 23] = <factory>, intersection_of: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference | tuple[~bioregistry.reference.NormalizedNamableReference, ~bioregistry.reference.NormalizedNamableReference]], 24] = <factory>, union_of: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 25] = <factory>, equivalent_to: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 26] = <factory>, disjoint_from: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 27] = <factory>, inverse: ~typing.Annotated[~bioregistry.reference.NormalizedNamableReference | None, 28, 'typedef-only'] = None, transitive_over: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 29, 'typedef-only'] = <factory>, equivalent_to_chain: ~typing.Annotated[list[list[~bioregistry.reference.NormalizedNamableReference]], 30, 'typedef-only'] = <factory>, disjoint_over: ~typing.Annotated[list[~bioregistry.reference.NormalizedNamableReference], 31] = <factory>, relationships: ~typing.Annotated[dict[~bioregistry.reference.NormalizedNamableReference, list[~bioregistry.reference.NormalizedNamableReference]], 32] = <factory>, is_obsolete: ~typing.Annotated[bool | None, 33] = None, created_by: ~typing.Annotated[str | None, 34] = None, creation_date: ~typing.Annotated[~datetime.datetime | None, 35] = None, is_metadata_tag: ~typing.Annotated[bool | None, 40, 'typedef-only'] = None, is_class_level: ~typing.Annotated[bool | None, 41] = None, type: ~typing.Literal['Term', 'Instance', 'TypeDef'] = 'TypeDef')[source]

A type definition in OBO.

See the subsection of https://owlcollab.github.io/oboformat/doc/GO.format.obo-1_4.html#S.2.2.

classmethod default(prefix: str, identifier: str, *, name: str | None = None, is_metadata_tag: bool) Self[source]

Construct a default type definition from within the OBO namespace.

definition: Annotated[str | None, 6] = None

A description of the entity

classmethod from_triple(prefix: str, identifier: str, name: str | None = None) TypeDef[source]

Create a typedef from a reference.

is_metadata_tag: Annotated[bool | None, 40, 'typedef-only'] = None

Whether this relationship is a metadata tag. Properties that are marked as metadata tags are used to record object metadata. Object metadata is additional information about an object that is useful to track, but does not impact the definition of the object or how it should be treated by a reasoner. Metadata tags might be used to record special term synonyms or structured notes about a term, for example.

is_obsolete: Annotated[bool | None, 33] = None

An annotation for obsolescence. By default, is None, but this means that it is not obsolete.

iterate_obo_lines(ontology_prefix: str, synonym_typedefs: Mapping[ReferenceTuple, SynonymTypeDef] | None = None, typedefs: Mapping[ReferenceTuple, TypeDef] | None = None) Iterable[str][source]

Iterate over the lines to write in an OBO file.

Parameters:

ontology_prefix – The prefix of the ontology into which the type definition is being written. This is used for compressing builtin identifiers

Yield:

The lines to write to an OBO file

S.3.5.5 of the OBO Flat File Specification v1.4 says tags should appear in the following order:

  1. id

  2. is_anonymous

  3. name

  4. namespace

  5. alt_id

  6. def

  7. comment

  8. subset

  9. synonym

  10. xref

  11. property_value

  12. domain

  13. range

  14. builtin

  15. holds_over_chain

  16. is_anti_symmetric

  17. is_cyclic

  18. is_reflexive

  19. is_symmetric

  20. is_transitive

  21. is_functional

  22. is_inverse_functional

  23. is_a

  24. intersection_of

  25. union_of

  26. equivalent_to

  27. disjoint_from

  28. inverse_of

  29. transitive_over

  30. equivalent_to_chain

  31. disjoint_over

  32. relationship

  33. is_obsolete

  34. created_by

  35. creation_date

  36. replaced_by

  37. consider

  38. expand_assertion_to

  39. expand_expression_to

  40. is_metadata_tag

  41. is_class_level

disjoint_over: Annotated[list[Reference], 31]

From the OBO spec:

For example: spatially_disconnected_from is disjoint_over part_of, in that two disconnected entities have no parts in common. This can be translated to OWL as: disjoint_over(R S), R(A B) ==> (S some A) disjointFrom (S some B)

default_reference(prefix: str, identifier: str, name: str | None = None) NormalizedNamableReference[source]

Create a CURIE for an “unqualified” reference.

Parameters:
  • prefix – The prefix of the ontology in which the “unqualified” reference is made

  • identifier – The “unqualified” reference. For example, if you just write “located_in” somewhere there is supposed to be a CURIE

Returns:

A CURIE for the “unqualified” reference based on the OBO semantic space

>>> default_reference("chebi", "conjugate_base_of")
Reference(prefix="obo", identifier="chebi#conjugate_base_of", name=None)
ensure_path(prefix: str, *parts: str, url: str, version: None | str | Callable[[], str | None] = None, name: str | None = None, force: bool = False, backend: Literal['requests', 'urllib'] = 'urllib', verify: bool = True, **download_kwargs: Any) Path[source]

Download a file if it doesn’t exist.

from_obo_path(path: str | Path, prefix: str | None = None, *, strict: bool = False, version: str | None, upgrade: bool = True, use_tqdm: bool = False, ignore_obsolete: bool = False, _cache_path: Path | None = None) Obo[source]

Get the OBO graph from a path.

from_obonet(graph: MultiDiGraph, *, strict: bool = False, version: str | None = None, upgrade: bool = True, use_tqdm: bool = False) Obo[source]

Get all of the terms from a OBO graph.

get_alts_to_id(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get alternative id to primary id mapping.

get_ancestors(prefix: str | NormalizedNamableReference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[HierarchyKwargs]) set[NormalizedNamableReference] | None[source]

Get all the ancestors (parents) of the term as CURIEs.

get_children(prefix: str | NormalizedNamableReference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[HierarchyKwargs]) set[NormalizedNamableReference] | None[source]

Get all the descendants (children) of the term as CURIEs.

get_definition(prefix: str | Reference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Get the definition for an entity.

get_descendants(prefix: str | NormalizedNamableReference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[HierarchyKwargs]) set[NormalizedNamableReference] | None[source]

Get all the descendants (children) of the term as CURIEs.

get_edges(prefix, **kwargs: Unpack[GetOntologyKwargs]) list[tuple[NormalizedNamableReference, NormalizedNamableReference, NormalizedNamableReference]][source]

Get a list of edge triples.

get_edges_df(prefix, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get a dataframe of edges triples.

get_filtered_properties_df(prefix: str, prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Extract a single property for each term.

Parameters:
  • prefix – the resource to load

  • prop – the property to extract

Returns:

A dataframe from identifier to property value. Columns are [<prefix>_id, value].

get_filtered_properties_mapping(prefix: str, prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Extract a single property for each term as a dictionary.

Parameters:
  • prefix – the resource to load

  • prop – the property to extract

Returns:

A mapping from identifier to property value

get_filtered_properties_multimapping(prefix: str, prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, list[str]][source]

Extract multiple properties for each term as a dictionary.

Parameters:
  • prefix – the resource to load

  • prop – the property to extract

Returns:

A mapping from identifier to property values

get_filtered_relations_df(prefix: str, relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get all the given relation.

get_filtered_xrefs(prefix: str, xref_prefix: str, *, flip: bool = False, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get xrefs to a given target.

get_graph(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) DiGraph[source]

Get the relation graph.

get_grounder(prefixes: str | Iterable[str], *, grounder_cls: type[gilda.Grounder] | None = None, versions: None | str | Iterable[str | None] | dict[str, str] = None, skip_obsolete: bool = False, **kwargs: Unpack[GetOntologyKwargs]) ssslm.Grounder[source]

Get a grounder for the given prefix(es).

get_hierarchy(prefix: str, *, extra_relations: Iterable[NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str] | None = None, properties: Iterable[NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str] | None = None, **kwargs: Unpack[HierarchyKwargs]) DiGraph[source]

Get hierarchy of parents as a directed graph.

Parameters:
  • prefix – The name of the namespace.

  • include_part_of – Add “part of” relations. Only works if the relations are properly defined using bfo:0000050 ! part of or bfo:0000051 ! has part

  • include_has_member – Add “has member” relations. These aren’t part of the BFO, but are hacked into PyOBO using pyobo.struct.typedef.has_member for relationships like from protein families to their actual proteins.

  • extra_relations – Other relations that you want to include in the hierarchy. For example, it might be useful to include the positively_regulates

  • properties – Properties to include in the data part of each node. For example, might want to include SMILES strings with the ChEBI tree.

  • force – should the resources be reloaded when extracting relations?

Returns:

A directional graph representing the hierarchy

This function thinly wraps _get_hierarchy_helper() to make it easier to work with the lru_cache mechanism.

get_id_definition_mapping(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get a mapping of descriptions.

get_id_multirelations_mapping(prefix: str, typedef: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, list[NormalizedNamableReference]][source]

Get the OBO file and output a synonym dictionary.

get_id_name_mapping(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get an identifier to name mapping for the OBO file.

get_id_species_mapping(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get an identifier to species mapping.

get_id_synonyms_mapping(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, list[str]][source]

Get the OBO file and output a synonym dictionary.

get_id_to_alts(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, list[str]][source]

Get alternate identifiers.

get_ids(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) set[str][source]

Get the set of identifiers for this prefix.

get_literal_mappings(prefix: str, *, skip_obsolete: bool = False, **kwargs: Unpack[GetOntologyKwargs]) list[LiteralMapping][source]

Get literal mappings.

get_literal_mappings_df(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get a literal mappings dataframe.

get_literal_mappings_subset(prefix: str, ancestors: Reference | Sequence[Reference], *, skip_obsolete: bool = False, **kwargs: Unpack[GetOntologyKwargs]) list[LiteralMapping][source]

Get a subset of literal mappings under the given ancestors.

get_literal_properties(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) list[tuple[NormalizedNamableReference, NormalizedNamableReference, OBOLiteral]][source]

Get a list of literal property triples.

get_literal_properties_df(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get a dataframe of literal property quads.

get_mappings_df(prefix: str | Obo, *, names: bool = True, include_mapping_source_column: bool = False, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get semantic mappings from a source as an SSSOM dataframe.

Parameters:
  • prefix – The ontology to look in for xrefs

  • names – Add name columns (subject_label and object_label)

Returns:

A SSSOM-compliant dataframe of xrefs

For example, if you want to get UMLS as an SSSOM dataframe, you can do

import pyobo

df = pyobo.get_mappings_df("umls")
df.to_csv("umls.sssom.tsv", sep="\t", index=False)

If you don’t want to get all of the many resources required to add names, you can pass names=False

import pyobo

df = pyobo.get_mappings_df("umls", names=False)
df.to_csv("umls.sssom.tsv", sep="\t", index=False)

Note

This assumes the Bioregistry as the prefix map

get_metadata(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) dict[str, Any][source]

Get metadata for the ontology.

get_name(prefix: str | Reference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Get the name for an entity.

get_name_by_curie(curie: str, **kwargs: Any) str | None[source]

Get the name for a CURIE, if possible.

get_name_id_mapping(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get a name to identifier mapping for the OBO file.

get_object_properties(prefix, **kwargs: Unpack[GetOntologyKwargs]) list[tuple[NormalizedNamableReference, NormalizedNamableReference, NormalizedNamableReference]][source]

Get a list of object property triples.

get_object_properties_df(prefix, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get a dataframe of object property triples.

get_obsolete(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) set[str][source]

Get the set of obsolete local unique identifiers.

get_ontology(prefix: str, *, force: bool = False, force_process: bool = False, strict: bool = False, version: str | None = None, robot_check: bool = True, upgrade: bool = True, cache: bool = True, use_tqdm: bool = True) Obo[source]

Get the OBO for a given graph.

Parameters:
  • prefix – The prefix of the ontology to look up

  • version – The pre-looked-up version of the ontology

  • force – Download the data again

  • force_process – Should the OBO cache be rewritten? Automatically set to true if force is true

  • strict – Should CURIEs be treated strictly? If true, raises exceptions on invalid/malformed

  • robot_check – If set to false, will send the --check=false command to ROBOT to disregard malformed ontology components. Necessary to load some ontologies like VO.

  • upgrade – If set to true, will automatically upgrade relationships, such as obo:chebi#part_of to BFO:0000051

  • cache – Should cached objects be written? defaults to True

Returns:

An OBO object

Raises:

OnlyOWLError – If the OBO foundry only has an OWL document for this resource.

Alternate usage if you have a custom url

from pystow.utils import download
from pyobo import Obo, from_obo_path

url = ...
obo_path = ...
download(url=url, path=path)
obo = from_obo_path(path)
get_primary_curie(prefix: str | Reference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Get the primary curie for an entity.

get_primary_identifier(prefix: str | Reference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[GetOntologyKwargs]) str[source]

Get the primary identifier for an entity.

Parameters:
  • prefix – The name of the resource

  • identifier – The identifier to look up

Returns:

the canonical identifier based on alt id lookup

Returns the original identifier if there are no alts available or if there’s no mapping.

get_properties(prefix: str, identifier: str, prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) list[str] | None[source]

Extract a set of properties for the given entity.

Parameters:
  • prefix – the resource to load

  • identifier – the identifier withing the resource

  • prop – the property to extract

Returns:

Multiple values for the property. If only one is expected, use get_property()

get_properties_df(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Extract properties.

Parameters:

prefix – the resource to load

Returns:

A dataframe with the properties

get_property(prefix: str, identifier: str, prop: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Extract a single property for the given entity.

Parameters:
  • prefix – the resource to load

  • identifier – the identifier withing the resource

  • prop – the property to extract

Returns:

The single value for the property. If multiple are expected, use get_properties()

>>> import pyobo
>>> pyobo.get_property("chebi", "132964", "http://purl.obolibrary.org/obo/chebi/smiles")
"C1(=CC=C(N=C1)OC2=CC=C(C=C2)O[C@@H](C(OCCCC)=O)C)C(F)(F)F"
get_references(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) set[NormalizedNamableReference][source]

Get the set of identifiers for this prefix.

get_relation(prefix: str, source_identifier: str, relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, target_prefix: str, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Get the target identifier corresponding to the given relationship from the source prefix/identifier pair.

Warning

Assumes there’s only one version of the property for each term.

Example usage: get homology between MAPT in HGNC and MGI:

>>> import pyobo
>>> human_mapt_hgnc_id = "6893"
>>> mouse_mapt_mgi_id = "97180"
>>> assert mouse_mapt_mgi_id == pyobo.get_relation(
...     "hgnc", human_mapt_hgnc_id, "ro:HOM0000017", "mgi"
... )
get_relation_mapping(prefix: str, relation: NormalizedNamableReference | Referenced | Reference | NamedReference | tuple[str, str] | str, target_prefix: str, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str][source]

Get relations from identifiers in the source prefix to target prefix with the given relation.

Warning

Assumes there’s only one version of the property for each term.

Example usage: get homology between HGNC and MGI:

>>> import pyobo
>>> human_mapt_hgnc_id = "6893"
>>> mouse_mapt_mgi_id = "97180"
>>> hgnc_mgi_orthology_mapping = pyobo.get_relation_mapping("hgnc", "ro:HOM0000017", "mgi")
>>> assert mouse_mapt_mgi_id == hgnc_mgi_orthology_mapping[human_mapt_hgnc_id]
get_relations_df(prefix: str, *, wide: bool = False, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get all relations from the OBO.

get_species(prefix: str | Reference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Get the species.

get_sssom_df(prefix: str | Obo, *, names: bool = True, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get an SSSOM dataframe, replaced by get_mappings_df().

get_subhierarchy(prefix: str | NormalizedNamableReference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[HierarchyKwargs]) DiGraph[source]

Get the subhierarchy for a given node.

get_synonyms(prefix: str | Reference | ReferenceTuple, identifier: str | None = None, /, **kwargs: Unpack[GetOntologyKwargs]) list[str] | None[source]

Get the synonyms for an entity.

get_typedef_df(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get an identifier to name mapping for the typedefs in an OBO file.

get_version(with_git_hash: bool = False) str[source]

Get the PyOBO version string, including a git hash.

get_xref(prefix: str, identifier: str, new_prefix: str, *, flip: bool = False, **kwargs: Unpack[GetOntologyKwargs]) str | None[source]

Get the xref with the new prefix if a direct path exists.

get_xrefs(prefix: str, xref_prefix: str, *, flip: bool = False, **kwargs: Unpack[GetOntologyKwargs]) Mapping[str, str]

Get xrefs to a given target.

get_xrefs_df(prefix: str, **kwargs: Unpack[GetOntologyKwargs]) DataFrame[source]

Get all xrefs.

ground(prefix: str | Iterable[str], query: str, **kwargs: Unpack[GetOntologyKwargs]) NormalizedNamableReference | None[source]

Normalize a string given the prefix’s labels and synonyms.

Parameters:
  • prefix – If a string, only grounds against that namespace. If a list, will try grounding against all in that order

  • query – The string to try grounding

has_ancestor(prefix: str | NormalizedNamableReference, identifier: str | NormalizedNamableReference, ancestor_prefix: str | None = None, ancestor_identifier: str | None = None, /, **kwargs: Unpack[HierarchyKwargs]) bool[source]

Check that the first identifier has the second as an ancestor.

Parameters:
  • prefix – The prefix for the descendant

  • identifier – The local unique identifier for the descendant

  • ancestor_prefix – The prefix for the ancestor

  • ancestor_identifier – The local unique identifier for the ancestor

  • kwargs – Keyword arguments for get_hierarchy()

Returns:

If the decendant has the given ancestor

Check that GO:0008219 (cell death) is an ancestor of GO:0006915 (apoptotic process):

>>> apoptosis = Reference.from_curie("GO:0006915", name="apoptotic process")
>>> cell_death = Reference.from_curie("GO:0008219", name="cell death")
>>> assert has_ancestor(apoptosis, cell_death)

The same, using the deprecated argumentation style:

>>> assert has_ancestor("go", "0006915", "go", "0008219")
has_nomenclature_plugin(prefix: str) bool[source]

Check if there’s a plugin for converting the prefix.

is_descendent(prefix: str | NormalizedNamableReference, identifier: str | NormalizedNamableReference, ancestor_prefix: str | None = None, ancestor_identifier: str | None = None, /, **kwargs: Unpack[HierarchyKwargs]) bool[source]

Check that the first identifier has the second as a descendent.

Parameters:
  • prefix – The prefix for the descendant

  • identifier – The local unique identifier for the descendant

  • ancestor_prefix – The prefix for the ancestor

  • ancestor_identifier – The local unique identifier for the ancestor

  • kwargs – Keyword arguments for get_hierarchy()

Returns:

If the decendant has the given ancestor

Check that GO:0070246 (natural killer cell apoptotic process) is a descendant of GO:0006915 (apoptotic process)

>>> nk_apoptosis = Reference.from_curie(
...     "GO:0070246", name="natural killer cell apoptotic process"
... )
>>> apoptosis = Reference.from_curie("GO:0006915", name="apoptotic process")
>>> assert is_descendent(nk_apoptosis, apoptosis)

Using deprecated old-style arguments:

>>> assert is_descendent("go", "0070246", "go", "0006915")
iter_nomenclature_plugins() Iterable[Obo][source]

Get all modules in the PyOBO sources.

parse_results_from_obo(obo: Obo) ParseResults[source]

Get parse results from an OBO graph.

run_nomenclature_plugin(prefix: str, version: str | None = None) Obo[source]

Get a converted PyOBO source.