API Reference

This is a reference guide for the modules, classes, and functions in ComptoxAI’s Python interface. For a more general overview of ComptoxAI, computational toxicology, and graph databases, please refer to the User Guide.

Note

This is the API documentation for ComptoxAI’s Python package. If you are looking for documentation for the REST web API, please see REST Web API in the User Guide.

comptox_ai.db: ComptoxAI’s graph database

Tools to access, query, and export data from ComptoxAI’s Neo4j graph database.

User Guide: See Graph Databases and :ref:`` for further details.

class comptox_ai.db.GraphDB(username=None, password=None, hostname=None, verbose=False)

A Neo4j graph database containing ComptoxAI graph data.

Parameters:
verbose: bool, default True

Sets verbosity to on or off. If True, status information will be returned to the user occasionally.

Methods

build_graph_cypher_projection(graph_name, ...)

Create a new graph in the Neo4j Graph Catalog via a Cypher projection.

build_graph_native_projection(graph_name, ...)

Create a new graph in the Neo4j Graph Catalog via a native projection.

convert_ids(node_type, from_id, to_id, ids)

Produce a mapping of IDs for a given node type from one terminology / database to another.

drop_all_existing_graphs()

Delete all graphs currently stored in the GDS graph catalog.

drop_existing_graph(graph_name)

Delete a single graph from the GDS graph catalog by graph name.

export_graph(graph_name[, to])

Export a graph stored in the GDS graph catalog to a set of CSV files.

fetch(field, operator, value[, what, ...])

Create and execute a query to retrieve nodes, edges, or both.

fetch_chemical_list(list_name)

Fetch all chemicals that are members of a chemical list.

fetch_node_type(node_label)

Fetch an entire class of nodes from the Neo4j graph database.

fetch_nodes(node_type, property, values)

Fetch nodes by node property value.

fetch_relationships(relationship_type, ...)

Fetch edges (relationships) from the Neo4j graph database.

find_node([name, properties])

Find a single node either by name or by property filter(s).

find_nodes([properties, node_types])

Find multiple nodes by node properties and/or labels.

find_relationships()

Find relationships by subject/object nodes and/or relationship type.

find_shortest_paths(node1, node2[, cleaned])

Parameters:

get_graph_statistics()

Fetch statistics for the connected graph database.

get_metagraph()

Examine the graph and construct a metagraph, which describes all of the node types and relationship types in the overall graph database.

list_existing_graphs()

Fetch a list of projected subgraphs stored in the GDS graph catalog.

run_cypher(qry_str[, verbose])

Execute a Cypher query on the Neo4j graph database.

stream_named_graph(graph_name)

Stream a named GDS graph into Python for further processing.

build_graph_cypher_projection(graph_name, node_query, relationship_query, config_dict=None)

Create a new graph in the Neo4j Graph Catalog via a Cypher projection.

Examples

>>> g = GraphDB()
>>> g.build_graph_cypher_projection(...)
>>> 
build_graph_native_projection(graph_name, node_types, relationship_types='all', config_dict=None)

Create a new graph in the Neo4j Graph Catalog via a native projection.

Parameters:
graph_namestr
A (string) name for identifying the new graph. If a graph already exists
with this name, a ValueError will be raised.
node_projstr, list of str, or dict of
Node projection for the new graph. This can be either a single node
label, a list of node labels, or a node projection

Notes

ComptoxAI is meant to hide the implementation and usage details of graph databases from the user, but some advanced features do expose the syntax used in the Neo4j and MongoDB internals. This is especially true when building graph projections in the graph catalog. The following components

NODE PROJECTIONS:

(corresponding argument: `node_proj`)

Node projections take the following format:

{
<node-label-1>: {

label: <neo4j-label>, properties: <node-property-mappings>

}, <node-label-2>: {

label: <neo4j-label>, properties: <node-property-mappings>

}, // … <node-label-n>: {

label: <neo4j-label>, properties: <node-property-mappings>

}

}

where node-label-i is a name for a node label in the projected graph (it can be the same as or different from the label already in neo4j), neo4j-label is a node label to match against in the graph database, and node-property-mappings are filters against Neo4j node properties, as defined below.

NODE PROPERTY MAPPINGS:

RELATIONSHIP PROJECTIONS:

Examples

>>> g = GraphDB()
>>> g.build_graph_native_projection(
graph_name = "g1",
node_proj = ['Gene', 'StructuralEntity'],
relationship_proj = "*"
)
>>> 
convert_ids(node_type, from_id, to_id, ids)

Produce a mapping of IDs for a given node type from one terminology / database to another.

Parameters:
node_typestr

Node type of the entities

from_idstr
to_idstr
idslist of str
drop_all_existing_graphs()

Delete all graphs currently stored in the GDS graph catalog.

Returns:
list
A list of dicts describing the graphs that were dropped as a result of
calling this method. The dicts follow the same format as one of the list
elements returned by calling list_current_graphs().
drop_existing_graph(graph_name)

Delete a single graph from the GDS graph catalog by graph name.

Parameters:
graph_namestr
A name of a graph, corresponding to the `’graphName’` field in the
graph’s entry within the GDS graph catalog.
Returns:
dict
A dict object describing the graph that was dropped as a result of
calling this method. The dict follows the same format as one of the list
elements returned by calling list_current_graphs().
export_graph(graph_name, to='db')

Export a graph stored in the GDS graph catalog to a set of CSV files.

Parameters:
graph_namestr
A name of a graph, corresponding to the `’graphName’` field in the
graph’s entry within the GDS graph catalog.
fetch(field, operator, value, what='both', register_graph=True, negate=False, query_type='cypher', **kwargs)

Create and execute a query to retrieve nodes, edges, or both.

Parameters:
fieldstr

A property label.

what{‘both’, ‘nodes’, edges’}

The type of objects to fetch from the graph database. Note that this functions independently from any subgraph registered in Neo4j during query execution - if register_graph is True, an induced subgraph will be registered in the database, but the components returned by this method call may be only the nodes or edges contained in that subgraph.

filterstr

‘Cypher-like’ filter statement, equivalent to a WHERE clause used in a Neo4j Cypher query (analogous to SQL WHERE clauses).

query_type{‘cypher’, ‘native’}

Whether to create a graph using a Cypher projection or a native projection. The ‘standard’ approach is to use a Cypher projection, but native projections can be (a.) more highly performant and (b.) easier for creating very large subgraphs (e.g., all nodes of several or more types that exist in all of ComptoxAI). See “Notes”, below, for more information, as well as https://neo4j.com/docs/graph-data-science/current/management-ops/graph-catalog-ops/#catalog-graph-create.

Warning

This function is incomplete and should not be used until we can fix its behavior. Specifically, Neo4j’s GDS library does not support non-numeric node or edge properties in any of its graph catalog-related subroutines.

fetch_chemical_list(list_name)

Fetch all chemicals that are members of a chemical list.

Parameters:
list_namestr
Name (or acronym) corresponding to a Chemical List in ComptoxAI’s graph
database.
Returns:
list_datadict
Metadata corresponding to the matched list
chemicalslist of dict
Chemical nodes that are members of the chemical list
fetch_node_type(node_label)

Fetch an entire class of nodes from the Neo4j graph database.

Parameters:
node_labelstr
Node label corresponding to a class of entities in the database.
Returns:
generator of dict

Warning

Since many entities may be members of a single class, users are cautioned that this method may take a very long time to run and/or be very demanding on computing resources.

fetch_nodes(node_type, property, values)

Fetch nodes by node property value.

Allows users to filter by a single node type (i.e., ontology class).

Parameters:
node_typestr
Node type on which to filter all results. Can speed up queries
significantly.
propertystr
Node property to match against.
valuesstr or list
Value or list of values on which to match `property`.
Returns:
list of dict
Each element in the list corresponds to a single node. If no matches are
found in the database, an empty list will be returned.
fetch_relationships(relationship_type, from_label, to_label)

Fetch edges (relationships) from the Neo4j graph database.

find_node(name=None, properties=None)

Find a single node either by name or by property filter(s).

find_nodes(properties={}, node_types=[])

Find multiple nodes by node properties and/or labels.

Parameters:
propertiesdict
Dict of property values to match in the database query. Each key of
`properties` should be a (case-sensitive) node property, and each value
should be the value of that property (case- and type-sensitive).
node_typeslist of str
Case sensitive list of strings representing node labels (node types) to
include in the results. Two or more node types in a single query may
significantly increase runtime. When multiple node labels are given, the
results will be the union of all property queries when applied
Returns:
generator of dict
A generator containing dict representations of nodes matching the given
query.

Notes

The value returned in the event of a successful query can be extremely large. To improve performance, the results are returned as a generator rather than a list.

find_relationships()

Find relationships by subject/object nodes and/or relationship type.

find_shortest_paths(node1, node2, cleaned=True)
Parameters:
node1comptox
get_graph_statistics()

Fetch statistics for the connected graph database.

This method essentially calls APOC.meta.stats(); and formats the output.

Returns:
dict
Dict of statistics describing the graph database.
Raises:
RuntimeError
If not currently connected to a graph database or the APOC.meta
procedures are not installed/available.
get_metagraph()

Examine the graph and construct a metagraph, which describes all of the node types and relationship types in the overall graph database.

Notes

We currently don’t run this upon GraphDB instantiation, but it may be prudent to start doing that at some point in the future. It’s not an extremely quick operation, but it’s also not prohibitively slow.

list_existing_graphs()

Fetch a list of projected subgraphs stored in the GDS graph catalog.

Returns:
list
A list of graphs in the GDS graph catalog. If no graphs exist, this will
be the empty list [].
run_cypher(qry_str, verbose=True)

Execute a Cypher query on the Neo4j graph database.

Parameters:
qry_strstr
A string containing the Cypher query to run on the graph database server.
Returns:
list
The data returned in response to the Cypher query.

Examples

>>> from comptox_ai.db import GraphDB
>>> g = GraphDB()
>>> g.run_cypher("MATCH (c:Chemical) RETURN COUNT(c) AS num_chems;")
[{'num_chems': 719599}]
stream_named_graph(graph_name)

Stream a named GDS graph into Python for further processing.

Parameters:
graph_namestr
A name of a graph in the GDS catalog.

comptox_ai.graph: Graphs

Tools for Python-based representation and manipulation of graphs (and/or subgraphs) extracted from the graph database.

User Guide: See Graphs for further details.

graph.Graph(data)

A graph representation of ComptoxAI data.

graph.io

Common format for graph data structures in ComptoxAI.

comptox_ai.ml: Machine learning models

Machine learning models designed (and tuned) with ComptoxAI’s graph database in mind. These include both shallow and deep models, and operate on both tabular and graph data.

Graph ML

Non-graph ML