API Reference¶
This is a reference guide for the modules, classes, and functions in ComptoxAI’s Python interface. For a more general overview of ComptoxAI, computational toxicology, and graph databases, please refer to the User Guide.
Note
This is the API documentation for ComptoxAI’s Python package. If you are looking for documentation for the REST web API, please see REST Web API in the User Guide.
comptox_ai.db
: ComptoxAI’s graph database¶
Tools to access, query, and export data from ComptoxAI’s Neo4j graph database.
User Guide: See Graph Databases and :ref:`` for further details.
- class comptox_ai.db.GraphDB(username=None, password=None, hostname=None, verbose=False)¶
A Neo4j graph database containing ComptoxAI graph data.
- Parameters:
- verbose: bool, default True
Sets verbosity to on or off. If True, status information will be returned to the user occasionally.
Methods
build_graph_cypher_projection
(graph_name, ...)Create a new graph in the Neo4j Graph Catalog via a Cypher projection.
build_graph_native_projection
(graph_name, ...)Create a new graph in the Neo4j Graph Catalog via a native projection.
convert_ids
(node_type, from_id, to_id, ids)Produce a mapping of IDs for a given node type from one terminology / database to another.
Delete all graphs currently stored in the GDS graph catalog.
drop_existing_graph
(graph_name)Delete a single graph from the GDS graph catalog by graph name.
export_graph
(graph_name[, to])Export a graph stored in the GDS graph catalog to a set of CSV files.
fetch
(field, operator, value[, what, ...])Create and execute a query to retrieve nodes, edges, or both.
fetch_chemical_list
(list_name)Fetch all chemicals that are members of a chemical list.
fetch_node_type
(node_label)Fetch an entire class of nodes from the Neo4j graph database.
fetch_nodes
(node_type, property, values)Fetch nodes by node property value.
fetch_relationships
(relationship_type, ...)Fetch edges (relationships) from the Neo4j graph database.
find_node
([name, properties])Find a single node either by name or by property filter(s).
find_nodes
([properties, node_types])Find multiple nodes by node properties and/or labels.
Find relationships by subject/object nodes and/or relationship type.
find_shortest_paths
(node1, node2[, cleaned])- Parameters:
Fetch statistics for the connected graph database.
Examine the graph and construct a metagraph, which describes all of the node types and relationship types in the overall graph database.
Fetch a list of projected subgraphs stored in the GDS graph catalog.
run_cypher
(qry_str[, verbose])Execute a Cypher query on the Neo4j graph database.
stream_named_graph
(graph_name)Stream a named GDS graph into Python for further processing.
- build_graph_cypher_projection(graph_name, node_query, relationship_query, config_dict=None)¶
Create a new graph in the Neo4j Graph Catalog via a Cypher projection.
Examples
>>> g = GraphDB() >>> g.build_graph_cypher_projection(...) >>>
- build_graph_native_projection(graph_name, node_types, relationship_types='all', config_dict=None)¶
Create a new graph in the Neo4j Graph Catalog via a native projection.
- Parameters:
- graph_namestr
- A (string) name for identifying the new graph. If a graph already exists
- with this name, a ValueError will be raised.
- node_projstr, list of str, or dict of
- Node projection for the new graph. This can be either a single node
- label, a list of node labels, or a node projection
Notes
ComptoxAI is meant to hide the implementation and usage details of graph databases from the user, but some advanced features do expose the syntax used in the Neo4j and MongoDB internals. This is especially true when building graph projections in the graph catalog. The following components
NODE PROJECTIONS:
(corresponding argument: `node_proj`)
Node projections take the following format:
{
- <node-label-1>: {
label: <neo4j-label>, properties: <node-property-mappings>
}, <node-label-2>: {
label: <neo4j-label>, properties: <node-property-mappings>
}, // … <node-label-n>: {
label: <neo4j-label>, properties: <node-property-mappings>
}
}
where
node-label-i
is a name for a node label in the projected graph (it can be the same as or different from the label already in neo4j),neo4j-label
is a node label to match against in the graph database, andnode-property-mappings
are filters against Neo4j node properties, as defined below.NODE PROPERTY MAPPINGS:
RELATIONSHIP PROJECTIONS:
Examples
>>> g = GraphDB() >>> g.build_graph_native_projection( graph_name = "g1", node_proj = ['Gene', 'StructuralEntity'], relationship_proj = "*" ) >>>
- convert_ids(node_type, from_id, to_id, ids)¶
Produce a mapping of IDs for a given node type from one terminology / database to another.
- Parameters:
- node_typestr
Node type of the entities
- from_idstr
- to_idstr
- idslist of str
- drop_all_existing_graphs()¶
Delete all graphs currently stored in the GDS graph catalog.
- Returns:
- list
- A list of dicts describing the graphs that were dropped as a result of
- calling this method. The dicts follow the same format as one of the list
- elements returned by calling list_current_graphs().
- drop_existing_graph(graph_name)¶
Delete a single graph from the GDS graph catalog by graph name.
- Parameters:
- graph_namestr
- A name of a graph, corresponding to the `’graphName’` field in the
- graph’s entry within the GDS graph catalog.
- Returns:
- dict
- A dict object describing the graph that was dropped as a result of
- calling this method. The dict follows the same format as one of the list
- elements returned by calling list_current_graphs().
- export_graph(graph_name, to='db')¶
Export a graph stored in the GDS graph catalog to a set of CSV files.
- Parameters:
- graph_namestr
- A name of a graph, corresponding to the `’graphName’` field in the
- graph’s entry within the GDS graph catalog.
- fetch(field, operator, value, what='both', register_graph=True, negate=False, query_type='cypher', **kwargs)¶
Create and execute a query to retrieve nodes, edges, or both.
- Parameters:
- fieldstr
A property label.
- what{‘both’, ‘nodes’, edges’}
The type of objects to fetch from the graph database. Note that this functions independently from any subgraph registered in Neo4j during query execution - if register_graph is True, an induced subgraph will be registered in the database, but the components returned by this method call may be only the nodes or edges contained in that subgraph.
- filterstr
‘Cypher-like’ filter statement, equivalent to a WHERE clause used in a Neo4j Cypher query (analogous to SQL WHERE clauses).
- query_type{‘cypher’, ‘native’}
Whether to create a graph using a Cypher projection or a native projection. The ‘standard’ approach is to use a Cypher projection, but native projections can be (a.) more highly performant and (b.) easier for creating very large subgraphs (e.g., all nodes of several or more types that exist in all of ComptoxAI). See “Notes”, below, for more information, as well as https://neo4j.com/docs/graph-data-science/current/management-ops/graph-catalog-ops/#catalog-graph-create.
Warning
This function is incomplete and should not be used until we can fix its behavior. Specifically, Neo4j’s GDS library does not support non-numeric node or edge properties in any of its graph catalog-related subroutines.
- fetch_chemical_list(list_name)¶
Fetch all chemicals that are members of a chemical list.
- Parameters:
- list_namestr
- Name (or acronym) corresponding to a Chemical List in ComptoxAI’s graph
- database.
- Returns:
- list_datadict
- Metadata corresponding to the matched list
- chemicalslist of dict
- Chemical nodes that are members of the chemical list
- fetch_node_type(node_label)¶
Fetch an entire class of nodes from the Neo4j graph database.
- Parameters:
- node_labelstr
- Node label corresponding to a class of entities in the database.
- Returns:
- generator of dict
Warning
Since many entities may be members of a single class, users are cautioned that this method may take a very long time to run and/or be very demanding on computing resources.
- fetch_nodes(node_type, property, values)¶
Fetch nodes by node property value.
Allows users to filter by a single node type (i.e., ontology class).
- Parameters:
- node_typestr
- Node type on which to filter all results. Can speed up queries
- significantly.
- propertystr
- Node property to match against.
- valuesstr or list
- Value or list of values on which to match `property`.
- Returns:
- list of dict
- Each element in the list corresponds to a single node. If no matches are
- found in the database, an empty list will be returned.
- fetch_relationships(relationship_type, from_label, to_label)¶
Fetch edges (relationships) from the Neo4j graph database.
- find_node(name=None, properties=None)¶
Find a single node either by name or by property filter(s).
- find_nodes(properties={}, node_types=[])¶
Find multiple nodes by node properties and/or labels.
- Parameters:
- propertiesdict
- Dict of property values to match in the database query. Each key of
- `properties` should be a (case-sensitive) node property, and each value
- should be the value of that property (case- and type-sensitive).
- node_typeslist of str
- Case sensitive list of strings representing node labels (node types) to
- include in the results. Two or more node types in a single query may
- significantly increase runtime. When multiple node labels are given, the
- results will be the union of all property queries when applied
- Returns:
- generator of dict
- A generator containing dict representations of nodes matching the given
- query.
Notes
The value returned in the event of a successful query can be extremely large. To improve performance, the results are returned as a generator rather than a list.
- find_relationships()¶
Find relationships by subject/object nodes and/or relationship type.
- find_shortest_paths(node1, node2, cleaned=True)¶
- Parameters:
- node1comptox
- get_graph_statistics()¶
Fetch statistics for the connected graph database.
This method essentially calls APOC.meta.stats(); and formats the output.
- Returns:
- dict
- Dict of statistics describing the graph database.
- Raises:
- RuntimeError
- If not currently connected to a graph database or the APOC.meta
- procedures are not installed/available.
- get_metagraph()¶
Examine the graph and construct a metagraph, which describes all of the node types and relationship types in the overall graph database.
Notes
We currently don’t run this upon GraphDB instantiation, but it may be prudent to start doing that at some point in the future. It’s not an extremely quick operation, but it’s also not prohibitively slow.
- list_existing_graphs()¶
Fetch a list of projected subgraphs stored in the GDS graph catalog.
- Returns:
- list
- A list of graphs in the GDS graph catalog. If no graphs exist, this will
- be the empty list
[]
.
- run_cypher(qry_str, verbose=True)¶
Execute a Cypher query on the Neo4j graph database.
- Parameters:
- qry_strstr
- A string containing the Cypher query to run on the graph database server.
- Returns:
- list
- The data returned in response to the Cypher query.
Examples
>>> from comptox_ai.db import GraphDB >>> g = GraphDB() >>> g.run_cypher("MATCH (c:Chemical) RETURN COUNT(c) AS num_chems;") [{'num_chems': 719599}]
- stream_named_graph(graph_name)¶
Stream a named GDS graph into Python for further processing.
- Parameters:
- graph_namestr
- A name of a graph in the GDS catalog.
comptox_ai.graph
: Graphs¶
Tools for Python-based representation and manipulation of graphs (and/or subgraphs) extracted from the graph database.
User Guide: See Graphs for further details.
|
A graph representation of ComptoxAI data. |
Common format for graph data structures in ComptoxAI. |
comptox_ai.ml
: Machine learning models¶
Machine learning models designed (and tuned) with ComptoxAI’s graph database in mind. These include both shallow and deep models, and operate on both tabular and graph data.