Quick Start¶
In the quick guide here, we show how to install CAG, create a sample graph from a datasource and extend the graph with annotation nodes.
Installation
Graph creation:
GraphCreatorBase
: creates or updates a graph from a datasourceGraph annotation:
AnnotatorBase
: annotate (extend) graphs by running various algorithms over them
Installation¶
You can download the latest stable version of CAG from Pypi.
pip install cag
Or, to install the latest fixes, you can install CAG from GitHub:
pip install git+https://github.com/DLR-SC/corpus-annotation-graph-builder.git
Requirement¶
Make sure you have an ArangoDB instance up and running. You can connect to it using the cag.utils.config.Config
utilities as shown in the code block below. your config instance will be used later on when you initiated you Graph Creator.
import cag.utils.config as config
my_config = config.Config(
url="http://127.0.0.1:8529", # URL to your ArangoDB instance
user="root", # ArangoDB username
password="root", # ArangoDB password
database="_system", # The database name - DB will be created if it does not already exist
graph="MyCagGraph", # Ypur graph name - A new graph will be created if it does not already exist
)
Build your First Graph¶
In this section we show a quick example on how to create your graph from a datasource using CAG. For a more deailed guide, refer to the Graph-Creator Section.
Your GraphCreator inherits from the class cag.framework.creator.base_creator.GraphCreatorBase
which is an abstract class; it enforces the implementation of two functions: one for initializing (init_graph()) the graph from a datasource and one for updating it (init_graph()). In jupyter notebook here, we create a sample graph.
Each GraphCreator would look as follows:
from cag.framework.creator.base_creator import GraphCreatorBase
class MyDatasourceXGraphCreator(GraphCreatorBase):
_name = "DatasourceX"
_description = "This is the description of my DatasourceX"
# Here you define a sub-ontology of your graph
# - only the ontology related to your DatasourceX
# 1. Define Nodes not created in CAG
# TODO
# 2. Define relations
# TODO
def init_graph(self):
# Loop over each entry of your dataset and an load it to the graph
# use the following to insert a node or edge (respectively):
# GENERIC functions: `self.upsert_node(name, attrs_dict, alt_key)`,
# `self.upsert_edge(relation_name,
# from_node, to_node, attrs_dict)`
# Specific functions: `self.create_author_node()`, `self.create_author_node()`
pass
def update_graph(self, timestamp):
return self.init_graph()
Annotate your First Graph¶
In *annotation* jupyter notebook, we create an annotation pipeline and .