What is Neo4j?
When it comes to graph databases, Neo4j is the leader in that field. Its particular features are that it has no strict shema (so called schema-less) and uses Cyper – its own query language which is effective for graph traversal and especially helpful in querying such data as financial transactions, network or logistics data and all kinds of relationships.
Unlike other databases, Neo4j stores information in the form of nodes (entities of any nature) and relationships, or connections between them. Nodes can have a label or even multiple ones while connections have just one. Both nodes and relationships have properties so the entities of the same kind can be easily filtered. For example, you can create a node with the labels:
person: student, person: teacher
Examples of connections’ labels: IS_A, WORKS_AT.
Due to the directional nature of the relationships, they always connect two nodes, a source one and a target one, sometimes it can be the same node. For graphs with data from multiple sources this structure is really beneficial and easy to use.
What is Knime?
Knime is a free open-source data analytics, reporting and integration platform where you can create data science projects without any blockers like paid access or limited features. It’s one of the most comprehensive platforms when you do machine learning, analytics, ETL, statistics, and much more through no-code solutions and data visualization.
More than 1000 data analytics routines are available, and all the workflows can be run through interactive interface and in a batch execution mode. This enables the process to be easily integrated into local job management.
With Knime, you can create multi-step data flows and execute some or all analysis steps on a periodic basis. Additional plugins make it easy to export data sets to document formats: doc, ppt, xls, pdf and others.
Integration between Neo4J and Knime
Overview of the Neo4j Extension
There are three nodes in the extension.
Neo4j Connection connects to the Neo4j instance via native BOLT protocol. Its settings are quite simple:
- URL, username and password. You can also use KNIME credentials if you created them in the workflow.
Neo4j Writer — pushes data into the database. It has two modes:
- Script mode – script where the user provides a Cypher query. This query can be enriched with injections of node and relationship labels, their properties, Neo4j functions and variables.
- Table mode – this mode is to run a series of queries from the table. There are 2 query execution modes: asynchronous or sequential.
Both modes support fault-tolerant execution mode.
Neo4j Reader is used for data extraction from the database. Like Writer node, it has two modes: script and query from table. This node can match the data types between Neo4j and KNIME and convert primitive types (String, Integer, Double, etc), dates and collections. Though you can return JSON, it will be helpful if you want to extract complicated or generic data structures.
Our team will help you integrate Neo4j and Knime.