APOC Pearls Michael Hunger Developer Relations Engineering, Neo4j - - PowerPoint PPT Presentation

apoc pearls
SMART_READER_LITE
LIVE PREVIEW

APOC Pearls Michael Hunger Developer Relations Engineering, Neo4j - - PowerPoint PPT Presentation

APOC Pearls Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii APOC Unicorns Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii All Images by TeeTurtle.com & Unstable Unicorns Power Up


slide-1
SLIDE 1

APOC Pearls

Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii

slide-2
SLIDE 2

APOC Unicorns

Michael Hunger Developer Relations Engineering, Neo4j Follow @mesirii

slide-3
SLIDE 3

All Images by TeeTurtle.com & Unstable Unicorns

slide-4
SLIDE 4

Power Up

slide-5
SLIDE 5

Backercorns: https://unstable-unicorns.backerkit.com/hosted_preorders/project_updates?page=4 https://www.kickstarter.com/projects/ramybadie/unstable-unicorns-control-and-chaos-the-back ercorn/posts/2271771

slide-6
SLIDE 6

Extending Neo4j

User Defined Procedures let you write custom code that is:

  • Written in any JVM language
  • Deployed to the Database
  • Accessed by applications via Cypher
slide-7
SLIDE 7

Extending Neo4j

Neo4j Execution Engine User Defined Procedure Applications Bolt

User Defined Procedures let you write custom code that is:

  • Written in any JVM language
  • Deployed to the Database
  • Accessed by applications via Cypher
slide-8
SLIDE 8

APOC History

  • My Unicorn Moment
  • 3.0 was about to have

User Defined Procedures

  • Add the missing utilities
  • Grew quickly 50 - 150 - 450
  • Active OSS project
  • Many contributors
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

Agenda

why and how of user defined extensions

  • procedures, functions, aggregation functions
  • history of apoc
  • 5 pearls -> come to the training if you want to see more
  • apoc.help() + doc & videos
  • 1 3x utilities - text, map and collection functions
  • 2 aggregation functions
  • 3 data integration - load json
  • 4 handling large updates - periodic iterate
  • 5 graph refactoring
  • 6 path expanders
  • 7 triggers
  • 8 time to live
  • 9 graph grouping
  • 10 cypher functions
slide-12
SLIDE 12
  • Neo4j Sandbox
  • Neo4j Desktop
  • Neo4j Cloud

Available On

slide-13
SLIDE 13
  • Neo4j Sandbox
  • Neo4j Desktop
  • Neo4j Cloud

Available On

slide-14
SLIDE 14

Install

slide-15
SLIDE 15
  • Utilities & Converters
  • Data Integration
  • Import / Export
  • Graph Generation / Refactoring
  • Transactions / Jobs / TTL
  • much more ...

What's in the Box?

slide-16
SLIDE 16
  • Videos
  • Documentation
  • Browser Guide
  • APOC Training
  • Neo4j Community Forum
  • apoc.help()

Where can I learn more?

slide-17
SLIDE 17

If you learn one thing: call apoc.help("keyword")

slide-18
SLIDE 18

APOC Video Series

Youtube Playlist: r.neo4j.com/apoc-videos

slide-19
SLIDE 19

APOC Docs

  • installation instructions
  • videos
  • searchable overview table
  • detailed explanation
  • examples

neo4j-contrib.github.io/neo4j-apoc-procedures

slide-20
SLIDE 20

Browser Guide

:play apoc

  • live examples
slide-21
SLIDE 21

The Pearls - That give you Superpowers

21

slide-22
SLIDE 22

Data Integration

22

slide-23
SLIDE 23
  • Relational / Cassandra
  • MongoDB, Couchbase,

ElasticSearch

  • JSON, XML, CSV, XLS
  • Cypher, GraphML
  • ...

Data Integration

slide-24
SLIDE 24

apoc.load.json

  • load json from web-apis and files
  • JSON Path
  • streaming JSON
  • compressed data

neo4j-contrib.github.io/neo4j-apoc-procedures/#_load_json

slide-25
SLIDE 25
slide-26
SLIDE 26

WITH "https://api.stackexchange.com/2.2/questions?pagesize=100..." AS url CALL apoc.load.json(url) YIELD value UNWIND value.items AS q MERGE (question:Question {id:q.question_id}) ON CREATE SET question.title = q.title, question.share_link = q.share_link, question.favorite_count = q.favorite_count MERGE (owner:User {id:q.owner.user_id}) ON CREATE SET owner.display_name = q.owner.display_name MERGE (owner)-[:ASKED]->(question) FOREACH (tagName IN q.tags | MERGE (tag:Tag {name:tagName}) MERGE (question)-[:TAGGED]->(tag)) …

slide-27
SLIDE 27

StackOverflow data model

slide-28
SLIDE 28

Huge Transactions

28

slide-29
SLIDE 29

apoc.periodic.iterate

  • driving statement
  • executing statement
  • batching
  • parallel execution
  • handling retries

neo4j-contrib.github.io/neo4j-apoc-procedures/#_apoc_periodic_iterate

slide-30
SLIDE 30

Run large scale imports

CALL apoc.periodic.iterate( 'LOAD CSV … AS row RETURN row', 'MERGE (n:Node {id:row.id}) SET n.name = row.name', {batchSize:10000})

slide-31
SLIDE 31

CALL apoc.periodic.iterate( 'UNWIND range(1,1000000) as id return id', 'CREATE (n:Node {id:id,name:"an "+id})', {batchSize:10000, parallel:true}) YIELD batches, total, timeTaken; +-------------------------------+ | batches | total | timeTaken | +-------------------------------+ | 100 | 1000000 | 1 | +-------------------------------+ 1 row available after 1868 ms, consumed after another 0 ms

Run large scale imports

slide-32
SLIDE 32

Run large scale updates

CALL apoc.periodic.iterate( 'MATCH (n:Person) RETURN n', 'SET n.name = n.firstName + " " + n.lastName', {batchSize:10000, parallel:true})

slide-33
SLIDE 33

Utilities

33

slide-34
SLIDE 34

Text Functions - apoc.text.*

indexOf, indexesOf split, replace, regexpGroups format capitalize, decapitalize random, lpad, rpad snakeCase, camelCase, upperCase charAt, hexCode base64, md5, sha1,

https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_text_functions

slide-35
SLIDE 35

Collection Functions - apoc.coll.*

sum, avg, min,max,stdev, zip, partition, pairs sort, toSet, contains, split indexOf, .different

  • ccurrences, frequencies, flatten

disjunct, subtract, union, … set, insert, remove randomItem(s)

https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/3.4/docs/overview.adoc#collection-functions

slide-36
SLIDE 36

Map Functions - apoc.map.*

  • .fromNodes, .fromPairs,

.fromLists, .fromValues

  • .merge
  • .setKey,removeKey
  • .clean(map,[keys],[values])
  • .groupBy(Multi)

https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/3.4/docs/overview.adoc#map-functions

slide-37
SLIDE 37

JSON - apoc.convert.*

.toJson([1,2,3]) .fromJsonList('[1,2,3]') .fromJsonMap('{"a":42,"b":"foo","c":[1,2,3]}') .toTree([paths],[lowerCaseRels=true]) .getJsonProperty(node,key) .setJsonProperty(node,key,complexValue) (JSON)-[:IS]->(everywhere)-[:LIKE]->(graphs)

slide-38
SLIDE 38

Graph Refactoring

38

slide-39
SLIDE 39
  • .cloneNodes
  • .mergeNodes
  • .extractNode
  • .collapseNode
  • .categorize

Relationship Modifications

  • .to(rel, endNode)
  • .from(rel, startNode)
  • .invert(rel)
  • .setType(rel, 'NEW-TYPE')

Aggregation Function - apoc.refactor.*

slide-40
SLIDE 40

apoc.refactor.mergeNodes

MATCH (n:Person) WITH n.email AS email, collect(n) as people WHERE size(people) > 1 CALL apoc.refactor.mergeNodes(people) YIELD node RETURN node

slide-41
SLIDE 41

apoc.create.addLabels

MATCH (n:Movie) CALL apoc.create.addLabels( id(n), [ n.genre ] ) YIELD node REMOVE node.genre RETURN node

slide-42
SLIDE 42

Triggers

42

slide-43
SLIDE 43

Triggers

CALL apoc.trigger.add( name, statement,{phase:before/after})

  • apoc.trigger.pause/resume/list/remove
  • Transaction-Event-Handler calls Cypher statement
  • parameters:
  • createdNodes, assignedNodeProperties, deletedNodes,...
  • utility functions to extract entities/properties from update-records
  • triggers stored in graph, restored at startup

https://medium.com/neo4j/streaming-graph-loading-with-neo4j-and-apoc-triggers-188ed4dd40d5

slide-44
SLIDE 44

Time to Live

44

slide-45
SLIDE 45

enable in config: apoc.ttl.enabled=true Label :TTL apoc.date.expire(In)(node, time, unit) Creates Index on :TTL(ttl)

Time To Live TTL

slide-46
SLIDE 46

background job (every 60s - configurable) that runs: MATCH (n:TTL) WHERE n.ttl > timestamp() WITH n LIMIT 1000 DET DELETE n

Time To Live TTL

slide-47
SLIDE 47

Aggregation Functions

47

slide-48
SLIDE 48

Aggregation Function - apoc.agg.*

  • more efficient variants of collect(x)[a..b]
  • .nth,.first,.last,.slice
  • .median(x)
  • .percentiles(x,[0.5,0.9])
  • .product(x)
  • .statistics() provides a full

numeric statistic

slide-49
SLIDE 49

Graph Grouping

49

slide-50
SLIDE 50

Graph Grouping

MATCH (p:Person) set p.decade = b.born / 10; MATCH (p1:Person)-->()<--(p2:Person) WITH p1,p2,count(*) as c MERGE (p1)-[r:INTERACTED]-(p2) ON CREATE SET r.count = c CALL apoc.nodes.group(['Person'],['decade']) YIELD node, relationship RETURN *;

slide-51
SLIDE 51

Graph Grouping

MATCH (p:Person) set p.decade = b.born / 10; MATCH (p1:Person)-->()<--(p2:Person) WITH p1,p2,count(*) as c MERGE (p1)-[r:INTERACTED]-(p2) ON CREATE SET r.count = c CALL apoc.nodes.group(['Person'],['decade']) YIELD node, relationship RETURN *;

slide-52
SLIDE 52

Cypher Procedures

52

slide-53
SLIDE 53

apoc.custom.asProcedure/asFunction (name,statement, columns, params)

  • Register statements as real procedures & functions
  • 'custom' namespace prefix
  • Pass parameters, configure result columns
  • Stored in graph and distributed across cluster

Custom Procedures (WIP)

slide-54
SLIDE 54

call apoc.custom.asProcedure('neighbours', 'MATCH (n:Person {name:$name})-->(nb) RETURN neighbour', [['neighbour','NODE']],[['name','STRING']]); call custom.neighbours('Joe') YIELD neighbour;

Custom Procedures (WIP)

slide-55
SLIDE 55

Report Issues Contribute!

slide-56
SLIDE 56

Ask Questions

neo4j.com/slack community.neo4j.com

slide-57
SLIDE 57

APOC on GitHub

slide-58
SLIDE 58

Join the Workshop tomorrow!

slide-59
SLIDE 59

Any Questions?

slide-60
SLIDE 60

Best Question gets a box!