GraphTalk Berlin - Einführung in Graphdatenbanken

33
Neo4j GraphTalks Herzlich Willkommen! Oktober 2015 [email protected]

Transcript of GraphTalk Berlin - Einführung in Graphdatenbanken

Page 1: GraphTalk Berlin - Einführung in Graphdatenbanken

Neo4j GraphTalks

Herzlich Willkommen!

Oktober [email protected]

Page 2: GraphTalk Berlin - Einführung in Graphdatenbanken

Neo4j GraphTalks

• 09:00-09:30 Frühstück und Networking

• 09:30-10:00 Einführung in Graphen-Datenbanken und Neo4j (Bruno Ungermann, Neo4j)

• 10:00-10.30 Kantwert: Deutschland erstes Entscheidernetzwerk – mit Neo4j (Tilo Walter, Geschäftsführer Kantwert)

• 10.30-11.00 e-Spirit: Erfahrungswerte mit der Integration von Neo4j in das Content Management System FirstSpirit (Christoph Feddersen, Head of Module Development e-Spirit)

• Open End (Stefan Plantikow, Alexander Erdl)

Page 3: GraphTalk Berlin - Einführung in Graphdatenbanken

Beispiel: Logisches Modell Logistikprozess

Page 4: GraphTalk Berlin - Einführung in Graphdatenbanken

Relationales Schema (“die Welt in Tabellen pressen”):

Page 5: GraphTalk Berlin - Einführung in Graphdatenbanken

Graphenmodell, kein Schema

Page 6: GraphTalk Berlin - Einführung in Graphdatenbanken

The Whiteboard Model Is the Physical Model

Page 7: GraphTalk Berlin - Einführung in Graphdatenbanken

An intuitive approach to data problems

Page 8: GraphTalk Berlin - Einführung in Graphdatenbanken

Discrete DataMinimally

connected data

Neo4j is designed for data relationships

Use the Right Database for the Right Job

Other NoSQL Relational DBMS Neo4j Graph DB

Connected DataFocused on

Data Relationships

Development BenefitsEasy model maintenance

Easy query

Deployment BenefitsUltra high performanceMinimal resource usage

Page 9: GraphTalk Berlin - Einführung in Graphdatenbanken

Relational DBMSs Can’t Handle Relationships Well

• Cannot model or store data and relationships without complexity

• Performance degrades with number and levels of relationships, and database size

• Query complexity grows with need for JOINs• Adding new types of data and relationships

requires schema redesign, increasing time to market

… making traditional databases inappropriate when data relationships are valuable in real-time

Slow developmentPoor performance

Low scalabilityHard to maintain

Page 10: GraphTalk Berlin - Einführung in Graphdatenbanken

NoSQL Databases Don’t Handle Relationships

• No data structures to model or store relationships

• No query constructs to support data relationships

• Relating data requires “JOIN logic” in the application

• No ACID support for transactions

… making NoSQL databases inappropriate when data relationships are valuable in real-time

Page 11: GraphTalk Berlin - Einführung in Graphdatenbanken

High Business Value in Data Relationships

Data is increasing in volume…• New digital processes• More online transactions• New social networks• More devices

Using Data Relationships unlocks value • Real-time recommendations• Fraud detection• Master data management• Network and IT operations• Identity and access management• Graph-based search… and is getting more connected

Customers, products, processes, devices interact and relate to each other

Early adopters became industry leaders

Kamille Nixon
We need to put these use cases in order and use the same order on all materials.
Page 12: GraphTalk Berlin - Einführung in Graphdatenbanken

“Forrester estimates that over 25% of enterprises will be using graph databases by 2017”

Neo4j Leads the Graph Database Revolution

“Neo4j is the current market leader in graph databases.”

“Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.”

IT Market Clock for Database Management Systems, 2014https://www.gartner.com/doc/2852717/it-market-clock-database-managementTechRadar™: Enterprise DBMS, Q1 2014http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801Graph Databases – and Their Potential to Transform How We Capture Interdependencies (Enterprise Management Associates)http://blogs.enterprisemanagement.com/dennisdrogseth/2013/11/06/graph-databasesand-potential-transform-capture-interdependencies/

Page 13: GraphTalk Berlin - Einführung in Graphdatenbanken

2012 2015

Page 14: GraphTalk Berlin - Einführung in Graphdatenbanken

2000 2003 2007 2009 2011 2013 2014 20152012

Neo4j: The Graph Database Leader

GraphConnect, first conference for graph DBs

First Global 2000

Customer

Introduced first and only

declarative query language for

property graph

Published O’Reilly

bookon Graph

Databases

$11M Series A from Fidelity,

Sunstoneand Conor

$11M Series B from Fidelity,

Sunstoneand Conor

CommercialLeadership

First native

graph DB in 24/7

production

Invented property

graph model

Contributed first graph DB to open

source

$2.5M SeedRound from

Sunstone and Conor

Funding

Extended graph data model to

labeled property graph

150+ customers

50K+ monthlydownloads

500+ graph DB eventsworldwide

$20M Series C led by Creandum,

with Dawn and existing investors

TechnicalLeadership

Kamille Nixon
Arrow bullet should be on 2000
Kamille Nixon
Arrow bullet should be on 2012
Page 15: GraphTalk Berlin - Einführung in Graphdatenbanken

Largest Ecosystem of Graph Enthusiasts

• 1,000,000+ downloads• 20,000+ education registrants• 18,000+ Meetup members• 100+ technology and service partners• 200 enterprise subscription customers

including 50+ Global 2000 companies

Page 16: GraphTalk Berlin - Einführung in Graphdatenbanken

Neo4j Adoption by Selected VerticalsFinancialServices

Communications

Health &Life

SciencesHR &

RecruitingMedia &

PublishingSocialWeb

Industry & Logistics

Entertainment Consumer Retail Information ServicesBusiness Services

Page 17: GraphTalk Berlin - Einführung in Graphdatenbanken

How Customers Use Neo4jNetwork &

Data Center

Master DataManagementSocial

Recom–mendation

s

Identity &

Access

Search &Discover

yGEO

Page 18: GraphTalk Berlin - Einführung in Graphdatenbanken

Background• One of the world’s largest logistics carriers• Projected to outgrow capacity of old system• New parcel routing system

• Single source of truth for entire network• B2C & B2B parcel tracking • Real-time routing: up to 8M parcels per day

Business problem• 24x7 availability, year round• Peak loads of 3000+ parcels per second• Complex and diverse software stack• Need predictable performance & linear

scalability• Daily changes to logistics network: route from

any point, to any point

Solution & Benefits• Neo4j provides the ideal domain fit:

• a logistics network is a graph • Extreme availability & performance with Neo4j clustering• Hugely simplified queries, vs. relational for complex

routing• Flexible data model can reflect real-world data variance

much better than relational• “Whiteboard friendly” model easy to understand

Industry: LogisticsUse case: Real-time Recommendations for RoutingGermany

Page 19: GraphTalk Berlin - Einführung in Graphdatenbanken

Adidas: Shared Metadata Service

Page 20: GraphTalk Berlin - Einführung in Graphdatenbanken

Lufthansa: Content/Digital Asset Management

Page 21: GraphTalk Berlin - Einführung in Graphdatenbanken

Background

Business problem Solution & Benefits

• German mid-size Insurance company• Founded in 1858• Project executed by delvin GmbH - a 100%

subsidiary of die Bayerische Versicherung a.G. and an IT service specialist in the insurance business

• Field sales unit needed easy access to policies and customer data, in an increasing variety of ways

• Needed to support a growing business• Existing IBM DB2 system not able to meet

performance requirements as the system scaled• 24/7 available system for sales unit outside the

company needed

• Enable field sales unit to flexibly search for insurance policies and associated personal data, single source of truth

• Raising the bar with respect to insurance industry practices

• Support the business as it scales, with a high level of performance

• Easy port of existing metadata into Neo4j

Industry: InsuranceUse case: Master Data ManagementGermany

Page 22: GraphTalk Berlin - Einführung in Graphdatenbanken

Neo Technology, Inc Confidential

Background

Business problem• In the drive to provide the best customer web

experience on its walmart.com site, Walmart sought to use data products that connect masses of complex buyer and product data to gain super-fast insight into customer needs and product trends

• Existing relational database couldn’t handle the complexity of the system’s queries

Solution & Benefits• Substituted complex batch process with Neo4j for its

online real-time recommendations• Built a simple, real-time recommendation system with

low latency queries• Serves up better and faster recommendations, by

combining historical and session data

Industry: RetailUse case: Real-Time RecommendationsBentonville, Arkansas

• Founded in 1962, Walmart has more than 11,000 brick and mortar stores in 27 countries

• Plus more than 2 million employees and $470 billion in annual revenues

• Needs to provide optimal online customer experience on its walmart.com site to compete

Page 23: GraphTalk Berlin - Einführung in Graphdatenbanken

Neo Technology, Inc Confidential

Background

Business problem• Enable customer-selected delivery inside 90min• Maintain a large network routes covering many

carriers and couriers. Calculate multiple routing operations simultaneously, in real time, across all possible routes

• Scale to enable a variety of services, including same-day delivery, consumer-to-consumer shipping (www.shutl.it) and more predictable delivery times

Solution & Benefits• Neo4j calculates all possible routes in real time for

every order• The Neo4j-based solution is thousands of times faster

than the prior RDMS based solution• Queries require 10-100 times less code, improving

time-to-market & code quality• Neo4j lets the team add functionality that was not

previously possible

Industry: RetailUse case: Routing RecommendationsSan Francisco & London

• eBay seeks to expand global retail presence• Quick & predictable delivery is an important

competitive cornerstone• To counter & upstage Amazon Prime, eBay

acquired U.K.-based Shutl to form the core of a new delivery service, launching eBay Now (www.ebay.com/now) prior to Christmas 2013

• Founded in 2009, Shutl was the U.K. Leader in same-day delivery, with 70% of the market

Page 24: GraphTalk Berlin - Einführung in Graphdatenbanken

Industry: CommunicationsUse case: Real-Time RecommendationsSan Jose CA

• Cisco.com serves customer and business customers with Support Services

• Needed real-time recommendations, to encourage use of online knowledge base

• Cisco had been successfully using Neo4j for its internal master data management solution.• Identified a strong fit for online

recommendations

Solution & Benefits• Cases, solutions, articles, etc. continuously scraped

for cross-reference links, and represented in Neo4j• Real-time reading recommendations via Neo4j• Neo4j Enterprise with HA cluster• The result: customers obtain help faster, with

decreased reliance on customer support

Background

Business problem• Call center volumes needed to be lowered by

improving the efficacy of online self service• Leverage large amounts of knowledge stored in

service cases, solutions, articles, forums, etc.• Problem resolution times, as well as support

costs, needed to be lowered

Support Case

Knowledge Base Article

Solution

Knowledge Base Article

Knowledge Base Article

Message

Support Case

Page 25: GraphTalk Berlin - Einführung in Graphdatenbanken

Industry: CommunicationsUse case: Network & IT OpsParis

Background• Second largest communications company in

France• Part of Vivendi Group, partnering with Vodafone

Business problemInfrastructure maintenance took one full week to plan, because of the need to model network impacts• Needed rapid, automated “what if” analysis to

ensure resilience during unplanned network outages

• Identify weaknesses in the network to uncover the need for additional redundancy

• Network information spread across > 30 systems, with daily changes to network infrastructure

• Business needs sometimes changed very rapidly

Solution & Benefits• Flexible network inventory management system, to

support modeling, aggregation & troubleshooting• Single source of truth (Neo4j) representing the

entire network• Dynamic system loads data from 30+ systems, and

allows new applications to access network data• Modeling efforts greatly reduced because of the

near 1:1 mapping between the real world and the graph

• Flexible schema highly adaptable to changing business requirements

Router

Service

DEPENDS_ON

Switch Switch

Router

Fiber Link Fiber Link

Fiber Link

Oceanfloor Cable

DEPE

NDS_

ON

DEPENDS_ON

DEPENDS_ONDEPENDS_O

NDEPENDS_ON

DEPENDS_ON

DEPENDS_ONDEPENDS_ON

DEPE

NDS_

ON

LINKE

D

LINKED

LINKED

DEPENDS_ON

Page 26: GraphTalk Berlin - Einführung in Graphdatenbanken

Background• One of the world’s oldest and largest banks• More than 100 years old and includes more

than 1000 predecessor institutions• 500,000 employees and contractors• Most processing is done on UNIX. Needed to

manage & visualize the approximately 50,000 UNIX servers

Business problem• Improve performance on company-wide network configuration

• Combine log data from Splunk into an application that plays events over a visualization of the network, detect incidents

• Leverage M&A legacy systems, with no room for error

Solution & Benefits• Use Neo4j to store UNIX server & network

configuration companywide• Original RDBMS solution could handle only 5000

servers. Neo4j introduced for performance• New applications also were built much more

rapidly using Neo4j than possible with SQL

Industry: Financial ServicesUse case: Network & IT OperationsGlobal

Large Investment Bank

Page 27: GraphTalk Berlin - Einführung in Graphdatenbanken

Industry: CommunicationsUse case: ID & Access ManagementOslo

Background• 10th largest Telco provider in the world, leading in the Nordics

• Online self-serve system where large business admins manage employee subscriptions and plans

• Mission-critical system whose availability and responsiveness is critical to customer satisfaction

Business problem• Degrading relational performance. User login taking

minutes while system retrieved access rights• Millions of plans, customers, admins, groups.

Highly interconnected data set w/massive joins• Nightly batch workaround solved the performance

problem, but led to outdated data • Primary system was Sybase. Batch pre-compute

workaround projected to reach 9 hours by 2014: longer than the nightly batch window

Solution & Benefits• Moved authorization functionality from Sybase to Neo4j

• Modeling the resource graph in Neo4j was straightforward, as the domain is inherently a graph

• Able to retire the batch process, and move to real-time responses: measured in milliseconds

• Users able to see fresh data, not yesterday’s snapshot

• Customer retention risks fully mitigated• Performance, Mi->millsec, Simplicity, Understand

Bus Rules, Scale

Subscription

Account

Customer

Customer

SUBSCRIBED_BY

CONTROLLED_BY

PART_OF

User

USER_ACCESS

Page 28: GraphTalk Berlin - Einführung in Graphdatenbanken

Background• Top investment bank, headquarters Switzerland• Using a relational database coupled with

Gemfire for managing employee permissions to research resources (documents and application services)

Business problem• When a new investment manager was onboarded, permissions were manually provisioned via a complex manual process. Traders lost an average of 7 days of trading, waiting for the permissions to be granted

• Competitor had implemented a project to accelerate the onboarding process. Needed to respond quickly.

• High stakes: Regulations leave no room for error. • High complexity: Granular permissions mean

each trader needed access to hundreds of resources.

Solution & Benefits• Organizational model, groups, and entitlements

stored in Neo4j• Meets & exceeds performance requirements. • Significant productivity advantage due to domain

fit• Graph visualization makes it easier for the

business to provision permissions themselves• Moving to Neo4j meant “fewer compromises” than

a relational data store• Now using Neo4j for authorization behind online

brokerage business

Industry: Financial ServicesUse case: ID & Access ManagementLondon

Large Investment Bank

Page 29: GraphTalk Berlin - Einführung in Graphdatenbanken

Background• The global cost of fraud and identity theft is estimated to be

over $200 billion per year • Global financial services firm: trillions of dollars

in total assets• Varying compliance & governance

considerations• Incredibly complex transaction systems, with

ever-growing opportunities for fraud

Business problem• Needed to spot and prevent fraud detection in

real time, especially in payments that fall within “normal” behavior metrics

• Needed more accurate and faster credit risk analysis for payment transactions

• Needed to dramatically reduce chargebacks

Solution & Benefits• Neo4j helped them simplify both the credit risk

analysis and fraud detection processes, lowering TCO

• Uniquely identify entities and connections• Chargebacks and fraud greatly reduced, huge

savings • Empower business-unit teams to build Neo4j

applications for real-time use, and easily evolve them to include non-uniform data, avoiding sparse tables and frequent schema changes

Industry: Financial ServicesUse case: Fraud DetectionLondon & New York

Large Financial Services Co.

Page 30: GraphTalk Berlin - Einführung in Graphdatenbanken

Background

Business problem Solution & Benefits

• Tre is part of Hutchison Whampoa, one of the world’s largest telecommunications conglomerates

• Operates in the Nordics and U.K.

• A Neo4j cluster, containing a graph of customer billing information, is accessed by customer-facing applications

• Neo4j’s graph-based model enables timely & insightful profiling of customers to support customer service

• New applications & enhancements are developed faster

• Queries running much faster thanks to Neo4j

Industry: TelecommunicationsUse case: Master Data Management (Customer Data)Stockholm, Schweden

• New business requirement to give customers more insight into their own usage patterns

• Changing the data model was slow and painful• New queries were difficult to write• Very large data sets creating serious

performance problems in RDBMS for connected queries (>L2)

• Tre saw value in moving towards real-time customer profiling and real-time analytics

Page 31: GraphTalk Berlin - Einführung in Graphdatenbanken

• One of the world’s largest communications equipment manufacturers

• #91 Global 2000. $44B in annual sales.• Had experienced success with Neo4j in Master

Data Management and Real-time Recommendations projects, so wanted to use it for this content management / Graph-based Search problem

Solution & Benefits• Cisco created a new “Intelligent Query Service,” an

internal document discovery system with automated keyword assignment

• Sales reps report that the time it takes to find precisely the right asset decreased from 2 weeks to 20 minutes

Background

Business problem• Sales reps wasted days looking for appropriate

materials to send prospects• Keyword indexing system was too slow• Deal sales cycles were suffering

Industry: CommunicationsUse case: Graph-based Search San Jose, CA

Page 32: GraphTalk Berlin - Einführung in Graphdatenbanken

• One of the world’s largest communications equipment manufacturers

• #91 Global 2000. $44B in annual sales.• Needed a system that could accommodate its

master data hierarchies in a performant way• HMP is a Master Data Management system at

whose heart is Neo4j. Data access services available 24x7 to applications companywide

Solution & Benefits• Cisco created a new system: the Hierarchy Management

Platform (HMP)• Allows Cisco to manage master data centrally, and

centralize data access and business rules• Neo4j provided “Minutes to Milliseconds” performance over

Oracle RAC, serving master data in real time• The graph database model provided exactly the flexibility

needed to support Cisco’s business rules• HMP so successful that it has expanded to

include product hierarchy

Background

Business problem• Sales compensation system had become unable

to meet Cisco’s needs• Existing Oracle RAC system had reached its limits:

• Insufficient flexibility for handling complex organizational hierarchies and mappings

• “Real-time” queries were taking > 1 minute!• Business-critical “P1” system needs to be

continually available, with zero downtime

Industry: CommunicationsUse case: Master Data Management, HMPSan Jose, CA

Page 33: GraphTalk Berlin - Einführung in Graphdatenbanken

Neo Technology, Inc Confidential

Fragen?

Präsentationen Videos...

Sammlung Use Cases

Beispiel-Modelle

[email protected]