ING-DiBa - Home | Frankfurt Big Data Lab · 2 Die ING-DiBa an einem Tag 20.000 Anrufe im...
Transcript of ING-DiBa - Home | Frankfurt Big Data Lab · 2 Die ING-DiBa an einem Tag 20.000 Anrufe im...
ING-DiBa
Bart Buter
ING-DiBa AG
Frankfurt • 04. November 2016
2
Die ING-DiBa an einem Tag
20.000 Anrufe im Kundendialog
4.300 Kontakte in der Immobilien-finanzierung
2.500 Email-Eingänge pro Tag im Kundendialog
5,2 Mio. Seiten-aufrufe im Internet
400.000 Logins im Internetbanking & Brokerage
50.000 Vorgänge im Dokumenten Service
29.000 Abhebungenan den 1.200 ING-DiBa Geldautomaten
160.000 ZugriffeMobile Banking App
2,304
3,749
2005 2015
5.3
8.5
2005 2015
82
241
2005 2015
3
Erfolgreich anders
Privatkunden in Mio.
Mitarbeiterzahl
Geschäftsvolumen in Mrd. €
Drittgrößte Privatkundenbank
# Institut Kunden in DE in Mio.
1 Deutsche Bank PBC (inkl. Potsbank) 23
2 Commerzbank (inkl. Comdirect) 11
3 ING-DiBa 7,8
4 Santander 6,3
5 Targobank 4
Hohe Kundenzufriedenheit
Basis: Motivierte und engagierte Mitarbeiter
Geschwindigkeit von Veränderung nimmt weiter zu und Innovationsdruck auf die Banken steigt
4
71% of Millenials
would rather go to the dentist than listen to what banks are saying.
5
ING-DiBa Strategie – der Kunde steht im Mittelpunkt
3Q2016https://www.youtube.com/embed/B6K_hvM052g
Hackatonhttps://www.youtube.com/watch?v=di67BJPd8I0
Movie Time, Hackathon
6
Income Estimation Use Case
Bart Buter / Georgios Gkekas
ING-DiBa AG
Frankfurt • 04. November 2016
International Advanced Analytics: Mentors
8
Georgios GkekasSenior BigData Engineer
Expertise: Scalable BigData solutions, softwarearchitecture/development, distributed applications
Bart ButerHead of Data Engineering
Expertise: Building Big Data Environments, Connecting People & Delegation.
Center of excellence in advanced analytics
9
• Specialists in machine learning and big
data technologies
• Support business units
• Experiment with new data
• Technology
• Methods
• Sources
• Training and knowledge transfers
• Development of exploration
environment
• Fundamental research
From
Generalized predictions
Rule based
Millions of calculations
Sampled data
Structured data
Central Calculations
International Advanced Analytics
10
To
Individual predictions
Self learning
Billions of calculations
Very large datasets
Structured and unstructured data
Distributed Computing
Our question to youCan you estimate the income of a person?
Be creative- We give you the problem- We expect a solution from you
Be a startup that we would like to work with- Work Responsibly
- Ethically- Legally
- Fit to our values and vision- Any time anywhere- Clear and easy- Empower us
You are a fictive startup
11
Can you give us an indication of someone’s income based on publicly-available data. Furthermore, since working with personal data carries considerable responsibility, we would like you to investigate the legal and ethical implications of your solutions. We would like you to define a minimal set of data which you would want from the customer and which you can enrich with public data e.g. statistics bureau data, publicly-available profiles, income comparison sites, etc. We expect you to demo a working prototype together with a legal and ethical assessment of the prototype and a documentation regarding the validation of your results.
Challenge
12
Income The flow of cash or cash-equivalents received from work (wage or salary), capital(interest or profit), or land (rent). (http://www.businessdictionary.com/definition/income.html)
Income estimation Using data to create a model that can estimate the incomeThis is not the same as the sum of all incoming money, because people can receive a one-off donation, get money back for dinner with friends, use money from their savings account, transfer money from one account to another.
Why? income might correlate with interest in financial products
Higher income:Investments & Mortgages
Lower incomeOver draft, savings goals
Income Estimation
13
Because, In reality a bank would be unlikely to share its data with a young startup.
We have the expertise to analyze data internal to the bank, asking the same from you for this challenge wouldn’t add value to us.
However, We only have partial observability i.e. we only have access to our internal data, but we don’t have a full overview if a customer uses multiple banks.
Banking traditionally uses limited external datasets, with the rise of companies monetizing their data by providing data-services and government opening up datasets it might be that we have missed useful data sources.
We might have internal data sets about income that we are not allowed to use. i.e. for protecting our customers and the bank from fraud, more is permitted than for marketing.
Therefore, We’d like to see some creative use of the data available outside of our bank.
A data challenge without a data set
14
• Open data• Income resources (gehalt.de,
stepstone.de)• Vacancy websites (monster.de,
stepstone.de)• Public social profiles• Statistics bureau (dstatis)• Public data from work councils (Verdi,
e.t.c.)• Public statistical data on prices
(immobilienscout.de)• Closed data• Private social profiles (Facebook, e.t.c.)• Professional networks (XING, LinkedIn,
e.t.c.)• Questionairs• Ask people to give you the information
15
Possible Data Sources
• Methods• Scraping• Public APIs
• Technologies• Python libraries• scrapy -> https://scrapy.org/• lxml & requests -> http://docs.python-
guide.org/en/latest/scenarios/scrape/• Java libraries• Jaunt -> http://jaunt-api.com/• Jsoup -> https://jsoup.org/• jArvest -> http://sing.ei.uvigo.es/jarvest/
16
How to get the data?
Service Oriented Architecture• Easy to consume• Easy to understand• Easy to integrate• Easy to deploy• Easy to discover• Easy to version• Rely on open standards• Based on APIs
Open Interfaces
17
How to design?
Abstract - high degree of aggregation
https://income.de?postal=604**&age=30-50
{" income ": [
{"range": "55000-70000","probability": 86%
},{
"range": "45000-60000","probability": 78%
}]
}
18
API design - Fine grained control over the detail of the result
Detailed - high degree of personalization
https://income.de?street=heerstrasse&postal=60488&name=georgios-gkekas&linkedinid= 0672474{
“income": [{
"range": “69120-70300","probability": 93%
},{
"range": “61450-63900","probability": 86.5%
}]
}
19
API design - Fine grained control over the detail of the result
Data is more important than insights / analytics
20
What do we want from you?
Scrapedata
Gatherunstructured
data
Structured data
Usefulinformation
Understanddata
Estimateincome
Higher importance Lower importance
ImplementationMethod
Sources
Data mining
APIs Scraping
ML
You could also design an API for getting JUST the data from the various data sources
• Raw data• Aggregated data• Combined data• Enriched data
21
API design – Raw data retrieval
• Present your data sources• Present an evaluation of legal considerations & access methods• Present method for structuring the data
• Define API/interface• High-level architecture
• Tech stack• Data flow
• High-level implementation plan• Work packages
• Make sure your proposal is feasible under the time constraints• Supporting materials
• Not desired but up to 2 pages if necessary to convince us
Proposal for the first deliverables
22
- You are not working for DiBa, your ideas, solutions and actions are yours.
- Financial matters and related data are private and sensitive, treat them as such.
- When in doubt contact us or your professor.
Rules
23
Vielen Dank!
Bart Buter – [email protected] 069 / 27 222 66776Georgios Gkekas – [email protected] 069 / 27 222 69371
International Advanced Analytics
ING-DiBa AGTheodor-Heuss-Allee 260486 Frankfurt am Main
www.ing-diba.de
YouTube.com/ingdiba
@ING_DiBa_Presse
Instagram.com/ingdiba
Facebook.com/ingdiba