PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

70
© 2014 IBM Corporation An IBM Proof of Technology IBM SPSS Data Mining Workshop Laila FettahTechnical Sales Specialist Advanced Analytics Robin van Tilburg Business analytics Specialty Architect 30 oktober 2014

description

IBM Proof of Technology Probeer de Mogelijkheden van Datamining zelf uit 30-10-2014 Amsterdam, IBM Client Center Presentatie van Laila Fettah & Robin van Tilburg

Transcript of PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

Page 1: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

An IBM Proof of Technology

IBM SPSS Data Mining Workshop

Laila Fettah– Technical Sales Specialist Advanced Analytics

Robin van Tilburg – Business analytics Specialty Architect

30 oktober 2014

Page 2: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

2 IBM SPSS Data Mining Workshop

Welcome to the Technical Exploration Center

Introductions

Access restrictions

Restrooms

Emergency Exits

Smoking Policy

Breakfast/Lunch/Snacks – location and times

Special meal requirements?

Page 3: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

3 IBM SPSS Data Mining Workshop

Introductions

Please introduce yourself

Name and organization

Current integration

technologies/tools in use

What do you want out of this Data Mining Workshop?

Page 4: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

4 IBM SPSS Data Mining Workshop

Agenda

10:00-10:10 Welcome and Introductions

10:10-11:00 Introduction to Predictive Analytics

11:00-11:30 Exercise: Navigating IBM SPSS Modeler

11:30-12:00 Exercise: Predictive in 20 Minutes

12:00-12:45 Lunch

12:45-13:30 Data Mining Methodology and Application

13:30-14:00 Exercise: Data Mining Techniques

14:00-14:30 Exercise: Deployment

14:30-14:45 Wrap-up

Page 5: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

5 IBM SPSS Data Mining Workshop

Objectives

Introduction to predictive analytics and data mining

Stimulate thinking about how data mining would benefit your organization

Demonstrate ease of use of powerful technology

Get experience in “doing” data mining

See examples of existing customers and their realized ROI/benefits

Page 6: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

6

“I used to think my

job was all about

arrests. Chasing

bad guys.”

“Now, we figure out

where to send

patrols to stop crime

before it happens.”

Page 7: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

7 IBM SPSS Data Mining Workshop

Smarter Planet

The world is changing, enabling organizations to make faster,

better-informed decisions

7

Digital technologies (sensors and

other monitoring instruments) are

being embedded into every object,

system and process.

All the data generated by digital

technology is providing intelligence

to help us do things better,

improving our responsiveness

and our ability to predict and

optimize for future events.

INTELLIGENT

INSTRUMENTED

INTERCONNECTED

In the globalized, networked

world, people, systems,

objects and processes are

connected, and they

are communicating with one

another in entirely new ways.

Page 8: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

8 IBM SPSS Data Mining Workshop

With this change comes an

explosion in information …

… Yet organizations are

operating with blind spots

Inefficient Access

1 in 2 don’t have access to the

information across their organization

needed to do their jobs

Lack of Insight

1 in 3 managers frequently make

critical decisions without the

information they need

Inability to Predict

3 in 4 business leaders say more

predictive information would drive better

decisions

Variety of Information

Volume of Digital Data

Velocity of Decision Making

Source: IBM Institute for Business Value

8

Page 9: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

9 IBM SPSS Data Mining Workshop

Leverage Information To Drive Smarter Business Outcomes

Increase Revenue

Increase Productivity

Reduce Costs

Reduce Risk

9

Page 10: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

10

“I used to think my

job was all about

arrests. Chasing

bad guys.”

“Now, we figure out

where to send

patrols to stop crime

before it happens.”

Page 11: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

Door middel van data mining kan de politie de delen van hun jurisdictie rangschikken

11

Minst waarschijnlijk

dat…

Meest waarschijnlijk

dat…

Page 12: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

12 IBM SPSS Data Mining Workshop

Why is predictive analytics important to your organization?

“The median ROI for the

projects that incorporated

predictive technologies was

145%, compared with a

median ROI of 89% for those

projects that did not.” – Source: IDC, “Predictive Analytics

and ROI: Lessons from IDC’s

Financial Impact Study”

Page 13: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

13 IBM SPSS Data Mining Workshop

SPSS Customers: Business Objectives

13

Attract the

best customers

Retain

profitable

customers

Grow

customer

value

Manage

Risk

Detect and prevent

Non-Compliance “What is the

likelihood a prospect will respond?”

“What is the most likely next product for

each customer?“

“Which

customers are

likely to

leave?”

“What activities are

likely to be

fraudulent?”

“Which customers are likely to default on a

loan?”

Page 14: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

14 IBM SPSS Data Mining Workshop

Enabling the Predictive Analytics Process

14

Connect & Capture Analyse & Predict Deliver & Act Data Collection delivers

an accurate view of

customer attitudes and

opinions

Predictive capabilities bring

repeatability to ongoing decision

making, and drive confidence in

your results and decisions

Unique deployment

technologies and

methodologies maximize the

impact of analytics in your

operation

Page 15: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

15 IBM SPSS Data Mining Workshop

SPSS Predictive Analytics Software -- 4 Product Families

Data Collection (surveys) Delivers accurate view of customer attitudes & opinions

• IBM SPSS Data Collection

Statistics Drives confidence in your results & decisions

• IBM SPSS Statistics

• IBM SPSS Text Analytics for Surveys (STAFS)

Modeling (data mining) Brings repeatability to ongoing decision making

• IBM SPSS Modeler

• IBM SPSS Text Analytics (TA)

Deployment (automation, scoring service,

sharing, …) Maximizes the impact of analytics in your operation

• IBM SPSS Decision Management

• IBM SPSS Collaboration & Deployment Services

Page 16: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

16 IBM SPSS Data Mining Workshop 16

Predictive Modeling with Modeler

Page 17: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

17 IBM SPSS Data Mining Workshop

Predicting Customer Behavior

Marketing activities are driven by

predicted customer behavior

Data Mining

Data on

Historic and

Present

Customer

Behavior

Predicted Customer Behavior

Enterprise

Data

Sources

Marketing

Attitudinal

Interaction

Web

Call-center

Operational

Attrition

risk

Potential

value

Cross sell

B

Cross sell

A

Credit

risk

Fraud

risk

Page 18: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

18 IBM SPSS Data Mining Workshop

Definition of Data Mining

Finding patterns in your data that you can use to do your business better

Business-oriented discovery of patterns producing insight and a predictive capability which can be deployed widely

Process of autonomously retrieving useful information or knowledge (“actionable assets”) from large data stores or set

Predictive analysis helps connect data to

effective action by drawing reliable conclusions

about current conditions and future events.”

Gareth Herschel,

Research Director, Gartner Group

Page 19: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

19 IBM SPSS Data Mining Workshop

Statistical vs. Data Mining Approach

Top-Down Approaches: Query, Search

Bottom-Up Approaches: Data Mining, Text Mining

A Statistical Approach can

involve a user forming a theory

about a possible relationship in

a database and converting that

to a hypothesis and testing

that hypothesis using a

statistical method. It is a

manual, user-driven, top-

down approach to data

analysis. Source DM Review

• The difference with data mining is that the interrogation of the data is done by the data mining method--rather than by the user. It is a data-driven, self-organizing, bottom-up approach to data analysis that works on large data sets.

* "Statistical Modeling: The Two Cultures," Leo Breiman, Statistical Science, 2001, Vol.16 (3), pp.199-231.

Page 20: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

20 IBM SPSS Data Mining Workshop

Data Mining: a Different Approach

Top-Down Query

Search (OLAP, BI)

Bottom-Up Data Mining

Text/Web Mining

Measurement (historical) Prediction (future)

Bu

sin

ess v

alu

e

Facts Segments & Trends Predictions

Data

mining

Which customer types are at risk

and why?

Which cities were they located in?

OLAP

How many subscribers did we lose?

Query &

Reporting

What should we offer this

customer today?

Integrated

Analytical

Solutions

Page 21: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

21 IBM SPSS Data Mining Workshop

IBM SPSS Modeler

High performance data mining and text analytics workbench

Used for the proactive

• Identification of revenue opportunities

• Reduction of costs

• Increase in productivity

• Forecasting

Allows analytics to be repeated and integrated within business systems

Page 22: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

22 IBM SPSS Data Mining Workshop

IBM SPSS Modeler

Page 23: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

23 IBM SPSS Data Mining Workshop

IBM SPSS Modeler

Page 24: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

24 IBM SPSS Data Mining Workshop

Exercise: Predictive in 20 Minutes

Goal:

Identify who has cancelled their contract

Approach:

Use a data extract from a CRM

Define which fields to use

Choose the modeling technique

Automatically generate a model to identify who has cancelled

Review results

Why?

To prevent customers cancelling, by proactively identifying those likely

to cancel before they do.

Page 25: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

25 IBM SPSS Data Mining Workshop

Page 26: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

26 IBM SPSS Data Mining Workshop

Page 27: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

27 IBM SPSS Data Mining Workshop

Data mining methodology

CRoss-Industry Standard

Process Model for Data Mining

Describes Components of

Complete Data Mining Project

Cycle

Shows Iterative Nature of Data

Mining

Vendor and Industry Neutral

Page 28: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

28 IBM SPSS Data Mining Workshop

Data Mining Considerations – CRISP-DM

28

Business Understanding

What is the goal, what are we trying to achieve?

Data Understanding/Preparation Available data (structured/unstructured)

Relevant factors

Subject matter expertise

Modeling Supervised vs. Unsupervised

Different types of models (NN vs. Rules)

Combining models (Meta modeling)

Deployment Batch vs. Real-time

Production Automation Scheduling

Champion – Challenger

Multi-step jobs, conditional logic

Governance Version control

Security and auditing

Page 29: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

29 IBM SPSS Data Mining Workshop

Business Understanding

Business Problem

Telco Company has seen an increase in Customer Churn.

Problems with the Current Process

Based on Analysis it is not clear what the factors drive churn. The

business is in reactive mode vs. proactive.

Business Need

The executives have asked the marketing department to identify the

customers that are likely to churn and create an action plan to

address the problem.

Page 30: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

30 IBM SPSS Data Mining Workshop

Data Understanding

Do we have historical data that describes our customer behavior?

– Yes, the data is available in the Enterprise Data Warehouse

Do we have historical data of the customers that have churned?

– Yes, we keep that historical data in the EDW as well.

What data do we need? Where is it located?

– Billing data, call data, payment data and demographics

Page 31: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

31 IBM SPSS Data Mining Workshop

Data Preparation

Aggregate the data so that we have one row for each account

Get the relevant attributes and calculate them if necessary

Demographic data

Call behavioral data

Churn flag

Page 32: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

32 IBM SPSS Data Mining Workshop

Modeling

In this phase, various modeling techniques are selected and applied,

and their parameters are calibrated to optimal

values. Typically, there are several techniques for the same data

mining problem type. Some techniques have specific

requirements on the form of data.

Therefore, going back to the data preparation phase is often necessary.

Page 33: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

33 IBM SPSS Data Mining Workshop

Evaluation

Page 34: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

34 IBM SPSS Data Mining Workshop

Questions Customer Ask That Modeler Helps Answer

Segment – I know my customers aren’t all the same, but how?

Acquire –What customer should I be going after? –Where should I put my new store?

Grow – I’ve got dozens of products to offer– how do I know the best mix to offer? – I’m blanketing my customer base with offers, but my returns seem to be

diminishing. What am I doing wrong?

Retain – I wish I knew which customers were most likely to leave me for a competitor. – I wish I knew which customers were the most profitable

Fraud/Risk – I am spending a lot of time reviewing each claim,

I wish there was a way of identifying which claims I should focus on.

Page 35: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

35 IBM SPSS Data Mining Workshop

“After a thorough

investigation of the analytical

solutions in the market, we

selected IBM SPSS for its

ease of use for the business

users and the extensive

insight it provides into

customer behavior and

profitability. The software

generates results rapidly.”

— Paul Groenland

Project manager, database

marketing Rabobank

Business challenge

Rabobank aims to strengthen its position as a market leader in financial

services by further developing and expanding its relationship with its private

and corporate customers.

Solution

Rabobank uses predictive analytics software from IBM SPSS to create and

execute targeted direct marketing and lead generation campaigns. The

quality of the leads is higher, so marketing campaigns are much more cost-

efficient and effective

Benefits

Completion time for marketing campaigns has decreased, on average,

by two to four weeks

The quality of the leads is higher, so marketing campaigns are much

more cost-efficient and effective

Highly targeted support for local banks and advisors. By providing timely

and targeted leads, they can quickly respond to changes and to

individual customers’ wishes.

Rabobank

Page 36: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

36 IBM SPSS Data Mining Workshop

Zorg en Zekerheid Uses business analytics to target fraudulent insurance claims

36

The need:

Processing millions of healthcare records requires surgical precision. For this

Netherlands health insurer, this level of efficiency was missing from the process of

analyzing claims and invoices to catch fraudulent activity. Manually selecting the

data on the basis of predefined risk indicators had proven to be both time-

consuming and unreliable in catching those abusing the system.

The solution:

Zorg en Zekerheid deployed a predictive analytics software solution capable of

analyzing larger quantities of data, discovering patterns automatically,and catching

anomalies in the process with a sharper level of accuracy and efficiency. The

software provides a simple, graphical interface to deliver robust data mining,

advanced analytics and interactive visualization for business users.

What makes it smarter:

Propels the fraud investigation process to action within days, instead of multiple

weeks, using predictive analytics. Enables lost money to be recovered.

Captures all relevant data, including hard-copy invoices, which the system scans

and archives.

Aggregates millions of digitally submitted records from multiple data sources and

media formats into a central database, so data can be cross-functionally

structured and automatically analyzed.

“The analytics solution has

doubled our financial results

each year since 2007.”

— Andor de Vries, Fraud Analyst,

Zorg and Zekerheid

Solution component:

IBM® SPSS Modeler

Page 37: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

37 IBM SPSS Data Mining Workshop

Data Mining Methods

Unsupervised Learning – Input and outputs are unknown, finds useful patterns

Supervised Learning – Modeler specifies what to predict

Clustering Associations / Sequences

Regression

• Exploratory data analysis • Reveals natural groups within a data set • Distance Measure: No prior knowledge about

groups or characteristics • Not always an end in itself

• Finds things that occur together • Associations can exist between any of the

attributes • Discovers association rules in time-oriented data • Find the sequence or order of the events

Customer Segmentation Market Basket Analysis, Next logical purchase

Classification

• Predicts an fixed outcome based on a set of inputs.

• Modelers pre-defines input and outputs

Fraudulent insurance claim prediction

Page 38: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

38 IBM SPSS Data Mining Workshop 38

Unsupervised Learning - Cluster and Associate

Clustering

– An exploratory data analysis technique

– Reveals natural groups within a data set

– Distance Measure:

No prior knowledge about groups or characteristics

– Not always an end in itself

Associations

– Finds things that occur together – ex: events in a crime incident

– Associations can exist between any of the attributes

(no single outcome like Decision Trees)

Sequential Associations

– Discovers association rules in time-oriented data

– Find the sequence or order of the events

Page 39: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

39 IBM SPSS Data Mining Workshop 39

Supervised Learning - Classification

Neural Networks

– A technique for predicting outcomes based on inputs

where the inputs are weighted on hidden layers

– Behaves similar to the neurons in your brain

– Powerful general function estimators

– Require minimal statistical or mathematical knowledge

Decision Trees and Rule Induction

– Classification systems that predict or classify

– Technique that shows the ‘reasoning’

– contrast with Neural Network

– Builds sets of easy to understand ‘If – Then’ Rules

– Eliminates factors that are unimportant

Cat. % n

Bad 52.01 168

Good 47.99 155

Total (100.00) 323

Credit ranking (1=default)

Cat. % n

Bad 86.67 143

Good 13.33 22

Total (51.08) 165

Paid Weekly/Monthly

P-value=0.0000, Chi-square=179.6665, df=1

Weekly pay

Cat. % n

Bad 15.82 25

Good 84.18 133

Total (48.92) 158

Monthly salary

Cat. % n

Bad 90.51 143

Good 9.49 15

Total (48.92) 158

Age Categorical

P-value=0.0000, Chi-square=30.1113, df=1

Young (< 25);Middle (25-35)

Cat. % n

Bad 0.00 0

Good 100.00 7

Total (2.17) 7

Old ( > 35)

Cat. % n

Bad 48.98 24

Good 51.02 25

Total (15.17) 49

Age Categorical

P-value=0.0000, Chi-square=58.7255, df=1

Young (< 25)

Cat. % n

Bad 0.92 1

Good 99.08 108

Total (33.75) 109

Middle (25-35);Old ( > 35)

Cat. % n

Bad 0.00 0

Good 100.00 8

Total (2.48) 8

Social Class

P-value=0.0016, Chi-square=12.0388, df=1

Management;Clerical

Cat. % n

Bad 58.54 24

Good 41.46 17

Total (12.69) 41

Professional

Page 40: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

40 IBM SPSS Data Mining Workshop

Anomaly Detection

Anomalies

– Anomaly detection is an exploratory method

– Designed for quick detection of unusual cases or records that should

be candidates for further analysis

– These should be regarded as suspected anomalies, which, on closer

examination, may or may not turn out to be real

40

Page 41: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

41 IBM SPSS Data Mining Workshop

Disclaimer: Common Sense Check

Page 42: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

42 IBM SPSS Data Mining Workshop

Richmond Police Department Curbing crime with predictive analytics

42

The need: Facing a rising crime rate, the Richmond Police Department needed an efficient and cost-effective way to analyze crime data, assess public safety risks and make intelligent decisions about personnel deployment.

The solution: The Department turned to IBM SPSS, to deploy a powerful predictive analytics tool that brings data from multiple sources into one data warehouse; discovers hidden relationships in the data; and automatically generates crime forecasts.

What makes it smarter:

Analyzes extremely large datasets and predicts crime patterns, giving the

Department intelligence it needs to curb crime

Enables the Department to be efficient about how, where and when to deploy

patrol and tactical units

Demonstrates ability to reduce violent-crime rates (homicide rates dropped 32 %

from 2006-2007 and an additional 40 % from 2007-2008)

“The big performance boost

has been for my new guys

on the streets. IBM SPSS

essentially does the work

that is gained only from

experience.”

— Stephen Hollifield

Head of Technology

Richmond Police Department

Solution components: IBM SPSS Statistics

IBM SPSS Modeler

IBM Business Partner

Information Builders

IBM Business Partner RTI

International

Page 43: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

43 IBM SPSS Data Mining Workshop

Association Classification Segmentation

Exercises

Page 44: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

44 IBM SPSS Data Mining Workshop

Association Classification Segmentation

Exercises

Page 45: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

45 IBM SPSS Data Mining Workshop

Association model

Goal:

Identify what products are being sold together

Approach:

Use a data extract from a transactional system

Define which fields to use

Visualize relationship between products

Generate association model

Review results

Why?

Identify next likely purchase

Create bundles to increase $ value

Page 46: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

46 IBM SPSS Data Mining Workshop

Association Classification Segmentation

Exercises

Page 47: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

47 IBM SPSS Data Mining Workshop

Segmentation model

Page 48: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

48 IBM SPSS Data Mining Workshop

Association Classification Segmentation

Hands on sessions

Page 49: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

49 IBM SPSS Data Mining Workshop

The importance of text

Because people communicate with

words, not numbers, it has become

critical to be able to mine text for its

meaning and to sort, analyse, and

understand it in the same way that data

has been tamed. In fact, the two basic

types of information complement each

other, with data supplying the “what”

and text supplying the “why”.

Source IDC: “Text Analytics: Software’s Missing Piece?”

Page 50: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

50 IBM SPSS Data Mining Workshop

Text data and text analytics

Around 80% of data held within a company is in the form of unstructured text

documents or records:

– Insurance claim notes

– Emails

– Call center logs,

– Reports

– Surveys

– Web pages

– Blogs

– …

Text Analytics connects unstructured text data to effective action by drawing

reliable conclusions about current conditions and future events

50

Page 51: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

51 IBM SPSS Data Mining Workshop

IBM SPSS Text Analytics

51

Bring repeatability to ongoing decision making

Page 52: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

52 IBM SPSS Data Mining Workshop

Sentiment Analysis

Hundreds of customers reviews at a glance…

52

Page 53: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

53 IBM SPSS Data Mining Workshop

Text Mining

53

Free form notes entries

Linguistic Text Mining: 1. Language analysis

2. Concept extraction

3. Process types,

frequencies, & patterns

Integrated structured and unstructured data ready for Predictive Text Analytics

Page 54: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

54 IBM SPSS Data Mining Workshop

Use Text Analytics results to Improve Predictive Models

54

Page 55: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

55 IBM SPSS Data Mining Workshop 55

RTL Nederland / InSites Consulting - Analyzing social media buzz to increase TV viewer involvement

The need:

RTL Nederland aimed to evaluate its television programs in the Dutch market and

increase viewer satisfaction making use of online conversations. Therefore, RTL

Nederland needed a way to analyze, interpret and successfully respond to

audience feedback from social media sources.

The solution:

RTL Nederland worked with InSites Consulting to capture viewer opinions from

user-generated comments on social media and other online buzz by using IBM

predictive analytics software. This helps RTL Nederland to better understand

audience needs and preferences, and hence increase viewer satisfaction and

involvement. The obtained insight on viewer likes and dislikes allows RTL

Nederland to optimize its product offering.

What makes it smarter:

Analyzed the sentiment of over 71,000 online conversations about ‘X FACTOR’,

providing RTL Nederland with a powerful tool to measure attitudes indirectly and

quickly adapt the program accordingly

Captures unstructured data automatically from the web with sophisticated text

analytics technology

Approaching the final episodes of the reality competition shows, online buzz on

the program even increased by about 400 percent, which provided a very rich

source of information about viewer opinions

“Collecting and analyzing

feedback from social media is

of great importance to RTL

Nederland in order to offer

programmes that are fully

aligned with the target

audience.”

— Emilie van den Berge, senior

Research & Intelligence project

leader, RTL Nederland

Page 56: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

56 IBM SPSS Data Mining Workshop

Classification model

Goal:

Identify who is likely to cancel their contract

Approach:

Use a data extract from a CRM

Use open ended comments from call center

Extract concepts from the text

Define which fields to use

Choose the modeling technique

Automatically generate a model to identify who has cancelled

Review results

Why?

Identify customers at risk before they churn

Unstructured data can provide insight into customers actions and

improve model accuracy

Page 57: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

57 IBM SPSS Data Mining Workshop

Association Classification Segmentation

Exercises

Page 58: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

58 IBM SPSS Data Mining Workshop

Deployment

Page 59: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

59 IBM SPSS Data Mining Workshop

Deployment

Goal:

Deploy a predictive model

Approach:

Use the stream generated in the earlier session

Pass new data through the stream and ‘score’ the data

Identify those likely to cancel

Export an .xls file with 50 most likely to cancel

Why?

Extend the reach of analytics in an organization

Allows analytics at the point of impact rather than being reactive

Page 60: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

60 IBM SPSS Data Mining Workshop

Based on the

predictive model,

a single offer is

presented to the

customer

A call center agent

submits customer

information during

an interaction

The reaction to the offer

is tracked and used to

refine the model

Deployment – integrating with existing systems

Page 61: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

61 IBM SPSS Data Mining Workshop

Customer Example Customer Growth from Inbound Contacts

“I’m calling to get my information on my download limit”

Next Best Action : Recommend Broadband Unlimited

“Certainly, Mr. Watson. I’ll just get

that for you right now… “

“Mr.Watson, you currently close to your 10GB

monthly limit however, as a valued long-term

customer, we’re able to make you an offer on

unlimited broadband”

Page 62: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

62 IBM SPSS Data Mining Workshop

Deployment – integrating with Cognos BI

3) Results widely

distributed via BI for

consumption by

business Users

Cognos BI

Common

Business

Model

1) Leveraging BI,

identify problem or

situation needing

attention

2) SPSS

predictive

analytics feed

results back into

the BI layer

Page 63: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

63 IBM SPSS Data Mining Workshop

Modeler’s Unique Capabilities

Easy to Learn / Intuitive Visual Interface –Visual approach - no programming –Comprehensive range of data mining

functions –Flexible deployment options

Powerful Automated modeling –Automated data preparation –Multi model creation & evaluation – Integrated analysis of text, web, & survey

data

Open and scalable architecure –Data mining within standard databases

with SQL pushback support –Maximized use of infrastructure with

multithreading, clustering and use of embedded algorithms (in database mining)

– Integration with IBM technologies such as IBM Cognos Business Intelligence, Netezza and IBM InfoSphere Warehouse

Page 64: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

64 IBM SPSS Data Mining Workshop

Modeler Editions

IBM SPSS Modeler Professional

–Modeler Professional is a data mining workbench for the analysis of

structured numerical data to model outcomes and make predictions that

inform business decisions with predictive intelligence.

IBM SPSS Modeler Premium

–Modeler Premium allows organizations to tap into the predictive intelligence

held in all forms of data. Modeler Premium goes beyond the analysis of

structured numerical data alone and includes information from unstructured

data such as web activity, blog content, customer feedback, e-mails, articles,

and more to create the most accurate predictive models possible.

Page 65: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

65 IBM SPSS Data Mining Workshop

IBM SPSS Modeler Deployment Options

Client (Desktop)

–Access local files

–Connect to operational databases

–Connect to Cognos BI

–Processing performed on local installation

Client/Server

–Data operations/processing on server

– In-database data mining

–SQL pushback

–Modeler Batch

–SuSE Linux Enterprise Server 10 (zLinux)

– Inclusion in Smart Analytics System for Power (AIX)

65

Page 66: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

66 IBM SPSS Data Mining Workshop

Workshop Takeaways

Easy to use, visual interface

Short timeframe to be productive with actionable results

Does not require knowledge of programming language

Business results focused

Cost effective solution that delivers powerful results across organization

Flexible licensing and deployment options

Full range of algorithms for your business problems

End-to-end solution

Data preparation through real time interactions

Use structured, unstructured and survey data

Full suite of products, from data collection through deployment

Page 67: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

67 IBM SPSS Data Mining Workshop

Workshop Takeaways

Flexible architecture

Leverages the investments already made in technology

Does not require data in a proprietary format or DB

Structured and unstructured data

Open architecture (both inputs and outputs)

SQL Pushback

Page 68: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

68 IBM SPSS Data Mining Workshop

Predictive analytics customer success

“94% achieved a positive return on investment with an average

payback period of 10.7 months.”

“Returns were achieved through reduced costs, increased productivity,

increased employee and customer satisfaction, and greater visibility.”

“Flexibility, performance, and price were all key factors in purchase

decisions.”

Nucleas Research, An independent provider of Global Research and Advisory Services.

“30 Million Euro in new revenue” “100% increase in

campaign effectiveness”

“Reduced churn from 19 to 2%” “35% reduction in mailing cost,

2X response rate, 29% more

profit”

Page 69: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

69 IBM SPSS Data Mining Workshop

We appreciate your feedback.

Please fill out the survey form in order to improve this educational event.

SIMPLIFIED CHINESE HINDI JAPANESE

ARABIC RUSSIAN TRADITIONAL CHINESE TAMIL THAI

FRENCH

GERMAN

ITALIAN

SPANISH

BRAZILIAN PORTUGUESE

Page 70: PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014

© 2014 IBM Corporation

IBM Software

70 IBM SPSS Data Mining Workshop 70 IBM Business Solutions Center, La Gaude – october 2011

Thank You

Laila Fettah Client Technical Professional Advanced Analytics

IBM

Johan Huizingalaan 765

1066 VH Amsterdam

Tel: +31 (0)20 513 8950

Mobile: +31 (0)6 11 87 61 55

[email protected]

Robin van Tilburg Client Technical Professional Advanced Analytics

IBM

Johan Huizingalaan 765

1066 VH Amsterdam

Tel: +31 (0)20 513 8371

Mobile: +31 (0)6 31 04 10 74

[email protected]

Contact