Fi nf068c73aef66f694f31a049aff3f4

27
Shawn D’souza Oct2012 TORQUE IT SOLUTIONS Empowering Automotive Finance BTECH 451 Data Integration Final Presentation

Transcript of Fi nf068c73aef66f694f31a049aff3f4

Page 1: Fi nf068c73aef66f694f31a049aff3f4

Shawn D’souzaOct2012

TORQUE IT SOLUTIONS

Empowering Automotive Finance

BTECH 451Data Integration

Final Presentation

Page 2: Fi nf068c73aef66f694f31a049aff3f4

OverviewDEFINITION NEED FOR DI CHALLENGES FOR DI APPROACHES

PREVIOUSLY TECHNICAL DETAILS DEMO SOLUTION ANALYSIS

FUTURE WORK CONCLUSION EXPERIENCE GAINED THANK YOU

Page 3: Fi nf068c73aef66f694f31a049aff3f4

Data Integration

Page 4: Fi nf068c73aef66f694f31a049aff3f4

Definition

Data integration involves combining data residing in different sources and providing users with a unified view of these data.[1]

Maurizio Lenzerini (2002). "Data Integration: A Theoretical Perspective". PODS 2002. pp. 233–246

Page 5: Fi nf068c73aef66f694f31a049aff3f4

Data warehouse

Pros: • Reports run against the Data Warehouse rather than

your production database so your production database can be dedicated to transactional processing rather than reporting

• Reporting can be faster• Static Metadata is provided in the Data Warehouse

Cons: • Building or buying pre-built Data Warehouses is more

expensive than a Live Data strategy • “IT intensive” with heavy reliance on IT support • Resources intensive to manage, maintain, and provide

additional content on an ongoing basis • The frequency of data being refreshed in the Data

Warehouse may impact reporting • Requires additional database software to store data and

ETL software to populate your Data Warehouse

Live Reporting

Pros: • Less costly• Less complicated• “IT Lite” with much less reliance on IT

resources• Reports run against live production data rather

than a Data Warehouse so you know all data returned in reports is guaranteed to be the most recent data in DPMS environment

• Reports may run up to 10 to 30 times faster with Live Data reporting than with existing DPMS

Cons: • If POS tables are purged then tables often you will have

to be copied first if you want to report historical information with a Live Data strategy

• Report processing is shared with transactional processing on DPMS database

Page 6: Fi nf068c73aef66f694f31a049aff3f4

The need for DI

•Querying on business activities, for statistical analysis, online analytical processing (OLAP), and data mining in order to en-able forecasting, decision making, enterprise-wide planning, and, in the end,• To gain sustainable competitive advantages.•Requirements for improved customer service

or self-service

Page 7: Fi nf068c73aef66f694f31a049aff3f4

Challenges for DI

• Data quality• The data integration team must promote data quality to a first-class citizen.

• Transparency and auditability• Even high-quality results will be questioned by business consumers. Providing

complete transparency into how the data results were produced will be necessary to relieve business consumers’ concerns around data quality.

• Tracking history• The ability to correctly report results at a particular period in time is an on-

going challenge, particularly when there are adjustments to historical data. • Reducing processing times• Efficiently processing very large volumes of data within ever shortening

processing windows is an on-going challenge for the data integration team

Page 8: Fi nf068c73aef66f694f31a049aff3f4

Approaches to DI

[Dittrich and Jonscher, 1999], All Together Now — Towards Integrating the World’s Information Systems

Page 9: Fi nf068c73aef66f694f31a049aff3f4

Approaches to DI

• Manual Integration• users directly interact with all relevant information systems and manually integrate selected

data• Common User Interface• the user is supplied with a common user interface (e.g., a web browser) that provides a uniform

look and feel.• Integration by Applications• Applications that access various data sources and return integrated results to the user

• Integration by Middleware• reusable functionality that is generally used to solve dedicated aspects of the integration

problem• Uniform Data Access • a logical integration of data is accomplished at the data access level

• Common Data Storage • physical data integration is performed by transferring data to a new data storage

[Dittrich and Jonscher, 1999], All Together Now — Towards Integrating the World’s Information Systems

Page 10: Fi nf068c73aef66f694f31a049aff3f4

DI Strategies

• Enterprise Information Integration (EII) – This pattern loosely couples multiple data stores by creating a semantic layer above the data stores and using industry-standard APIs such as ODBC, OLE-DB, and JDBC to access the data in real time.

• Enterprise Application Integration (EAI) – This pattern supports business processes and workflows that span multiple application systems. It typically works on a message-/event-based model and is not data-centric (i.e., it is parameter-based and does not pass more than one “record” at a time).

• Extract, Transform, and Load (ETL) – This pattern extracts data from sources, transforms the data in memory and then loads it into a destination.

• Extract, Load, and Transform (ELT) – This pattern first extracts data from sources and loads it into a relational database. The transformation is then performed within the relational database and not in memory.

• Replication – This is a relational database feature that detects changed records in a source and pushes the changed records to a destination or destinations. The destination is typically a mirror of the source, meaning that the data is not transformed on the way from source to destination.

Page 11: Fi nf068c73aef66f694f31a049aff3f4

Previously

Page 12: Fi nf068c73aef66f694f31a049aff3f4

The Compan

y

Torque IT Solutions• Provides I.T Solutions For Automotive

Finance Companies And Car Dealerships• Start-up• Dealer Performance Management

System• Show Profit Potential

Page 13: Fi nf068c73aef66f694f31a049aff3f4

The GoalImplement an Interface that will allow users to Import Data from external

databases

Page 14: Fi nf068c73aef66f694f31a049aff3f4

Ove

rnig

ht

Data

base

Dum

p

Pull

TheArchitectur

e

Customised import process

D.P.M.S

Dealer’s Vehicle Database

Export File

P.O.S

StandardisedD.P.M.S Database

Reference Tables

Ad-h

oc P

ull

Page 15: Fi nf068c73aef66f694f31a049aff3f4

Implementation

Page 16: Fi nf068c73aef66f694f31a049aff3f4

Code structure

<< ILogImportService >>

+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults

PosImportService

+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults

DpmsImportService

+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults

Request from Presentation Layer

Page 17: Fi nf068c73aef66f694f31a049aff3f4

Demo

Page 18: Fi nf068c73aef66f694f31a049aff3f4

Search

screen

Page 19: Fi nf068c73aef66f694f31a049aff3f4

Results

screen

Page 20: Fi nf068c73aef66f694f31a049aff3f4

Imported

Entry

Page 21: Fi nf068c73aef66f694f31a049aff3f4

Solution Analysis

Page 22: Fi nf068c73aef66f694f31a049aff3f4

Future Work

• Limitations• Pros• Flexibility – allows new external data sources to be

easily configured• Cons• Exact match

• Bulk Import• Edge server caching

Page 23: Fi nf068c73aef66f694f31a049aff3f4

Edge Server Caching

• database caching at edge servers enables dynamic content to be replicated at the edge of the network, thereby improving the scalability and the response time of Web applications. • Integrates data service technology and edge server

data replication architecture, in order to improve Web services‟ data performance and address a variety of data issues in the SOA network.

Page 24: Fi nf068c73aef66f694f31a049aff3f4

Advantages of

Edge Server Caching

• Provide data services with edge server data replication to clients• Increase data service

performance•Reduce client-perceived response

time• Ensure data consistency is more

easily achieved

Page 25: Fi nf068c73aef66f694f31a049aff3f4

Conclusion

• Importance of DI • Issues for DI•How You can improve DI• Scalability considerations for DI

Page 26: Fi nf068c73aef66f694f31a049aff3f4

Technologies Used

&Experience

gained• NHibernate• MVC .NET• jQuery• SQL Server

Data Access Layer

HTML

JavaScript

Jquery

Repositories

SVN

XML

JSON

Controller

View

Rasor

ASP.Net

MVC3

Client Events

Persistence

Data Objects

Interface

SessionNHibernate

Lambda exoressions

N-Tier

Seralization

SOA

UI

Page 27: Fi nf068c73aef66f694f31a049aff3f4

Thank YouXinfeg Ye

Academic Advisor

Frederik DinkelakerIndustrial Mentor