Fi nf068c73aef66f694f31a049aff3f4

Post on 14-Apr-2017

197 views 0 download

Transcript of Fi nf068c73aef66f694f31a049aff3f4

Shawn D’souzaOct2012

TORQUE IT SOLUTIONS

Empowering Automotive Finance

BTECH 451Data Integration

Final Presentation

OverviewDEFINITION NEED FOR DI CHALLENGES FOR DI APPROACHES

PREVIOUSLY TECHNICAL DETAILS DEMO SOLUTION ANALYSIS

FUTURE WORK CONCLUSION EXPERIENCE GAINED THANK YOU

Data Integration

Definition

Data integration involves combining data residing in different sources and providing users with a unified view of these data.[1]

Maurizio Lenzerini (2002). "Data Integration: A Theoretical Perspective". PODS 2002. pp. 233–246

Data warehouse

Pros: • Reports run against the Data Warehouse rather than

your production database so your production database can be dedicated to transactional processing rather than reporting

• Reporting can be faster• Static Metadata is provided in the Data Warehouse

Cons: • Building or buying pre-built Data Warehouses is more

expensive than a Live Data strategy • “IT intensive” with heavy reliance on IT support • Resources intensive to manage, maintain, and provide

additional content on an ongoing basis • The frequency of data being refreshed in the Data

Warehouse may impact reporting • Requires additional database software to store data and

ETL software to populate your Data Warehouse

Live Reporting

Pros: • Less costly• Less complicated• “IT Lite” with much less reliance on IT

resources• Reports run against live production data rather

than a Data Warehouse so you know all data returned in reports is guaranteed to be the most recent data in DPMS environment

• Reports may run up to 10 to 30 times faster with Live Data reporting than with existing DPMS

Cons: • If POS tables are purged then tables often you will have

to be copied first if you want to report historical information with a Live Data strategy

• Report processing is shared with transactional processing on DPMS database

The need for DI

•Querying on business activities, for statistical analysis, online analytical processing (OLAP), and data mining in order to en-able forecasting, decision making, enterprise-wide planning, and, in the end,• To gain sustainable competitive advantages.•Requirements for improved customer service

or self-service

Challenges for DI

• Data quality• The data integration team must promote data quality to a first-class citizen.

• Transparency and auditability• Even high-quality results will be questioned by business consumers. Providing

complete transparency into how the data results were produced will be necessary to relieve business consumers’ concerns around data quality.

• Tracking history• The ability to correctly report results at a particular period in time is an on-

going challenge, particularly when there are adjustments to historical data. • Reducing processing times• Efficiently processing very large volumes of data within ever shortening

processing windows is an on-going challenge for the data integration team

Approaches to DI

[Dittrich and Jonscher, 1999], All Together Now — Towards Integrating the World’s Information Systems

Approaches to DI

• Manual Integration• users directly interact with all relevant information systems and manually integrate selected

data• Common User Interface• the user is supplied with a common user interface (e.g., a web browser) that provides a uniform

look and feel.• Integration by Applications• Applications that access various data sources and return integrated results to the user

• Integration by Middleware• reusable functionality that is generally used to solve dedicated aspects of the integration

problem• Uniform Data Access • a logical integration of data is accomplished at the data access level

• Common Data Storage • physical data integration is performed by transferring data to a new data storage

[Dittrich and Jonscher, 1999], All Together Now — Towards Integrating the World’s Information Systems

DI Strategies

• Enterprise Information Integration (EII) – This pattern loosely couples multiple data stores by creating a semantic layer above the data stores and using industry-standard APIs such as ODBC, OLE-DB, and JDBC to access the data in real time.

• Enterprise Application Integration (EAI) – This pattern supports business processes and workflows that span multiple application systems. It typically works on a message-/event-based model and is not data-centric (i.e., it is parameter-based and does not pass more than one “record” at a time).

• Extract, Transform, and Load (ETL) – This pattern extracts data from sources, transforms the data in memory and then loads it into a destination.

• Extract, Load, and Transform (ELT) – This pattern first extracts data from sources and loads it into a relational database. The transformation is then performed within the relational database and not in memory.

• Replication – This is a relational database feature that detects changed records in a source and pushes the changed records to a destination or destinations. The destination is typically a mirror of the source, meaning that the data is not transformed on the way from source to destination.

Previously

The Compan

y

Torque IT Solutions• Provides I.T Solutions For Automotive

Finance Companies And Car Dealerships• Start-up• Dealer Performance Management

System• Show Profit Potential

The GoalImplement an Interface that will allow users to Import Data from external

databases

Ove

rnig

ht

Data

base

Dum

p

Pull

TheArchitectur

e

Customised import process

D.P.M.S

Dealer’s Vehicle Database

Export File

P.O.S

StandardisedD.P.M.S Database

Reference Tables

Ad-h

oc P

ull

Implementation

Code structure

<< ILogImportService >>

+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults

PosImportService

+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults

DpmsImportService

+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults

Request from Presentation Layer

Demo

Search

screen

Results

screen

Imported

Entry

Solution Analysis

Future Work

• Limitations• Pros• Flexibility – allows new external data sources to be

easily configured• Cons• Exact match

• Bulk Import• Edge server caching

Edge Server Caching

• database caching at edge servers enables dynamic content to be replicated at the edge of the network, thereby improving the scalability and the response time of Web applications. • Integrates data service technology and edge server

data replication architecture, in order to improve Web services‟ data performance and address a variety of data issues in the SOA network.

Advantages of

Edge Server Caching

• Provide data services with edge server data replication to clients• Increase data service

performance•Reduce client-perceived response

time• Ensure data consistency is more

easily achieved

Conclusion

• Importance of DI • Issues for DI•How You can improve DI• Scalability considerations for DI

Technologies Used

&Experience

gained• NHibernate• MVC .NET• jQuery• SQL Server

Data Access Layer

HTML

JavaScript

Jquery

Repositories

SVN

XML

JSON

Controller

View

Rasor

ASP.Net

MVC3

Client Events

Persistence

Data Objects

Interface

SessionNHibernate

Lambda exoressions

N-Tier

Seralization

SOA

UI

Thank YouXinfeg Ye

Academic Advisor

Frederik DinkelakerIndustrial Mentor