Fi nf068c73aef66f694f31a049aff3f4
-
Upload
shawn-dsouza -
Category
Documents
-
view
197 -
download
0
Transcript of Fi nf068c73aef66f694f31a049aff3f4
Shawn D’souzaOct2012
TORQUE IT SOLUTIONS
Empowering Automotive Finance
BTECH 451Data Integration
Final Presentation
OverviewDEFINITION NEED FOR DI CHALLENGES FOR DI APPROACHES
PREVIOUSLY TECHNICAL DETAILS DEMO SOLUTION ANALYSIS
FUTURE WORK CONCLUSION EXPERIENCE GAINED THANK YOU
Data Integration
Definition
Data integration involves combining data residing in different sources and providing users with a unified view of these data.[1]
Maurizio Lenzerini (2002). "Data Integration: A Theoretical Perspective". PODS 2002. pp. 233–246
Data warehouse
Pros: • Reports run against the Data Warehouse rather than
your production database so your production database can be dedicated to transactional processing rather than reporting
• Reporting can be faster• Static Metadata is provided in the Data Warehouse
Cons: • Building or buying pre-built Data Warehouses is more
expensive than a Live Data strategy • “IT intensive” with heavy reliance on IT support • Resources intensive to manage, maintain, and provide
additional content on an ongoing basis • The frequency of data being refreshed in the Data
Warehouse may impact reporting • Requires additional database software to store data and
ETL software to populate your Data Warehouse
Live Reporting
Pros: • Less costly• Less complicated• “IT Lite” with much less reliance on IT
resources• Reports run against live production data rather
than a Data Warehouse so you know all data returned in reports is guaranteed to be the most recent data in DPMS environment
• Reports may run up to 10 to 30 times faster with Live Data reporting than with existing DPMS
Cons: • If POS tables are purged then tables often you will have
to be copied first if you want to report historical information with a Live Data strategy
• Report processing is shared with transactional processing on DPMS database
The need for DI
•Querying on business activities, for statistical analysis, online analytical processing (OLAP), and data mining in order to en-able forecasting, decision making, enterprise-wide planning, and, in the end,• To gain sustainable competitive advantages.•Requirements for improved customer service
or self-service
Challenges for DI
• Data quality• The data integration team must promote data quality to a first-class citizen.
• Transparency and auditability• Even high-quality results will be questioned by business consumers. Providing
complete transparency into how the data results were produced will be necessary to relieve business consumers’ concerns around data quality.
• Tracking history• The ability to correctly report results at a particular period in time is an on-
going challenge, particularly when there are adjustments to historical data. • Reducing processing times• Efficiently processing very large volumes of data within ever shortening
processing windows is an on-going challenge for the data integration team
Approaches to DI
[Dittrich and Jonscher, 1999], All Together Now — Towards Integrating the World’s Information Systems
Approaches to DI
• Manual Integration• users directly interact with all relevant information systems and manually integrate selected
data• Common User Interface• the user is supplied with a common user interface (e.g., a web browser) that provides a uniform
look and feel.• Integration by Applications• Applications that access various data sources and return integrated results to the user
• Integration by Middleware• reusable functionality that is generally used to solve dedicated aspects of the integration
problem• Uniform Data Access • a logical integration of data is accomplished at the data access level
• Common Data Storage • physical data integration is performed by transferring data to a new data storage
[Dittrich and Jonscher, 1999], All Together Now — Towards Integrating the World’s Information Systems
DI Strategies
• Enterprise Information Integration (EII) – This pattern loosely couples multiple data stores by creating a semantic layer above the data stores and using industry-standard APIs such as ODBC, OLE-DB, and JDBC to access the data in real time.
• Enterprise Application Integration (EAI) – This pattern supports business processes and workflows that span multiple application systems. It typically works on a message-/event-based model and is not data-centric (i.e., it is parameter-based and does not pass more than one “record” at a time).
• Extract, Transform, and Load (ETL) – This pattern extracts data from sources, transforms the data in memory and then loads it into a destination.
• Extract, Load, and Transform (ELT) – This pattern first extracts data from sources and loads it into a relational database. The transformation is then performed within the relational database and not in memory.
• Replication – This is a relational database feature that detects changed records in a source and pushes the changed records to a destination or destinations. The destination is typically a mirror of the source, meaning that the data is not transformed on the way from source to destination.
Previously
The Compan
y
Torque IT Solutions• Provides I.T Solutions For Automotive
Finance Companies And Car Dealerships• Start-up• Dealer Performance Management
System• Show Profit Potential
The GoalImplement an Interface that will allow users to Import Data from external
databases
Ove
rnig
ht
Data
base
Dum
p
Pull
TheArchitectur
e
Customised import process
D.P.M.S
Dealer’s Vehicle Database
Export File
P.O.S
StandardisedD.P.M.S Database
Reference Tables
Ad-h
oc P
ull
Implementation
Code structure
<< ILogImportService >>
+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults
PosImportService
+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults
DpmsImportService
+SetExternalLogImportSource+GetSearchParameters();+GetSearchResults
Request from Presentation Layer
Demo
Search
screen
Results
screen
Imported
Entry
Solution Analysis
Future Work
• Limitations• Pros• Flexibility – allows new external data sources to be
easily configured• Cons• Exact match
• Bulk Import• Edge server caching
Edge Server Caching
• database caching at edge servers enables dynamic content to be replicated at the edge of the network, thereby improving the scalability and the response time of Web applications. • Integrates data service technology and edge server
data replication architecture, in order to improve Web services‟ data performance and address a variety of data issues in the SOA network.
Advantages of
Edge Server Caching
• Provide data services with edge server data replication to clients• Increase data service
performance•Reduce client-perceived response
time• Ensure data consistency is more
easily achieved
Conclusion
• Importance of DI • Issues for DI•How You can improve DI• Scalability considerations for DI
Technologies Used
&Experience
gained• NHibernate• MVC .NET• jQuery• SQL Server
Data Access Layer
HTML
JavaScript
Jquery
Repositories
SVN
XML
JSON
Controller
View
Rasor
ASP.Net
MVC3
Client Events
Persistence
Data Objects
Interface
SessionNHibernate
Lambda exoressions
N-Tier
Seralization
SOA
UI
Thank YouXinfeg Ye
Academic Advisor
Frederik DinkelakerIndustrial Mentor