Informatica Basics Demo 8.6

download Informatica Basics Demo 8.6

of 36

Transcript of Informatica Basics Demo 8.6

  • 7/29/2019 Informatica Basics Demo 8.6

    1/36

    AN INTRODUCTION

    Presented by: Narendra Reddy.B

    BUSINESS INTELLIGENCE/DATA

    INTEGRATION/ETL/INTEGRATION

  • 7/29/2019 Informatica Basics Demo 8.6

    2/36

    What is Business Intelligence

    Business Intelligence (BI) encompasses the processes, tools,and technologies required to transform enterprise data

    into information, and information into knowledge that can

    be used to enhance decision-making and to create

    actionable plans that drive effective business activity.

    BI can be used to acquire

    Tactical insight to optimize business processes by

    identifying trends, anomalies, and behaviors that

    require management action.

    Strategic insight to align multiple business processes

    with key business objectives through integrated

    performance management and analysis.

  • 7/29/2019 Informatica Basics Demo 8.6

    3/36

    What is Business Intelligence

    Business Intelligence (BI) is about getting the rightinformation, to the right decision makers, at the righttime.

    BI is an enterprise-wide platform that supports reporting,analysis and decision making.

    BI leads to:

    fact-based decision making

    single version of the truth

    BI includes reporting and analytics.

  • 7/29/2019 Informatica Basics Demo 8.6

    4/36

    Used for:

    BI is not a single computer system, but framework for leveraging data for tactical and

    strategic use

  • 7/29/2019 Informatica Basics Demo 8.6

    5/36

    How BI Works TogetherExtract

    TransformLoad

    Data Input

    OLTP

    ATRRS

    OLTP

    RECBASS

    OLTP

    AIMSPC

    RATSSRFMSS

    Other Possible Data Sources

    Disparate Data Sources

    TIMS DW

    SingleReporting

    Repository

    Real-timeDashboards

    Static andAd-hoc Reporting

    Graphical

    Data Analysis

  • 7/29/2019 Informatica Basics Demo 8.6

    6/36

    Components of BI

    Data Integration ( Informatica, DataStage)

    Data Reporting ( Cognos, Business Objects)

  • 7/29/2019 Informatica Basics Demo 8.6

    7/36

    Data Integration

    Data integration involves combining data residing indifferent sources and providing users with a unified viewof these data.This process becomes significant in avariety of situations both commercial (when two similarcompanies need to merge their database) and scientific

    (combining research results from different bioinformaticsrepositories, for example).

    Data integration appears with increasing frequency as thevolume and the need to share existing data explodes It hasbecome the focus of extensive theoretical work, and

    numerous open problems remain unsolved. In managementcircles, people frequently refer to data integration as"Enterprise Information Integration" (EII).

  • 7/29/2019 Informatica Basics Demo 8.6

    8/36

    How to enable Data Integration

    USING ETL PROCESS

  • 7/29/2019 Informatica Basics Demo 8.6

    9/36

    ETL ( Extract Transform Load)

    ETL stands for extract, transform and load,

    the processes that enable companies to move

    data from multiple sources, reformat and

    cleanse it, and load it into anotherdatabase, a data mart or a data warehouse

    for analysis, or on another operational

    system to support a business process

  • 7/29/2019 Informatica Basics Demo 8.6

    10/36

    ETL ( Extract Transform Load)

    A Properly designed ETL system extracts data

    from the source systems, enforces data quality

    and consistency standards, conforms data so

    that separate sources can be used together, and

    finally delivers data in a presentation-readyformat so that application developers can build

    applications and end users can make

    decisions ETL makes or breaks the data

    warehouseRalph Kimball

  • 7/29/2019 Informatica Basics Demo 8.6

    11/36

    ETL ( Extract Transform Load)

  • 7/29/2019 Informatica Basics Demo 8.6

    12/36

    Informatica 8.6 What & How to work?

    What is Informatica 8.6?

    Informatica is an ETL tool that delivers an open,scalable data integration solution addressing the

    complete life cycle for data warehouse andanalytic application development.

    Informatica provides an environment that can

    extract data from multiple sources, transform the

    data according to the business logic that is built

    in the Informatica Client application and load the

    transformed data into files or relational targets.

  • 7/29/2019 Informatica Basics Demo 8.6

    13/36

    Informatica 8.6 PowerCenter

    PowerCenter provides an environment that allows you to load data

    into a centralized location, such as a data warehouse oroperational data store (ODS). You can extract data from multiple

    sources, transform the data according to business logic you

    build in the client application, and load the transformed data

    into file and relational targets.

  • 7/29/2019 Informatica Basics Demo 8.6

    14/36

    Informatica Architecture 8.6

  • 7/29/2019 Informatica Basics Demo 8.6

    15/36

    Integration services architecture

    The Integration Service moves data from sources to targets based on workflow and

    mapping metadata stored in a repository.

    When a workflow starts, the Integration Service retrieves mapping, workflow, and session

    metadata from the repository. It extracts data from the mapping sources and stores the data

    in memory while it applies the transformation rules configured in the mapping.

    The Integration Service loads the transformed data into one or more targets.

    To move data from sources to targets, the Integration Service uses the following

    components:

    Integration Service process.

    Load Balancer.

    Data Transformation Manager (DTM) process.

  • 7/29/2019 Informatica Basics Demo 8.6

    16/36

    ISP(Integration sevices processor)

    When you save a workflow assigned to an Integration Service to the repository, the

    Integration Service process adds the workflow to or removes the workflow from the

    schedule queue.

    Functions:

    Manages workflow scheduling.

    Locks and reads the workflow.

    Reads the parameter file.

    Creates the workflow log.

    Runs workflow tasks and evaluates the conditional links connecting tasks.

    Starts the DTM process or processes to run the session.

    Writes historical run information to the repository.

    Sends post-session email in the event of a DTM failure.

    Load Balancer:

    The Load Balancer is a component of the Integration Service that dispatches

    tasks to achieve optimal performance and scalability. The Load Balancer matches task requirements with resource availability to

    identify the best node to run a task.

    It dispatches the task to an Integration Service process running on the node. It

    may dispatch tasks to a single node or across nodes.

  • 7/29/2019 Informatica Basics Demo 8.6

    17/36

    Data Transformation Manager (DTM) Process

    The DTM process performs the following tasks:

    Retrieves and validates session information from the repository.

    Performs pushdown optimization when the session is configured forpushdown optimization.

    Adds partitions to the session when the session is configured fqor dynamic

    partitioning.

    Forms partition groups when the session is configured to run on a grid.

    Expands the service process variables, session parameters, and mapping

    variables and parameters. Creates the session log.

    Validates source and target code pages.

    Verifies connection object permissions.

    Runs pre-session shell commands, stored procedures, and SQL.

    Sends a request to start worker DTM processes on other nodes when the

    session is configured to run on a grid.

    Creates and runs mapping, reader, writer, and transformation threads toextract, transform, and load data.

    Runs post-session stored procedures, SQL, and shell commands.

    Sends post-session email.

  • 7/29/2019 Informatica Basics Demo 8.6

    18/36

    Processing Threads:

    The DTM allocates process memory for the session and divides it into buffers. This is also

    known as buffer memory. The default memory allocation is 12,000,000 bytes. The DTM uses

    multiple threads to process data in a session. The main DTM thread is called the master thread.

    Thread Types:

    The master thread creates different types of threads for a session. The types of threads the

    master thread creates depend on the pre- and post-session properties, as well as the types of

    transformations in the mapping.

    The master thr ead can create the fol lowing types of thr eads:

    Mapping Threads

    Pre- and Post-Session Threads

    Reader Threads

    Transformation Threads

    Writer Threads

  • 7/29/2019 Informatica Basics Demo 8.6

    19/36

    Mapping Threads:

    The master thread creates one mapping thread for each session.

    The mapping thread fetches session and mapping information, compiles the mapping,

    and cleans up after session execution.

    Pre- and Post-Session Threads:

    The master thread creates one pre-session and one post-session thread to perform pre-

    and post-session operations.

    Reader Threads:

    The master thread creates reader threads to extract source data.

    The number of reader threads depends on the partitioning information for each

    pipeline.

    The number of reader threads equals the number of partitions.

    Relational sources use relational reader threads, and file sources use file reader threads.

    The Integration Service creates an SQL statement for each reader thread to extract data

    from a relational source.

    For file sources, the Integration Service can create multiple threads to read a singlesource.

  • 7/29/2019 Informatica Basics Demo 8.6

    20/36

    Transformation Threads:

    The master thread creates one or more transformation threads for each partition.

    Transformation threads process data according to the transformation logic in the

    mapping.

    The master thread creates transformation threads to transform data received in buffers by

    the reader thread, move the data from transformation to transformation, and create

    memory caches when necessary. The number of transformation threads depends on the

    partitioning information for each pipeline.

    Transformation threads store fully-transformed data in a buffer drawn from the memory

    pool for subsequent access by the writer thread.

    If the pipeline contains a Rank, Joiner, Aggregator, Sorter, or a cached Lookup

    transformation, the transformation thread uses cache memory until it reaches the

    configured cache size limits. If the transformation thread requires more space, it pages to

    local cache files to hold additional data.

    When the Integration Service runs in ASCII mode, the transformation threads pass

    character data in single bytes. When the Integration Service runs in Unicode mode, the

    transformation threads use double bytes to move character data.

  • 7/29/2019 Informatica Basics Demo 8.6

    21/36

    Writer Threads:

    The master thread creates one writer thread for each partition if a target exists in the

    source pipeline. Relational targets use relational writer threads, and file targets use file

    writer threads.

    The master thread creates writer threads to load target data. The number of writer threads

    depends on the partitioning information for each pipeline. If the pipeline contains one

    partition, the master thread creates one writer thread. If it contains multiple partitions, the

    master thread creates multiple writer threads.

    Each writer thread creates connections to the target databases to load data. If the target is

    a file, each writer thread creates a separate file. You can configure the session to merge

    these files.

    If the target is relational, the writer thread takes data from buffers and commits it to

    session targets. When loading targets, the writer commits data based on the commit

    interval in the session properties. You can configure a session to commit data based onthe number of source rows read, the number of rows written to the target, or the number

    of rows that pass through a transformation that generates transactions, such as a

    Transaction Control transformation.

  • 7/29/2019 Informatica Basics Demo 8.6

    22/36

  • 7/29/2019 Informatica Basics Demo 8.6

    23/36

    Informatica Architecture 8.6- Components

  • 7/29/2019 Informatica Basics Demo 8.6

    24/36

    PowerCenter - Domain

  • 7/29/2019 Informatica Basics Demo 8.6

    25/36

    PowerCenter Admin Console

  • 7/29/2019 Informatica Basics Demo 8.6

    26/36

    Informatica-Power Center Repository Service

  • 7/29/2019 Informatica Basics Demo 8.6

    27/36

    Any Suggestions

  • 7/29/2019 Informatica Basics Demo 8.6

    28/36

    PowerCenter Client Components

    The Informatica Client is used to manage users, define sources and targets, building

    mappings and mapplets with the transformation logic, and create sessions to run the

    mapping logic.

    The Informatica Client has the following main applications:

    Repository Manager

    Designer

    Workflow Manager

    Workflow Monitor

  • 7/29/2019 Informatica Basics Demo 8.6

    29/36

    PowerCenter Repository

  • 7/29/2019 Informatica Basics Demo 8.6

    30/36

    PowerCenter Client Components

  • 7/29/2019 Informatica Basics Demo 8.6

    31/36

    PowerCenter Client Components

    Repository Manager: This is used to create and administer the metadata repository.

    The repository users and groups are created through the Repository Manager.

    Assigning privileges and permissions, managing folders in the repository and managing

    locks on the mappings are also done through the Repository Manager

    Informatica/Power Center Client

  • 7/29/2019 Informatica Basics Demo 8.6

    32/36

    Informatica/Power Center Client

    Components

    Designer: The Designer has five tools that are used to analyze sources, design target

    schemas and build the Source to Target mappings. These are

    1. Source Analyzer: This is used to either import or create the source definitions.

    2. Target Designer: This is used to import or create target definitions.

    3. Mapping Designer: This is used to create mappings that will be run by the Informatica Server to extract,

    transform and load data.

    4. Transformation Developer: This is used to develop reusable transformations that can be used in

    mappings.

    5. Mapplet Designer:This is used to create sets of transformations referred to as Mapplets which can be

    used across mappings.

    Informatica/Power Center Client

  • 7/29/2019 Informatica Basics Demo 8.6

    33/36

    Informatica/Power Center Client

    Components

    What is WORKFLOW MANAGER?

    Its a tool where you define a set of instructions called aworkflow to execute mappings you build in the Designer.

    What are workflow manager tools?

    It consists of three tools to help you develop a workflow.

    Task Developer. Use the Task Developer to create tasks you

    want to execute in the workflow.

    Workflow Designer. Use the Workflow Designer to create a

    workflow by connecting tasks with links. You can also create

    tasks in the Workflow Designer as you develop the workflow.

    Worklet Designer. Use the Worklet Designer to create aworklet.

  • 7/29/2019 Informatica Basics Demo 8.6

    34/36

  • 7/29/2019 Informatica Basics Demo 8.6

    35/36

  • 7/29/2019 Informatica Basics Demo 8.6

    36/36

    Informatica-Power Center Integration Service