Monday 18 December 2017

Informatica Data Quality tutorial

This tutorial gives you an overview and talks about the fundamentals of Informatica Data Quality (IDQ).
  • Informatica Data Quality is a suite of applications and components that you can integrate with Informatica Power Center to deliver enterprise-strength data quality capability in a wide range of scenarios.

  • The core components are: Data Quality Workbench, Data Quality Server.
  • Use to design, test, and deploy data quality processes, called Data Quality Workbench plans. Workbench allows you to test and execute plans as needed, enabling rapid data investigation and testing of data quality methodologies.

  • Data Quality Server. Use to enable plan and file sharing and to run plans in a networked environment. Data Quality Server supports networking through service domains and communicates with Workbench over TCP/IP.

  • Both Workbench and Server install with a Data Quality engine and a Data Quality repository. Users cannot create or edit plans with Server, although users can run a plan to any Data Quality engine independently of Workbench by run time commands or from Power Center.

  • Users can apply parameter files, which modify plan operations, to run time commands when running data quality plans to a Data Quality engine.

  • Informatica also provides a Data Quality Integration plug-in for Power Center. This plug-in enables Power Center users to add data quality plan instructions to a Power Center transformation and to run the plan to the Data Quality engine from a Power Center session.

  • In Data Quality, a plan is a self-contained set of data analysis or data enhancement processes. A plan is composed of one or more of the following types of component:
  1. Data sources provide the input data for the plan.
  2. Data sinks collect the data output from the plan.
  3. Operational components perform the data analysis or data enhancement actions on the data they receive.
Role of Dictionaries
Plans can make use of reference dictionaries to identify, repair, or remove inaccurate or duplicate data values. Informatica Data Quality plans can make use of three types of reference data.
Standard dictionary files. These files are installed with Informatica Data Quality and can be used by several types of component in Workbench. All dictionaries installed with Data Quality are text dictionaries. These are plain-text files saved in .DIC file format. They can be created and edited manually.
Database dictionaries Informatica Data Quality users with database expertise can create and specify dictionaries that are linked to database tables, and that thus can be updated dynamically when the underlying data is updated.
Third-party reference data. These data files are provided by third-parties and are provided by Informatica customers as premium product options. The reference data provided by third-party vendors is typically in database format.

No comments:

Post a Comment