Results of an Evolved ETL Environment
Mappings
Data load mappings do not follow any consistent design standards from project to project and even within the same project. Developers start from scratch with each mapping because they lack mapping templates. This results in inconsistent mapping designs and methods within a project, redundant processes, poorly utilized developer time, and project delays.
Workflows
Workflows follow no design standards and are structurally very brittle. Every workflow is basically a custom build and is difficult to maintain. Architecture varies greatly from project to project. Governance of the ETL Process Control architecture is basically non-existent. Implementing changes to workflows are difficult and usually require a code change because load processes and their dependencies are hard-coded rather than externally configurable.
Dependencies
SLA’s are missed resulting from an inability to manage complex inter-system load dependencies. Load dependencies are hard coded on a batch level rather than on an individual table level. Dependencies are not dynamic and often utilize cumbersome indicators files.
Recovery
Load failures and workflow failures are painfully difficult to recover from because data-load sessions are so tightly coupled. Failures have a cascading effect and often require a complete restore and re-load due to the lack of automated recovery and restart processes. Re-running a single load is not possible without re-structuring the workflow.
Metadata
Business Analysts still spend hours analyzing data, trying to find out what really happened in a data-load due to the lack of detailed record-disposition metadata.
Environment
The lack of a configurable metadata-driven Process Control Infrastructure provides no checks and balances for the data load processes making standardization and governance impossible.
When Best Practices Are Not Enough
You have invested heavily in Informatica
You have Velocity
You have implemented 'Best Practices'
But you still don't have
Architected ETL Infrastructure
You understand how to design and construct data warehouses and datamarts, and you understand how to structure databases, however, no one has ever provided you with clear and proven strategy for implementing a consistent, flexible and configurable ETL infrastructure.
Even those who rigorously follow ‘best practices’ still stuffer from a lack of a truly architected ETL infrastructure. Most ETL environments are not architected but rather they are evolved. Mappings and workflows are not consistent across the enterprise or even within the same development project.
Evolved -vs- Architected
As your organization’s demand for data integration escalates, the demands on your data stores and data warehouses becomes increasingly critical.
And no matter how you define your approach to enterprise data integration, and regardless of how you combine strategies and technologies for achieving your data integration initiatives, the fact remains that ETL is still a key foundational technology used to build your corporate data stores and data warehouse platforms.
When we look at how companies have implemented their ETL platforms, we find that most implementations are evolved rather than architected.
Most organizations that have implemented best practices still struggle with how to combine these powerful tools into a truly unified architecture to provide a real and measurable return on their ETL investment. In the absence of an architected ETL infrastructure, organizations find themselves dealing with extremely troublesome issues.
Strategic Development Framework Defined
Strategic Development Framework ™ (SDF)
This packaged solution from LoganBritton delivers a true unified ETL architecture for your enterprise data movement applications. This solution applies both to firms that are implementing a new EDW/DM, or who have existing ETL installations that need a robust and flexible infrastructure. The Strategic Development Framework™ is not only based on a proven methodology and best practices, but it also includes a repository of pre-built components ready to be tailored to your individual requirements and specifications.
Functional Implementation Modules
The LoganBritton SDF implementation plan is made up of a series of clearly defined tasks grouped into functional modules.
Module 1 - Development Standards and Templates
This module includes an actual mapping development template in the form of an XML import file. This template provides an example of how a mapping should be structured using standard methods for:: data validation, error trapping, integrity checking, business rule compliance checking and the generation and management of standard error codes. This module includes table definitions and scripts for the metadata tables that gather load statistics, error statistics and manage extract start and end dates, the metadata management mappings and sessions associated with these tables, a sample dynamic parameter file, and the mappings and sessions needed to manage and refresh the dynamic parameter file.
Module 2 - Process Control Infrastructure
This module includes all of the process control mechanisms. This includes the externally defined dependencies, automated restart functionality, process-control metadata tables, dynamic session launcher, external scheduler integration and error detection, and exception management functionality.
Module 3 - Advanced Functions
This module includes all of the advanced metadata statistics including balance and control processes, operational and statistical reporting, and the reports and dashboards used by Business Analysts, Operations and Production Support.
SDF Component Repository
In addition to the defining an architected ETL infrastructure and implementing ‘best practices’, the Strategic Development Framework comes with a repository of pre-built objects that will be tailored and configured to your requirements.
These objects include mapping development templates, process control scripts, sessions and workflows, dynamic parameter files, reusable transformation objects, process control metadata tables, SQL scripts, and sample data load sessions.
This SDF is delivered with complete technical documentation including Operations and Production Support Guides.
Four Key Architectures
The Strategic Development Framework addresses four areas of architecture:
Data Integration Architecture
The goal of every data warehouse or operational data store is to define a consistent business context behind data definitions. This involves structured, non-structured, semi-structured and real time data, insuring accurate and consistent data.
ETL Architecture
Establish standardized mapping and workflow templates, eliminate redundancies. Establish global data extract parameters. Capture detailed record-disposition metadata.
Process Control Architecture
Process Control Architecture has probably been the most overlooked area of ETL deployment. Workflows generally follow no architecture, and are custom built for every data load. . Most ETL shops have no structured and unified approach or governance for managing the load process. Workflows and their dependencies are hard coded, structurally brittle and not externally configurable.
Metadata Architecture
This provides a strategy for integration of various technologies at the process-control metadata level. This also provides detailed statistical and exception metadata and tools to help satisfy compliance and regulatory requirements.