Thursday, 29 March 2018

Can Big Data replace an EDW?

Data Warehousing has been the buzzword for the past two or three decades and big data is the new trend in technology. A question that often arises in our mind is, “Are they similar and will Big Data replace a Data Warehouse”, the reason being, both have similarities like holding data, used for reporting purposes and managed by electronic storage devices.  There is an underlying difference between the two, namely; Big Data Solution is a technology whereas Data Warehousing is an architectural concept in data computing.
An organization can have different combinations such as Big Data or Data warehouse solution only or Big Data and Data Warehouse solutions based on the four consideration factors such as: Data Structure, Data Volume, Unstructured Data, Schema-on-Read.
This blog post tries to bring out the similarities and differences between the two and illustrates with a use case Use Case example.

What is a Data Warehouse?

Data Warehouse is a conceptual architecture that helps to store structured, subject-oriented, time variant, non-volatile data for decision making.  Data Warehouse typically stores the historical data, a copy of transaction data specifically structured for query and analysis. The physical data consolidation has been shifting to a more logical one which accommodates real time data as well. Data from the sources are transformed (cleansed, applying business rules, enhanced) and analysis is done in ETL/ELT phase to load into a structured form (Can be relational, dimensional, hybrid etc…).

http://www.infotrellis.com/can-big-data-replace-edw/

Tuesday, 27 March 2018

How to use Informatica Power Center as a RESTful Web Service Client?

Introduction

In today’s world DATA is ubiquitous and critical to the business which eventually increases the need for integration across different platforms like Cloud, Web Service etc.  When it comes to Data Integration, business needs effective communication between their software systems and ETL tool to fulfill their needs.
This blog post explains what a REST Web Service is, how you can create a Power Center workflow, use REST based method to access the web services via HTTP Transformation.

REST Overview

Web Service provides a common platform that allows two different applications on various platforms to communicate and exchange messages between them over HTTP protocol. Web Services can be accessed using different methods or styles. In Web Service world REpresentational  State Transfer(REST) is a stateless client-server architecture in which Web Services are exposed as URLs.  The typical method of accessing Web resource in a RESTful system is through various HTTP methods such as GET, POST, PUT and DELETE.
REST can use SOAP Web Services and any protocol such as HTTP, SOAP.
http://www.infotrellis.com/use-informatica-power-center-restful-web-service-client/

Thursday, 22 March 2018

Connecting Informatica PowerCenter to Teradata

Teradata – Overview

PowerCenter works with many databases, among which Teradata is one of a kind. Informatica PowerCenter integrates Teradata database into any business system and it serves as the technology foundation for controlling data movements. In Informatica PowerCenter, ODBC is used to connect with Teradata tables and its data.
This blog helps you to create, configure, compile, and execute a PowerCenter workflow in Windows that can read the data from and write the data to Teradata database.

What’s unique about Teradata database?

Teradata is an RDBMS with multiple processors to cater parallel processing. Because of its linear scalability, the performance increases as you increase the number of nodes.

Configuring and Executing a PowerCenter Workflow

Let us look at the set of steps for Configuring Teradata ODBC Connection on Powercenter Informatica.
http://www.infotrellis.com/connecting-informatica-powercenter-teradata/

Friday, 16 March 2018

Data Warehouse Vs Data Lake

Data Generation, Analysis, and Usage – Current Scenario

Last decade has seen an exponential increase in the data being generated from across traditional as well as non-traditional data sources. International Data Corporation (IDC)report says that, data generated in the year 2020 alone will be a staggering 40 zettabytes which would constitute a 50-fold growth from 2010. The data generated per second has increased to 2.5 Quintillion bytes and with the advent of latest innovations like the Internet of Things; it is poised to grow even more rapidly. This increase in data generation coupled with growing ability to store various types of data that is being generated has ensued in a vast repository of data which is now available for scrutiny.

http://www.infotrellis.com/data-warehouse-vs-data-lake/

Tuesday, 13 March 2018

How to access Informatica PowerCenter as a Web Service

Web Services Overview:

Web Services are services available over the web that enables communication and provide a standard protocol for communication. To enable the communication, we need a medium (HTTP) and a format (XML/JSON).
There are two parties to the web services, namely Service Provider and Service Consumer. A web service provider develops/implements the application (web service) and makes it available over the internet (web).  Service Provider publishes an interface for the web services that describes all the attributes of the web service. Service Consumer consumes the web service. For the Consumer to consume the web service, the consumer has to know the services available, request and response parameters, how to call the services and so on.
Hence we can define Web Service as a standardized way of integrating web desk applications using XML, SOAP, WSDL and UDDI open standards over an internet protocol backbone. XML is used to tag the data. SOAP is used to transfer the data. WSDL is used for describing the services available and UDDI is used for listing what services are available.
Read more:http://www.infotrellis.com/how-to-access-informatica-powercenter-as-a-web-service/

Wednesday, 7 March 2018

EDW Readiness Checklist for adding new Data Sources

Overview

It is common practice to make changes to the underlying systems either to correct problems or to provide support for new features that are needed by the business.  Changes can be in the form of adding a new source system to your existing Enterprise Data Warehouse (EDW).
This blog post examines the issue of adding new source systems in an EDW environment, how to manage customizations in an existing EDW, what type of analysis has to be made before the commencement of a project, in the impacted areas, and the solution steps required.

Enterprise Data Warehouse

An Enterprise Data warehouse (EDW) is a conceptual architecture that helps to store subject-oriented, integrated, time variant and non-volatile data for decision making. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources. An EDW includes various source systems, ETL (E- extract, T – transform, and L – load), Staging Area, Data warehouse, various Data Marts and BI reporting as shown in EDW Architecture.
http://www.infotrellis.com/edw-readiness-checklist-adding-new-data-sources/