Thursday, 27 December 2018

Informatica MDM – Suspect Duplicate Process (SDP) Approach


A master data management (MDM) system is installed so that the core data of an organization is secure,  is accessible by multiple systems as and when required and does not have multiple copies floating in the system, in order to have a single source of truth. A solid Suspect Duplicate Process is required in order to achieve the 360 degree view of an entity.

The concept of Suspect Duplicate Processing represents the broad category of activities related to identifying entities that are likely duplicates of each other. Suspect duplicate processing is the process of searching for, matching, creating associations between and, when appropriate, merging data for existing duplicate party records in the system.

To achieve this functionality, Informatica MDM has come up with its own Suspect Duplicate Processing (SDP) approach. An organization based on its use case can opt any of the following two approaches:


  • Deterministic Matching Approach
  • Fuzzy Matching Approach


Deterministic Matching Approach

Deterministic Matching uses a series of rules, like nested if statements, to run a series of logical tests on the data sets. This is how we determine relationships, hierarchies, and households within a dataset. Deterministic matching seeks a clear “Yes” or “No” result on each and every attribute, based on which we define whether:


  • Two records are duplicates
  • should be resolved by a data steward or
  • Two unique entities.


It doesn’t leave any room for error and provides the result in an ideal scenario. But most of the data in organizations is far from an ideal scenario. These are the cases when the Fuzzy Matching Approach of Informatica comes handy.

Learn more at http://www.infotrellis.com/informatica-mdm-fuzzy-matching/

Tuesday, 25 December 2018

Mastech InfoTrellis - Experts in Big Data Analytics

Mastech InfoTrellis’ diverse expertise in the Big Data space, has helped to assist global enterprises in their Big Data initiatives

Big Data Analytics Hub

Mastech InfoTrellis offers managed Big Data Analytics Hub Solution Centered on Hadoop, which enables customers to consolidate multi-channel data of various formats into a single source. Big Data Analytics Hub enables self service analytics by different business functions.

AllSight – Customer Intelligence Management

AllSight Customer Intelligence Management System which delivers Enterprise Customer 360 by ingesting structured and unstructured data from disparate data sources across the organization.

IBM Big Data Solutions

IBM Big Data Solutions combine open source Hadoop and Spark for the open enterprise to cost effectively analyze and manage big data. With BigInsights, you spend less time creating an enterprise-ready Hadoop infrastructure, and more time gaining valuable insights. IBM provides a complete solution, including Spark, SQL, Text Analytics and more to scale analytics quickly and easily.

Learn more at http://www.infotrellis.com/big-data/

Saturday, 22 December 2018

Data Management and IBM IIS Tools

As per a study conducted by a leading market research and advisory company the data that we have generated in the past two years is many times more than that we generated in over two decades. It has not just multiplied, but have also become complex, varied and is being generated at much more rate than it ever was. These factors present a data integration challenge to the industries and businesses to be able to better utilize their data for help building strategies, provide services, introduce policy regulations such that their business is empowered to bridge or completely meet the gap for that matter between data and analytics.
IBM has always been innovative, technology-driven and in fact, they pioneer in data integration and management technologies. They have always provided the business with the right set of tools and IBM MDM (Master Data Management) is the best example of that. Besides MDM, IBM also has IIS (InfoSphere Information Server) in its quiver to target data integration and management challenges that almost every line of business in this age encounters.
This blog aims to provide an outlook on the IBM IIS suite and how it can empower your business data integration demands for better resource utilization and finding the right set of tools to address key business challenges.

http://www.infotrellis.com/data-management-ibm-iis-tools/

Wednesday, 19 December 2018

Best Practices in Data Validation

Data Quality is the buzz word in the digital age.

What is data quality and why is it so important?

“Data quality” is the term that is probably hidden but plays an important role in many streams. Data plays a vital role in acquiring a market place, especially in enterprise data management stream.

Data Quality Examples

Following are some examples which emphasize the need for data quality.
  • A customer shouldn’t be allowed to enter his age where he has to mention his marital status.
  • When a customer enters a store, there is a high possibility that he might miss out his original details to be filled up with the forms, some of it can be in a hurry not mentioning a correct phone number.
  • There is also a possibility of the billing staff to wrongly enter the store address as default in place of the customer address which contributes to a bad quality data that gets persisted in the system.
This data may be crucial as the customer might not just be a Guest customer and the customers’ viable interest towards the store becomes obscure.
This blog post speaks on Data Quality, the significance of Data Quality, business impacts, best practices to be followed, and Mastech InfoTrellis’ specialization in Data validation.
http://www.infotrellis.com/best-practices-data-validation/

Monday, 10 December 2018

Why Big Data in Healthcare is so required

“Data analytics” refers to the practice of taking masses of aggregated data and analyzing them to draw important insights and information contained in it. This process is increasingly aided by new software and technology that helps examine large volumes of data for hidden information that can help us in many areas and healthcare is one of those areas.

80% of all healthcare information is unstructured data which is so vast and complex that it needs specialized methods and tools to make meaningful use of the data. The new and emerging technologies like artificial intelligence (AI), machine learning, and predictive analytics are bringing in powerful tools for healthcare technologists and thought leaders to capture these data and process it effectively and efficiently for the complete transformation of the healthcareindustry.Physician decisions are winding up increasingly prove based, implying that they depend on expansive swathes of research and clinical information rather than exclusively their tutoring and expert sentiment. This new treatment state of mind implies there is a more prominent interest for big data analytics in medicinal services offices than at any other time. There is almost certainly that big data has developed as a defining moment changer for the healthcare industry to enable it to advance to another level.

Read full article at http://www.infotrellis.com/big-data-analytics-augmented-patient-care/

Why Big Data in Healthcare is so required

“Data analytics” refers to the practice of taking masses of aggregated data and analyzing them to draw important insights and information contained in it. This process is increasingly aided by new software and technology that helps examine large volumes of data for hidden information that can help us in many areas and healthcare is one of those areas.

80% of all healthcare information is unstructured data which is so vast and complex that it needs specialized methods and tools to make meaningful use of the data. The new and emerging technologies like artificial intelligence (AI), machine learning, and predictive analytics are bringing in powerful tools for healthcare technologists and thought leaders to capture these data and process it effectively and efficiently for the complete transformation of the healthcareindustry.Physician decisions are winding up increasingly prove based, implying that they depend on expansive swathes of research and clinical information rather than exclusively their tutoring and expert sentiment. This new treatment state of mind implies there is a more prominent interest for big data analytics in medicinal services offices than at any other time. There is almost certainly that big data has developed as a defining moment changer for the healthcare industry to enable it to advance to another level.

Read full article at http://www.infotrellis.com/big-data-analytics-augmented-patient-care/

Overview of Informatica PowerCenter Web Service

Web Services Overview:
Web Services are services available over the web that enables communication and provide a standard protocol for communication. To enable the communication, we need a medium (HTTP) and a format (XML/JSON).

There are two parties to the web services, namely Service Provider and Service Consumer. A web service provider develops/implements the application (web service) and makes it available over the internet (web).  Service Provider publishes an interface for the web services that describes all the attributes of the web service. Service Consumer consumes the web service. For the Consumer to consume the web service, the consumer has to know the services available, request and response parameters, how to call the services and so on.

Hence we can define Web Service as a standardized way of integrating web desk applications using XML, SOAP, WSDL and UDDI open standards over an internet protocol backbone. XML is used to tag the data. SOAP is used to transfer the data. WSDL is used for describing the services available and UDDI is used for listing what services are available.

Learn more at, http://www.infotrellis.com/how-to-access-informatica-powercenter-as-a-web-service/