Monday, 13 August 2018

Connecting MongoDB using IBM DataStage

Introduction

MongoDB is an open-source document- oriented schema-less database system. It does not organize the data using rules of a classical relational data model. Unlike other relational databases where data is stored in columns and rows, MongoDB is built on the architecture of collections and documents. One collection holds different documents and functions. Data is stored in the form of JSON style documents. MongoDB supports dynamic queries on documents using a document based query language like SQL.

This blog post explains how MongoDB can be integrated with IBM DataStage with an illustration.

Why MongoDB?
For the past two decades we have been using Relational Database as data store as they were the only option that was available. But with the introduction of NoSQL, we have more options based on the requirement. Mongo DB is predominantly used in insurance and travel industry.

We can extract any semi-structured data and load it to MongoDB through any of the integration tools. Also Extract from MongoDB is easier and faster when compared to relational databases.

MongoDB integration with IBM DataStage
Since we don’t have a specific external stage in IBM DataStage tool to integrate MongoDB, we are going with Java Integration stage to load or extract data from MongoDB.

Since MongoDB is a schema free database, we can use structured or semi-structured data extracted through DataStage and load it to MongoDB.

Prerequisites

  • Make sure you have java installed on your machine.
  • Install Eclipse tool.
  • Java requires below MongoDB jar to be imported inside the package to use MongoDB functions
    • mongo-java-driver-2.11.3.jar or higher versions if available (Download it from the internet)
  • Also, Java requires below jar file to be imported inside the package to extract or load data from DataStage
    • jar (It is available on the DataStage server. Location: /opt/IBM/InformationServer/Server/DSEngine/java/lib)


Illustration of a DataStage job
Create a job in DataStage to parse the below sample XML




Read more steps at http://www.infotrellis.com/connecting-mongodb-using-ibm-datastage/

No comments:

Post a Comment