Saturday 19 April 2014

59 - Explain any scenario of XML as a Source

We will take a simple example of XML file which is as below. We will import this xml file as a XML Source and load the data from it to a relational target.
http://gyaankatta.blogspot.in/2014/04/xml-source-file.html

Import XML as a Source --

You can import the XML file by navigating menu "Source" --> "Import XML Definition".


Informatica will prompt you number of option at the time of importing xml as shown.

In this case, we will go with the default option "Entity Relationships".

Once you click on Finish, your source xml file will get imported as an Informatica Source.




Relationships Between Entities -- As we have chosen the option of "Entity Relationships", xml file got imported with some relationships between the entities its caring.

If we observe the data of our XML file, you can find that there are 3 tags
1. <myComp> -- is a root tag, which hols all other entities inside it. So, after importing xml file, in a relationship its showing as a Root Tag, having relationship with <odc> tag
2. <odc> -- is a subsequent tag coming after <myComp>. As it holds <employee> again inside it, you can see its relationship with it.
3. <employee> -- is a leaf tag you can see in your XML Source; If we observe the relationship between <odc> and <employee> its many to many [as shown by arrow]
You can change this relationship by clicking on corresponding tag at left panel and then its corresponding properties tab below [highlighted by rectangle]

In an xml file, you can find <Address> tag residing inside <Employee> however that does not appear at above image. That's because, it does not have a many to many relationship with parent tag <employee>

If needed, you can create a new XML View in current xml definition above and make its 1 to 1 relationship with its immediate parent tag <employee>. Make sure you delete the entry of <Address> tag from Employee View.

Working with actual Mapping --
As shown above, we have 3 XML Views separately with their internal relationship inbetween. If we want to process data as a whole from all 3 views, we need to join them using Joiner Transformation. All these 3 views will be treated as separate data sets, but as the granularity of each data set is different, we can not join them without Joiner Transformation.

Also, since the source is same for both inputs of our joiner transformation, we need to select Sorted Input otherwise informatica won't allow you to do that.

As shown in image, to connect each XML View or a Data Set, we will require separate Joiner. The condition to join these data sets will be a Primary Key and Foreign Key relationship between those two created by Informatica by default.

As these keys gets generated internally by Informatica, and though we have clicked on Sorted Input, we do do not have to worry about keeping a sorter inbetween to sort it again.

Now the next task is to join the 2 Joiners which we created to join XML Views Comp -- ODC and ODC -- Employee. As the common condition between these 2 dataset is "ODC", we will use that as a join condition to join these 2 data sets. So, our mapping structure will be as below

To complete the mapping we required totally 3 joiners. Now just create a corresponding workflow and run the mapping.

Note -- if your informatica is installed on linux machine, make sure you will store the source xml file on linux machine. If you give your windows [client machine] path as a source file, your mapping will get fail giving below error
HIER_28039 Message: HIER_28039 Reading data from source file [filename.xml]



No comments:

Post a Comment