Skip to main content

Ab Initio Interview Questions and Answers

1. Mention what information does a .dbc file extension provides to connect to the database ? (Ab Initio Scenario Based Interview Questions)
Answer:
The .dbc extension provides the GDE with the information to connect with the database are
• Name and version number of the data-base to which you want to connect
• Name of the computer on which the data-base instance or server to which you want to connect runs, or on which the database remote access software is installed
• Name of the server, database instance or provider to which you want to link.
Ab Initio Scenario Based Interview Questions
2. What is a data processing cycle and what is its significance ?
Answer:
Data often needs to be processed continuously and it is used at the same time. It is known as data processing cycle. The same provide results which are quick or may take extra time depending on the type, size and nature of data. This is boosting the complexity in this approach and thus there is a need of methods that are reliable and advanced than existing approaches. The data cycle simply make sure that complexity can be avoided upto the possible extent and without doing much.
3. Suppose we assign you a new project. What would be your initial point and the key steps that you follow ?
Answer:
The first thing that largely matters is defining the objective of the task and then engages the team in it. This provides a solid direction for the accomplishment of the task. This is important when one is working on a set of data which is completely unique or fresh. After this, next big thing that needs attention is effective data modeling. This includes finding the missing values and data validation. Last thing is to track the results.
4. What do you mean by the term data warehousing? Is it different from Data Mining ?
Answer:
Many times there is a need to have data retrieval, warehousing can simply be considered to assure the same without affecting the efficiency of operational systems. It simply supports decision support and always works in addition to the business applications and Customer Relationship Management and warehouse architecture. Data mining is closely related to this approach. It assures simple findings of required operators from the warehouse.
5. Have you ever encountered an error called “depth not equal” ?
Answer:
When two components are linked together if their layout does not match then this problem can occur during the compilation of the graph. A solution to this problem would be to use a partitioning component in between if there was change in layout.
6. What is a cursor? Within a cursor, how would you update fields on the row just fetched ?
Answer:
The oracle engine uses work areas for internal processing in order to the execute sql statement is called cursor.There are two types of cursors like Implecit cursor and Explicit cursor.Implicit cursor is using for internal processing and Explicit cursor is using for user open for data required.
7. What are Cartesian joins ?
Answer:
A Cartesian join will get you a Cartesian product. A Cartesian join is when you join every row of one table to every row of another table. You can also get one by joining every row of a table to every row of itself.
8. Can anyone give me an exaple of realtime start script in the graph ?
Answer:
Here is a simple example to use a start script in a graph:
In start script lets give as:
export $DT=`date ‘+%m%d%y’`
Now this variable DT will have today’s date before the graph is run.
Now somewhere in the graph transform we can use this variable as;
out.process_dt::$DT;
which provides the value from the shell.
Ab Initio Scenario Based Interview Questions
9. What is skew and skew measurement ?
Answer:
skew is the mesaureof data flow to each partation .
suppose i/p is comming from 4 files and size is 1 gb
1 gb= ( 100mb+200mb+300mb+5oomb)
1000mb/4= 250 mb
(100- 250 )/500= –> -150/500 == cal ur self it wil come in -ve value.
calclu for 200,500,300.
+ve value of skew is allways desriable.
skew is a indericet measure of graph.
10. Do you think effective communication is necessary in the data processing? What is your strength in terms of same ?
Answer:
The biggest ability that one could have in this domain is the ability to rely on the data or the information. Of course, communication matters a lot in accomplishing several important tasks such as representation of the information. There are many departments in an organization and communication make sure things are good and reliable for everyone.
11. Describe in detail about lookup ?
Answer:
A group of keyed dataset is said called lookup. The datasets in lookup can be classified into two types such as Static and Dynamic. In the case of dynamic datasets, the lookup file would be generated in the previous phase and used in the current phase. With respect to the data present in a particular multi/serial file, lookup can be used to map values.
12. What kind of layouts does Abinitio support ?
Answer:
• Abinitio supports serial and parallel layouts.
• A graph layout supports both serial and parallel layouts at a time.
• The parallel layout depends on the degree of the data parallelism
• A multi-file system is a 4-way parallel system
• A component in a graph system can run 4-way parallel system.
13. What is a local lookup ?
Answer:
• Local lookup file has records which can be placed in main memory
• They use transform function for retrieving records much faster than retrieving from the disk.
14. Mention what is the role of Co-operating system in Abinitio ?
Answer:
The Abinitio co-operating system provide features like Manage and run Abinitio graph and control the ETL processes
Provide Abinitio extensions to the operating system
ETL processes monitoring and debugging
Meta-data management and interaction with the EME
15. Mention what is Abinitio ?
Answer:
“Abinitio” is a latin word meaning “from the beginning.” Abinitio is a tool used to extract, transform and load data. It is also used for data analysis, data manipulation, batch processing, and graphical user interface based parallel processing.
16. Mention what is Rollup Component ?
Answer:
Roll-up component enables the users to group the records on certain field values. It is a multiple stage function and consists initialize 2 and Rollup 3.
Ab Initio Scenario Based Interview Questions
17. What is the importance of EME in abinitio ?
Answer:
EME is a repository in Ab Inition and it used for checkin and checkout for graphs also maintains graph version.
18. What are steps to create repository in AB Initio ?
Answer:
If you have installed AB Initio in a standalone machine, then there is no need to create a separate repository as it will be created automatically during the installation process. You could be able to view the newly created automated repository under AB Initio folder.
19. What would be the next step after collecting the data ?
Answer:
Once the data is collected, the next important task is to enter it in the concerned machine or system. Well, gone are those days when storage depends on papers. In the present time, data size is very large and it needs to be performed in a reliable manner. The digital approach is a god option for this as it simply let users perform this task easily and in fact without compromising with anything. A large set of operations then need to be performed for the meaningful analysis. In many cases, conversion also largely matters and users are always free to consider the outcomes which best meet their expectations.
20. Suppose you find the term Validation mentioned with a set of data, what does that simply represent ?
Answer:
It represents that the concerned data is clean, correct and can thus be used reliably without worrying about anything. Data validation is widely regarded as the key points in the processing system.
21. How scientific data processing is different from commercial data processing ?
Answer:
Scientific data processing simply means data with great amount of computation i.e. arithmetic operations. In this, a limited amount of data is provided as input and a bulk data is there at the outcome. On the other hand commercial data processing is different. In this, the outcome is limited as compare to the input data. The computational operations are limited in commercial data processing.
22. Name any two stages of the data processing cycle and provide your answer in terms of a comparative study of them ?
Answer:
The first is Collection and second one is preparation of data. Of course, the collection is the first stage and preparation is the second in a cycle dealing with data processing. The first stage provides baseline to the second and the success and simplicity of the first depends on how accurately the first has been accomplished. Preparation is mainly the manipulation of important data. Collection break data sets while Preparation joins them together.
23. What do you mean by a transaction file and how it is different from that of a Sort file ?
Answer:
The Transaction file is generally considered to hold input data and that is for the time when a transaction is under process. All the master files can be updated with it simply. Sorting is done to assign a fixed location to the data files on the other hand.
Ab Initio Scenario Based Interview Questions
24. Do you know what a local lookup is ?
Answer:
If your lookup file is a multifile and partioned/sorted on a particular key then local lookup function can be used ahead of lookup function call. This is local to a particular partition depending on the key.
Lookup File consists of data records which can be held in main memory. This makes the transform function to retrieve the records much faster than retrieving from disk. It allows the transform component to process the data records of multiple files fast.
25. How many components in your most complicated graph ?
Answer:
It depends the type of components you us. Usually avoid using much complicated transform function in a graph.
26. Have you worked with packages ?
Answer:
Multistage transform components by default use packages. However user can create his own set of functions in a transfer function and can include this in other transfer functions.
27. What are the different forms of output that can be obtained after processing of data ?
Answer:
These are
1. Tables
2. Plain Text files
3. Image files
4. Maps
5. Charts
6. Vectors
7. Raw files
Sometime data is required to be produced in more than one format and therefore the software accomplishing this task must have features available in it to keep up the pace in this matter.
28. What exactly do you know about the typical data analysis ?
Answer:
It generally involves the organization as well as the collection of important files in the form of important files. The main aim is to know the exact relation among the industrial data or the full data and the one which is analyzed. Some experts also call it as one of the best available approaches to find errors. It entails the ability to spot problems and enable the operator to find out root causes of the errors.
29. How to add default rules in transformer ?
Answer:
Add Default Rules — Opens the Add Default Rules dialog. Select one of the following: Match Names — Match names: generates a set of rules that copies input fields to output fields with the same name. Use Wildcard (.*) Rule — Generates one rule that copies input fields to output fields with the same name.
1) If it is not already displayed, display the Transform Editor Grid.
2) Click the Business Rules tab if it is not already displayed.
3) Select Edit > Add Default Rules.
In case of reformat if the destination field names are same or subset of the source fields then no need to write anything in the reformat xfr unless you dont want to use any real transform other than reducing the set of fields or split the flow into a number of flows to achieve the functionality. Ab initio training
30. How do you truncate a table ?
Answer:
From Abinitio run sql component using the DDL “truncate table by using the truncate table component in Ab Initio
31. Describe the Grant/Revoke DDL facility and how it is implemented ?
Answer:
Basically,This is a part of D.B.A responsibilities GRANT means permissions for example GRANT CREATE TABLE ,CREATE VIEW AND MANY MORE .
REVOKE means cancel the grant (permissions).So,Grant or Revoke both commands depend upon D.B.A.
Ab Initio Scenario Based Interview Questions
32. How would you find out whether a SQL query is using the indices you expect ?
Answer:
Explain plan can be reviewed to check the execution plan of the query. This would guide if the expected indexes are used or not.
33. What is the purpose of having stored procedures in a data baset ?
Answer:
Main Purpose of Stored Procedure for reduse the network trafic and all sql statement executing in cursor so speed too high.
34. How do you convert 4-way MFS to 8-way mfs ?
Answer:
To convert 4 way to 8 way partition we need to change the layout in the partioning component. There will be seperate parameters for each and every type of partioning eg. AI_MFS_HOME, AI_MFS_MEDIUM_HOME, AI_MFS_WIDE_HOME etc.
The appropriate parameter need to be selected in the component layout for the type of partioning.
35. What is $mpjret? Where it is used in ab-initio ?
Answer:
You can use $mpjret in endscript like
if 0 -eq($mpjret)
then
echo “success”
else
mailx -s “[graphname] failed” mailid
36. What is the difference between a Scan component and a RollUp component ?
Answer:
Rollup is for group by and Scan is for successive total. Basically, when we need to produce summary then we use scan. Rollup is used to aggregate data.
37. What is the Difference between DML Expression and XFR Expression ?
Answer:
The main difference b/w dml & xfr is that
DML represent format of the metadata.
XFR represent the tranform functions.which will contain business
rules
38. How can i run the 2 GUI merge files ?
Answer:
Do you mean by merging Gui map files in WR.If so, by merging GUI map files in GUI map editor it wont create corresponding test script.without testscript you cant run a file.So it is impossible to run a file by merging 2 GUI map files.
39. What is the difference between rollup and scan ?
Answer:
By using rollup we cant generate cumulative summary records for that we will be using scan.
Ab Initio Scenario Based Interview Questions
40. What is common among data validity and Data Integrity ?
Answer:
Both these approaches deal with errors related with errors and make sure of smooth flow of operations that largely matters.
41. Name the different type of processing based on the steps that you know about ?
Answer:
They are:
1. Real-Time processing
2. Multiprocessing
3. Time Sharing
4. Batch processing
5. Adequate Processing
42. What is the diff b/w look-up file and look-up, with a relevant example ?
Answer:
Generally, Lookup file represents one or more serial files (Flat files). The amount of data is small enough to be held in the memory. This allows transform functions to retrieve records much more quickly than it could retrieve from Disk. (Top 43 Abinitio Interview Questions And Answers)
43. How to run a graph infinitely ?
Answer:
To run a graph infinitely…The .ksh graph file should be called by the end script in the graph.
If the graph name is abc.mp then the graph should call the abc.ksh file. company
44. Mention how can you connect EME to Abinitio Server ?
Answer:
To connect with Abinitio Server, there are several ways like
• Set AB_AIR_ROOT
• Login to EME web interface- http://serverhost:[serverport]/abinitio
• Through GDE, you can connect to EME data-store
• Through air-command
45. How can you force the optimizer to use a particular index ?
Answer:
Use hints /*+ */, these acts as directives to the optimizer
46. What are the operations that support avoiding duplicate record ?
Answer:
Duplicate records can be avoided by using the following:
• Using Dedup sort
• Performing aggregation
• Utilizing the Rollup component
47. What is m_dump ?
Answer:
m_dump command prints the data in a formatted way.
m_dump
Ab Initio Scenario Based Interview Questions
48. What is the latest version that is available in Ab-initio ?
Answer:
The latest version of GDE ism1.15 AND Co>operating system is 2.14
49. What are differences between different versions of Co-op ?
Answer:
1.10 is a non key version and rest are key versions.
There are lot of components added and revised at following versions.
50. Explain about AB Initio’s dependency analysis ?
Answer:
Dependency analysis in AB Initio is closely associated with data linage. Data linage provides the source for data and upon the implementation of dependency analysis, the type of applications dependent on the data can be identified. Dependency analysis also helps to carry out maximum retrieval operation (from existing data) by the use of surrogate key. New records can be generated when using scan or next_in_sequence/reformat sequence.
51. informatica vs ab initio ?
Answer:
Feature AB Initio Informatica
About Tool Code based ETL Engine based ETL
Parallelism Supports One Types of parallelism Supports three types of parallelism
Scheduler No scheduler Schedule through script available
Error Handling Can attach error and reject files One file for all
Robust Robustness by function comparison Basic in terms of robustness
Feedback Provides performance metrics for each component executed Debug mode, but slow implementation
Delimiters while reading Supports multiple delimeters Only dedicated delimeter
52. What are the benefits of data analyzing ?
Answer:
It makes sure of the following:
1. Explanation of development related to the core tasks can be assured
2. Test Hypotheses with an integration approach is always there
3. Pattern detection in a reliable manner
53. What are the key elements of a data processing system ?
Answer:
These are Converter, Aggregator, Validator, Analyzer, Summarizer, and a sorter.
54. What are the facts that can compromise data integrity ?
Answer:
There are several errors that can cause this issue and can transform many other problems. These are:
1. Bugs and malwares
2. Human error
3. Hardware error
4. Transfer errors which generally include data compression beyond a limit.
55. What does EDP stand for ?
Answer:
It means Electronic Data Processing
56. Give one reason when you need to consider multiple data processing ?
Answer:
When the required files are not the complete outcomes which are required and need further processing.
57. Can sorting and storing be done through single software or you need different for these approaches ?
Answer:
Well, it actually depends on the type and nature of data. Although it is possible to accomplish both these tasks through the same software, many software have their own specialization and it would be good if one adopts such an approach to get the quality outcomes. There are also some pre-defined set of modules and operations that largely matters. If the conditions imposed by them are met, users can perform multiple tasks with the similar software. The output file is provided in the various formats.
Courtesy: https://svrtechnologies.com/interview-question-answers/best-57-ab-initio-scenario-based-interview-questions-and-answers

Comments

Post a Comment

Popular posts from this blog

Ab initio Parameters

Parameters A parameter is a value that you specify to control some part of an object’s behaviour. The object can be a project, component, graph, subgraph, plan, and so on. You type in a value for a parameter (or click a button or select a value from a list), and thus specify the aspect of the object’s behaviour identified by the parameter’s name. Every parameter has two main parts: * the declaration of its name * the definition of its value Parameters also have attributes that specify various details about what type of value it can hold, whether the parameter is input or local. The normal way to edit a component’s parameters is through the Parameters tab of the component dialog. Graph, subgraph and project parameters are edited through the Parameters Editor. Component parameters too can be edited with the Parameters Editor. Usually component parameters are edited through the components’ own dialogs (with Description, Parameters, Ports, and other tabs). Parameter sets ...

Ab initio Questions and Answers

1. Explain what is de-partition in Abinitio ? (Abinitio Interview Questions) Answer: De-partition is done in order to read data from multiple flow or operations and are used to re-join data records from different flows. There are several de-partition components available which includes Gather, Merge, Interleave, and Concatenation. Abinitio Interview Questions 2. Explain what is SANDBOX ? Answer: A SANDBOX is referred for the collection of graphs and related files that are saved in a single directory tree and behaves as a group for the purposes of navigation, version control, and migration. 3. What do you mean by the overflow errors ? Answer: While processing data, calculations which are bulky are often there and it is not always necessary that they fit the memory allocated for them. In case a character of more than 8-bits is stored there, this errors results simply. 4. What is data encoding ? Answer: Data needs...

Indexed compressed flat file or ICFF

Indexed compressed flat file or ICFF can be considered as a special kind of lookup file which can store large volumes of data without compromising quick access to individual records. Common lookup files have a limit to the amount of data one can store, which is not the limitation of ICFF. Other important features, as gathered from help: ICFFs present advantages in a number of categories: * Requires much less disk storage — as name suggests, ICFFs store compressed data in flat files without the overheads associated with a DBMS, hence requiring much less disk storage capacity than databases — on the order of 10 times less. * Requires much less memory at one time — as ICFFs organize data in discrete blocks, only a small portion of the data needs to be loaded in memory at any one time. * Comparatively much faster — ICFFs allow us to create successive generations of updated information without any pause in processing which significantly reduces the time between a transaction taking...