40 most important EXTRACT, TRANSFORM, LOAD mcq questions

EXTRACT, TRANSFORM, LOAD mcq questions

etl mcq, etl mcq questions and answers, etl multiple choice questions with answers, etl multiple choice questions, extract transform load interview questions, extract transform load mcq, etl interview questions, etl mcq with answers,
EXTRACT, TRANSFORM, LOAD mcq questions

Upto 40 most important mcq questions on ETL for online examinations. These EXTRACT, TRANSFORM, LOAD mcq are also important in Business intelligence (BI) domain and many times asked for interviews and during training and onboardings by the companies like Capgemini, Cognizant, epam and others.

23. Value of an aggregated column in the data warehouse is calculated based on two three tables from the source systems. Where can one mention this transformation formula during the ETL process?
In a meeting that involves relevant team members
Data warehouse administrator has to record it
Mentioning it in the ‘transformation’ column of logical data map
By the SQL queries in the respective source systems

Advertisement

Mentioning it in the ‘transformation’ column of logical data map

24. One of the challenges involved in extracting data from the mainframe sources is, connecting to the mainframe itself
it uses different way of storing numeric data like packed decimals
connecting to its ODBC manager is cumbersome
such data storage methods to save on disk space which were very expensive those days
its complex login process

Advertisement

it uses different way of storing numeric data like packed decimals

25. One of the ways to overcome the challenge of parsing XML files is,
load the full file into the main memory first
parse only one line of the XML file at a time
copy its content into a DB and use SQL query to parse it
to use ETL tools with XML parsing feature

Advertisement

to use ETL tools with XML parsing feature

26. It is important to keep the data warehouse (DWH) data always ‘current’. It is the responsibility of ______
DWH administrator
System administrator
End users themselves
ETL team

ETL team

27. One of the methods used during the incremental load is to use two tables:
previous_load and current_load, refreshing them with data. What is done at the end of load to only get incremental data?
Do a row-by-row comparison
Do a column-by-column comparison
Find the difference (MINUS) between the rows of these two tables.
Load both of them into the data warehouse and delete these tables

Advertisement

Find the difference (MINUS) between the rows of these two tables.

28. The ‘cleaning process’ to ensure good quality data is one of the important steps after extracting it. A good quality data has to be:
i. Correct & Consistent
ii. Readable & Traceable
iii. Complete & Unambiguous
Find the odds one(s) out.
Only (i)
Only (ii)
Only (iii)
All of these are incorrect

Advertisement

ii. Readable & Traceable

29. Which of these is not a broad category in data quality check?
Column enforcement
Row enforcement
Structure enforcement
Value enforcement

Advertisement

Row enforcement

30. While loading fact tables, generating surrogate keys for them for every dimension may be slow. One of the ways to overcome this issue is:
Have a special lookup table to store the surrogate key for every new natural key for each dimension.
Create them in the source system databases.
Use the same surrogate key for different dimensions of the same natural key
No other solution. Will have to go with slow process

1. If the ETL process tries to ‘pull’ the data from the host server during the automatic load to the data warehouse, one of the challenges involved is:
the associated network may be down
userid / password combination mismatch
it may end up connecting to an incorrect host server
it may end up pulling the incomplete / incorrect file

Advertisement

it may end up pulling the incomplete / incorrect file

2. Exception propagation is a scenario where:
One exception triggering other exceptions
Same exception getting raised again and again.
One exception getting logged many times.
To send many messages about an exception

Advertisement

One exception triggering other exceptions

3. Once the data warehouse is ready to get loaded with data, it is suggested to remove all kinds of DB constraints on its tables. Why?
It helps reduce the disk space consumption.
Since it’s a DWH and not a transactional system
With those constraints, it is not possible to load data into DWH tables.
So that the loading process is fast as it does not involve checking for constraints for every row loaded.

So that the loading process is fast as it does not involve checking for constraints for every row loaded.

4. In the operational considerations of ETL processes, the historic data that are no longer required and are not used for long can be purged. Correct?
No, it may be required in future.
No, it cannot be done once loaded into the data warehouse
Yes; however they have to be moved back to their original sources.
Yes, it can be done for better performance & to free some disk space

Advertisement

Yes, it can be done for better performance & to free some disk space

5. ETL processes working on a particular hardware configuration can be moved to another h/w environment. There should be mechanism to avoid hard-coding h/w environment values like server name, DB details, etc. in the code of ETL jobs. How?
Re-coding these values in the ETL jobs every time when the h/w environment changes
Asking the ETL administrator to change these values
By specifying them as parameters to ETL jobs
Not possible. ETL jobs are permanently coded for specific h/w environment.

By specifying them as parameters to ETL jobs

6. While loading the data from external files, using database INSERT & UPDATE statements is highly inefficient. Database like Oracle, DB2 provide a utility which is very efficient & fast. What is this utility?
Bulk loader utility
Mass load utility
Data pump utility
Export/Import utility

Advertisement

Bulk loader utility

7. In the push approach of extracting data from host systems, a special file is sent at the last to let ETL system know that the data is ready for pickup. What is this special file called?
Signaling file
Last_host_file
Sentinel file
Host special file

Sentinel file

EXTRACT, TRANSFORM, LOAD mcq questions

8. Which of these may not be an important criteria while monitoring ETL job load performance:
job run duration
no. of table columns read per second
no. of rows read/written per second
no. of bytes processed per second

no. of table columns read per second

9. ETL jobs being ‘token aware’ means:
they should be aware of every record being loaded
become aware of arrival of a new external file and run the respective job automatically
becoming aware of a particular user logging into the ETL system
they should be aware of no. of jobs currently running, completed, and failed

Advertisement

become aware of arrival of a new external file and run the respective job automatically

10. While arranging jobs to run during ETL job scheduling, the exact time when the job runs may be insignificant. But what’s significant is:
the no. of jobs to run in unit time
how many jobs are successful and how many have failed
the relationships and dependencies among those jobs
what’s the error message being logged

the relationships and dependencies among those jobs

11. One of the ways to identify partially loaded data, that was loaded by a failed ETL job is to:
look at the timestamps of the records
look at their row numbers
look at the modified timestamp of physical file that contains the table data
ask the ETL team

Advertisement

look at the timestamps of the records

12. If a failed ETL job has a feature to resume from the point where it had failed, upon its rerun, such jobs are referred as:
rerunnable job
re-entrant job
recallable job
recoverable job

re-entrant job

13. If an ETL job wants to send automated notification to relevant users about the statuses of ETL jobs,
it can be done using integrated ETL tool
it has to use third-party messaging application
custom scripts have to written for it
any of these options can be used for this purpose

any of these options can be used for this purpose

14. ETL support group personnel are responsible for sending alerts to every user whenever a job completes, fails, running and other such statuses
TRUE
FALSE

Advertisement

Virtual private cloud

15. One of the good strategies that needs to be implemented while designing the ETL systems in handling exceptions is:
it must look at loading all source data to the data warehouse whether good or bad
it must look at loading only good data irrespective of the time it takes in loading
it must look at the business value in loading the good data while skipping the bad data
design strategy should only look at loading the data and never about handling exceptions

it must look at the business value in loading the good data while skipping the bad data

16. If an ETL job loading transactional data to the data warehouse fails halfway through:
there should be a mechanism to resume the job from that halfway
terminate the process and do not try to run it again
up loading duplicate rows upon next run
error should be logged and continue with loading rest of the data

Advertisement

there should be a mechanism to resume the job from that halfway

17. One of the methods to improve overall throughput of ETL process during incremental load is:
increase the no. of reads & writes
partitioning the tables
connecting to source systems during non-peak hours
to define more constraints to tables

partitioning the tables

18. ETL schedulers should make metadata available for business users and to the DWH system.
TRUE
FALSE

ETL mcq questions and answers

1. Certain work related terms like say, ‘account’ may have different meanings depending upon the dept. that uses it. Coming up with a uniform definition for such terms is very important while designing a data warehouse. What type of metadata category does it belong to?
Business metadata
Technical Metadata
Operational metadata
Business standards

Business metadata

2. Data from the source systems are ‘transformed’ before loading the source data into a data warehouse (DWH). Is there a mechanism to get a log of all transformations done on these source data?
Yes, DWH administrator maintains the list
Yes, ETL team manually maintains this log
Yes, it’s available as ETL process execution metadata
Yes, logical data map has it; but not the actual log of it

Advertisement

Yes, it’s available as ETL process execution metadata

3. How will a third-party ETL tool come to know complete details about a data warehouse (DWH) like its data structures, tables, models, etc.?
ETL team has to input all these details into the tool
DWH administrator has to put them in a flat file so that the tool can read it
Many standard DWH related inputs will be already in-built in the ETL tools
They get it from the metadata repository of the DWH

They get it from the metadata repository of the DWH

4. One of the challenges involved in managing the metadata in a data warehouse is:
metadata might get scattered in large enterprises
creating a database table for storing the metadata
the metadata table size might grow very large
None of these

metadata might get scattered in large enterprises

5. Which of these metadata are used during the ETL process?
i. ETL system parameters
ii. Descriptions of source system databases
iii. Complete ETL job details
Only (i)
Only (ii)
Only (ii) & (iii)
All of these

All of these

6. Maintaining a list of data elements of the data warehouse (DWH) and their business descriptions. Where is it stored?
One of the transaction tables in the DWH
DWH data dictionary
In respective source system databases
DWH administrator maintains it separately

Advertisement

DWH data dictionary

1. Which one of these is an example of an ETL tool?
Oracle Apps
JDBC connector
Ab Initio
Oracle Essbase

Ab Initio

2. One of the main strengths of ETL tools is:
they are easy to use
they make development effort simpler, faster & cheaper
they provide a nice user-interface
they are almost free

they make development effort simpler, faster & cheaper

3. One of the weaknesses of ETL tools is:
May not provide the flexibility that a hand-coded ETL tool provides
All of them are very expensive
Requires very large training period to learn their usage
Consumes a lot of disk space for installation

May not provide the flexibility that a hand-coded ETL tool provides

4. Which one of these may not be an important factor while choosing an ETL tool?
Management & administration
Performance
Maintainability
Task capability

Advertisement

Maintainability

etl mcq, etl mcq questions and answers, etl multiple choice questions with answers, etl multiple choice questions, extract transform load interview questions, extract transform load mcq, etl interview questions, etl mcq with answers,

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top