Introduced in Oracle Database 10g, Data Pump provides a server-based data extraction and import tool. Its features include improved architecture and functionality of original import and export applications. Data Pump allows you to stop and restart a job, see the status of the service running, and prevent data export and import.
NOTE: Data Pump files are incompatible with those generated from the original Export utility.
Data Pump runs as a server process, benefiting users in many ways. The client process that starts the job can disconnect and later reattach to the job. Performance is enhanced (as compared to the original Export/Import) because the data no longer has to be processed by a client program. Data Pump extractions and loads can be parallelized, further enhancing performance. In this section, you will see how to use Data Pump, along with descriptions and examples of its major options. This includes how Data Pump uses directory objects, specifying options on the command line, and how to stop and restart jobs within the Data Pump command-line interface.
DATAPUMP Architecture :
Certain enhancements in DATAPUMP cannot be achieved using traditional exp/imp. One of the most important ones is the “Restartability” of the exp/imp process. I will try to go through most of the useful features (with examples) that are new in Oracle 10g and above. The movement of data across the databases and servers is an integral part of the enterprise architecture. There has always been the need to move the subset of data from one database to another database or from one server to another. This was achieved using the exp/imp tools of Oracle in earlier versions. With Oracle 10g, we have been provided with the DATAPUMP utility, which provides fast parallel bulk data and metadata movement between databases… simple and fast. With DATAPUMP Oracle has introduced a setup, which the utility creates by itself. The various components of the setup are described below. The architecture of the Data Pump:
In Oracle 10g the Data Pump (server-side utility) does all the work. Earlier the exp/imp utilities were executed from the client and the dump file was stored on the client. With Data Pump the dump files are stored on the server. The Data Pump architecture consists of the following components.
- Data Pump API
- Client Tools
- Master Control Process
- Worker Process
Data Pump API: DBMS_DATAPUMP is the API for Data Pump. This API is the engine using which the jobs are created and monitored.
Client Tools: As in the earlier version, we had exp/imp tools, now we have data pump tools expdp and impdp. These client-side tools use the DBMS_DATAPUMP API to execute and monitor the jobs.
Master Control Process: The Master Control Process (MCP) controls the execution of the data pump job. There is one MCP per job. It maintains the control information like job name, job status, file information, etc. in the master table, which has the same name as the job name. The details in the master table can be used to restart a job. Once the job is completed, the master table is dropped. At the heart of the Data Pump operation is the Master Table. This table is created at the start of a Data Pump operation and deleted at the end of a successful Data Pump operation. The Master Table can be deleted if the job is deleted using the interactive kill_job command. If a job is stopped using the stop_job command or if a job is stopped unexpectedly, The Master Table will be saved. The keep_master parameter can be set to Y to persist the master table at the end of a successful job for maintenance purposes. The name of the Master Table is the same as the Data Pump job name.
Worker Process: Worker process are created by the MCP depending on the PARALLEL parameter. The worker process performs the tasks as requested the MCP process which is the loading/unloading of data or metadata. The worker process maintains the details in the master table which can be used to restart the job.
The advantages of using a Data Pump are :
- ability to estimate job times
- ability to restart failed jobs
- perform fine-grained object selection
- monitor running jobs
- directly load a database from a remote instance via the network
- remapping capabilities
- improved performance using parallel executions
A couple of notes are that you cannot export to a tape device only to disk, and the import will only work with the version of Oracle 10.1 or greater. Also, remember that the expdp and impdp are command-line tools and run from within the Operating System.
Data Pump Uses: You can use the Data Pump for the following
- migrating databases
- copying databases
- transferring Oracle databases between different operating systems
- backing up important tables before you change them
- moving database objects from one tablespace to another
- transporting tablespaces between databases
- reorganizing fragmented table data
- extracting the DDL for tables and other objects such as stored procedures and packages
How does a Data Pump work?
The Master Control Process (MCP), has the process name DMnn, only one master job runs per job which controls the whole Data Pump job, it performs the following
- create jobs and control them
- creates and manages the worker processes
- monitors the jobs and logs the process
- maintains the job state and restart information in the master table (created in the user schema running the job)
- manages the necessary files including the dump file set.
The worker process is named DWnn and is the process that actually performs the work, you can have several worker processes running on the same job (parallelism). The work process updates the master table with the various job statuses.
The shadow process is created when the client logs in to the Oracle server it services data pump API requests, and it creates the job consisting of the master table and the master process.
The client processes are the expdp and impdp commands.
Data Pump files
You will use three types of files when using the data pump, all files will be created on the server.
Dump Files - holds the data and metadata
Log Files - the resulting output from the data pump command
SQL Files - contain the DDL statements describing the objects included in the job but can contain data
Master Data Pump Tables - when using the data pump it will create tables within the schema, this is used for controlling the data pump job, and the table is removed when finished.
Five modes of Data Pump exports are supported:
- Full: Export of all database data and metadata
- Schema: Export data and metadata for specific user schemas
- Tablespace: Export data and metadata for tablespaces
- Table: Export data and metadata for tables and table partitions
- Transportable Tablespace: Export metadata for specific tablespaces in preparation for Transporting a tablespace from one database to another
NOTE: You must have the EXP_FULL_DATABASE system privilege to perform a Full export or a Transportable Tablespace export.
No comments:
Post a Comment