How to use Data Pump to import data into the shards of a Sharded Database (SDB)?

Data Pump Import (impdp)  has a new capability starting from Oracle Database 12.2 to filter appropriate records for loading into individual shards of an SDB. It supports all methods of Sharding – namely System Managed, Composite and User-defined Sharding.

If you have a stand-alone database, then you need to export the table data using the data_options=group_partition_table_data. This unloads all table data at once instead of unloading each table partition independently.

Here is an example that walks you through the steps:

On the Source Database:

SQL>grant DATAPUMP_EXP_FULL_DATABASE to  testuser1;

SQL>create or replace directory def_dir1 as ‘/tmp’

SQL>grant read, write on directory def_dir1 to testuser1;

On the non-sharded database, export the table data using “data_options=group_partition_table_data” as shown below. This unloads all table data at once instead of doing each table partition independently. Data Pump exports all data into the dump file.

$ expdp testuser1/testuser1@src_db                   \

directory=def_dir1                                         \

schemas=testuser1                                         \

dumpfile=no_part%u                                     \

logfile=tkdpshrd_exp.log                              \

data_options=group_partition_table_data \

parallel=4

 

Pre-reqs (on the Target Sharded Database)

On the Shard Catalog

Create the Data Pump directory objects (on shardcat and all shards) and grant appropriate role to the user as shown below.

SQL>alter session enable shard ddl

SQL>create or replace directory def_dir1 as ‘/tmp

SQL>grant read, write on directory def_dir1 TO <user>;

SQL>grant DATAPUMP_IMP_FULL_DATABASE  to  <user>;

<create the sharded tables that match the structure of the source tables>

<Create other schema objects>

On each of the shards:

Execute the import specifying CONTENT=DATA_ONLY.  Because the tables already exist in the shard, attempts to recreate them will fail. The Data Pump Import detects that you are importing into a shard and only load rows that belong to that shard.  You need to run the import command on each shard.

impdp testuser1/testuser1@shard1          \

directory=def_dir1                                 \

dumpfile=no_part%u                             \

logfile=tkdpshrd_imp_01.log                \

table_exists_action=append                 \

content=data_only                                \

parallel=4

At the time of import, the Data Pump filters out rows that do not belong in the given shard. Data Pump passes key values for each row to a internal GSM function that tells  whether the row belongs in that shard.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s