Tag Archives: Sharding

My takeaways from – “High Availability and Sharding with Next Generation Oracle Database” OOW-2017 conference session

In his High Availability (HA)  conference session, Wei Hu, VP of Product Development covered the new Oracle Sharding and HA features of 18c, and how the Oracle Maximum Availability Architecture (MAA) powers the Oracle Autonomous Database Cloud.

Here are my key takeaways from the session:

  • Oracle Autonomous Database Cloud Service relies completely on Oracle Maximum Availability Architecture (MAA) for automated protection and repair for both planned and unplanned downtime:
    • System Failure   – Exadata, RAC, ASM
    • Regional Outage   – Active Data Guard
    • Patches   – RAC Rolling Updates
    • Major Upgrades   – Transient Logical Standby
    • Table Changes   – Online Redefinition
    • User Error   – Flashback
  • Sharding 1.0 (Oracle 12.2) is great for internet-style applications
  • Sharding 2.0 (Oracle 18c) expands the range of use cases. Following are some of the key Sharding features in 18c.
    • User-defined sharding – Partition data across shards by RANGE or LIST. By explicitly mapping the data to the shards, you achieve better control, compliance & application performance. This sharding method supports the geo-distributed, hybrid cloud and cloud bursting use cases.
    • Swim-lanes – Establish mid-tier affinity with shards to improve mid-tier cache locality, effective connection management and for geo- distributed shards – eliminate the chatty midtier-to-shard  connections across regions.
    • Multishard queries – Proxy routing now supports all sharding methods and all query shapes.
    • RAC Sharding – Affinitizes table partitions to  RAC instances. Requests that specify sharding key will be routed to the RAC instance that logically holds the partition. Affinity gives better cache utilization and reduced block pings across instances. Oracle RAC on steroids, I say.
  • Wei showcased the first customer implementation of Oracle Sharding. China Telecom went from setup to go-live in only 3 months.  China Telecom’s rational – “Migration cost is too high, if we go to other data stores…DBAs and Developers are familiar with Oracle Database. Since Oracle has sharding, why don’t use Oracle Sharding?”  Details of China Telecom’s implementation will be covered in my Oracle Sharding conference session.
  • Some of the Active Data Guard 18c Enhancements:
    • Automated Nologging support in Active Data Guard
    • Update-like capabilities on Active Data Guard for  “Mostly Read, Occasional Updates” applications
    • Automatically update password file on Active Data Guard standbys – changing Admin password on primary automatically updates standbys’
    • Standby-first encryption – Encrypt tablespaces on standby first, switchover, then encrypt on the old primary. How cool this is!
    • New Broker “Validate” command – For validation and proactive health checks to automatically detect and fix issues before they happen
  • He also covered Automated backups with Zero Data Loss Recovery Appliance, Online operations, Online patching enhancements and the Zero Downtime upgrade utility and many other capabilities.

Oracle Database 18c HA features are the core building blocks for the reliability of Oracle Autonomous Database cloud. On-premise deployments must deliver the same availability as the Oracle Cloud. Attain this by adopting the same MAA architecture used by the Oracle Autonomous Database Cloud Service.

Are your on-premise database deployments as resilient and reliable as Autonomous Database cloud? Something to ponder.

 

Oracle Sharding – Licensing

This blog post covers how Oracle Sharding is licensed for On-premises and Oracle Cloud deployments.

Q: How is Oracle Sharding Licensed for On-premises?

A: For a sharded database (SDB) with 3 or fewer primary shards, Oracle Sharding is included with EE (includes Data Guard). No limit on number of standby shards.

For an SDB with more than 3 primary shards, in addition to EE, all shards must be licensed either for Active Data Guard, Oracle GoldenGate or RAC.

So if your licensing agreement covers EE and one of the HA options – Active Data Guard or Oracle GoldenGate or RAC, you can use Oracle Sharding at no additional cost.

Q: How is Oracle Sharding Licensed for Oracle Cloud (PaaS)?

A: With DBCS EE and DBCS EE-High Performance (HP), use is limited to three primary shards. (No limit on number of standby shards)

With DBCS EE-Extreme Performance (EP) and Exadata Cloud Service (ECS), there is no limit on the number of primary shards or standby shards.

Q: What is the Sharding BYOL model for Oracle Cloud (IaaS)?

A: For a sharded database (SDB) with 3 or fewer primary shards, Oracle Sharding is included with EE (includes Data Guard). No limit on number of standby shards. For the 3 primary shards, if Active Data Guard, Oracle GoldenGate or RAC are required for these shards, they ought to be licensed accordingly.

For an SDB with more than 3 primary shards, in addition to EE, all shards must be licensed either for Active Data Guard, Oracle GoldenGate or RAC.

Ref: Oracle Database Licensing Information User Manual, 12c Release 2 (12.2)

Can I build a scalable document store application with an Oracle Sharded Database?

Yes, indeed! Starting with Oracle Database 12c Release 12.1.0.2, Oracle Database provides support for JSON. It can store, manage, and index JSON documents, and it enables rapid schemaless development. Oracle delivers the schemaless development paradigm that many developers want today, but without compromising on core enterprise capabilities. While many NoSQL systems are now recognizing the need for a tabular, structured format for accessing data, and some are even introducing SQL-like languages, Oracle Database delivers the full power of SQL to JSON document stores today, with its advanced SQL analytic capabilities and scalable parallel SQL infrastructure.

Unlike NoSQL databases, Oracle Database provides sophisticated SQL querying and reporting over JSON documents. This lets you integrate JSON and relational data, joining them in the same query. And because JSON features are integrated into Oracle Database 12c, all of its enterprise features for availability, security, scalability, and manageability are fully supported for JSON data. And now, with the advent of Oracle Sharding which naturally supports Oracle JSON, document-oriented databases can be built very easily. The following table illustrates how simple it is to build a system managed sharded table for a document store application.

CREATE SHARDED TABLE Customers
( CustId   VARCHAR2(60) NOT NULL,
CustProfile CLOB,
CONSTRAINT pk_customers PRIMARY KEY(CustId),
CONSTRAINT cons_json CHECK (CustProfile IS JSON)
)
PARTITION BY CONSISTENT HASH (CustId)
PARTITIONS AUTO
TABLESPACE SET tbs1 ;

The combination of Oracle JSON and Oracle Sharding renders data model flexibility, developer productivity plus the linear scalability of a sharded database.

I plan to do a detailed blog on using Oracle JSON with Oracle Sharding.

Cookbooks on Oracle Sharding

In this blog, I will give you pointers on how to get started in deploying a sharded database (SDB), create a sample sharded schema and access the SDB with direct and proxy routing. I will also cover how to elastically scale an SDB.
Oracle Sharding provides the capability to automatically deploy the sharded database , which includes both the shards and the replicas.  For the replication, the SDB administrator can select either Data Guard or Active Data Guard or Oracle GoldenGate 12.3 while specifying the topology.
The SDB administrator defines the topology (regions, shard hosts, replication technology etc.) and invokes the DEPLOY command with declarative specification using the GDSCTL command-line interface.
The high-level steps of the deployment of a sharded database include the following:
a) Prerequisites:
• Create a Non-CDB database that hosts the shard catalog
• Install Oracle Database software on shard nodes
• Install shard director (GSM) software on shard director nodes
Note:  For production deployments, it is highly recommended to configure Data Guard for the shard catalog database. In Oracle Database 12.2.0.1 release, all shards and shard catalog must be Non-CDB databases.
b) Specify the topology layout using:
• CREATE SHARDCATALOG
• ADD GSM
• START GSM
• ADD CREDENTIAL (IF USING ‘CREATE SHARD’)
• ADD SHARDGROUP
• ADD INVITEDNOTE
• CREATE SHARD (OR ADD SHARD) (for each shard)
c) Run the “Deploy” command and add global service to access any shard in the SDB:
• DEPLOY
• ADD SERVICE
Oracle Sharding supports two deployment methods.
  1. The first method is with the “CREATE SHARD” command, where the creation of shards and the configuration of the replication setup are automatically done by the Oracle Sharding management tier. The “DEPLOY” command creates the shards. This is done via the DBMS_SCHEDULER package (executed on the shard catalog), which communicates with the scheduler agents on the remote shard hosts. Agents then invoke DBCA and NETCA to create the shards and the local listeners. Once the primary shards are created, the corresponding standby shards are built using the RMAN ‘duplicate’ command. After the primary and standby shards are built, the “DEPLOY” command configures the Data Guard Broker with Fast-Start Failover (FSFO) enabled. The FSFO observers are automatically started on the regional shard director.  If you plan to use CREATE SHARD method, use this cookbook.
  2. The second method is with the “ADD SHARD” command. Many customers have their own database creation standards and they may opt to deploy the SDB using their own pre-created databases. The ADD SHARD based deployment method supports this requirement by simply adding the shards, which are pre-built by the user. If the “ADD SHARD” command is used for deployment, the “DEPLOY” command handles the configuration of the Data Guard, Broker and Fast-start Failover. It also handles the scenario where the user has pre-configured Data Guard for the shard that is being added. You may also want to consider the “ADD SHARD” method, if you are using ASM, RAC or Oracle Restart at the shard-level. If you do plan to use “ADD SHARD” method, use this specific cookbook.
These cookbooks walk you through the entire lifecycle of a sharded database which include the following steps:
  • Creation of shard catalog and shard directors for system managed sharding
  • Specify the metadata and deploy the sharded database
  • Creation of table family using system managed sharding
  • Specify Sharding_Key for session based routing (using SQL*Plus)
  • Observe uniform data distribution by executing Read Write and Read Only workloads on the sharded database using a demo application (using UCP)
  • Execute cross-shard queries
  • Elastically scale the sharded database
Hope this helps in allowing you to get started in playing with Oracle Sharding.

Components of an Oracle Sharded Database (SDB)

Oracle Sharding is a scalability, availability and geo-distribution feature for applications that enables distribution and replication of data across a pool of Oracle databases that share no hardware or software. Applications elastically scale (data, transactions and concurrent users) to any level, on any platform, simply by adding additional databases (shards) to the pool. Oracle Sharding allows applications to perform high velocity relational transactions. It is applicable to applications (OLTP and Data Analytics) where the primary access pattern is via the Sharding key. Sharding supports multi-shard operations as well.

Oracle Sharding provides the advantages of an enterprise DBMS, including: relational schema, SQL, and other programmatic interfaces, support for complex data types, online schema changes, multi-core scalability, advanced security, compression, high-availability, ACID properties, consistent reads, developer agility with JSON, and much more.

In this blog, I will go over the various components of a Sharded Database (SDB) – Shards, Shard Catalog, Shard Directors (GSM), Global Service and Connection Pools. Figure 1 showcases the architectural components of Oracle Sharding:

SDB Arch

Figure 1: Oracle Sharding Architecture with no SPOF

Sharded Database and Shards

Shards are independent Oracle databases that are hosted on database servers which have their own local resources – CPU, memory, and disk . No shared-storage is required across the shards. A sharded database is a collection of shards. Shards can all be placed in one region (datacenter[s]) or can be placed in different regions. A region in the context of Oracle Sharding represents a datacenter or a multiple datacenters that are in close network proximity.

Shards are replicated for High Availability (HA) and Disaster Recovery (DR) with Oracle replication technologies such as Active Data Guard or Oracle GoldenGate. For HA, the standby shards can be placed in the same region where the primary shards are placed. For DR, the standby shards are located in another region.

Shard Catalog

The shard catalog is a special-purpose Oracle Database that is a persistent store for SDB configuration data and plays a key role in automated deployment and centralized management of a sharded database. It also hosts the gold schema of the application and the master copies of common reference data (duplicated tables). The shard catalog database also acts as a query coordinator used to process multi-shard queries and queries that do not specify a sharding key.

All configuration changes, such as adding and removing shards and global services, are initiated on the shard catalog. All DDLs in an SDB are executed by connecting to the shard catalog.

An outage of the shard catalog does not affect the availability of the SDB. A shard catalog outage only affects the ability to perform maintenance operations or multi-shard queries during the brief period required to failover to a standby shard catalog. OLTP transactions are unaffected; they continue to be routed and executed on the shards. Oracle MAA recommends that a local Active Data Guard standby database be configured with Maximum Availability database protection mode with Fast-Start Failover and a remote physical standby database.

Shard Directors

The Shard Director is a regional network listener for clients that connect to an SDB. It maintains an up-to-date topology of the sharded database. Shard Directors route connections to the appropriate shards based on the sharding key passed during a connection request. The Shard Director is built upon the Oracle Global Service Manager (GSM). GSMs route connections based on database role, load, replication lag, and locality. For Oracle Sharding, GSMs have been enhanced to support management of SDB and routing of connections based on the location of data.

For a typical sharded database, shard directors (GSMs) are installed on dedicated low-end commodity servers in each region. Multiple shard directors should be deployed for high availability. In Oracle Database 12.2, up to 5 shard directors can be deployed in a given region. Oracle MAA recommends deploying three shard directors per region for availability and scalability.

Shard Directors provide the following set of functions

  • Maintain runtime metadata about SDB configuration and availability of shards
  • Publish SDB topology changes to connection pools
  • Measure network latency between its own and other regions
  • Act as a regional listeners for clients to connect to an SDB
  • Manage global services
  • Perform connection load balancing
  • Publish runtime load balancing information to connection pools
  • Monitor availability of database instances and global services, and notify clients via FAN HA events upon failure incidents
  • Aid in schema propagation to all the shards

Global Service

A Global Service is a database service that can run across multiple databases. A global service allows clients to access data on any shard in the SDB. Global Services offer additional properties for sharded databases – e.g., database role, replication lag tolerance, region affinity, etc. You can create role-based global services: create a read-write global service that accesses data from primary shards, and a separate read-only global service for Active Data Guard shards.

Connection Pools

At runtime, connection pools act as shard directors by routing database requests across pooled connections. Key enhancements have been made to Oracle connection pools and drivers to support Sharding. Starting from 12.2, JDBC/UCP, OCI and Oracle Data Provider for .NET (ODP.NET) recognize the sharding keys as part of the connection check out. Apache Tomcat, JBoss, IBM WebSphere and Oracle WebLogic can use UCP support for sharding. PHP, Python, Perl, and Node.js can use OCI support.

I will go over the details of the data-dependent routing against an SDB in the next blog.

For more info, do visit the Oracle Sharding portal on Oracle Technology Network (OTN). Follow me on Twitter @nageshbattula

Oracle Sharding Capabilities – Part 3 of 3

In the last of the three-part series on the capabilities of Sharding, I will cover the lifecycle management aspects of a sharded database.

Full Support for the Lifecycle Management of an SDB

Oracle Sharding platform allows shards to be added or removed online. It also supports the data reorganization at the chunk level.

Online scale-out with auto-resharding

Oracle Sharding provides the capability to elastically scale-out the sharded database via incremental deployment – shards are added online to expand the pool. In System Managed sharding, the addition of new shards to the pool will automatically trigger resharding wherein the chunks are automatically moved in order to attain balanced distribution of data.

resharding

resharding2.png

Figure 6. Auto-resharding when shards are added

Ability to scale-in

Sharding allows to scale-in a given sharded database by removing the shards. In order to shrink the pool, you can use the “REMOVE SHARD” command. In 12.2.0.1, the chunks must be explicitly moved to the other shards before removing the shard.

Chunk Move or Split

In addition to shard addition and removal, Oracle Sharding supports splitting and moving of chunk from one shard to another. Chunks can also be moved from one shard to another when data or workload skew occurs without a change in the number of shards. Chunk migration can be initiated automatically or by the DBA.

Structural modifications to the Sharded Database are done transparently to the application. Connection pools get notified (via ONS) about split, move, add/remove shards and auto-resharding.

Patching Automation

Sharding enables patching all the shards with one command via opatchauto. OPatchauto supports all sharding schemes and replication methods. It also supports rolling mode and parallel mode.

Supports Command-Line and Graphical User Interfaces

Oracle Sharded databases can be deployed, managed and monitored with two interfaces:

GDSCTL:

GDSCTL is a command-line interface that provides a simple declarative way of specifying the configuration of an SDB and automating its deployment.  For example, just a few GDSCTL commands are required to create the SDB:

CREATE SHARDCATALOG

ADD GSM; START GSM (create and start shard directors)

CREATE SHARD (for each shard)

DEPLOY (for automated creation of shards and replication setup)

Oracle Enterprise Manager:

Enterprise Manager enables management of the life cycle of a sharded database with graphical user interface. For example, with few clicks, a shard director and shard software can be provisioned. Likewise, the deployment of shard directors and the creation of sharded database can be performed. With EM, you can also manage and monitor an SDB for availability and performance.

In summary, Oracle Sharding supports – data distribution, data colocation, data replication, direct as well as proxy routing and the complete lifecycle of a sharded database. Having studied the the key capabilities of Sharding, in the next blog post, let’s take a look at the components and architecture of the sharded database.

For more info, do visit the Oracle Sharding portal on Oracle Technology Network (OTN). Follow me on Twitter @nageshbattula

Oracle Sharding Capabilities – Part 2 of 3

In the first part of the three-part blog series, we reviewed the automated data distribution and centralized schema management capabilities of Oracle Sharding. In part 2, I will cover the automated deployment of a sharded database (SDB) and data-dependent routing against an SDB.

Automated creation and replication of shards

In the Oracle Database 12.2.0.1 release, Oracle Sharding supports three automatically configured replication options: Data Guard, Active Data Guard, or Oracle GoldenGate (Oracle GoldenGate 12.3 supports Oracle Sharding).

adg

Figure 2. Shard-level replication with Active Data Guard

ogg

Figure 3. chunkset-level replication with Oracle GoldenGate

Sharding supports two SDB deployment methods.

The first method is with the “CREATE SHARD” GDSCTL command. With this method, the shards and their respective listeners are automatically created. Once the primary shards are created, the corresponding standby shards are automatically built using the RMAN ‘duplicate’ command. After which, the Data Guard Broker configuration with Fast-Start Failover (FSFO) is automatically enabled. The FSFO observers are automatically started on the regional shard director.

The second method is with the “ADD SHARD” GDSCTL command. Many organizations have their own database creation standards and they may opt to deploy the SDB using their own pre-created databases (shards). The ADD SHARD based deployment method supports this requirement by simply adding the shards, which are pre-built by the user.

These two deployment methods support both the initial and incremental deployments.

Data-dependent routing

Oracle Sharding supports Direct and Proxy routing of database requests to shards.

Direct Routing:

Key enhancements have been made to Oracle connection pools and drivers to support Sharding. Starting from 12.2, JDBC/UCP, OCI and Oracle Data Provider for .NET (ODP.NET) recognize the sharding keys as part of the connection check out. Apache Tomcat, JBoss, IBM WebSphere and Oracle WebLogic can use UCP support for sharding. PHP, Python, Perl, and Node.js can use OCI support.

Sharding Key is used for routing the database connection requests at a user session level during connection checkout. Based on this information, connection is established to the relevant shard which contains the data pertinent to the given sharding_key. Once the session is made to a shard, all SQL queries and DMLs are supported, executed in the scope of the given shard and require no modification.

Upon the first connection to a given shard, the sharding key range mapping is collected from the shards to dynamically build the shard topology cache. This routing map is cached in the client. This allows subsequent requests for sharding keys within the cached range to be routed directly to the shard, bypassing the shard director. Such data-dependent routing of database requests eliminates an extra network-hop – thereby decreasing the latency for high volume OLTP applications.

When a connection request is made with a sharding key, connection pool looks up the corresponding shards on which this particular sharding key exists (from its topology cache). If a matching connection is available in the pool then the pool returns a connection to one of these shards by applying its internal connection selection algorithm. If a connection is not available, then forwarding the request with the sharding key (by the connection pool) to the shard director creates a new connection.

As illustrated in Figure 4, DB connection request for a given sharding key that is in any of the cached topology map, goes directly to the shard (i.e., bypassing the shard director).

pic1

Figure 4. Logical flow of direct routing – Connection pool as a Shard Director

Note: Super_sharding_key is needed only in the case of composite sharding.

The routing map automatically refreshes when a shard becomes unavailable or changes occur to the sharding topology. This is enabled by the Fast Application Notification (FAN) published by Shard Director via Oracle Notification Server (ONS).

Proxy Routing for Multi-Shard Queries:

Proxy routing is an ancillary usage pattern targeted for developer convenience. This requires connection be established to the coordinator. In Oracle Database 12.2.0.1, the shard catalog database assumes the role of the coordinator database. Once the session is made to the coordinator, SQL queries and DMLs are executed and require no modification. Proxy routing is suitable for the following scenarios:

»  When the application cannot pass the sharding key during connect

»  When the application needs to access data from multiple shards in the same query. For example, a query to aggregate data across all shards.

pic2

Figure 5. Logical flow of proxy routing

As illustrated in Figure 5, routing via the coordinator allows users to submit SQL statements without a sharding key value passed during connect. The Coordinator’s SQL compiler analyzes and rewrites the query into query fragments that are sent and executed by the participating shards. The queries are rewritten so that most of the query processing is done on the participating shards and then aggregated by the coordinator.

Applications should separate their workloads for direct vs. proxy routing. Separate connection pools must be created for these workloads.

In the second part of the three-part blog series , we have looked at the automated creation of shards and the replication setup. We also understood how routing happens against a sharded database. In the last part, we will go over the  lifecycle management aspects of a sharded database.

For more info, do visit the Oracle Sharding portal on Oracle Technology Network (OTN). Follow me on Twitter @nageshbattula

 

Oracle Sharding – Capabilities – Part 1 of 3

In Oracle Database 12.2.0.1, Oracle Sharding supports the complete lifecycle of a Sharded Database. In this three part blog post series, we will review the capabilities of Oracle Sharding. In the first part, we will study the automated data distribution and centralized schema management.

Automated Data Distribution

In Oracle Database 12.2.0.1, Oracle Sharding supports two methods of sharding: system-managed and composite.

System-managed sharding:

System-managed sharding method automatically distributes data across shards using consistent hash (sharding_key). This algorithm evenly and randomly distributes data across shards to eliminate hot spots and provide uniform performance across shards. Oracle Sharding automatically rebalances data when shards are added to or removed. System-managed sharding is the most used form of sharding.

Oracle Sharding uses the familiar SQL syntax for table partitioning to specify how table rows are partitioned across shards.  For example, the CREATE SHARDED TABLE statement is used to create a sharded table based on cust_id as the sharding key.

CREATE SHARDED TABLE customers

( cust_id     NUMBER NOT NULL

, name       VARCHAR2(50)

, address     VARCHAR2(250)

, region     VARCHAR2(20)

, class       VARCHAR2(3)

, signup     DATE

CONSTRAINT cust_pk PRIMARY KEY(cust_id)

)

PARTITION BY CONSISTENT HASH (cust_id)

TABLESPACE SET ts1

PARTITIONS AUTO;

Oracle Sharding automates the creation of tablespaces on all the shards as a unit called a tablespace set. The PARTITIONS AUTO clause specifies that the number of partitions should be automatically determined and mapped to the tablespaces.

The unit of data migration between shards is a chunk. A chunk is a set of tablespaces that collocate corresponding partitions of all tables in a table family. A chunk contains a single partition from each table of a set of related tables. This guarantees that related data from different sharded tables can be moved together. The number of chunks within each shard is specified when the SDB is created (default is 120 chunks per shard).

In addition to sharded tables, many applications require common reference data (e.g. Products) that needs to be accessed as part of the queries in the transactions. Multi-shard queries can be avoided by replicating the small number of read-only or read-mostly “non-shardable” tables across all shards. Replication of complete tables is a good choice for relatively small tables that are often accessed together with sharded tables. A table with the same contents in each shard is called a Duplicated Table. For example a Customers–Orders–Line Items schema may also include a Products table. This table contains data shared by all customers and cannot be sharded by the customer number. Instead, the entire table is duplicated on all databases to prevent multi-shard queries during order processing. In the example of the Products table, it is created using the CREATE DUPLICATED TABLE statement.

The combination of sharded and duplicated tables enables all transactions associated with a sharding key to be processed by a single shard. This technique enables linear scalability and fault isolation.

Composite Sharding:

With composite sharding method, data is first partitioned by list or range (super_sharding_key) and then further partitioned by consistent hash (sharding_key). The two levels of sharding make it possible to map data to a set of shards, and then automatically maintain balanced distribution of data across that set of shards. Composite sharding is ideal for global data distribution where shards are placed in each geography and within a given geography data is uniformly distributed and enables linear scalability

The composite sharding method allows you to create multiple shardspaces for different subsets of data in a table partitioned by consistent hash. A shardspace is set of shards that store data that corresponds to a range or list of key values. For example, the following CREATE SHARDED TABLE statement is used to create a table with composite sharding method based on geo as the super_sharding_key and cust_id as the sharding key.

CREATE SHARDED TABLE customers
( cust_id NUMBER NOT NULL
, name VARCHAR2(50)
, address VARCHAR2(250)
, geo VARCHAR2(20)
, class VARCHAR2(3)
, signup_date DATE
, CONSTRAINT cust_pk PRIMARY KEY(geo, cust_id)
)
PARTITIONSET BY LIST (geo)
PARTITION BY CONSISTENT HASH (cust_id)
PARTITIONS AUTO
(PARTITIONSET AMERICA VALUES (‘AMERICA’) TABLESPACE SET tbs1,
PARTITIONSET ASIA VALUES (‘ASIA’) TABLESPACE SET tbs2);

Oracle Sharding also supports subpartitioning methods provided by Oracle Database and enables information lifecycle management (ILM) by placing subpartitions on separate tablespaces and moving them between storage tiers. Migration of subpartitions between storage tiers can be done without sacrificing the scalability and availability benefits of sharding and the ability to perform partition pruning and partition-wise joins on a primary key. This composite Sharding with sub-partitioning provides three-levels of data organization.

Centralized schema management

The SDB schema, including the structure of sharded and duplicated tables and the data distribution, is maintained in the Shard Catalog. The Shard Directors in conjunction with the Shard Catalog propagate the schema to all the shards.

central

Figure 1 . Schema management in a Sharded Database

The above pictorial illustrates how the schema is propagated by the shard director and the shard catalog to all the shards.

In the first part of the three part blog series , we have looked at the automated data distribution and centralized schema management. In the second part, we will go over the  automated creation and replication of shards and data-dependent routing.

 

Examples of target customers and applications for Oracle Sharding

In the last blog post, we looked at why a given customer would be interested in Oracle Sharding.  In this post, let’s see some examples of various customers and their applications that can benefit from Oracle Sharding. These target customers for Oracle sharding can come from any industry vertical. Examples include:

  • Mass Media and Financial Information Services providers who need massive scalability with high availability for online storage and retrieval of information.
  • Airline ticketing systems whose main driver for sharding is fault isolation. They want to shard across tens of independent databases. Failure of a database only makes 1/N of the data momentarily unavailable.
  • Social Media companies who may wish to allocate different shards for different classes of users/customer profiles, at different price levels.
  • Online Payment Systems that shard for linear scalability and fault isolation, and who may need to satisfy regulatory requirements for storing user data in the country of citizenship.
  • Financial and Tax preparation companies who shard by customer id to scale users, workload and transactions. Sharding provides these companies with elasticity required when demand for service peaks during tax filing season.
  • Large billing systems where each customer can be identified by a customer ID, phone number, or user ID.

In summary, Oracle Sharding 12.2.0.1 is a good fit for the applications (OLTP and data analytics) whose primary access pattern is based on a sharding key. The secondary pattern – multi-shard querying is also supported for reporting and data aggregation use cases. In the next blog post, I will cover the salient capabilities of Oracle Sharding.

For more info, do visit the Oracle Sharding portal on Oracle Technology Network (OTN). Follow me on Twitter @nageshbattula

Why would a customer be attracted to Oracle Sharding?

Oracle Sharding is a response to customer demand for a relational database architecture that can deliver the combination of linear scalability and complete fault isolation without requiring shared storage or compromising on the enterprise qualities of the Oracle Database: strict consistency, the full power of SQL, developer agility with JSON, security, high availability, backup and recovery, life-cycle management, etc.

OLTP applications designed for Oracle sharding can elastically scale (data, transactions and users) to any level, on any platform, simply by deploying new shards on additional stand-alone servers. The unavailability or slowdown of a shard due to either an unplanned outage or planned maintenance affects only the users of that shard, it does not affect the availability or performance of the application for users of other shards. Each shard may run a different release of the Oracle Database as long as the application is backward compatible with the oldest running version – making it simple to maintain availability of an application while performing database maintenance.

Oracle Sharding also does more than just extend the enterprise qualities of Oracle to a sharded database architecture. Oracle Sharding uses automation to simplify life- cycle management, advanced partitioning methods to address a wide array of use-cases, and data-dependent routing for superior runtime performance. Collectively these capabilities provide customers substantial advantages compared to competitive sharding solutions or custom deployments.

For more info, do visit the Oracle Sharding portal on Oracle Technology Network (OTN).