What is Sharding?
Sharding is a type of
database partition that split up very large databases the into
smaller, faster, more easily managed parts called data shards. The word shard
means a small part of a whole.
Here's how Jason Tee
explains sharding on The Server Side, "In the simplest sense,
sharding your database involves breaking up your big database into many, much
smaller databases that share nothing and can be spread across multiple
servers."
Technically, sharding is
a synonym for horizontal partitioning. In practice, the term is often used to
refer to any database partitioning that is meant to make a very large database
more manageable.
The governing concept
behind sharding is based on the idea that as the size of a database and the
number of transactions per unit of time made on the database increase linearly,
the response time for querying the database increases exponentially.
Additionally, the costs
of creating and maintaining a very large database in one place can increase
exponentially because the database will require high-end computers. In
contrast, data shards can be distributed across a number of much less expensive
commodity servers. Data shards have comparatively little restriction as far as
hardware and software requirements are concerned.
In some cases, database
sharding can be done fairly simply. One common example is splitting a customer
database geographically. Customers located on the East Coast can be placed on
one server, while customers on the West Coast can be placed on a second server.
Assuming there are no customers with multiple locations, the split is easy to
maintain and build rules around.
Data sharding can be a
more complex process in some scenarios, however. Sharding a database that holds
less structured data, for example, can be very complicated, and the resulting
shards may be difficult to maintain.
Comments
Post a Comment