20、分区表
MySQL数据库支持的分区类型为水平分区,并不支持垂直分区,此外MySQL数据库的分区是局部分区索引,一个分区中即存放了数据又存放了索引。
分区主要用于高可用性,利于数据库的管理。在OLTP应用中,对于分区的使用应该小心。
水平分区:指同一表中不同行的记录分配到不同的物理文件中。
查看当前数据库是否开启了分区功能:
mysql> showvariables like '%partition%';
+-------------------+-------+
|Variable_name | Value |
+-------------------+-------+
| have_partitioning| YES |
+-------------------+-------+
mysql> showplugins;
+-----------------------+----------+--------------------+---------+-------------+
| Name | Status | Type | Library | License |
+-----------------------+----------+--------------------+---------+-------------+
| binlog | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
|mysql_native_password | ACTIVE |AUTHENTICATION | NULL | PROPRIETARY |
|mysql_old_password | ACTIVE | AUTHENTICATION | NULL | PROPRIETARY |
| MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| CSV | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| MyISAM | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| MEMORY | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| InnoDB | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| INNODB_TRX | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
|INNODB_LOCKS | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
|INNODB_LOCK_WAITS | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
| INNODB_CMP | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
|INNODB_CMP_RESET | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
|INNODB_CMPMEM | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
|INNODB_CMPMEM_RESET | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |
|PERFORMANCE_SCHEMA | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| FEDERATED | DISABLED | STORAGE ENGINE | NULL | PROPRIETARY |
| ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
| partition | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |
+-----------------------+----------+--------------------+---------+-------------+
不论创建何种类型的分区,如果表中存放主键或者是唯一索引时,分区列表必须是唯一索引的一个组成部分。
唯一索引可以是允许NULL值的,并且分区列只能是唯一索引的一个组成部分,不需要整个唯一索引列都是分区列。普通索引可以不作为分区列。
(1)RANGE分区
mysql> createtable t (
->id int) engine=innodb
->partition by range (id)(
->partition p0 values less than (10),
->partition p1 values less than (20));
mysql> systemls -lh /home/mysql/data/test
总计 208K
-rw-rw---- 1mysql mysql 8.4K 2012-09-04 10:11 t.frm
-rw-rw---- 1mysql mysql 28 2012-09-04 10:11 t.par
-rw-rw---- 1mysql mysql 96K 2012-09-04 10:11t#P#p0.ibd
-rw-rw---- 1mysql mysql 96K 2012-09-04 10:11t#P#p1.ibd
查看partitions表来查看每个分区的具体信息:
mysql> select* from information_schema.partitions where table_schema=database() andtable_name='t'\G
***************************1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test
TABLE_NAME: t
PARTITION_NAME: p0
SUBPARTITION_NAME: NULL
PARTITION_ORDINAL_POSITION: 1
SUBPARTITION_ORDINAL_POSITION:NULL
PARTITION_METHOD: RANGE
SUBPARTITION_METHOD: NULL
PARTITION_EXPRESSION: id
SUBPARTITION_EXPRESSION: NULL
PARTITION_DESCRIPTION: 10
TABLE_ROWS: 1
AVG_ROW_LENGTH: 16384
DATA_LENGTH: 16384
MAX_DATA_LENGTH: NULL
INDEX_LENGTH: 0
DATA_FREE: 0
CREATE_TIME: NULL
UPDATE_TIME: NULL
CHECK_TIME: NULL
CHECKSUM: NULL
PARTITION_COMMENT:
NODEGROUP: default
TABLESPACE_NAME: NULL
***************************2. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test
TABLE_NAME: t
PARTITION_NAME: p1
SUBPARTITION_NAME: NULL
PARTITION_ORDINAL_POSITION: 2
SUBPARTITION_ORDINAL_POSITION:NULL
PARTITION_METHOD: RANGE
SUBPARTITION_METHOD: NULL
PARTITION_EXPRESSION: id
SUBPARTITION_EXPRESSION: NULL
PARTITION_DESCRIPTION: 20
TABLE_ROWS: 2
AVG_ROW_LENGTH: 8192
DATA_LENGTH: 16384
MAX_DATA_LENGTH: NULL
INDEX_LENGTH: 0
DATA_FREE: 0
CREATE_TIME: NULL
UPDATE_TIME: NULL
CHECK_TIME: NULL
CHECKSUM: NULL
PARTITION_COMMENT:
NODEGROUP: default
TABLESPACE_NAME: NULL
不在区间定义的值,报错
mysql> insertinto t select 30;
ERROR 1526(HY000): Table has no partition for value 30
可以添加一个maxvalue值的区间,可以理解为正无穷。
mysql> altertable t add partition (
->partition p2 values less than maxvalue);
mysql> insertinto t select 30;
Query OK, 1 rowaffected (0.00 sec)
其实range分区主要用于日期列的分区,举例:
mysql> createtable sales (
->money int unsigned not null,
->date datetime) engine=innodb
->partition by range (YEAR(date)) (
->partition p2008 values less than(2009),
->partition p2009 values less than(2010),
->partition p2010 values less than(2011)
-> );
mysql> insertinto sales select 100,'2008-01-01';
Query OK, 1 rowaffected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insertinto sales select 100,'2009-02-01';
Query OK, 1 rowaffected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insertinto sales select 200,'2008-01-02';
Query OK, 1 rowaffected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insertinto sales select 100,'2009-03-01';
Query OK, 1 rowaffected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insertinto sales select 200,'2010-03-01';
Query OK, 1 rowaffected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
这样创建的好处是:便于对sales这张表的管理。如果我们要删除2008年的数据,就不需要执行执行delete from sales wheredate >= ‘2008-01-01’ and date <’2009-01-01’,而是只需要删除2008年的数据所在的分区即可:
mysql> altertable sales drop partition p2008;
Query OK, 0 rowsaffected (0.24 sec)
Records: 0 Duplicates: 0 Warnings: 0
这样创建的另一个好处是,可以加快某些查询的操作。
mysql> explainpartitions select * from sales where data >= '2008-01-01' and
->data<= '2008-12-31'\G
***************************1. row ***************************
id: 1
select_type: SIMPLE
table: sales
partitions: p2009
type: ALL
possible_keys:NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2
Extra: Using where
可以看到SQL优化器只需要去搜索p2008这个分区,而不会去搜索其他所有的分区,因此大大地提高了查询的速度。
考虑下面的一种情况,设计者的原意是想可以按照每年每月来进行分区,
mysql> createtable sales (
->money int unsigned not null,
->date datetime
-> )
->engine=innodb
->partition by range(YEAR(date)*100+MONTH(date)) (
->partition p201001 values less than(201002),
->partition p201002 values less than(201003),
->partition p201003 values less than(201004)
-> );
但是在执行SQL语句时,开发人员发现,优化器不会根据分区进行选择:
mysql> explainpartitions select * from sales where date >= '2010-01-01' and date <='2010-01-31'\G
***************************1. row ***************************
id: 1
select_type: SIMPLE
table: sales
partitions:p201001,p201002,p201003
type: ALL
possible_keys:NULL
key: NULL
key_len: NULL
ref: NULL
rows: 3
Extra: Using where
可以看到优化对三个分区都进行了搜索。产生这个问题的主要原因就是,对于range分区的查询,优化器只能对YEAR()、TO_DAYS()、TO_SECONDS()、UNIX_TIMESTAMP()这类函数进行优化选择,因此可以将上面的函数改为TO_DAYS,如下:
(2)LIST分区
和range分区非常类似,只是分区的值是离散的,而非连续的。
mysql> createtable t (
->a int,
->b int) engine=innodb
->partition by list(b) (
->partition p0 values in (1,3,5,7,9),
->partition p1 values in (0,2,4,6,8)
-> );
不同于range分区中定义的valuesless than语句,LIST分区使用values in,所以每个分区的值是离散的,只能是定义的值。
mysql> selecttable_name,partition_name,table_rows from information_schema.partitions where
-> table_name = 't' and table_schema =database()\G
***************************1. row ***************************
table_name: t
partition_name:p0
table_rows: 2
***************************2. row ***************************
table_name: t
partition_name:p1
table_rows: 2
如果插入的值不在分区范围内,同样会报错:
mysql> insertinto t select 1,10;
ERROR 1526(HY000): Table has no partition for value 10
对于innoDB存储引擎的分区表,下面的插入操作(作为一个事务)不会插入任何数据:
mysql> insertinto t values (1,2),(2,4),(6,10),(5,3);
ERROR 1526(HY000): Table has no partition for value 10
但是对于MyISAM存储引擎的表,会将之前的行数据都插入,但之后的数据不会被插入。
mysql> createtable t (
->a int,
->b int) engine=myisam
->partition by list(b) (
->partition p0 values in (1,3,5,7,9),
->partition p1 values in (0,2,4,6,8));
Query OK, 0 rowsaffected (0.18 sec)
mysql> insertinto t values (1,2),(2,4),(6,10),(5,3);
ERROR 1526(HY000): Table has no partition for value 10
mysql> select* from t;
+------+------+
| a | b |
+------+------+
| 1 | 2 |
| 2 | 4 |
+------+------+
因此在使用分区时,也需要对不同存储引擎支持的事务特性进行考虑。
(3)HASH分区
HASH分区的目的是将数据均匀地分布到预先定义的各个分区中,保证各分区的数据数量都是一样的。
要使用HASH分区来分隔一个表,要在createtable语句上添加一个partitionby hash (expr)子句,其中expr是一个返回一个整数的表达式。
mysql> createtable t_hash (
->a int,
->b datetime)
->engine=innodb
->partition by hash (YEAR(b))
->partitions 4;
mysql> selectmod(year('2010-04-01'),4);
+---------------------------+
|mod(year('2010-04-01'),4) |
+---------------------------+
| 2 |
+---------------------------+
如果将2010-04-01这个记录插入表t_hash中,那么保存该条记录的分区确定如上,放入分区2;
如下验证:
mysql> insertinto t_hash select 1,'2010-04-01';
Query OK, 1 rowaffected (0.06 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> selecttable_name,partition_name,table_rows from information_schema.partitions where
-> table_schema = database() andtable_name = 't_hash'\G
***************************1. row ***************************
table_name: t_hash
partition_name:p0
table_rows: 0
***************************2. row ***************************
table_name: t_hash
partition_name:p1
table_rows: 0
***************************3. row ***************************
table_name: t_hash
partition_name: p2
table_rows: 1
***************************4. row ***************************
table_name: t_hash
partition_name:p3
table_rows: 0
4 rows in set(0.00 sec)
MySQL数据库还支持一种称为LINEAR HASH的分区,它使用一种更加复杂的算法来确定新行插入已经分区的表中的位置。
mysql> createtable t_linear_hash (
->a int,
->b datetime)
->engine=innodb
->partition by linear hash (year(b))
->partitions 4;
进行分区的判断算法如下:
1)取大于分区数量4的下一个2的幂值V,V=power(4,ceiling(log(2,num))) = 4;
2)所在的分区N=YEAR(‘2010-04-01’) &(V-1) = 2
虽然还是分区2,但是计算的算法和之前不一样。
LINEAR HASH分区的优点在于:增加,删除,合并和拆分分区将变得更加快捷,这有利于处理含有大量数据的表。缺点是:与使用HASH分区得到的数据分别相比,各个分区间数据的分布可能不大均衡。
mysql> insertinto t_linear_hash select 1,'2010-04-01';
mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_schema = 'test' and table_name = 't_linear_hash';
+---------------+----------------+------------+
| table_name | partition_name | table_rows |
+---------------+----------------+------------+
| t_linear_hash |p0 | 0 |
| t_linear_hash |p1 | 0 |
| t_linear_hash | p2 | 1 |
| t_linear_hash |p3 | 0 |
+---------------+----------------+------------+
(4)KEY分区
KEY分区使用MySQL数据库提供的函数进行分区,而HASH使用用户定义的函数进行分区。
mysql> createtable t_key (
->a int,
->b datetime ) engine=innodb
->partition by key (b)
->partitions 4;
在KEY分区中使用关键字LINEAR,和在HASH分区中具有同样的作用,分区的编号是通过2的幂算法得到的。
21、COLUMNS分区
我单独作为一节介绍,是因为这个很重要,MySQL数据库版本5.5.0开始支持COLUMNS分区,对于之前的RANGE和LIST分区,我们应该使用RANGE COLUMNS和LIST COLUMNS分区进行很好的代替。
背景:
在前面介绍的四种分区中,分区的条件都必须是整数,如果不是整数则需要转化为整数。
发展:
COLUMNS分区可以直接使用非整数的数据进行分区,分区根据类型直接比较而得,不需要转化为整数。其次,RANGE COLUMNS分区可以对多个列的值进行分区。
COLUMNS分区支持以下的数据类型:
1)所有的整数类型,如int,smallint,tinyint,bigint。float和decimal则不予支持。
2)字符串类型,如char,varchar,binary和varbinary。blob和text类型不予支持。
3)日期类型,如date和datetime。其余的日期类型不予支持。
特别,对于日期类型的分区,我们不再需要year()和to_days()函数了,而直接可以使用COLUMNS,如:
mysql> createtable t_columns_range(
->a int,
->b datetime) engine=innodb
->partition by range columns (b) (
->partition p0 values less than('2009-01-01'),
->partition p1 values less than('2010-01-01')
-> );
推荐:Mysql学习笔记一
基础部分: 1. 创建表: create table user( id smallint primary key not null auto_increment, name varchar(20), age smallint ); 2. 常用命令: create dat
同样可以使用字符串的分区:
mysql> createtable customers (
->first_name varchar(25),
->last_name varchar(25),
->street_1 varchar(30),
->street_2 varchar(30),
->city varchar(15),
->renewal date
-> )
->partition by list columns(city) (
->partition region1 values in('oskarshamn','hosgby','monsrtea'),
->partition region2 values in('vimmerby','hults','vsafsri'),
->partition region3 values in('nossjo','ekfs','vetlanda'),
->partition region4 values in('uppviding','afsf','vafaf')
-> );
mysql> insertinto customers select 'shouzhuang','jiang','china','jiangsu','nossjo','2012-09-04';
mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_schema = 'test' and table_name = 'customers';
+------------+----------------+------------+
| table_name |partition_name | table_rows |
+------------+----------------+------------+
| customers | region1 | 0 |
| customers | region2 | 0 |
| customers | region3 | 1 |
| customers | region4 | 0 |
+------------+----------------+------------+
对于RANGE COLUMNS分区,可以使用多个列进行分区,如:
mysql> createtable rcx (
->a int,
->b int,
->c char(3),
->d int
-> )
->partition by range columns (a,b,c) (
->partition p0 values less than(5,10,'ggg'),
->partition p1 values less than(10,20,'mmmm'),
->partition p2 values less than(15,30,'sss'),
->partition p3 values less than(maxvalue,maxvalue,maxvalue)
-> );
mysql> insertinto rcx select 1,2,'jia',12;
mysql> insertinto rcx select 12,24,'zzz',18;
mysql> select table_name,partition_name,table_rowsfrom information_schema.partitions where table_schema = 'test' and table_name ='rcx';
+------------+----------------+------------+
| table_name |partition_name | table_rows |
+------------+----------------+------------+
| rcx | p0 | 1 |
| rcx | p1 | 0 |
| rcx | p2 | 1 |
| rcx | p3 | 0 |
+------------+----------------+------------+
22、子分区
MySQL允许在range和list的分区上再进行hash或者是key的子分区,如:
mysql> createtable ts ( a int, b date )
->engine=innodb
->partition by range (year(b))
->subpartitionby hash(to_days(b))
->subpartitions2 (
-> partition p0 values less than (1990),
->partition p1 values less than (2000),
->partition p2 values less than maxvalue);
Query OK, 0 rowsaffected (0.36 sec)
查看物理文件:
mysql> systemls -lh /home/mysql/data/test
总计 592K
-rw-rw---- 1mysql mysql 8.4K 2012-09-04 15:07 ts.frm
-rw-rw---- 1mysql mysql 96 2012-09-04 15:07 ts.par
-rw-rw---- 1 mysql mysql 96K2012-09-04 15:07 ts#P#p0#SP#p0sp0.ibd
-rw-rw---- 1 mysql mysql 96K2012-09-04 15:07 ts#P#p0#SP#p0sp1.ibd
-rw-rw---- 1 mysqlmysql 96K 2012-09-04 15:07ts#P#p1#SP#p1sp0.ibd
-rw-rw---- 1 mysqlmysql 96K 2012-09-04 15:07ts#P#p1#SP#p1sp1.ibd
-rw-rw---- 1 mysql mysql 96K 2012-09-04 15:07 ts#P#p2#SP#p2sp0.ibd
-rw-rw---- 1 mysql mysql 96K 2012-09-04 15:07 ts#P#p2#SP#p2sp1.ibd
也可以用SUBPARTITION语法来显示指出各个子分区的名称,
mysql> create table ts (a int,b date) engine=innodb
->partition by range(year(b))
->subpartition byhash(to_days(b)) (
->partition p0 values less than (1990) (
->subpartition s0,
->subpartition s1
-> ),
->partition p1 values less than (2000) (
->subpartition s2,
->subpartitions3
-> ),
->partition p2 values less than maxvalue (
->subpartition s4,
->subpartition s5
-> )
-> );
需要注意的几点:
1)每个子分区的数量必须相同。
2)如果在一个分区表上的任何分区上使用subpartition来明确定义任何子分区,那么就必须定义所有的子分区。
3)每个subpartition子句必须包括子分区的一个名称
4)在每个子分区内,子分区的名称必须是唯一的。
子分区可以用于特别大的表,在多个磁盘间分别分配数据和索引。
假如有6个磁盘,分别为disk0,disk1,disk2等,如下:
create table ts (a int, b date) engine=myisam
partition by range (year(b))
subpartition by hash (to_days(b)) (
partition p0 values less than (2000) (
subpartition s0
data directory = '/disk0/data'
index directory = '/disk0/idx',
subpartition s1
data directory = '/disk1/data'
index directory = '/disk1/idx'
),
partition p1 values less than (2010) (
subpartition s2
data directory = '/disk2/data'
index directory = '/disk2/idx',
subpartition s3
data directory = '/disk3/data'
index directory = '/disk3/idx'
),
partition p1 values less than maxvalue (
subpartition s4
data directory = '/disk4/data'
index directory = '/disk4/idx',
subpartition s5
data directory = '/disk5/data'
index directory = '/disk5/idx'
)
);
mysql> show warnings;
+---------+------+----------------------------------+
| Level |Code | Message |
+---------+------+----------------------------------+
| Warning | 1618 | <DATA DIRECTORY> optionignored |
| Warning | 1618 | <INDEX DIRECTORY> optionignored |
| Warning | 1618 | <DATA DIRECTORY> optionignored |
| Warning | 1618 | <INDEX DIRECTORY> optionignored |
| Warning | 1618 | <DATA DIRECTORY> optionignored |
| Warning | 1618 | <INDEX DIRECTORY> optionignored |
| Warning | 1618 | <DATA DIRECTORY> optionignored |
| Warning | 1618 | <INDEX DIRECTORY> optionignored |
| Warning | 1618 | <DATA DIRECTORY> optionignored |
| Warning | 1618 | <INDEX DIRECTORY> optionignored |
| Warning | 1618 | <DATA DIRECTORY> optionignored |
| Warning | 1618 | <INDEX DIRECTORY> optionignored |
+---------+------+----------------------------------+
但是InnoDB存储引擎会忽略DATA DIRECTORY和INDEX DIRECTORY语法,因此上述分区表的数据和索引文件分开设置对其是无效的。
23、分区中的NULL值
MySQL数据库中允许对NULL值做分区,MySQL数据库总是把NULL值视为小于任何一个非NULL值。
对于range分区,如果对于分区列插入了NULL值,则MySQL数据库会将该值放入最左边的分区。
mysql> create table t_range (
->aint,
->bint) engine=innodb
->partition by range(b) (
->partition p0 values less than (10),
->partition p1 values less than (20),
->partition p2 values less than maxvalue);
Query OK, 0 rows affected (0.19 sec)
mysql> insert into t_range select 1,1;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insert into t_range select 1,NULL;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> select * from t_range;
+------+------+
| a |b |
+------+------+
| 1 | 1 |
| 1 | NULL|
+------+------+
2 rows in set (0.00 sec)
mysql> select table_name,partition_name,table_rowsfrom information_schema.partitions where table_name = 't_range' andtable_schema = database()\G
*************************** 1. row***************************
table_name: t_range
partition_name: p0
table_rows: 2
*************************** 2. row***************************
table_name: t_range
partition_name: p1
table_rows: 0
*************************** 3. row***************************
table_name: t_range
partition_name: p2
table_rows: 0
3 rows in set (0.01 sec)
可以看到数据都放入了p0分区。NULL值会放入最左边的分区中。
另外需要特别注意的是:如果删除了p0这个分区,你删除的是小于10的记录,并且还有NULL值的记录,这点非常重要。
mysql> alter table t_range drop partition p0;
Query OK, 0 rows affected (0.11 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> select * from t_range;
Empty set (0.00 sec)
LIST分区下要使用NULL值,则必须显示地指出哪个分区中放入NULL值,否则会报错,如:
mysql> create table t_list (
->aint,
->bint) engine=innodb
->partition by list(b) (
->partition p0 values in (1,3,5,7,9),
->partitionp1 values in (0,2,4,6,8)
-> );
Query OK, 0 rows affected (0.13 sec)
mysql> insert into t_list select 1,NULL;
ERROR 1526 (HY000): Table has no partition for value NULL
若p0分区允许NULL值,则插入不会报错:
mysql> create table t_list (
->aint,
->b int)engine=innodb
->partition by list(b) (
->partition p0 values in (1,3,5,7,9,NULL),
->partition p1 values in (0,2,4,6,8)
-> );
mysql> insert into t_list select 1,NULL;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_name = 't_list' and table_schema = database()\G
*************************** 1. row***************************
table_name: t_list
partition_name: p0
table_rows: 1
*************************** 2. row***************************
table_name: t_list
partition_name: p1
table_rows: 0
2 rows in set (0.00 sec)
HASH和KEY分区对于NULL值的处理方式,和RANGE分区,LIST分区不一样。任何分区函数都会将含有NULL值的记录返回为0。
mysql> create table t_hash (
->aint,
->bint) engine=innodb
->partition by hash(b)
->partitions 4;
mysql> insert into t_hash select 1,0;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insert into t_hash select 1,NULL;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_name = 't_hash' and table_schema = database()\G
*************************** 1. row ***************************
table_name: t_hash
partition_name: p0
table_rows: 2
*************************** 2. row***************************
table_name: t_hash
partition_name: p1
table_rows: 0
*************************** 3. row***************************
table_name: t_hash
partition_name: p2
table_rows: 0
*************************** 4. row***************************
table_name: t_hash
partition_name: p3
table_rows: 0
4 rows in set (0.00 sec)
24、分区和性能
数据库的应用分为两类:
一类是OLTP,在线事务处理,如博客,电子商务,网络游戏等
另一类是OLAP,在线分析处理,如数据仓库,数据集市。
对于OLAP的应用,分区的确可以很好地提高查询的性能,因为OLAP应用的大多数查询需要频繁地扫描一张很大的表。
对于OLTP的应用,分区应该非常小心。在这种应用下,不可能会获取一张大表中10%的数据,大部分都是通过索引返回几条记录即可,而根据B+树索引的原理可知,对于一张大表,一般的B+树需要2~3次的磁盘IO,很少会有4层的B+树索引,因此B+树可以很好地完成操作,不需要分区的帮助,并且设计不好的分区会带来严重的性能问题。
实验:
mysql> create table profile (
->idint(11) not null auto_increment,
->nickname varchar(20) not null default '',
->password varchar(32) not null default '',
->sexchar(1) not null default '',
->rdate date not null default '0000-00-00',
->primary key (id),
->keynickname (nickname)
-> )engine=innodb
->partition by hash(id)
->partitions 10;
mysql> \d //
mysql> create procedure jiang(count intunsigned)
->begin
->declare i int unsigned default 1;
->while i < count do
->insert into profile(nickname,password,sex,rdate)values('jiang','jiang0228','M','2012-09-04');
-> SETi = i + 1;
->endwhile;
->end;
-> //
mysql>\d ;
创建1000万条数据:
mysql> call jiang(10000000);
如果进行主键的查询,可以发现分区的确是有意义的:
mysql> explain partitions select * from profilewhere id = 1\G
*************************** 1. row***************************
id: 1
select_type: SIMPLE
table: profile
partitions: p1
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra:
1 row in set (0.01 sec)
可以发现只寻找了p1分区,但是对于表profile中nickname列索引的查询,explain partitions则会得到如下的结果:
mysql> explain partitions select * from profilewhere nickname = 'david'\G
*************************** 1. row***************************
id: 1
select_type: SIMPLE
table: profile
partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9
type: ref
possible_keys: nickname
key: nickname
key_len: 62
ref: const
rows: 10
Extra: Using where
mysql> update profile set nickname = 'david' where id = 5566;
mysql> select * from profile where nickname = 'david';
上述查询实际的IO执行了20-30次。
可以看到,MySQL数据库会搜索所有的分区,因此查询速度会慢很多。
最后提示:分区并不总是适合于OLTP应用,应该根据自己的应用好好规划自己的分区设计。