Mysql 中 exists 和 in 的区别[java教程]
最近刷到面试题:Mysql 中 exists 和 in 的区别,先说下答案。
下面将主查询的表称为外表;子查询的表称为内表。exists 与 in 的区别如下:
- 子查询使用 exists,会先进行主查询,将查询到的每行数据循环带入子查询校验是否存在,过滤出整体的返回数据;子查询使用 in,会先进行子查询获取结果集,然后主查询匹配子查询的结果集,返回数据
- 外表内表相对大小情况不一样时,查询效率不一样:两表大小相当,in 和 exists 差别不大;内表大,用 exists 效率较高;内表小,用 in 效率较高。
- 不管外表与内表的大小,not exists 的效率一般要高于 not in,跟子查询的索引访问类型有关。
建表、造数据,验证一下以上答案。
# 建表 student1
drop table if exists student1;
create table student1(
sid int primary key auto_increment,
sname varchar(40)
);
# 建存储过程给表 student1,插入1000条数据
drop procedure if exists addStudent1;
create procedure addStudent1()
BEGIN
declare idx int;
set idx = 1;
while idx <= 1000 DO
insert into student1 values(null, concat('student-', idx));
set idx = idx + 1;
end while;
end;
call addStudent1();
select * from student1;
# 建表 student2
drop table if exists student2;
create table student2(
sid int primary key auto_increment,
sname varchar(40)
);
# 建存储过程给表 student2,插入100000条数据
drop procedure if exists addStudent2;
create procedure addStudent2()
BEGIN
declare idx int;
set idx = 1;
while idx <= 100000 DO
insert into student2 values(null, concat('student-', idx));
set idx = idx + 1;
end while;
end;
call addStudent2();
select * from student2;
in 与 exists 的查询 SQL
select count(1) from student1 where sname in (select sname from student2);
select count(1) from student1 where exists (select sname from student2 where student2.sname = student1.sname);
select count(1) from student2 where sname in (select sname from student1);
select count(1) from student2 where exists (select sname from student1 where student2.sname = student1.sname);
执行时间:
[SQL] select count(1) from student1 where sname in (select sname from student2);
受影响的行: 0
时间: 0.092s
[SQL]
select count(1) from student1 where exists (select sname from student2 where student2.sname = student1.sname);
受影响的行: 0
时间: 0.076s
[SQL]
select count(1) from student2 where sname in (select sname from student1);
受影响的行: 0
时间: 14.820s
[SQL]
select count(1) from student2 where exists (select sname from student1 where student2.sname = student1.sname);
受影响的行: 0
时间: 15.144s
结论:student2 大表在内适用 exists,所以第 2 条 SQL 比第 1 条快;student1 小表在内适用 in,所以第 3 条 SQL 比第 4 条快。
not in 与 not exists 的查询 SQL
select count(1) from student1 where sname not in (select sname from student2);
select count(1) from student1 where not exists (select sname from student2 where student2.sname = student1.sname);
select count(1) from student2 where sname not in (select sname from student1);
select count(1) from student2 where not exists (select sname from student1 where student2.sname = student1.sname);
执行时间:
[SQL]
select count(1) from student1 where sname not in (select sname from student2);
受影响的行: 0
时间: 0.079s
[SQL]
select count(1) from student1 where not exists (select sname from student2 where student2.sname = student1.sname);
受影响的行: 0
时间: 0.075s
[SQL] select count(1) from student2 where sname not in (select sname from student1);
受影响的行: 0
时间: 15.797s
[SQL]
select count(1) from student2 where not exists (select sname from student1 where student2.sname = student1.sname);
受影响的行: 0
时间: 15.160s
结论:not exists 性能高于 not in
给 student1、student2 sname 字段,加上索引,上述结论仍然成立。
create index idx_1 on student1(sname);
create index idx_2 on student2(sname);
执行时间:
[SQL] select count(1) from student1 where sname in (select sname from student2);
受影响的行: 0
时间: 0.022s
[SQL]
select count(1) from student1 where exists (select sname from student2 where student2.sname = student1.sname);
受影响的行: 0
时间: 0.014s
[SQL]
select count(1) from student2 where sname in (select sname from student1);
受影响的行: 0
时间: 0.379s
[SQL]
select count(1) from student2 where exists (select sname from student1 where student2.sname = student1.sname);
受影响的行: 0
时间: 0.373s
[SQL]
select count(1) from student1 where sname not in (select sname from student2);
受影响的行: 0
时间: 0.006s
[SQL]
select count(1) from student1 where not exists (select sname from student2 where student2.sname = student1.sname);
受影响的行: 0
时间: 0.006s
[SQL]
select count(1) from student2 where sname not in (select sname from student1);
受影响的行: 0
时间: 0.455s
[SQL]
select count(1) from student2 where not exists (select sname from student1 where student2.sname = student1.sname);
受影响的行: 0
时间: 0.418s
再细看一下,not in 与 not exists 查询索引使用情况
not in,子查询使用了 index_subquery 访问类型
EXPLAIN EXTENDED select count(1) from student2 where sname not in (select sname from student1);
SHOW WARNINGS;
not exists,子查询使用了 ref 访问类型
EXPLAIN EXTENDED select count(1) from student2 where not exists (select sname from student1 where student2.sname = student1.sname);
SHOW WARNINGS;