读书人

Mysql 上 SQL 优化NOT IN (除了把NOT

发布时间: 2012-08-11 20:50:31 作者: rapoo

Mysql 下 SQL 优化NOT IN (除了把NOT IN转化为LEFT JOIN外,可优化影响的数据行数)

Mysql 下 SQL 优化NOT IN (除了把NOT IN转化为LEFT JOIN外,可优化影响的数据行数)

2010-05-31 17:38:30|??分类:?默认分类|字号?订阅

Mysql?下?SQL?优化 NOT IN:??除了把NOT IN转化为LEFT JOIN外,可通过优化业务逻辑,减少了查询涉及到的返回的数据,从而达到优化查询的目的。??====================================================================查询目的:? 最新更新的10条博文信息 :从博文表中取出200条,然后排除重复,按照发表时间顺序显示。?SELECT?*?FROM?(??SELECT?post_title,?guid?,?path,?post_author,?post_date,?tableid??FROM?wp_posts,?wp_blogs??WHERE?post_r_site?=1? /*?post_r_site?=1 表示网校下的博文?*/?AND?post_type?=?'post'? /*??type=post 为博文?*/?AND?post_status?=?'publish'? /*??博文不是私有博文 ?*/?AND?wp_blogs.blog_id=wp_posts.tableid??AND?wp_blogs.blog_id?NOT?IN?(SELECT?id?FROM?wp_blacklist?WHERE?type='0')?/*?? 博客Id 不再黑名单中?*/?AND?wp_posts.tableid?NOT?IN?(?SELECT?blog_id?FROM?wp_blogs?WHERE?registered?>?'2010-05-14'?)?? /*?用户注册日期大于05月14日??*/?AND?wp_posts.post_author?IN?(SELECT?id?FROM?wp_users)? /*??博主用户信息存在?*/?ORDER?BY?wp_posts.post_date?DESC? /*??按照发表博文的时间做降序排列?*/?LIMIT?200??)FF??GROUP?BY?tableid?/*?排除重复用户? 只显示用户的一条博文信息?*/?ORDER?BY?max(?post_date?)?DESC? /*??按照发表博文的时间做降序排列?*/?LIMIT?0,10?这条查询语句,会从满足条件的wp_posts中取出200条记录。但是满足post_r_site?=1等条件的数据达27万多条。27万远大于200条。?====================================================================测试时,清空下Mysql?的 查询缓存,SQL语句有可能被内存缓存。?mysql>?FLUSH?QUERY?CACHE;mysql>?RESET?QUERY?CACHE;mysql>?FLUSH?TABLES;?关闭mysql 并释放mysql占用的内存[root@mail?~]#?more?res.sh??killall?memcachedmemcached?-d?-m?500?-u?root?-l?192.168.190.11?-p?12000?-c?256?-P?/tmp/memcached.pid?mysqladmin?-u?root?--password=123qwe?shutdown./testmysqld_safe?--defaults-file=/home/linyang/database/my.cnf?&???分配一个很大的内存,使得mysql的内存占用被释放。[root@mail?~]#?more?test.cpp?#include?<stdio.h>#include?<string.h>int?main(){????????long?iSize?=?1024*1024*2000;????????char?*?pMem?=?new?char[iSize];????????if(pMem)????????{????????????????memset(pMem,0,iSize);????????????????printf("clear?mem?success\n");????????????????delete?pMem;????????}????????else????????????????printf("clear?mem?error\n");????????return?0;}?[root@mail?~]#?g++?-o?test test.cpp?????====================================================================explain?此sql语句,会发现wp_posts表涉及到的行?273098行。 远大于我们的需要200条。?mysql>?explain?SELECT?*?FROM?(?????->??SELECT?post_title,?guid?,?path,?post_author,?post_date,?tableid?????->??FROM?wp_posts,?wp_blogs?????->??WHERE?post_r_site?=1??/*?post_r_site?=1?表示查询会计网校下的博文?*/????->??AND?post_type?=?'post'??/*??type=post?为博文?*/????->??AND?post_status?=?'publish'??/*??博文不是私有博文??*/????->??AND?wp_blogs.blog_id=wp_posts.tableid?????->??AND?wp_blogs.blog_id?NOT?IN?(SELECT?id?FROM?wp_blacklist?WHERE?type='0')?/*???博客Id?不再黑名单中?*/月14日??*/D?wp_posts.tableid?NOT?IN?(?SELECT?blog_id?FROM?wp_blogs?WHERE?registered?>?'2010-05-14'?)???/*?用户注册日期大于05?????->??AND?wp_posts.post_author?IN?(SELECT?id?FROM?wp_users)??/*??博主用户信息存在?*/????->??ORDER?BY?wp_posts.post_date?DESC??/*??按照发表博文的时间做降序排列?*/????->??LIMIT?200?????->??)FF?????->??GROUP?BY?tableid?/*?排除重复用户??只显示用户的一条博文信息?*/????->??ORDER?BY?max(?post_date?)?DESC??/*??按照发表博文的时间做降序排列?*/????->??LIMIT?0,10;?------------+----------------------+-------|?id?|?select_type????????|?table????????|?rows???|?Extra???????????????????????????|+----+--------------------+--------------+--------+---------------------------------+|??1?|?PRIMARY????????????|?<derived2>???|????200?|?Using?temporary;?Using?filesort?|?|??2?|?DERIVED????????????|?wp_posts?????|?273098?|?Using?where?????????????????????|?|??2?|?DERIVED????????????|?wp_blogs?????|??????1?|?????????????????????????????????|?|??5?|?DEPENDENT?SUBQUERY?|?wp_users?????|??????1?|?Using?index?????????????????????|?|??4?|?DEPENDENT?SUBQUERY?|?wp_blogs?????|??????1?|?Using?where?????????????????????|?|??3?|?DEPENDENT?SUBQUERY?|?wp_blacklist?|??????1?|?Using?index;?Using?where????????|?+----+--------------------+--------------+--------+---------------------------------+6?rows?in?set?(0.03?sec)???====================================================================?NOT?IN的(除了转化为LEFT JOIN外,可以通过业务逻辑)优化:??缩小了博文的数据范围,定义在10天内发表的博文,取出200条,然后排除重复,按照发表时间顺序显示。?如果不加?博文的时间范围限制,返回满足条件的全部的博文数据列表多达27万条记录,mysql会分配很大内存来存放数据。?加上博文的时间范围,则返回满足条件的全部的博文数据列减少了很多。?再次explain?此sql语句,会发现wp_posts表涉及到的行为248行。 执行sql语句 发现速度提升不少。??====================================================================mysql>?explain?SELECT?*?FROM?(??SELECT?post_title,?guid?,?path,?post_author,?post_date,?tableid??FROM?wp_posts,?wp_blogs??WHERE?post_r_site?=1? /*?post_r_site?=1 表示查询会计网校下的博文?*/?AND?post_type?=?'post'? /*??type=post 为博文?*/?AND?post_status?=?'publish'? /*??博文不是私有博文 ?*/?AND?post_date?>?'2010-05-04'? /*??定义在10天内发表的博文??*/?AND?wp_blogs.blog_id=wp_posts.tableid??AND?wp_blogs.blog_id?NOT?IN?(SELECT?id?FROM?wp_blacklist?WHERE?type='0')?/*?? 博客Id 不再黑名单中?*/?AND?wp_posts.tableid?NOT?IN?(?SELECT?blog_id?FROM?wp_blogs?WHERE?registered?>?'2010-05-14'?)?? /*?用户注册日期大于05月14日??*/?AND?wp_posts.post_author?IN?(SELECT?id?FROM?wp_users)? /*??博主用户信息存在?*/?ORDER?BY?wp_posts.post_date?DESC? /*??按照发表博文的时间做降序排列?*/?LIMIT?200??)FF??GROUP?BY?tableid?/*?排除重复用户? 只显示用户的一条博文信息?*/?ORDER?BY?max(?post_date?)?DESC? /*??按照发表博文的时间做降序排列?*/?LIMIT?0,10?+----+--------------------+--------------+------+---------------------------------+|?id?|?select_type????????|?table????????|?rows?|?Extra???????????????????????????|+----+--------------------+--------------+------+---------------------------------+|??1?|?PRIMARY????????????|?<derived2>???|???62?|?Using?temporary;?Using?filesort?|?|??2?|?DERIVED????????????|?wp_posts?????|???248?|?Using?where?????????????????????|?|??2?|?DERIVED????????????|?wp_blogs?????|????1?|?????????????????????????????????|?|??5?|?DEPENDENT?SUBQUERY?|?wp_users?????|????1?|?Using?index?????????????????????|?|??4?|?DEPENDENT?SUBQUERY?|?wp_blogs?????|????1?|?Using?where?????????????????????|?|??3?|?DEPENDENT?SUBQUERY?|?wp_blacklist?|????1?|?Using?index;?Using?where????????|?+----+--------------------+--------------+------+---------------------------------+6?rows?in?set?(0.00?sec)???整理: ?

读书人网 >Mysql

热点推荐