处理mysql cup飙升问题
紧急处理: 运行平稳的数据库,如果遇到CPU狂飙,到80%左右,那一定是开发写的烂SQL导致的,DBA首先要保证的是,数据库别跑挂了,所以我们要把那些运行慢的SQL杀死并记录到文件里,以便后面的排查。
pt-kill --match-info "^(select|SELECT)"--busy-time 3--victim all --interval 1--kill --print --daemonize > /root/kill.txt
这条命令是只把select耗时3秒以上的SQL全部杀死,并打印出来
模拟测试: 也可以通过show processlist;命令详细排查: 1.通过top命令查询服务器cup占用情况 我们看到cup占用率已经达到199.7%(服务器是双核) 2.执行 mysql -u root -p 进入mysql,输入show full processlist; 查看正在连接mysql的线程 我们看到这个sql是导致mysql数据库飙升的原因 3.杀死这些占用资源的烂sql kill 504; kill 509;
通过mysql慢日志分析: 1、查看mysql日志记录是否开启:show variables like ‘slow_query_log’; 没开启的话先开启日志记录: set global slow_query_log = on;
2、查看记录慢sql的时间:show global variables like ‘long_query_time’; 默认是10秒,可以根据需要修改,这里修改为执行时间超过3秒的sql被记录:
set global long_query_time =3;
3、查看日志存放位置:show variables like ‘slow_query_log_file’; 4、查看日志: cat /data/mysql/XXX/long.log
# Time: 170104 10:41:04 # User@Host: root[root] @ [127.0.0.1] Id: 785264 # Query_time: 2.082531 Lock_time: 0.000132 Rows_sent: 0 Rows_examined: 437963 SET timestamp=1483497664; SELECT id,update_time,update_end_time,ctime FROM `sg_point_log` WHERE pid = 105 AND date(ctime) = curdate() AND (update_time != 0 OR update_end_time != 0) ORDER BY id DESC limit 1; # Time: 170104 10:42:06 # User@Host: root[root] @ [127.0.0.1] Id: 785461 # Query_time: 2.037191 Lock_time: 0.000162 Rows_sent: 0 Rows_examined: 438013 SET timestamp=1483497726; SELECT id,update_time,update_end_time,ctime FROM `sg_point_log` WHERE pid = 203 AND date(ctime) = curdate() AND (update_time != 0 OR update_end_time != 0) ORDER BY id DESC limit 1; # Time: 170104 10:42:10 # User@Host: root[root] @ [127.0.0.1] Id: 785549 # Query_time: 2.003415 Lock_time: 0.000071 Rows_sent: 0 Rows_examined: 438016 SET timestamp=1483497730; SELECT id,update_time,update_end_time,ctime FROM `sg_point_log` WHERE pid = 281 AND date(ctime) = curdate() AND (update_time != 0 OR update_end_time != 0) ORDER BY id DESC limit 1; # Time: 170104 10:42:14 # User@Host: root[root] @ [127.0.0.1] Id: 785264 # Query_time: 2.152803 Lock_time: 0.000127 Rows_sent: 0 Rows_examined: 438018 SET timestamp=1483497734; SELECT id,update_time,update_end_time,ctime FROM `sg_point_log` WHERE pid = 8 AND date(ctime) = curdate() AND (update_time != 0 OR update_end_time != 0) ORDER BY id DESC limit 1; # Time: 170104 10:42:16 # User@Host: root[root] @ [127.0.0.1] Id: 785264 # Query_time: 2.040670 Lock_time: 0.000077 Rows_sent: 0 Rows_examined: 438021 SET timestamp=1483497736; SELECT id,upload_img_time,ctime FROM `sg_point_log` WHERE pid = 7 AND date(ctime) = curdate() AND upload_img_time != 0 ORDER BY id DESC limit 1; # Time: 170104 10:42:18 # User@Host: root[root] @ [127.0.0.1] Id: 785264 # Query_time: 2.139948 Lock_time: 0.000124 Rows_sent: 0 Rows_examined: 438025 SET timestamp=1483497738; SELECT id,update_time,update_end_time,ctime FROM `sg_point_log` WHERE pid = 7 AND date(ctime) = curdate() AND (update_time != 0 OR update_end_time != 0) ORDER BY id DESC limit 1;
可以看出罪魁祸首就是这个sql