复现MySQL的索引选择失误以及通过OPTIMIZER_TRACE分析过程

验证环境:MySQL 5.7.39 windows-pc

一、构造数据(生成150万数据)

构建一张账户表,带有一级部门id和二级部门id,并且建立有索引。比较典型的业务场景,根据部门id进行各类查询。

CREATE TABLE TM_ACCOUNT (
  account_id bigint(20)  NOT null ,
  name varchar(32) DEFAULT '',
  address varchar(32) DEFAULT '',
  org_first_id int(10) DEFAULT 0,
  org_second_id int(10) DEFAULT 0,
  biz_date date DEFAULT null,
  last_modify_dt datetime DEFAULT null,
  PRIMARY KEY (account_id),
  KEY IDX_org_id_combine(org_first_id,org_second_id),
  KEY IDX_last_modify_dt_org_first_id_name(last_modify_dt,org_first_id,org_second_id)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8mb4;

此处直接通过jdbc批量插入数据。

数据分布,保证数据无倾斜,索引数据均匀:

代码可以直接使用,详情见附件3

二、通过explain验证语句的索引使用

show index from TM_ACCOUNT ;  -- 看索引

执行结果,可以看到org_first_id/org_second_id的区分度,都很不错。

Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment tm_account 0 PRIMARY 1 account_id A 1408599 BTREE tm_account 1 IDX_org_id_combine 1 org_first_id A 101 YES BTREE tm_account 1 IDX_org_id_combine 2 org_second_id A 10611 YES BTREE tm_account 1 IDX_last_modify_dt_org_first_id_name 1 last_modify_dt A 24 YES BTREE tm_account 1 IDX_last_modify_dt_org_first_id_name 2 org_first_id A 2497 YES BTREE tm_account 1 IDX_last_modify_dt_org_first_id_name 3 org_second_id A 251724 YES BTREE

show table status like '%TM_ACCOUNT%'; -- 看表状态,有数据大小、索引大小、大概行数

可看到使用了InnoDB引擎,大概行数是1408599,实际行数是1500000整。

Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment tm_account InnoDB 10 Dynamic 1408599 83 118128640 0 128253952 7340032 2022-09-13 10:49:36 utf8mb4_general_ci

explain SELECT * from TM_ACCOUNT where ACCOUNT_ID = '10'; -- 典型的主键字段查询,非常快,type=const

explain SELECT * from TM_ACCOUNT where ACCOUNT_ID = '10'; -- 典型的主键字段查询,非常快,type=const

explain SELECT * from TM_ACCOUNT where NAME = 'name-11'; -- 典型的非索引字段查询,全表扫描

explain SELECT * from TM_ACCOUNT where ADDRESS = 'QR3xHEOpaLAVNFCtAKXY'; -- 典型的非索引字段查询,全表扫描

explain SELECT * from TM_ACCOUNT where LAST_MODIFY_DT = '2100-09-13 00:00:00' and ACCOUNT_ID > 100 LIMIT 2; -- 典型的范围查询,扫描索引。单速度也很快

下面通过3个SQL查询的结果对比,来复现MySQL优化器如何选错优化场景。(这里不讨论为何不换种写法,直接规避劣化SQL。往往出现这类SQL时,一是业务场景复杂,二是开发时数据量少并未发现,在生产环境才能出现)

-- SQL-1
explain
SELECT * from TM_ACCOUNT where org_first_id >= 99 and org_second_id in (1,2,3,60) and BIZ_DATE  in ('2100-09-01','2100-09-02')and ACCOUNT_ID > '120306' order by ACCOUNT_ID  desc LIMIT 5000;

查询结果:可见使用了IDX_org_id_combine索引,并用到索引范围扫描、回表查询、临时文件排序。不算是一个很好的查询语句,但实际业务中的查询条件,只会更复杂。直接查询耗时140ms。

id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE TM_ACCOUNT range PRIMARY,IDX_org_id_combine IDX_org_id_combine 18 33942 4.0 Using index condition; Using where; Using filesort

-- SQL-2 坏案例-全表扫描;
explain
SELECT * from TM_ACCOUNT where org_first_id >= 90 and org_second_id in (1,2,3,60) and BIZ_DATE  in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' order by ACCOUNT_ID desc   LIMIT 5000;

查询结果:改变org_first_id条件,扩大查询范围,结果变成了主键索引的大范围扫描,预估扫描行数70万行,几乎是表总数的一半。直接查询耗时3900ms。

id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE TM_ACCOUNT range PRIMARY,IDX_org_id_combine PRIMARY 8 704299 1.68 Using where

-- SQL-3 与SQL-1基本相同,但limit数量减少。
explain
SELECT * from TM_ACCOUNT where org_first_id >= 99 and org_second_id in (1,2,3,60) and BIZ_DATE  in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' order by ACCOUNT_ID desc   LIMIT 500;

查询结果:与SQL-1基本相同,但limit数量减少,即查询条件范围缩小,劣化成主键大范围扫描。 直接查询耗时1210ms。

id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE TM_ACCOUNT range PRIMARY,IDX_org_id_combine PRIMARY 8 704299 0.19 Using where

三、复现索引选择劣化、并尝试分析OPTIMIZER_TRACE

执行相关命令,获取OPTIMIZER_TRACE过程。

/* 打开optimizer_trace,只对本线程有效 */
SET optimizer_trace='enabled=on';
#你的sql
-- select ......;
SELECT * from TM_ACCOUNT where org_first_id >= 90 and org_second_id in (1,2,3,60) and BIZ_DATE  in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' order by ACCOUNT_ID desc   LIMIT 5000;
#查看优化器追踪链
select * from INFORMATION_SCHEMA.OPTIMIZER_TRACE;
#关闭优化器追踪
SET optimizer_trace='enabled=off';

关键过程:通过对潜在查询方式的预估,分别对PRIMARY/IDX_org_id_combine的开销进行评估,这里开销并不仅看扫描行数,还会看排序等情况。可以看到虽然走主键索引的行数更多,但总开销更小。由此可知在【预估】过程,误导了整个优化器。

共有2个潜在选项,分别标出了rowid是否排序、行数rows、预估开销cost

截取部分OPTIMIZER_TRACE结果,完整json参考附录1

    // 分析可供选择的范围条件
    "analyzing_range_alternatives": {
        "range_scan_alternatives": [
        {
            "index": "PRIMARY",
            "ranges": [
            "120306 < account_id"
            ],
            "index_dives_for_eq_ranges": true,
            "rowid_ordered": true,
            "using_mrr": false,
            "index_only": false,
            "rows": 704299,
            "cost": 141880,
            "chosen": true
        },
        {
            "index": "IDX_org_id_combine",
            "ranges": [
            "90

这里怀疑是 order by ACCOUNT_ID影响了优化器选择,但通测试发现,即使移除了’order by ACCOUNT_ID desc LIMIT 5000’,explain结果仍然是走PRIMARY索引。由此可见,还有些隐藏的信息,OPTIMIZER_TRACE没有展示全。这里暂不深入讨论。

explain
SELECT * from TM_ACCOUNT where org_first_id >= 90 and org_second_id in (1,2,3,60) and BIZ_DATE  in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' ;

id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE TM_ACCOUNT range PRIMARY,IDX_org_id_combine PRIMARY 8 704299 1.68 Using where

结果:实际查询耗时912ms。在【## 附录2 OPTIMIZER_TRACE原始信息2】中也能看到选择实际索引,仍然是PRIMARY,与explain结果一致。

四、如何优化?

改写SQL:

下面来看效果

explain
SELECT * from TM_ACCOUNT where org_first_id in ('90','91','92','93','94','95','96','97','98','99') and org_second_id in (1,2,3,60) and BIZ_DATE  in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' order by ACCOUNT_ID desc   LIMIT 5000;

id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE TM_ACCOUNT range PRIMARY,IDX_org_id_combine IDX_org_id_combine 18 5543 20.0 Using index condition; Using where; Using filesort

结果:实际查询耗时59ms。explain结果可看到虽然也用了IDX_org_id_combine索引,但仍然是range查询、回表、filesort,好在扫描行数较少,最终耗时很小。

随着数据量的增大,无论多么简单的SQL,最终仍然会变慢。

其他方式:

附录1 OPTIMIZER_TRACE原始信息1

以下语句的执行优化过程
SELECT * from TM_ACCOUNT where org_first_id >= 90 and org_second_id in (1,2,3,60) and BIZ_DATE in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' order by ACCOUNT_ID desc LIMIT 5000;

{
    "steps": [
      {
        "join_preparation": {
          "select#": 1,
          "steps": [
            {
              "IN_uses_bisection": true
            },
            {
              "IN_uses_bisection": true
            },
            {
              "expanded_query": "/* select#1 */ select tm_account.account_id AS account_id,tm_account.name AS name,tm_account.address AS address,tm_account.org_first_id AS org_first_id,tm_account.org_second_id AS org_second_id,tm_account.biz_date AS biz_date,tm_account.last_modify_dt AS last_modify_dt from tm_account where ((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306)) order by tm_account.account_id desc limit 5000"
            }
          ]
        }
      },
      {
        "join_optimization": {
          "select#": 1,
          "steps": [
            {
              "condition_processing": {
                "condition": "WHERE",
                "original_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))",
                "steps": [
                  {
                    "transformation": "equality_propagation",
                    "resulting_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                  },
                  {
                    "transformation": "constant_propagation",
                    "resulting_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                  },
                  {
                    "transformation": "trivial_condition_removal",
                    "resulting_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                  }
                ]
              }
            },
            {
              "substitute_generated_columns": {}
            },
            {
              "table_dependencies": [
                {
                  "table": "tm_account",
                  "row_may_be_null": false,
                  "map_bit": 0,
                  "depends_on_map_bits": []
                }
              ]
            },
            {
              "ref_optimizer_key_uses": []
            },
            {
                // 行数预估
              "rows_estimation": [
                {
                  "table": "tm_account",
                  "range_analysis": {
                    "table_scan": {
                      "rows": 1408599,
                      "cost": 288932
                    },
                    "potential_range_indexes": [
                      {
                        "index": "PRIMARY",
                        "usable": true,
                        "key_parts": [
                          "account_id"
                        ]
                      },
                      {
                        "index": "IDX_org_id_combine",
                        "usable": true,
                        "key_parts": [
                          "org_first_id",
                          "org_second_id",
                          "account_id"
                        ]
                      },
                      {
                        "index": "IDX_last_modify_dt_org_first_id_name",
                        "usable": false,
                        "cause": "not_applicable" // 直接标明不适用
                      }
                    ],
                    "setup_range_conditions": [],
                    "group_index_range": {
                      "chosen": false,
                      "cause": "not_group_by_or_distinct"
                    },
                    // 分析可供选择的范围条件
                    "analyzing_range_alternatives": {
                      "range_scan_alternatives": [
                        {
                          "index": "PRIMARY",
                          "ranges": [
                            "120306 < account_id"
                          ],
                          "index_dives_for_eq_ranges": true,
                          "rowid_ordered": true,
                          "using_mrr": false,
                          "index_only": false,
                          "rows": 704299,
                          "cost": 141880,
                          "chosen": true
                        },
                        {
                          "index": "IDX_org_id_combine",
                          "ranges": [
                            "90 = 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))",
                "attached_conditions_computation": [
                  {
                    "table": "tm_account",
                    "rechecking_index_usage": {
                      "recheck_reason": "low_limit",
                      "limit": 5000,
                      "row_estimate": 11806,
                      "range_analysis": {
                        "table_scan": {
                          "rows": 1408599,
                          "cost": 1690000
                        },
                        "potential_range_indexes": [
                          {
                            "index": "PRIMARY",
                            "usable": true,
                            "key_parts": [
                              "account_id"
                            ]
                          },
                          {
                            "index": "IDX_org_id_combine",
                            "usable": false,
                            "cause": "not_applicable"
                          },
                          {
                            "index": "IDX_last_modify_dt_org_first_id_name",
                            "usable": false,
                            "cause": "not_applicable"
                          }
                        ],
                        "setup_range_conditions": [],
                        "group_index_range": {
                          "chosen": false,
                          "cause": "cannot_do_reverse_ordering"
                        },
                        "analyzing_range_alternatives": {
                          "range_scan_alternatives": [
                            {
                              "index": "PRIMARY",
                              "ranges": [
                                "120306 < account_id"
                              ],
                              "index_dives_for_eq_ranges": true,
                              "rowid_ordered": true,
                              "using_mrr": false,
                              "index_only": false,
                              "rows": 704299,
                              "cost": 141880,
                              "chosen": true
                            }
                          ]
                        },
                        "chosen_range_access_summary": {
                          "range_access_plan": {
                            "type": "range_scan",
                            "index": "PRIMARY",
                            "rows": 704299,
                            "ranges": [
                              "120306 < account_id"
                            ]
                          },
                          "rows_for_plan": 704299,
                          "cost_for_plan": 141880,
                          "chosen": true
                        }
                      }
                    }
                  }
                ],
                "attached_conditions_summary": [
                  {
                    "table": "tm_account",
                    "attached": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                  }
                ]
              }
            },
            {
              "clause_processing": {
                "clause": "ORDER BY",
                "original_clause": "tm_account.account_id desc",
                "items": [
                  {
                    "item": "tm_account.account_id"
                  }
                ],
                "resulting_clause_is_simple": true,
                "resulting_clause": "tm_account.account_id desc"
              }
            },
            {
              "reconsidering_access_paths_for_index_ordering": {
                "clause": "ORDER BY",
                "steps": [],
                "index_order_summary": {
                  "table": "tm_account",
                  "index_provides_order": true,
                  "order_direction": "desc",
                  "index": "PRIMARY",
                  "plan_changed": false
                }
              }
            },
            {
              "refine_plan": [
                {
                  "table": "tm_account"
                }
              ]
            }
          ]
        }
      },
      {
        "join_execution": {
          "select#": 1,
          "steps": []
        }
      }
    ]
  }

附录2 OPTIMIZER_TRACE原始信息2

SELECT * from TM_ACCOUNT where org_first_id >= 90 and org_second_id in (1,2,3,60) and BIZ_DATE in ('2100-09-01','2100-09-02') and ACCOUNT_ID > '120306' ;


{
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "IN_uses_bisection": true
          },
          {
            "IN_uses_bisection": true
          },
          {
            "expanded_query": "/* select#1 */ select tm_account.account_id AS account_id,tm_account.name AS name,tm_account.address AS address,tm_account.org_first_id AS org_first_id,tm_account.org_second_id AS org_second_id,tm_account.biz_date AS biz_date,tm_account.last_modify_dt AS last_modify_dt from tm_account where ((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {}
          },
          {
            "table_dependencies": [
              {
                "table": "tm_account",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": []
              }
            ]
          },
          {
            "ref_optimizer_key_uses": []
          },
          {
            "rows_estimation": [
              {
                "table": "tm_account",
                "range_analysis": {
                  "table_scan": {
                    "rows": 1408599,
                    "cost": 288932
                  },
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": true,
                      "key_parts": [
                        "account_id"
                      ]
                    },
                    {
                      "index": "IDX_org_id_combine",
                      "usable": true,
                      "key_parts": [
                        "org_first_id",
                        "org_second_id",
                        "account_id"
                      ]
                    },
                    {
                      "index": "IDX_last_modify_dt_org_first_id_name",
                      "usable": false,
                      "cause": "not_applicable"
                    }
                  ],
                  "setup_range_conditions": [],
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  },
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "PRIMARY",
                        "ranges": [
                          "120306 < account_id"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": true,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 704299,
                        "cost": 141880,
                        "chosen": true
                      },
                      {
                        "index": "IDX_org_id_combine",
                        "ranges": [
                          "90 = 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))",
              "attached_conditions_computation": [],
              "attached_conditions_summary": [
                {
                  "table": "tm_account",
                  "attached": "((tm_account.org_first_id >= 90) and (tm_account.org_second_id in (1,2,3,60)) and (tm_account.biz_date in ('2100-09-01','2100-09-02')) and (tm_account.account_id > 120306))"
                }
              ]
            }
          },
          {
            "refine_plan": [
              {
                "table": "tm_account"
              }
            ]
          }
        ]
      }
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": []
      }
    }
  ]
}

附录3 java构造数据


public final class JdbcUtils {

    private static String url = "jdbc:mysql://localhost:3306/xxxx?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&serverTimezone=GMT%2B8";
    private static String user = "root";
    private static String password = "123";

    private JdbcUtils() {

    }

    static {
        try {
            Class.forName("com.mysql.jdbc.Driver");
        } catch (ClassNotFoundException e) {
            throw new ExceptionInInitializerError(e);
        }
    }

    public static void main(String args[]) {
        insertBatch();
    }

    public static void insertBatch() {

        Connection conn = null;
        PreparedStatement pst = null;
        ResultSet rs = null;
        try {
            String sql = "INSERT into TM_ACCOUNT values(?,?,?,?,?,?,?);";

            // 1. 获取链接,预处理语句
            conn = getConnection();
            conn.setAutoCommit(false);
            pst = conn.prepareStatement(sql);

            // 2. 开始插入,总插入150万
            Random random = new Random();
            int a_id_start = 1;
            for (int i = 0; i < 5 * 150; i++) {
                // 每2000条执行一次批量插入
                for (int loop = 0; loop < 2000; loop++) {

                    a_id_start++;
                    pst.setInt(1, a_id_start);
                    pst.setString(2, "name-" + a_id_start);
                    pst.setString(3, RandomString.make(20));
                    pst.setInt(4, random.nextInt(100));
                    pst.setInt(5, random.nextInt(100));
                    pst.setDate(6, new Date(200, 8, random.nextInt(25) + 1));
                    pst.setDate(7, new Date(200, 8, random.nextInt(25) + 1));
                    pst.addBatch();
                }
                pst.executeBatch();
                conn.commit();
                System.out.println(" done !!!!!!" + i);
            }

        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            free(rs, pst, conn);
        }
    }

    public static Connection getConnection() throws SQLException {
        return DriverManager.getConnection(url, user, password);
    }

    public static void free(ResultSet rs, Statement st, Connection conn) {
        try {
            if (rs != null)
                rs.close();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (st != null)
                    st.close();
            } catch (Exception e2) {
                e2.printStackTrace();
            } finally {
                try {
                    conn.close();
                } catch (Exception e3) {
                    e3.printStackTrace();
                }
            }
        }
    }
}

Original: https://www.cnblogs.com/starmoon1994/p/16942181.html
Author: starmoon1900
Title: 复现MySQL的索引选择失误以及通过OPTIMIZER_TRACE分析过程

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/797454/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球