Skip to content

Commit 72331a7

Browse files
authored
Merge pull request #142 from macbre/suggest_innodb
Suggest InnoDB over MyISAM
2 parents d2c1ba7 + f008e4e commit 72331a7

File tree

6 files changed

+158
-76
lines changed

6 files changed

+158
-76
lines changed

README.md

Lines changed: 88 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,84 @@ Examples:
7676
Visit <https://github.com/macbre/index-digest>
7777
```
7878

79-
## An example
79+
## SQL query log
80+
81+
It's a text file with a single SQL query in each line (no line breaks are allowed). Lines that do start with `--` (SQL comment) are ignored. The file can be [generated using `query-digest` when `--sql-log` output mode is selected](https://github.com/macbre/query-digest#output-modes).
82+
83+
An example:
84+
85+
```sql
86+
-- A comment
87+
select * from 0002_not_used_indices order by id
88+
select * from 0002_not_used_indices where foo = 'foo' and id = 2
89+
select count(*) from 0002_not_used_indices where foo = 'foo'
90+
/* foo bar */ select * from 0002_not_used_indices where bar = 'foo'
91+
INSERT IGNORE INTO `0070_insert_ignore` VALUES ('123', 9, '2017-01-01');
92+
```
93+
94+
## Formatters
95+
96+
`index-digest` can return results in various formats (use `--format` to choose one).
97+
98+
### plain
99+
100+
Emits human-readable report to a console. You can disable colored and bold text by setting env variable `ANSI_COLORS_DISABLED=1`.
101+
102+
### syslog
103+
104+
Pushes JSON-formatted messages via syslog, so they can be aggregated using ELK stack.
105+
Use `SYSLOG_IDENT` env variable to customize syslog's `ident` messages are sent with (defaults to `index-digest`).
80106

81107
```
108+
Dec 28 15:59:58 debian index-digest[17485]: {"meta": {"version": "index-digest v0.1.0", "database_name": "index_digest", "database_host": "debian", "database_version": "MySQL v5.7.20"}, "report": {"type": "redundant_indices", "table": "0004_id_foo", "message": "\"idx\" index can be removed as redundant (covered by \"PRIMARY\")", "context": {"redundant": "UNIQUE KEY idx (id, foo)", "covered_by": "PRIMARY KEY (id, foo)", "schema": "CREATE TABLE `0004_id_foo` (\n `id` int(9) NOT NULL AUTO_INCREMENT,\n `foo` varbinary(16) NOT NULL DEFAULT '',\n PRIMARY KEY (`id`,`foo`),\n UNIQUE KEY `idx` (`id`,`foo`)\n) ENGINE=InnoDB DEFAULT CHARSET=latin1", "table_data_size_mb": 0.015625, "table_index_size_mb": 0.015625}}}
109+
```
110+
111+
### yaml
112+
113+
Outputs YML file with results and metadata.
114+
115+
## Checks
116+
117+
You can select which checks should be reported by the tool by using `--checks` command line option. Certain checks can also be skipped via `--skip-checks` option. Refer to `index_digest --help` for examples.
118+
119+
> **Number of checks**: 22
120+
121+
* `redundant_indices`: reports indices that are redundant and covered by other
122+
* `non_utf_columns`: reports text columns that have characters encoding set to `latin1` (utf is the way to go)
123+
* `missing_primary_index`: reports tables with no primary or unique key (see [MySQL bug #76252](https://bugs.mysql.com/bug.php?id=76252) and [Wikia/app#9863](https://github.com/Wikia/app/pull/9863))
124+
* `test_tables`: reports tables that seem to be test leftovers (e.g. `some_guy_test_table`)
125+
* `single_column`: reports tables with just a single column
126+
* `empty_tables`: reports tables with no rows
127+
* `generic_primary_key`: reports tables with [a primary key on `id` column](https://github.com/jarulraj/sqlcheck/blob/master/docs/logical/1004.md) (a more meaningful name should be used)
128+
* `use_innodb`: reports table using storage engines different than `InnoDB` (a default for MySQL 5.5+ and MariaDB 10.2+)
129+
130+
### Additional checks performed on SQL log
131+
132+
> You need to provide SQL log file via `--sql-log` option
133+
134+
* `not_used_columns`: checks which columns were not used by SELECT queries
135+
* `not_used_indices`: checks which indices are not used by SELECT queries
136+
* `not_used_tables`: checks which tables are not used by SELECT queries
137+
* `queries_not_using_index`: reports SELECT queries that do not use any index
138+
* `queries_using_filesort`: reports SELECT queries that require filesort ([a sort can’t be performed from an index and quicksort is used](https://www.percona.com/blog/2009/03/05/what-does-using-filesort-mean-in-mysql/))
139+
* `queries_using_temporary`: reports SELECT queries that require a temporary table to hold the result
140+
* `queries_using_full_table_scan`: reports SELECT queries that require a [full table scan](https://dev.mysql.com/doc/refman/5.7/en/table-scan-avoidance.html)
141+
* `selects_with_like`: reports SELECT queries that use `LIKE '%foo'` conditions (they can not use an index)
142+
* `insert_ignore`: reports [queries using `INSERT IGNORE`](https://medium.com/legacy-systems-diary/things-to-avoid-episode-1-insert-ignore-535b4c24406b)
143+
* `select_star`: reports [queries using `SELECT *`](https://github.com/jarulraj/sqlcheck/blob/master/docs/query/3001.md)
144+
* `having_clause`: reports [queries using `HAVING` clause](https://github.com/jarulraj/sqlcheck/blob/master/docs/query/3012.md)
145+
* `high_offset_selects`: report [SELECT queries using high OFFSET](https://www.percona.com/blog/2008/09/24/four-ways-to-optimize-paginated-displays/)
146+
147+
### Additional checks performed on tables data
148+
149+
> You need to use `--analyze-data` command line switch. Please note that these checks will query your tables. **These checks can take a while if queried columns are not indexed**.
150+
151+
* `data_too_old`: reports tables that have really old data, maybe it's worth checking if such long data retention is actually needed (**defaults to three months threshold**, can be customized via `INDEX_DIGEST_DATA_TOO_OLD_THRESHOLD_DAYS` env variable)
152+
* `data_not_updated_recently`: reports tables that were not updated recently, check if it should be up-to-date (**defaults a month threshold**, can be customized via `INDEX_DIGEST_DATA_NOT_UPDATED_RECENTLY_THRESHOLD_DAYS` env variable)
153+
154+
## An example report
155+
156+
```sql
82157
$ index_digest mysql://index_digest:qwerty@localhost/index_digest --sql-log sql/0002-not-used-indices-log
83158
------------------------------------------------------------
84159
Found 85 issue(s) to report for "index_digest" database
@@ -175,6 +250,18 @@ generic_primary_key → table affected: 0094_generic_primary_key
175250
PRIMARY KEY (`id`)
176251
) ENGINE=InnoDB DEFAULT CHARSET=latin1
177252

253+
------------------------------------------------------------
254+
use_innodb → table affected: 0036_use_innodb_myisam
255+
256+
"0036_use_innodb_myisam" uses MyISAM storage engine
257+
258+
- schema: CREATE TABLE `0036_use_innodb_myisam` (
259+
`item_id` int(9) NOT NULL AUTO_INCREMENT,
260+
`foo` int(8) DEFAULT NULL,
261+
PRIMARY KEY (`item_id`)
262+
) ENGINE=MyISAM DEFAULT CHARSET=latin1
263+
- engine: MyISAM
264+
178265
------------------------------------------------------------
179266
not_used_indices → table affected: 0002_not_used_indices
180267

@@ -327,80 +414,6 @@ high_offset_selects → table affected: page
327414
Queries performed: 100
328415
```
329416

330-
## SQL query log
331-
332-
It's a text file with a single SQL query in each line (no line breaks are allowed). Lines that do start with `--` (SQL comment) are ignored. The file can be [generated using `query-digest` when `--sql-log` output mode is selected](https://github.com/macbre/query-digest#output-modes).
333-
334-
An example:
335-
336-
```sql
337-
-- A comment
338-
select * from 0002_not_used_indices order by id
339-
select * from 0002_not_used_indices where foo = 'foo' and id = 2
340-
select count(*) from 0002_not_used_indices where foo = 'foo'
341-
select * from 0002_not_used_indices where bar = 'foo'
342-
INSERT IGNORE INTO `0070_insert_ignore` VALUES ('123', 9, '2017-01-01');
343-
```
344-
345-
## Formatters
346-
347-
`index-digest` can return results in various formats (use `--format` to choose one).
348-
349-
### plain
350-
351-
Emits human-readable report to a console. You can disable colored and bold text by setting env variable `ANSI_COLORS_DISABLED=1`.
352-
353-
### syslog
354-
355-
Pushes JSON-formatted messages via syslog, so they can be aggregated using ELK stack.
356-
Use `SYSLOG_IDENT` env variable to customize syslog's `ident` messages are sent with (defaults to `index-digest`).
357-
358-
```
359-
Dec 28 15:59:58 debian index-digest[17485]: {"meta": {"version": "index-digest v0.1.0", "database_name": "index_digest", "database_host": "debian", "database_version": "MySQL v5.7.20"}, "report": {"type": "redundant_indices", "table": "0004_id_foo", "message": "\"idx\" index can be removed as redundant (covered by \"PRIMARY\")", "context": {"redundant": "UNIQUE KEY idx (id, foo)", "covered_by": "PRIMARY KEY (id, foo)", "schema": "CREATE TABLE `0004_id_foo` (\n `id` int(9) NOT NULL AUTO_INCREMENT,\n `foo` varbinary(16) NOT NULL DEFAULT '',\n PRIMARY KEY (`id`,`foo`),\n UNIQUE KEY `idx` (`id`,`foo`)\n) ENGINE=InnoDB DEFAULT CHARSET=latin1", "table_data_size_mb": 0.015625, "table_index_size_mb": 0.015625}}}
360-
```
361-
362-
### yaml
363-
364-
Outputs YML file with results and metadata.
365-
366-
## Checks
367-
368-
You can select which checks should be reported by the tool by using `--checks` command line option. Certain checks can also be skipped via `--skip-checks` option. Refer to `index_digest --help` for examples.
369-
370-
> **Number of checks**: 21
371-
372-
* `redundant_indices`: reports indices that are redundant and covered by other
373-
* `non_utf_columns`: reports text columns that have characters encoding set to `latin1` (utf is the way to go)
374-
* `missing_primary_index`: reports tables with no primary or unique key (see [MySQL bug #76252](https://bugs.mysql.com/bug.php?id=76252) and [Wikia/app#9863](https://github.com/Wikia/app/pull/9863))
375-
* `test_tables`: reports tables that seem to be test leftovers (e.g. `some_guy_test_table`)
376-
* `single_column`: reports tables with just a single column
377-
* `empty_tables`: reports tables with no rows
378-
* `generic_primary_key`: reports tables with [a primary key on `id` column](https://github.com/jarulraj/sqlcheck/blob/master/docs/logical/1004.md) (a more meaningful name should be used)
379-
380-
### Additional checks performed on SQL log
381-
382-
> You need to provide SQL log file via `--sql-log` option
383-
384-
* `not_used_columns`: checks which columns were not used by SELECT queries
385-
* `not_used_indices`: checks which indices are not used by SELECT queries
386-
* `not_used_tables`: checks which tables are not used by SELECT queries
387-
* `queries_not_using_index`: reports SELECT queries that do not use any index
388-
* `queries_using_filesort`: reports SELECT queries that require filesort ([a sort can’t be performed from an index and quicksort is used](https://www.percona.com/blog/2009/03/05/what-does-using-filesort-mean-in-mysql/))
389-
* `queries_using_temporary`: reports SELECT queries that require a temporary table to hold the result
390-
* `queries_using_full_table_scan`: reports SELECT queries that require a [full table scan](https://dev.mysql.com/doc/refman/5.7/en/table-scan-avoidance.html)
391-
* `selects_with_like`: reports SELECT queries that use `LIKE '%foo'` conditions (they can not use an index)
392-
* `insert_ignore`: reports [queries using `INSERT IGNORE`](https://medium.com/legacy-systems-diary/things-to-avoid-episode-1-insert-ignore-535b4c24406b)
393-
* `select_star`: reports [queries using `SELECT *`](https://github.com/jarulraj/sqlcheck/blob/master/docs/query/3001.md)
394-
* `having_clause`: reports [queries using `HAVING` clause](https://github.com/jarulraj/sqlcheck/blob/master/docs/query/3012.md)
395-
* `high_offset_selects`: report [SELECT queries using high OFFSET](https://www.percona.com/blog/2008/09/24/four-ways-to-optimize-paginated-displays/)
396-
397-
### Additional checks performed on tables data
398-
399-
> You need to use `--analyze-data` command line switch. Please note that these checks will query your tables. **These checks can take a while if queried columns are not indexed**.
400-
401-
* `data_too_old`: reports tables that have really old data, maybe it's worth checking if such long data retention is actually needed (**defaults to three months threshold**, can be customized via `INDEX_DIGEST_DATA_TOO_OLD_THRESHOLD_DAYS` env variable)
402-
* `data_not_updated_recently`: reports tables that were not updated recently, check if it should be up-to-date (**defaults a month threshold**, can be customized via `INDEX_DIGEST_DATA_NOT_UPDATED_RECENTLY_THRESHOLD_DAYS` env variable)
403-
404417
## Success stories
405418

406419
> Want to add your entry here? Submit a pull request

indexdigest/cli/script.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,8 @@
6262
check_data_too_old, \
6363
check_data_not_updated_recently, \
6464
check_generic_primary_key, \
65-
check_high_offset_selects
65+
check_high_offset_selects, \
66+
check_use_innodb
6667

6768

6869
def get_reports(database, sql_log=None, analyze_data=False):
@@ -94,6 +95,7 @@ def get_reports(database, sql_log=None, analyze_data=False):
9495
check_single_column(database),
9596
check_empty_tables(database),
9697
check_generic_primary_key(database),
98+
check_use_innodb(database),
9799
)
98100

99101
# checks that use SQL log

indexdigest/linters/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from .linter_0028_data_not_updated_recently import check_data_not_updated_recently
1515
from .linter_0032_utf_latin_columns import check_latin_columns
1616
from .linter_0034_missing_primary_index import check_missing_primary_index
17+
from .linter_0036_use_innodb import check_use_innodb
1718
from .linter_0070_insert_ignore import check_insert_ignore_queries
1819
from .linter_0074_single_column import check_single_column
1920
from .linter_0075_test_tables import check_test_tables
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
"""
2+
This linter checks for ...
3+
"""
4+
from collections import OrderedDict
5+
6+
from indexdigest.utils import LinterEntry
7+
8+
9+
def check_use_innodb(database):
10+
"""
11+
:type database indexdigest.database.Database
12+
:rtype: list[LinterEntry]
13+
"""
14+
# in MySQL 8.0 information_schema tables columns are uppercase
15+
res = database.query_dict_rows("SELECT TABLE_NAME, ENGINE FROM information_schema.tables "
16+
"WHERE ENGINE <> 'InnoDB' and TABLE_SCHEMA = '{}'".
17+
format(database.db_name))
18+
19+
for row in res:
20+
context = OrderedDict()
21+
context['schema'] = database.get_table_schema(row['TABLE_NAME'])
22+
context['engine'] = row['ENGINE']
23+
24+
yield LinterEntry(linter_type='use_innodb', table_name=row['TABLE_NAME'],
25+
message='"{TABLE_NAME}" uses {ENGINE} storage engine'.
26+
format(**row),
27+
context=context)
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
from __future__ import print_function
2+
3+
from unittest import TestCase
4+
5+
from indexdigest.linters.linter_0036_use_innodb import check_use_innodb
6+
from indexdigest.test import DatabaseTestMixin
7+
8+
9+
class TestLinter(TestCase, DatabaseTestMixin):
10+
11+
def test_use_innodb(self):
12+
reports = list(check_use_innodb(self.connection))
13+
14+
print(reports, reports[0].context)
15+
16+
self.assertEqual(len(reports), 1)
17+
18+
self.assertEqual(str(reports[0]),
19+
'0036_use_innodb_myisam: "0036_use_innodb_myisam" uses MyISAM storage engine')
20+
self.assertEqual(reports[0].table_name, '0036_use_innodb_myisam')
21+
self.assertEqual(str(reports[0].context['engine']), "MyISAM")
22+
23+
# assert False

sql/0036-use-innodb.sql

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
-- Report MyISAM tables and suggest to use InndDB
2+
--
3+
-- https://github.com/macbre/index-digest/issues/36
4+
DROP TABLE IF EXISTS `0036_use_innodb_myisam`;
5+
CREATE TABLE `0036_use_innodb_myisam` (
6+
`item_id` int(9) NOT NULL AUTO_INCREMENT,
7+
`foo` int(8),
8+
PRIMARY KEY (`item_id`)
9+
) ENGINE=MyISAM;
10+
11+
DROP TABLE IF EXISTS `0036_use_innodb`;
12+
CREATE TABLE `0036_use_innodb` (
13+
`item_id` int(9) NOT NULL AUTO_INCREMENT,
14+
`foo` int(8),
15+
PRIMARY KEY (`item_id`)
16+
);

0 commit comments

Comments
 (0)