MariaDB-server

mirror of https://github.com/MariaDB/server.git synced 2025-12-28 08:10:14 +00:00

Author	SHA1	Message	Date
Sergei Petrunia	72c0ba43b2	Code cleanup part #1	2022-01-19 18:10:09 +03:00
Sergei Petrunia	f76e310ace	Rename histogram_type=JSON to JSON_HB	2022-01-19 18:10:09 +03:00
Michael Okoko	bff65a813e	Implement point selectivity for JSON histograms * Also merges tests relating to JSON statistics into one file Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	547f805311	Refactor histogram point selectivity Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	63cbd0748b	replace range_selectivity methods for Histograms and add tests Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	c129689ddc	Use binary search to compute range selectivity * it also adds an "explain select" statement to the test so that the fprintf calls can print the computed intervals to mysqld.1.err Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	69f24c238e	Use generic Histogram_base class for Histogram_builders This fixes the wrong calculation for avg_frequency in json histograms by replacing the specific histogram objects with the generic Histogram_base class. It also restores get/set size functions as they were useful in calculating fields for binary histogram. Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Sergei Petrunia	21e0f5487f	MDEV-21130: Histograms: use JSON as on-disk format A demo of how to use in-memory data structure for histogram. The patch shows how to * convert string form of data to binary form * compare two values in binary form * compute a fraction for val in [X, Y] range. grep for GSOC-TODO for notes.	2022-01-19 18:10:08 +03:00
Michael Okoko	fe2e516a50	inform test result of zero hist_size for json histogram Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	bf4d0dcfe2	implement parse and serialize for histogram json	2022-01-19 18:10:08 +03:00
Michael Okoko	9bba595528	remove unneeded shared methods Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Michael Okoko	1fa7af749e	Split histogram classes and into JSON and binary classes Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:08 +03:00
Sergei Petrunia	1998b787ac	MDEV-21130: Histograms: use JSON as on-disk format Preparation for handling different kinds of histograms: - In Column_statistics, change "Histogram histogram" into "Histogram *histogram_". This allows for different kinds of Histogram classes with virtual functions. - [Almost] remove the usage of Histogram->set_values and Histogram->set_size. The code outside the histogram should not make any assumptions about what/how is stored in the Histogram. - Introduce drafts of methods to read/save histograms to/from disk.	2022-01-19 18:10:08 +03:00
Michael Okoko	9954aecc2b	Store bucket bounds and extend test cases for JSON histogram This fixes the memory allocation for json histogram builder and add more column types for testing. Some challenges at the moment include: * Garbage value at the end of JSON array still persists. * Garbage value also gets appended to bucket values if the column is a primary key. * There's a memory leak resulting in a "Warning: Memory not freed" message at the end of tests. Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:07 +03:00
Michael Okoko	2aca7b0c33	Prepare JSON as valid histogram_type Signed-off-by: Michael Okoko <okokomichaels@outlook.com>	2022-01-19 18:10:07 +03:00
Sergei Golubchik	e841957416	Merge branch '10.3' into 10.4	2021-02-23 09:25:57 +01:00
Sergei Golubchik	0ab1e3914c	Merge branch '10.2' into 10.3	2021-02-22 22:42:27 +01:00
Varun Gupta	a461e4d306	MDEV-19474: Histogram statistics are used even with optimizer_use_condition_selectivity=3 The issue here was histogram statistics were being used even when the level of optimizer_use_condition_selectivity doesn't allow usage of statistics from histogram. The histogram statistics are read for a table only when optimizer_use_condition_selectivity > 3. But the TABLE structure can be stored in the internal table cache and be reused for the next query. So in this case the histogram statistics will be available for the next query. The fix would be to make sure to use the histogram statistics only when optimizer_use_condition_selectivity > 3.	2021-02-16 11:53:13 +05:30
Marko Mäkelä	4b959bd8df	Merge 10.3 into 10.4	2020-07-20 15:34:59 +03:00
Marko Mäkelä	acc58fd835	Merge 10.2 into 10.3	2020-07-20 15:11:59 +03:00
Marko Mäkelä	ca9276e37e	Merge 10.1 into 10.2	2020-07-20 14:53:24 +03:00
Varun Gupta	dfdfeecb03	MDEV-22851: Engine independent index statistics are incorrect for large tables on Windows An oveflow was happening on windows because on Windows sizeof(ulong) is 4 bytes while it is 8 bytes on Linux. Switched avg_frequency and avg length for column statistics to ulonglong. Switched avg_frequency for index statistics to ulonglong.	2020-07-15 11:27:32 +05:30
Marko Mäkelä	8059148154	Merge 10.3 into 10.4	2020-06-03 07:32:09 +03:00
Marko Mäkelä	8300f639a1	Merge 10.2 into 10.3	2020-06-02 10:25:11 +03:00
Marko Mäkelä	d72eebaa3d	Merge 10.1 into 10.2	2020-06-01 09:33:03 +03:00
Sergey Vojtovich	c279878493	Thread safe histograms loading Previously multiple threads were allowed to load histograms concurrently. There were no known problems caused by this. But given amount of data races in this code, it'd happen sooner or later. To avoid scalability bottleneck, histograms loading is protected by per-TABLE_SHARE atomic variable. Whenever histograms were loaded by preceding statement (hot-path), a scalable load-acquire check is performed. Whenever histograms have to be loaded anew, mutual exclusion for loaders is established by atomic variable. If histograms are being loaded concurrently, statement waits until load is completed. - Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only meaningful within TABLE_SHARE (not used for collected stats). - TABLE_STATISTICS_CB::histograms_can_be_read and TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state atomic variable. - Simplified away alloc_histograms_for_table_share(). Note: there's still likely a data race if a thread attempts accessing histograms data after it failed to load it (because of concurrent load). It was there previously and goes out of the scope of this effort. One way of fixing it could be reviving TABLE::histograms_are_read and adding appropriate checks whenever it is needed. Part of MDEV-19061 - table_share used for reading statistical tables is not protected	2020-05-29 21:53:54 +04:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Marko Mäkelä	24232ec12c	Merge 10.1 into 10.2	2019-10-09 08:30:23 +03:00
Sergey Vojtovich	adefaeffcc	MDEV-19536 - Server crash or ASAN heap-use-after-free in is_temporary_table / read_statistics_for_tables_if_needed Regression after `279a907`, read_statistics_for_tables_if_needed() was called after open_normal_and_derived_tables() failure. Fixed by moving read_statistics_for_tables() call to a branch of get_schema_stat_record() where result of open_normal_and_derived_tables() is checked. Removed THD::force_read_stats, added read_statistics_for_tables() instead. Simplified away statistics_for_command_is_needed().	2019-10-07 13:30:22 +04:00
Sergey Vojtovich	e43791d4dc	Cleanup EITS Moved EITS allocation inside read_statistics_for_tables_if_needed(). Removed redundant is_safe argument.	2019-10-02 15:23:59 +04:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Marko Mäkelä	c64265f3f9	Merge 10.3 into 10.4	2018-12-12 14:09:48 +02:00
Marko Mäkelä	839cf16bb2	Merge 10.2 into 10.3	2018-12-12 13:46:06 +02:00
Marko Mäkelä	db1210f939	Merge 10.1 into 10.2	2018-12-12 12:13:43 +02:00
Marko Mäkelä	f77f8f6d1a	Merge 10.0 into 10.1	2018-12-12 10:48:53 +02:00
Varun Gupta	9207a838ed	MDEV-17255: New optimizer defaults and ANALYZE TABLE Added to new values to the server variable use_stat_tables. The values are COMPLEMENTARY_FOR_QUERIES and PREFERABLY_FOR_QUERIES. Both these values don't allow to collect EITS for queries like analyze table t1; To collect EITS we would need to use the syntax with persistent like analyze table t1 persistent for columns (col1,col2...) index (idx1, idx2...) / ALL Changing the default value from NEVER to PREFERABLY_FOR_QUERIES.	2018-12-09 13:25:27 +05:30
Varun Gupta	4886d14827	MDEV-17032: Estimates are higher for partitions of a table with @@use_stat_tables= PREFERABLY The problem here is EITS statistics does not calculate statistics for the partitions of the table. So a temporary solution would be to not read EITS statistics for partitioned tables. Also disabling reading of EITS for columns that participate in the partition list of a table.	2018-12-07 19:59:45 +05:30
Sergei Golubchik	57e0da50bb	Merge branch '10.2' into 10.3	2018-09-28 16:37:06 +02:00
Oleksandr Byelkin	28f08d3753	Merge branch '10.1' into 10.2	2018-09-14 08:47:22 +02:00
Oleksandr Byelkin	31081593aa	Merge branch '11.0' into 10.1	2018-09-06 22:45:19 +02:00
Varun Gupta	a9c09c95bd	MDEV-15306: Wrong/Unexpected result with the value optimizer_use_condition_selectivity set to 4 Currently for selectivity calculation we perform range analysis for a column even when we don't have any statistics(EITS). This makes less sense but is used to catch contradiction for WHERE condition. So the solution is to not perform range analysis for selectivity calculation for columns that do not have statistics.	2018-08-29 02:17:37 +05:30
Marko Mäkelä	7830fb7f45	Merge 10.2 into 10.3	2018-08-28 12:22:56 +03:00
Igor Babaev	4eac5df3fc	MDEV-16934 Query with very large IN clause lists runs slowly This patch introduces support for the system variable eq_range_index_dive_limit that existed in MySQL starting from 5.6. The variable sets a limit for index dives into equality ranges. Index dives are performed by optimizer to estimate the number of rows in range scans. Index dives usually provide good estimate but they are pretty expensive. To estimate the number of rows in equality ranges statistical data on indexes can be employed. Its usage gives not so good estimates but it's cheap. So if the number of equality dives required by an index scan exceeds the set limit no dives for equality ranges are performed by the optimizer for this index. As the new system variable is introduced in a stable version the default value for it is set to a special value meaning there is no limit for the number of index dives performed by the optimizer. The patch partially uses the MySQL code for WL 5957 'Statistics-based Range optimization for many ranges'.	2018-08-17 14:28:39 -07:00
Marko Mäkelä	05459706f2	Merge 10.2 into 10.3	2018-08-03 15:57:23 +03:00
Igor Babaev	095dc81158	MDEV-16757 Memory leak after adding manually min/max statistical data for blob column ANALYZE TABLE <table> does not collect statistical data on min/max values for BLOB columns of <table>. However these values can be added into mysql.column_stats manually by executing proper statements. Unfortunately this led to a memory leak because the memory allocated for these values was never freed. This patch provides the server with a function to free memory allocated for min/max statistical values of BLOB types. Temporarily changed the test case until MDEV-16711 is fixed as without this fix the test case for MDEV-16757 did not fail only for 10.0.	2018-07-15 16:24:24 -07:00
Igor Babaev	c89bb15c31	MDEV-16757 Memory leak after adding manually min/max statistical data for blob column ANALYZE TABLE <table> does not collect statistical data on min/max values for BLOB columns of <table>. However these values can be added into mysql.column_stats manually by executing proper statements. Unfortunately this led to a memory leak because the memory allocated for these values was never freed. This patch provides the server with a function to free memory allocated for min/max statistical values of BLOB types.	2018-07-13 17:48:45 -07:00

1 2

77 Commits