Wednesday, November 03, 2010

Halloween 2010


"Trick or treat" and Vote.


Dogs in Costumes


Pimpkin Craving


High Tech Costume


Tuesday, October 26, 2010

MYSQL SPEEPUP INSERT INTO A MYISAM TABLE

Accordint to the MySQL Manual, there are a few ways to speedup inserts while adding data to a MyISAM table. Following are the original document from MySQL manual plus the notes I added while I was experiencing with real data loads.

1. Background Information
 
The time required for inserting a row, according to the MySQL manual, is determined by the following factors, where the numbers indicate approximate proportions:
· Cnnecting: (3)
· There are many server variables that can be used to help optimize the user connections as stated in the post:
· Sending query to server: (2)
· Parsing query: (2)
· Inserting row: (1 × size of row)
· Inserting indexes: (1 × number of indexes)
· Closing: (1)

There are two additional things which need to be considered regarding overheard for inserts:
  • This does not take into consideration the initial overhead to open tables, which is done once for each concurrently running query. To try to reduce the times need to open table, try to cache them. To cache the table, set the table_cache variable. Please note that the table_cache, max_connections, and max_tmp_tables system variables affect the maximum number of files the server keeps open. If table_cache is set too high, MySQL may run out of file descriptors and refuse connections, fail to perform queries, and be very unreliable. You also have to take into account that the MyISAM storage engine needs two file descriptors for each unique open table. Please refer to this note for connection related variable setting.
  • The size of the table slows down the insertion of indexes by log N, assuming B-tree indexes.

2. Methods Which Can Be Used to speed up Inserts 
These are the methods which can be used to speed up inserts. You will notice that most of the methods mentions here are related to MySQL MyISAM tables.

(1.) If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements.
   Insert into t1 (a,b,c,d) values (1,2,3,’2010-01-01’), (3,5,6,’2010-02-01’), (7,8,9,’2010-03-01’);

Instead of:

   Insert into t1 (a,b,c,d) values (1,2,3,’2010-01-01’);
   Insert into t1 (a,b,c,d) values (3,5,6,’2010-02-01’);
   Insert into t1 (a,b,c,d) values (7,8,9,’2010-03-01’);

(2.) If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster.

(3.) If multiple clients are inserting a lot of rows, you can get higher speed by using the INSERT DELAYED statement. “Insert Delay” hint delays the inserts by storing the insert statements in the memory queue utile the target table is free from reads. Hence, the insert statement will return quickly and all the insert statements that are stored in the queue can be written in one block, which is more optimized that written one statement at a time.



Hear is the syntax for insert delay:

Insert delay into t1 (a,b,c,d) values (1,2,3,’2010-01-01’), (3,5,6,’2010-02-01’), (7,8,9,’2010-03-01’);
See Section 12.2.5.2, INSERT
Please note that delay insert only works with MyISAM, MEMORY, ARCHIVE and BLACKHOLE(v. 5.1.19) table. And, there is downside of using “Insert Delayed”. There will be negative performance gain if the table is empty or not being used and add overhead to the server to hold additional thread for each table for which there is delayed rows, etc. Please read the manual for details on the negative impacts.
Theses are system status variables can be used to monitor delay insert uses. Here are the Status Variable meaning:
  • Delayed_insert_threads -- Number of handler threads
  • Delayed_writes -- Number of rows written with INSERT DELAYED
  • Not_flushed_delayed_rows -- Number of rows waiting to be written

 (4.) For a MyISAM table, you can use concurrent inserts to add rows at the same time that SELECT statements are running, this works well for tables that have more reads than writes because it just keeps adding data at the end of the data file.
To enable concurrent inserts, we need to set the concurrent_insert system variable. The values for the variable are:

· 0 -- Disables concurrent inserts
· 1 -- Enables concurrent insert for MyISAM tables that do not have holes. A hole means that there is free space in the data file, i.e., rows were deleted from the table. This is the default value.
· 2 -- Enables concurrent inserts for all MyISAM tables, even those that have holes. For a table with a hole,  new rows are inserted at the end of the table if it is in use by another thread. Otherwise, MySQL acquires a normal write lock and inserts the row into the hole.

To set the variable with other values (0 or 2) rather than the default value, for example concurrent_insert = 2, enter the parameter in the my.cnf file and restart the server. Or you can also dynamically set it in MySQL, for example:
   MySQL> set global concurrent_insert = 2;

Here are a few things that need to keep in mind.

1. ) Under circumstances where concurrent inserts can be used, there is seldom any need to use the DELAYED modifier for INSERT statements.
2. ) If you are using the binary log, concurrent inserts are converted to normal inserts for CREATE ... SELECT or INSERT ... SELECT statements. This is done to ensure that you can re-create an exact copy of your tables by applying the log during a backup operation.
3. ) With LOAD DATA INFILE, if you specify CONCURRENT with a MyISAM table that satisfies the condition for concurrent inserts (that is, it contains no free blocks in the middle), other sessions can retrieve data from the table while LOAD DATA is executing.
4)Use of the CONCURRENT option affects the performance of LOAD DATA a bit, even if no other session is using the table at the same time.
MySQL document has more information on using concurrent inserts.
(5.) Using bulk load command, LOAD DATA INFILE, to load data into a table from a text file can be 20 times faster than using INSERT statements.

Updating indexes takes time and resources during inserts. When data are loaded into a table that has many indexes, we can follow the steps bellow to speed up the loads.

1.) Disable the indexes
Non unique indexes of a MyISAM table can be disabled using the following statement:
MySQL> ALTER TABLE t1 DISABLE KEYS;
The alter table disable keys statement also dose a flush table.
2.) Load data into the table using LOAD DATA INFILE statement. This does not update any indexes and therefore is very fast.
3.) Enable the indexes
Indexes can be enabled using the following statement:

MySQL> ALTER TABLE tbl_name ENABLE KEYS;

Here is a note on MySQL Disable and Enable Indexes Command.

http://databobjr.blogspot.com/2010/05/mysql-alter-table-disable-keys.html

You can also use myisamchk --keys-used=0 -rq /path/to/db/tbl_name to disable index keys and use myisamchk -rq /path/to/db/tbl_name to enable index keys. However, if you used myisamchk to disable or enable index keys, you must issue Flush Tables statement or a mysqladmin flush-tables command. More information can be found here: http://dev.mysql.com/doc/refman/5.0/en/myisamchk.html
.
(6.) To increase performance for MyISAM tables, for both LOAD DATA INFILE and INSERT, enlarge the key cache by increasing the key_buffer_size system variable.

2. SUMMARY

The methodologies mentioned above were not THE only methods that could speed up inserts nor should they be used exclusively in that orders. It requires a lot of trails and errors to have one that works for a particular environment.

References:



Thursday, October 21, 2010

MySQL Thread States Repair By Sorting Vs. Repair With Key Cache

When perform any of the REPAIR, ALTER or LOAD or ENABLE KEYS command on a MyISAM table, it also causes the rebuilding of the indexes. Indexes are rebuilt (repaired) in two modes:
  •  Repair by sorting -- The repair code is using a sort to create indexes.
  •  Repair using keycache -- The repair code is using creating keys one by one through the key cache. This is much slower than Repair by sorting.
To start repairing, first it tests if the table can be repaired by sorting provided it meets the following requirements:
  •   Table at least has one key 
  •   Total size needed for individual key is less than myisam_max_sort_file_size
If it meets the above requirements, then it uses either regular sorting if myisam_repair_threads = 1 (default) by building each key at a time or in parallel if myisam_repair_threads > 1 by using ‘n’ threads in parallel (n = total keys in the table).
 
If it fails to satisfy the above conditions, then it falls to expensive keycache repair mode.

Therefore, to avoid keycache repair, set the myisam_max_sort_file_size to the max index size.

 Related Note:
References: