Kayako logo
SupportSuite, eSupport and LiveResponse Discussion, troubleshooting and feedback related to Kayako's flagship support desk products SupportSuite, eSupport and LiveResponse.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  (#1) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
Scalability - 16-06-2008, 05:42 PM

I am having issues with the scalability of Kayako's database design and query choices. My database is currently only 6GB in size and I have many issues where queries will take large amounts of time, due to poor database design and query choices.


The biggest problem I am currently running into is searching. When one searches, the query that uses the ticketposts table does not use an index. This causes the entire ticketposts table to be read off the hard drive, 1GB, every time someone does a search is not very optimal. There's a few ways around this, the easiest would be to create an index. A better solution would not only be to create an index on the appropriate columns, but also move the contents field to it's own table. This way when ever the ticketposts table is manipulated, it is not extracted from the hard disk unless it is absolutely needed.


Code:
# Query_time: 161  Lock_time: 1  Rows_sent: 100  Rows_examined: 879161
SELECT ticketposts.ticketid, COUNT(*) AS rank FROM `swticketpostindex` AS ticketpostindex LEFT JOIN `swticketposts` AS ticketposts ON (tick
etposts.ticketpostid = ticketpostindex.ticketpostid) LEFT JOIN `swtickets` AS tickets ON (tickets.ticketid = ticketposts.ticketid) WHERE ti
cketpostindex.ticketwordid IN ('102','203') AND tickets.departmentid IN ('18','19','20','21') GROUP BY ticketposts.ticketid ORDER BY rank D
ESC LIMIT 100;


Another query I am having scalability issues with is the following query:

Code:
# Query_time: 11  Lock_time: 0  Rows_sent: 7  Rows_examined: 444436
SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` GROUP BY `departmenti
d`, `ticketstatusid` HAVING `departmentid` IN ('3');

I've yet to locate it's origin, however because the departmentID logic is placed into the HAVING clause, MySQL cannot use an index, but it also has to read every ticket off of the disk. 444,000 rows, or 111MB. This is an easy fix, move the HAVING clause into the WHERE clause when possible.

It looks like the development staff is swamped, if I develop the fixes I'll paste the patches. Either way, please keep scalability in mind when developing new features/functionality.
   
Reply With Quote
  (#2) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
16-06-2008, 05:57 PM

I believe I found, and fixed the bad query:


Code:
diff modules/tickets/functions_ticketcore.php.orig modules/tickets/functions_ticketcore.php
878c878
<               $dbCore->query("SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `". TABLE_PREFIX ."tickets` GROUP BY `departmentid`, `ticketstatusid` HAVING `departmentid` IN (". buildIN($departmentidlist) .");");
---
>               $dbCore->query("SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `". TABLE_PREFIX ."tickets` WHERE `departmentid` IN (". buildIN($departmentidlist) .") GROUP BY `departmentid`, `ticketstatusid`;");

Last edited by nickg; 16-06-2008 at 06:55 PM.
   
Reply With Quote
  (#3) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
16-06-2008, 11:08 PM

Within the ticket section, the action ACTION_MERGE is duplicated in the definitions. See as follows:

Code:
fgrep -n ACTION_MERGE modules/tickets/tickets.php 
36:define("ACTION_MERGE", 5);
58:define("ACTION_MERGE", 27);
One should remove row 58:

Code:
 diff modules/tickets/tickets.php modules/tickets/tickets.php.orig                        
57a58
> define("ACTION_MERGE", 27);
   
Reply With Quote
  (#4) Old
Jamie Edwards Offline
Operations Manager
 
Jamie Edwards's Avatar
 
Posts: 5,419
Join Date: Jan 2006
Location: United Kingdom
17-06-2008, 01:59 PM

Hi Nick,

Your change to the query assumes we are using indices. If we were, your query would indeed speed things up. However, indices are not used on this table.

We'll investigate the slow searching problem and if it can be duplicated, see what we can do about it.


Jamie Edwards (jamie.edwards ]at[ kayako.com)
----------------------------------------------------------------
---
  • Submit bug reports here.
  • Submit support tickets via the members area.
  • Submit sales queries either via live chat or via e-mail.
  • There is no official ETA on Version 4.
  • This is not an official support forum - submit a support ticket.
   
Reply With Quote
  (#5) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
18-06-2008, 04:23 AM

Quote:
Originally Posted by nickg View Post
I believe I found, and fixed the bad query:


Code:
diff modules/tickets/functions_ticketcore.php.orig modules/tickets/functions_ticketcore.php
878c878
<               $dbCore->query("SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `". TABLE_PREFIX ."tickets` GROUP BY `departmentid`, `ticketstatusid` HAVING `departmentid` IN (". buildIN($departmentidlist) .");");
---
>               $dbCore->query("SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `". TABLE_PREFIX ."tickets` WHERE `departmentid` IN (". buildIN($departmentidlist) .") GROUP BY `departmentid`, `ticketstatusid`;");

This query will use an index. I currently have the following indices on swtickets:

KEY `ticketcount` (`departmentid`,`ticketstatusid`,`ownerstaffid`),
KEY `tickets3` (`departmentid`,`ticketstatusid`,`lastactivity`),
KEY `tickets5` (`departmentid`,`ticketstatusid`,`userid`),
KEY `tickets6` (`departmentid`,`ticketstatusid`,`duetime`),
KEY `tickets8` (`departmentid`,`ticketstatusid`,`lastuserreplytim e`),


Any of them could be used to serve the departmentid and ticketstatusid portion of this query. However, MySQL will choose to use the `tickets3` index because every column needed to satisfy this query is in that index. The query would never hit the data file. Did I add these myself? They are very useful to speed up the pages. Trying to rebuild the data on menu tree for tickets without this index would be impossible with hundreds of thousands of tickets. (As it was impossible when I converted the data from MyISAM to InnoDB)
   
Reply With Quote
  (#6) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
18-06-2008, 04:26 AM

I think you should also implement the following change:

Code:
diff modules/tickets/functions_search.php.orig modules/tickets/functions_search.php
123c123
<               $dbCore->query("REPLACE INTO `". TABLE_PREFIX ."ticketpostindex` (`ticketwordid`, `ticketpostid`, `insubject`) VALUES ".implode(",",$postindexsql).";");
---
>               $dbCore->query("INSERT IGNORE INTO `". TABLE_PREFIX ."ticketpostindex` (`ticketwordid`, `ticketpostid`, `insubject`) VALUES ".implode(",",$postindexsql).";");

There are only 3 columns in this table; ticketwordid, ticketpostid, and insubject. All 3 of these columns are also in a UNIQUE KEY constraint. A REPLACE INTO will delete every row from the table that matches the query, then reinsert them. All values needed for an INSERT are being supplied by the application, thereforth doing the DELETE first has no purpose and wastes disk IO.
   
Reply With Quote
  (#7) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
18-06-2008, 04:31 AM

I'm also running into speed issues when you search for a Creator/Replier on the quick search. Currently the query does a %% search, which requires each row to be read off the disk. Again, slow if you have large amounts of data.

Code:
# Time: 080617 15:44:53
# User@Host: kayako[kayako] @ localhost []
# Query_time: 21  Lock_time: 0  Rows_sent: 23  Rows_examined: 446207
SELECT `ticketid` FROM `swtickets` WHERE email LIKE '%user@example.com%' ORDER BY `lastactivity` DESC LIMIT 50;
# Time: 080617 15:45:13
# User@Host: kayako[kayako] @ localhost []
# Query_time: 20  Lock_time: 0  Rows_sent: 30  Rows_examined: 255180
SELECT `ticketid` FROM `swticketposts` WHERE email LIKE '%user@example.com%' LIMIT 50;

Which takes just as long.

I have the following index on swtickets that could be used if there wasn't a leading %:

KEY `tickets4` (`email`,`subject`),

As for swticketposts, I have the following index which could be used if there wasn't a leading %:

KEY `ticketposts2` (`email`,`subjecthash`),



EDIT:
Eh, I striped out the leading % but then I run into the issue where the system tries to do a look up on fullname:

SELECT `ticketid` FROM `swtickets` WHERE fullname LIKE '%user@example.com%' LIMIT 50

Last edited by nickg; 18-06-2008 at 04:43 AM.
   
Reply With Quote
  (#8) Old
John Haugeland Offline
Developer
 
John Haugeland's Avatar
 
Posts: 619
Join Date: Dec 2007
Location: Idaho
Unhappy Not really. - 18-06-2008, 10:22 PM

Quote:
Originally Posted by nickg View Post
due to poor database design and query choices.
A number of these criticisms show a lack of understanding of the MySQL optimization engine.





Quote:
Originally Posted by nickg View Post
The biggest problem I am currently running into is searching. When one searches, the query that uses the ticketposts table does not use an index.
There are four indices on that table, sir, and they're in use in that query. I'm not entirely sure why you believe otherwise. You can see the indices on that table in /modules/tickets/setup.php starting on line 146, and you can see which of them are in use in that query by running an EXPLAIN on the query.




Quote:
Originally Posted by nickg View Post
This causes the entire ticketposts table to be read off the hard drive, 1GB, every time someone does a search is not very optimal.
Even if the table were unindexed, MySQL would not read the entire table. It's not necessary with BTREE databases, which already have a partial sort in their datastructure. MySQL won't read any columns not involved in the query, furthermore.




Quote:
Originally Posted by nickg View Post
There's a few ways around this, the easiest would be to create an index.
If your database is showing no indices on that table, then your database is damaged. There should be four indices on that table, sir.



Quote:
Originally Posted by nickg View Post
but also move the contents field to it's own table. This way when ever the ticketposts table is manipulated, it is not extracted from the hard disk unless it is absolutely needed.
Tables in a given database are typically stored in the same file, sir. Moving certain columns to seperate tables would actually slow things down significantly, because not only would this have zero effect on the work already being done, it would also introduce another relational scan pass while the database tried to figure out which blob belonged to which row (which costs o(lg n), which on a large ticket database is quite expensive).

I recommend you attempt this modification on a test database that you've filled with lots of data, then profile the difference.

Besides, data from all varbinary descended types (the blobs, the texts and the binaries) is not stored in the table, unless you explicitly request a prefix, which you'll note we have not, by looking at line 142 of setup.php.

To quote MySQL :: MySQL 5.0 Reference Manual :: 10.4.3 The BLOB and TEXT Types :
"Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened."





Quote:
Originally Posted by nickg View Post
Another query I am having scalability issues with is the following query:
...
I've yet to locate it's origin, however because the departmentID logic is placed into the HAVING clause, MySQL cannot use an index,
Code:
mysql> explain SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` GROUP BY `departmentid`, `ticketstatusid` HAVING `departmentid` IN ('3');

+----+-------------+-----------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table     | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-----------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | swtickets | index | NULL          | tickets3 | 12      | NULL |    5 | Using index |
+----+-------------+-----------+-------+---------------+----------+---------+------+------+-------------+
As you can see, sir, MySQL has no difficulty using an index in that query.





Quote:
Originally Posted by nickg View Post
444,000 rows, or 111MB. This is an easy fix, move the HAVING clause into the WHERE clause when possible.
If that were in a where clause instead of a having clause, it would not affect the group by. Making that change would introduce a significant bug that, under some configurations, would lose tickets. Also, if you profile it, you'll notice that it's actually a significant speed loss.




Quote:
Originally Posted by nickg View Post
I believe I found, and fixed the bad query:
That actually slows down a standard helpdesk, and introduces a ticket loss bug under some configurations.





Quote:
Originally Posted by nickg View Post
Within the ticket section, the action ACTION_MERGE is duplicated in the definitions. See as follows:

Code:
fgrep -n ACTION_MERGE modules/tickets/tickets.php 
36:define("ACTION_MERGE", 5);
58:define("ACTION_MERGE", 27);
One should remove row 58:
You're correct, that's an error. That said, it's the second one that should be removed, not the first. If you change that value from 5 to 27, you risk losing the ability to correctly interpret data that's in the database.

PHP Code:
C:\swift3>php
<?php
define
("foo",5);
define("foo",10);
echo 
foo;
?>
^Z
PHP Notice:  Constant foo already defined in C:\swift3\- on line 3
5
Quote:
Originally Posted by nickg View Post
This query will use an index. I currently have the following indices on swtickets:

KEY `ticketcount` (`departmentid`,`ticketstatusid`,`ownerstaffid`),
KEY `tickets3` (`departmentid`,`ticketstatusid`,`lastactivity`),
KEY `tickets5` (`departmentid`,`ticketstatusid`,`userid`),
KEY `tickets6` (`departmentid`,`ticketstatusid`,`duetime`),
KEY `tickets8` (`departmentid`,`ticketstatusid`,`lastuserreplytim e`),
Your desk appears to be severely corrupt. There are ten indices on that table, not five. Please see /modules/tickets/setup.php starting on line 112.





Quote:
Originally Posted by nickg View Post
Did I add these myself?
No. However, if you've been altering your database, it looks like you've done some fairly significant damage.




Quote:
Originally Posted by nickg View Post
Trying to rebuild the data on menu tree for tickets without this index would be impossible with hundreds of thousands of tickets. (As it was impossible when I converted the data from MyISAM to InnoDB)
Well, there's a big part of your problem right there: many (though not all) of the indices you're asking for are fulltext indices, and InnoDB doesn't support fulltext indices. That change alone accounts for three of the missing five indices on your tickets table.




Quote:
Originally Posted by nickg View Post
I think you should also implement the following change:

Code:
diff modules/tickets/functions_search.php.orig modules/tickets/functions_search.php
123c123
<               $dbCore->query("REPLACE INTO `". TABLE_PREFIX ."ticketpostindex` (`ticketwordid`, `ticketpostid`, `insubject`) VALUES ".implode(",",$postindexsql).";");
---
>               $dbCore->query("INSERT IGNORE INTO `". TABLE_PREFIX ."ticketpostindex` (`ticketwordid`, `ticketpostid`, `insubject`) VALUES ".implode(",",$postindexsql).";");
Insert ignore and replace into are not exchangeable. That change would cause massive, unacceptable data loss.

Code:
mysql> create table foo(id integer key, other integer);
Query OK, 0 rows affected (0.03 sec)

mysql> insert into foo values(1,1);
Query OK, 1 row affected (0.00 sec)

mysql> insert ignore into foo values(1,2);
Query OK, 0 rows affected (0.00 sec)

mysql> select * from foo;
+----+-------+
| id | other |
+----+-------+
|  1 |     1 |
+----+-------+
1 row in set (0.00 sec)

mysql> replace into foo values(1,2);
Query OK, 2 rows affected (0.00 sec)

mysql> select * from foo;
+----+-------+
| id | other |
+----+-------+
|  1 |     2 |
+----+-------+
1 row in set (0.00 sec)
Notice how insert ignore doesn't change things, and replace into does. The replacement query you have provided is completely incorrect.





Quote:
Originally Posted by nickg View Post
There are only 3 columns in this table; ticketwordid, ticketpostid, and insubject. All 3 of these columns are also in a UNIQUE KEY constraint. A REPLACE INTO will delete every row from the table that matches the query, then reinsert them.
Nothing is ever deleted in MySQL. That's why if you drop a ton of rows, the files won't shrink until you optimize table. They're just marked deleted, then overwritten. There pretty much isn't a database on earth which deletes then inserts. I mean, Paradox 2 for Dos 3.1 even knew about dirty records. That's one of the first optimizations made to a new database engine.

Insert ignore is significantly slower than replace into. Replace into can just add the data to known-usable space and update the index. Insert ignore has to go check whether that data exists first, then remove it from the transaction set.

Not only is the replacement query incorrect, it's another massive speed loss. You really should start profiling these things.





Quote:
Originally Posted by nickg View Post
I'm also running into speed issues when you search for a Creator/Replier on the quick search.
Yeah, because you removed the key tickets1.




Quote:
Originally Posted by nickg View Post
Currently the query does a %% search, which requires each row to be read off the disk.
That's not changeable. That search is not bound to the beginning of the string. Removing that percent would cause certain customers extreme distress, and would not have a significant speed impact.





Quote:
Originally Posted by nickg View Post
Again, slow if you have large amounts of data.
The official desk has several terabytes of data in it, and we get a sender search in about a second and a half on a normal commodity blade with a standard raid 1+0 disk setup. It is not unlikely that the damage done to your desk has also corrupted your master BTREE index; consider doing some mysql repair work.

MySQL :: MySQL 5.0 Reference Manual :: 6.4.3 How to Repair Tables
MySQL :: MySQL 5.0 Reference Manual :: 12.5.2.6 REPAIR TABLE Syntax


John Haugeland (john.haugeland ]at[ kayako.com)
----------------------------------------------------------------
---

Last edited by John Haugeland; 18-06-2008 at 10:57 PM.
   
Reply With Quote
  (#9) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
19-06-2008, 07:57 PM

Quote:
Originally Posted by John Haugeland View Post
A number of these criticisms show a lack of understanding of the MySQL optimization engine.

Wow, actually I am a recognized MySQL Expert by MySQL AB. As you will see when I point your the issues below. I have been working directly with MySQL for 10 years an MySQL AB employees for at least 6 years.




Quote:
There are four indices on that table, sir, and they're in use in that query. I'm not entirely sure why you believe otherwise. You can see the indices on that table in /modules/tickets/setup.php starting on line 146, and you can see which of them are in use in that query by running an EXPLAIN on the query.
Although those indexes are being used, they are not being used to the fullest. The HAVING clause is processed after the GROUP BY clause. By placing DepartmentID = X in the HAVING clause, MySQL will fetch every row off of the disk/index, regardless of the DepartmentID, perform the GROUP BY, then narrow down the data based on the HAVING clause. If you move the DepartmentID = X to the WHERE clause, MySQL will use the index to exclude the rows that do not match the DepartmentID, and it will not GROUP BY together. This will save large amounts of disk/ram cycles by not fetching rows, it will also save on RAM/CPU by not having to parse all of this extra data. Yes, the index is used to obtain the data and provide to the GROUP BY, but it is not used for the HAVING clause. Reversing makes an inefficient query, efficient. Simply using an index does not justify a query as being efficient.




Quote:
Even if the table were unindexed, MySQL would not read the entire table. It's not necessary with BTREE databases, which already have a partial sort in their datastructure. MySQL won't read any columns not involved in the query, furthermore.
The indexes are stored in a BTREE by default in all major disk based table handlers (MyISAM, InnoDB). If the query does not use the index, MyISAM will have to work its way through the entire data file, manually comparing the rows from the data side of things, which will take a large amount of seeks and reads, since they are not stored in a BTREE by each column.




Quote:
If your database is showing no indices on that table, then your database is damaged. There should be four indices on that table, sir.
I have made many alterations to the structure and code (and have submitted patches to Kayako through forums/tickets). I unfortunately do not keep a patch on the database side, so I do not recall which indexes I have added myself to improve the functionality.



Quote:
Tables in a given database are typically stored in the same file, sir. Moving certain columns to seperate tables would actually slow things down significantly, because not only would this have zero effect on the work already being done, it would also introduce another relational scan pass while the database tried to figure out which blob belonged to which row (which costs o(lg n), which on a large ticket database is quite expensive).
Actually, that is incorrect as well. In the MyISAM table handler, the data is separated to three separate files; a .frm file for the format of the table, a .MYD for the data and a .MYI for the index. MySQL :: MySQL 5.0 Reference Manual :: 13.1 The MyISAM Storage Engine.

If you use the InnoDB table handler, there are two different methods: per-table tablespace and raw devices. MySQL :: MySQL 5.0 Reference Manual :: 13.2.3 InnoDB Configuration. Both store the format in a .frm. The first stores all content in ibdata files, the second stores the index and data for each table in an .ibd file per table.


Separating large TEXT based columns into their own table is a practice used by many around the world. I have used it for years on applications I have that contain millions upon millions of rows. LiveJournal, one of the largest open source applications out there, and a frequent speaker/presenter at MySQL User Conferences specifically does this for their blogs. I doubt any ticket system ever used by Kayako will get close to the level of data stored by LiveJournal.

Quote:
I recommend you attempt this modification on a test database that you've filled with lots of data, then profile the difference.
I am currently running these modifications on my highly active support system. I have 45 active staff members who responded to tickets yesterday, with over a thousand responses a day from staff / clients. Every modification I make causes queries that take large amounts of time to disappear from my slow log and my current show processlist when I view pages.


Quote:
Besides, data from all varbinary descended types (the blobs, the texts and the binaries) is not stored in the table, unless you explicitly request a prefix, which you'll note we have not, by looking at line 142 of setup.php.

To quote MySQL :: MySQL 5.0 Reference Manual :: 10.4.3 The BLOB and TEXT Types :
"Each BLOB or TEXT value is represented internally by a separately allocated object. This is in contrast to all other data types, for which storage is allocated once per column when the table is opened."
Is is stored in a separate object, however in MyISAM, any time the column is pulled off of the disk the entire BLOB/TEXT is pulled off of the disk. This is not the case with InnoDB.




Quote:
Code:
mysql> explain SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` GROUP BY `departmentid`, `ticketstatusid` HAVING `departmentid` IN ('3');

+----+-------------+-----------+-------+---------------+----------+---------+------+------+-------------+
| id | select_type | table     | type  | possible_keys | key      | key_len | ref  | rows | Extra       |
+----+-------------+-----------+-------+---------------+----------+---------+------+------+-------------+
|  1 | SIMPLE      | swtickets | index | NULL          | tickets3 | 12      | NULL |    5 | Using index |
+----+-------------+-----------+-------+---------------+----------+---------+------+------+-------------+
As you can see, sir, MySQL has no difficulty using an index in that query.




If that were in a where clause instead of a having clause, it would not affect the group by. Making that change would introduce a significant bug that, under some configurations, would lose tickets. Also, if you profile it, you'll notice that it's actually a significant speed loss.




That actually slows down a standard helpdesk, and introduces a ticket loss bug under some configurations.
I covered this above and have shown it's more efficient limiting the amount of data being provided to the GROUP BY. It should not introduce a ticket loss bug under any configurations. That's the most absurd thing I've heard.






Quote:
You're correct, that's an error. That said, it's the second one that should be removed, not the first. If you change that value from 5 to 27, you risk losing the ability to correctly interpret data that's in the database.

PHP Code:
C:\swift3>php
<?php
define
("foo",5);
define("foo",10);
echo 
foo;
?>
^Z
PHP Notice:  Constant foo already defined in C:\swift3\- on line 3
5
I specifically said remove row 58, which is the entry for 27, are you reading things carefully? And I've already checked my 3,255,433 audit logs and have found no entries for 27.


Quote:
Your desk appears to be severely corrupt. There are ten indices on that table, not five. Please see /modules/tickets/setup.php starting on line 112.



No. However, if you've been altering your database, it looks like you've done some fairly significant damage.
No damage has been done, I assure you all changes have only improved an unscalable product not designed to support medium sized businesses. I've already had to spend development time on my end to improve the speed, small functionality, create an API for my user interfaces to integrate, build in LDAP support, etc. Listen, I knew Kayako was a stop gap solution when I purchased it. I am simply providing you advice and modifications that I think your customer base and company will benefit from.




Quote:
Well, there's a big part of your problem right there: many (though not all) of the indices you're asking for are fulltext indices, and InnoDB doesn't support fulltext indices. That change alone accounts for three of the missing five indices on your tickets table.

Um, there isn't a single query currently in the ticketing module that uses FULLTEXT indexing. Both tickets and ticketposts do not use FULLTEXT() as an index type, nor is there a query that uses MATCH ... AGAINST() which is the only way to do a full text search in the modules/tickets/ directory.




Quote:
Insert ignore and replace into are not exchangeable. That change would cause massive, unacceptable data loss.

Code:
mysql> create table foo(id integer key, other integer);
Query OK, 0 rows affected (0.03 sec)

mysql> insert into foo values(1,1);
Query OK, 1 row affected (0.00 sec)

mysql> insert ignore into foo values(1,2);
Query OK, 0 rows affected (0.00 sec)

mysql> select * from foo;
+----+-------+
| id | other |
+----+-------+
|  1 |     1 |
+----+-------+
1 row in set (0.00 sec)

mysql> replace into foo values(1,2);
Query OK, 2 rows affected (0.00 sec)

mysql> select * from foo;
+----+-------+
| id | other |
+----+-------+
|  1 |     2 |
+----+-------+
1 row in set (0.00 sec)
Notice how insert ignore doesn't change things, and replace into does. The replacement query you have provided is completely incorrect.
I know they are not interchangeable in all situations, as I explained the table has all columns in the UNIQUE index which means in this case both will produce the same exact results. Your simple example does not contain this.




Quote:
Nothing is ever deleted in MySQL. That's why if you drop a ton of rows, the files won't shrink until you optimize table. They're just marked deleted, then overwritten. There pretty much isn't a database on earth which deletes then inserts. I mean, Paradox 2 for Dos 3.1 even knew about dirty records. That's one of the first optimizations made to a new database engine.

Insert ignore is significantly slower than replace into. Replace into can just add the data to known-usable space and update the index. Insert ignore has to go check whether that data exists first, then remove it from the transaction set.

Not only is the replacement query incorrect, it's another massive speed loss. You really should start profiling these things.
Yes, I am well of aware of this, however, when you perform a REPLACE INTO MySQL will mark the item as deleted, then it will have to go back and rewrite the data. It will not mark as Undeleted. The documentation clearly states REPLACE INTO maps to a DELETE and an INSERT.










Quote:
The official desk has several terabytes of data in it, and we get a sender search in about a second and a half on a normal commodity blade with a standard raid 1+0 disk setup. It is not unlikely that the damage done to your desk has also corrupted your master BTREE index; consider doing some mysql repair work.
If you do a LIKE '%SOMETHING%' an index will not be used. Plain and simple. And I've actually just had to convert some tables from MyISAM to InnoDB due to the inefficiencies of the application and having queries that run for multiple minutes. When an ALTER TABLE is done the entire table is rebuilt from scratch.

Last edited by nickg; 19-06-2008 at 10:15 PM.
   
Reply With Quote
  (#10) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
19-06-2008, 08:19 PM

I thought of something additional to prove why moving a HAVING clause to the WHERE clause is more efficient.

Please analyze the two following EXPLAIN syntaxes carefully:

Code:
mysql> explain SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` GROUP BY `departmentid`, `ticketstatusid` HAVING `departmentid` IN ('3');
+----+-------------+-----------+-------+---------------+-------------+---------+------+--------+-------+
| id | select_type | table     | type  | possible_keys | key         | key_len | ref  | rows   | Extra |
+----+-------------+-----------+-------+---------------+-------------+---------+------+--------+-------+
|  1 | SIMPLE      | swtickets | index | NULL          | ticketcount | 12      | NULL | 454787 |       | 
+----+-------------+-----------+-------+---------------+-------------+---------+------+--------+-------+
1 row in set (0.00 sec)


mysql>  SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` GROUP BY `departmentid`, `ticketstatusid` HAVING `departmentid` IN ('3');       
+--------------+----------------+------------+--------------+
| departmentid | ticketstatusid | totalitems | lastactivity |
+--------------+----------------+------------+--------------+
|            3 |              0 |        159 |   1200986822 | 
|            3 |              1 |          4 |   1213928185 | 
|            3 |              2 |        123 |   1213928079 | 
|            3 |              3 |      30123 |   1213927070 | 
|            3 |              4 |       9931 |   1213904199 | 
|            3 |              5 |       2422 |   1213908731 | 
|            3 |              6 |          1 |   1213922581 | 
+--------------+----------------+------------+--------------+
7 rows in set (17.13 sec)

Code:
mysql> explain SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` WHERE DepartmentID IN ('3') GROUP BY `departmentid`, `ticketstatusid` ;
+----+-------------+-----------+------+-------------------------------------------------+----------+---------+-------+--------+-----------------------------------------------------------+
| id | select_type | table     | type | possible_keys                                   | key      | key_len | ref   | rows   | Extra                                                     |
+----+-------------+-----------+------+-------------------------------------------------+----------+---------+-------+--------+-----------------------------------------------------------+
|  1 | SIMPLE      | swtickets | ref  | ticketcount,tickets3,tickets5,tickets6,tickets8 | tickets3 | 4       | const | 119210 | Using where; Using index; Using temporary; Using filesort | 
+----+-------------+-----------+------+-------------------------------------------------+----------+---------+-------+--------+-----------------------------------------------------------+

mysql> SELECT `departmentid`, `ticketstatusid`, COUNT(*) AS totalitems, MAX(`lastactivity`) AS lastactivity FROM `swtickets` WHERE DepartmentID IN ('3') GROUP BY `departmentid`, `ticketstatusid` ;        
+--------------+----------------+------------+--------------+
| departmentid | ticketstatusid | totalitems | lastactivity |
+--------------+----------------+------------+--------------+
|            3 |              0 |        159 |   1200986822 | 
|            3 |              1 |          4 |   1213928185 | 
|            3 |              2 |        123 |   1213928079 | 
|            3 |              3 |      30123 |   1213927070 | 
|            3 |              4 |       9931 |   1213904199 | 
|            3 |              5 |       2422 |   1213908731 | 
|            3 |              6 |          1 |   1213922581 | 
+--------------+----------------+------------+--------------+
7 rows in set (0.08 sec)
You will notice how the EXPLAIN result states it will only analyze 119210 rows, versus 454787. You will also notice all of the Extra's which did not exist before. They can be described on the following page: MySQL :: MySQL 5.0 Reference Manual :: 7.2.1 Optimizing Queries with EXPLAIN
   
Reply With Quote
  (#11) Old
nickg Offline
New Member
 
Posts: 13
Join Date: Sep 2006
15-07-2008, 11:26 PM

I forgot about this after my fixes solved most of the problems in Kayako. Have you had any chance to review the comments and posts?
   
Reply With Quote
  (#12) Old
Dewak Offline
Member
 
Dewak's Avatar
 
Posts: 148
Join Date: Feb 2008
16-07-2008, 04:15 AM

Hi nick,

I'm not sure but I assume, due to the size of the database, that you are currently storing all attachments on the database, am I correct?

I recommend you to store all attachments with the filesystem method, and migrate all current database attachments using Craigbrass free attachment migration tool Attachment Migration Tool [FREE - EXTERNAL LINK].

This tool will help you release a little bit of size from the database, it's not the core solution but it will help you gain a little bit more of performance.


Andres Berdugo
Email & MSN: andres[at]dewak.com
Skype: andres.dewak
Dewak: Kayako Development, Consultancy and Support
http://www.dewak.com
   
Reply With Quote
Reply

Tags
scalability

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vBulletin Skin developed by: vBStyles.com


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46