Thursday, October 23, 2014

When not to use Thread pool in MariaDB?

When not to use Thread pool in MariaDB?
very bursty workload (long periods of inactivity mixed with short periods of high activity by many users), and also you cannot tolerate delays coming from thread creation throttling. Even in this situation, performance can be improved by tweaking the way threads are retired, e.g. thread_pool_idle_timeout on Unix, or thread_pool_min_threads on Windows.

many concurrent, long, non-yielding queries. Non-yielding here means that a thread never waits while executing a query, or does not indicate waits to threadpool. Such workloads are mostly used in data warehouse scenarios. In this case long running queries will delay execution of other queries, and the delay is fixed only with stall detection/preemption mechanism (s. description thread_pool_stall_limit). You can still connect trough the extra-port TCP/IP port.

you rely on the fact that simple queries always finish quickly, no matter how loaded you database server is. With loaded threadpool, queries might be queued for later execution, such that even a simple SELECT 1, might take more time than with thread-per-connection.

PostgresSQL Monitoring Locks and Duplicate Indexes

SELECT AS blocked_pid,
a.usename AS blocked_user,
ka.current_query AS blocking_statement,
now() - ka.query_start AS blocking_duration, AS blocking_pid,
ka.usename AS blocking_user,
a.current_query AS blocked_statement,
now() - a.query_start AS blocked_duration
FROM pg_catalog.pg_locks bl
JOIN pg_catalog.pg_stat_activity a ON a.procpid =
JOIN pg_catalog.pg_locks kl ON kl.transactionid = bl.transactionid AND !=
JOIN pg_catalog.pg_stat_activity ka ON ka.procpid =
WHERE NOT bl.granted;

Setting log_lock_waits and related deadlock_timeout parameters help to even analyze intermittent locks that are causing problems only sometimes. The slow lock acquisition will appear in the database logs for later analysis.

PostgreSQL Duplicate Indexes

Duplicate Indexes are serious pain in PostgreSQL Infrastruture. It is always cool to 
drop them once found.... The script below helps you to find duplicate indexes in PostgreSQL

SELECT pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS size,
       (array_agg(idx))[1] AS idx1, (array_agg(idx))[2] AS idx2,
       (array_agg(idx))[3] AS idx3, (array_agg(idx))[4] AS idx4
    SELECT indexrelid::regclass AS idx, (indrelid::text ||E'\n'|| indclass::text ||E'\n'|| indkey::text ||E'\n'||
                                         coalesce(indexprs::text,'')||E'\n' || coalesce(indpred::text,'')) AS KEY
    FROM pg_index) sub
ORDER BY sum(pg_relation_size(idx)) DESC;

MySQL - Lock wait timeout exceeded - try restarting transaction

MySQL - Lock wait timeout exceeded - try restarting transaction

Make sure the database tables are using InnoDB storage engine and READ-COMMITTED transaction isolation level.
If the the above configuration is correct then please try to increase the database server innodb_lock_wait_timeout variable 
Restart the MySQL database service for the configuration to take place.
If the steps above don't help, please run these queries below to double-check the configuration:
show variables like '%wait_timeout%';
show variables like '%tx_isolation%';
SELECT @@GLOBAL.tx_isolation,

Troubleshooting blocking in MariaDB

Even with thread_pool_max_threads set high (recall the default for it is 500), if global locks are involved it is possible to block the entire pool. Imagine a situation where one client performs a FLUSH TABLES WITH READ LOCK and then pauses. If 500 other clients start a write operation, the maximum number of threads allowed in the pool is reached and then it is impossible to UNLOCK TABLES anymore. One way to solve the problem would be to set thread_pool_max_threads to a very high value, but this might be less than ideal, and negatively affect performance.

To workaround such a situation, MariaDB permits you to use a dedicated admin connection. To allow a dedicated admin connection, set the extra_port server variable to the TCP port for this connection (this must be different from the usual port), and connect using this port as a superuser. Once connected, you can either increase thread_pool_max_threads or kill the offending connection – in the example above that would be the connection that holds the global lock (such a connection would be in the 'sleep' state).

PostgreSQL Performance Tuning params

max_connections =  This option sets the maximum number of database backend to have at any one time. Use this feature to ensure that you do not launch so many backends that you begin swapping to disk and kill the performance of all the children. Depending on your application it may be better to deny the connection entirely rather than degrade the performance of all of the other children.

shared_buffers =  Editing this option is the simplest way to improve the performance of your database server. The default is pretty low for most modern hardware. General wisdom says that this should be set to roughly 25% of available RAM on the system. Like most of the options I will outline here you will simply need to try them at different levels (both up and down ) and see how well it works on your particular system. Most people find that setting it larger than a third starts to degrade performance.

effective_cache_size =  This value tells PostgreSQL's optimizer how much memory PostgreSQL has available for caching data and helps in determing whether or not it use an index or not. The larger the value increases the likely hood of using an index. This should be set to the amount of memory allocated to shared_buffers plus the amount of OS cache available. Often this is more than 50% of the total system memory.

work_mem =  This option is used to control the amount of memory using in sort operations and hash tables. While you may need to increase the amount of memory if you do a ton of sorting in your application, care needs to be taken. This isn't a system wide parameter, but a per operation one. So if a complex query has several sort operations in it it will use multiple work_mem units of memory. Not to mention that multiple backends could be doing this at once. This query can often lead your database server to swap if the value is too large. This option was previously called sort_mem in older versions of PostgreSQL.

max_fsm_pages = This option helps to control the free space map. When something is deleted from a table it isn't removed from the disk immediately, it is simply marked as "free" in the free space map. The space can then be reused for any new INSERTs that you do on the table. If your setup has a high rate of DELETEs and INSERTs it may be necessary increase this value to avoid table bloat.
fsync = This option determines if all your WAL pages are fsync()'ed to disk before a transactions is committed. Having this on is safer, but can reduce write performance. If fsync is not enabled there is the chance of unrecoverable data corruption. Turn this off at your own risk.

commit_delay  and commit_siblings = These options are used in concert to help improve performance by writing out multiple transactions that are committing at once. If there are commit_siblings number of backends active at the instant your transaction is committing then the server waiting commit_delay microseconds to try and commit multiple transactions at once.

random_page_cost = random_page_cost controls the way PostgreSQL views non-sequential disk reads. A higher value makes it more likely that a sequential scan will be used over an index scan indicating that your server has very fast disks.
Increase, in postgresql.conf, the value of shared_buffers.
Increase, in postgresql.conf, the value of effective_cache_size.
Run VACUUM on database tables frequently.
Run ANALYZE on the database periodically.
Don't run VACUUM FULL too often.
Move the WAL files to a separate physical disk.
Increase, in postgresql.conf, the value of sort_mem.
Reduce, in postgresql.conf, the value of random_page_cost.
Increase, in postgresql.conf, the value of max_fsm_pages and max_fsm_relations