Some Drupal observations

I had the opportunity to review a client’s production Drupal installation recently. This is a new site and traffic is just starting to pick up. Drupal is a popular LAMP stack open source CMS system using the MySQL Database.

Unfortunately I don’t always have the chance to focus on one product when consulting , sometimes the time can be minutes to a few hours. Some observations from looking at Drupal.

Disk footprint

Presently, volume and content is of a low volume, but expecting to ramp up. I do however find 90% of disk volume in one table called ‘watchdog';

+--------------+--------------+--------------+-------------+--------+
| table_schema | total_mb     | data_mb      | index_mb    | tables |
+--------------+--------------+--------------+-------------+--------+
| xxxxx        | 812.95555878 | 745.34520721 | 67.61035156 |    191 |
+--------------+--------------+--------------+-------------+--------+

+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| table_name                                | engine | row_format | table_rows | avg_row_length | total_mb     | data_mb      | index_mb    |
+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| watchdog                                  | MyISAM | Dynamic    |      63058 |            210 | 636.42242813 | 607.72516251 | 28.69726563 |
| cache_menu                                | MyISAM | Dynamic    |        145 |         124892 |  25.33553696 |  25.32577133 |  0.00976563 |
| search_index                              | MyISAM | Dynamic    |     472087 |             36 |  23.40134048 |  16.30759048 |  7.09375000 |
| comments                                  | MyISAM | Dynamic    |      98272 |            208 |  21.83272934 |  19.58272934 |  2.25000000 |

Investigating the content of the ‘watchdog’ table shows detailed logging. Drilling down just on the key ‘type’ records shows the following.

mysql> select message,count(*) from watchdog where type='page not found' group by message order by 2 desc limit 10;
+--------------------------------------+----------+
| message                              | count(*) |
+--------------------------------------+----------+
| content/images/loadingAnimation.gif  |    17198 |
| see/images/loadingAnimation.gif      |     6659 |
| images/loadingAnimation.gif          |     6068 |
| node/images/loadingAnimation.gif     |     2774 |
| favicon.ico                          |     1772 |
| sites/all/modules/coppa/coppa.js     |      564 |
| users/images/loadingAnimation.gif    |      365 |
| syndicate/google-analytics.com/ga.js |      295 |
| content/img_pos_funny_lowsrc.gif     |      230 |
| content/google-analytics.com/ga.js   |      208 |
+--------------------------------------+----------+
10 rows in set (2.42 sec)

Some 25% of rows is just the reporting one missing file. Correcting this one file cuts down a pile of unnecessary logging.

Repeating Queries

Looking at just 1 random second of SQL logging shows 1200+ SELECT statements.
355 are SELECT changed FROM node

$ grep would_you_rather drupal.1second.log
              7 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              5 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              3 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1
              4 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              6 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
             10 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              8 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1

There is plenty of information regarding monitoring the Slow Queries in MySQL, but I have also promoted that’s it not the slow queries that ultimately slow a system down, but the 1000’s of repeating fast queries.

MySQL of course has the Query Cache to assist, but this is a course grade solution, and a high volume read/write environment this is meaningless.

There is a clear need for either a application level caching, or a database redesign to pull rather then poll this information, however without more in depth review of Drupal I can not make any judgment calls.

Tagged with: Databases General MySQL Open Source Web Web Development

Related Posts

More CPUs or Newer CPUs

In a CPU-bound database workload, regardless of price, would you scale-up or scale-new? What if price was the driving factor, would you scale-up or scale-new? I am using as a baseline the first available AWS Graviton2 processor for RDS (r6g).

Read more

An Interesting Artifact with AWS RDS Aurora Storage

As part of using public datasets with my own Benchmarking Suite I wanted upsize a dataset for larger volume testing. I have always used the INFORMATION_SCHEMA.TABLES data_length and index_length columns as a sufficiently accurate measurement for actual disk space used.

Read more

How long does it take the ReadySet cache to warm up?

During my setup of benchmarking I run a quick test-sysbench script to ensure my configuration is right before running an hour+ duration test. When pointing to a Readyset cache where I have cached the 5 queries used in the sysbench test, but I have not run any execution of the SQL, throughput went up 10x in 5 seconds.

Read more