Open Source Database Schemas

I am seeking the help of the community. I am working on an evaluation project about schema design in open source applications. While it’s possible for me to download the software of many popular apps, and install the software and then do a mysqldump, it takes time. Quite often there is no simple schema.sql file, but a process for creating the schema. If you are using an open source project, would you take a moment and run the following.

$ mysqldump -u[user] -p --skip-lock-tables --no-data --databases [schema]  > [schema].sql

This will only dump the table definitions, and should therefore contain nothing company specific. I have at this time:

  • WordPress
  • Drupal
  • Mediawiki
  • OS Commerce
  • Joomla
  • EzPublish
  • PHPWiki

I am open to any projects, and it doesn’t matter if the version is not the most current, what I am seeking is to understand trends. There is a huge list of possibilities including Bugzilla, PHPBB, SugarCRM, Magento, PHPWiki just to name a few.

Your help would be greatly appreciated. You are welcome to add a a comment, or email me at [this domain] with your results.

I am also happy to accept sanitized schemas of any projects, however please ensure no company or propriety specific information is provided.

Tagged with: Databases MySQL

Producing Skewness statistics with SQL

Skewness measures the asymmetry of a distribution. A perfectly symmetric distribution has a skewness of zero. A positive skew (right-skewed) means the tail extends to the right — a small number of high values pull the mean above the median.

Exploring the vsql-ai extension

The vsql-ai extension adds AI prompt capabilities and text embeddings directly in SQL queries, with support for Anthropic Claude , Google Gemini , OpenAI ChatGPT , or a local LLM such as Ollama .

Producing Chi-Squared statistics with SQL

The Chi-Squared test is one of the most widely used statistical tests for categorical data. It comes in two flavors: the goodness-of-fit test asks whether an observed frequency distribution matches an expected one, while the test of independence asks whether two categorical variables are associated with each other.