Infobright Community Edition(ICE) – It’s Free

The March NY MySQL Meetup featured a presentation from Infobright , a data warehousing solution built on the MySQL Product.

With a pitch of “Simplicity, Scalability and low TCO” I became more impressed with the capability to delivery on these as the presentation proceeded. Here are some highlights.

  • The company and product has been around for a few years. Infobright started as a compression engine to sit beside Teradata, providing a significant cost saving to clients, and allowing a two way data transfer between Teradata.
  • In September 2008, a open source community edition was released, called ICE. (Which I didn’t know)
  • The technology is based on a Rough Set theory, a mathematical approach
  • Using a column oriented approach, compression generally starts at 10:1, different applications can get 30:1 or better
  • There is basically no tuning, there are no indexes. Knowledge is gleaned at data loading and each data pack node holds key information per column, such as range of values (min,max).
  • Some interesting results are, there is a constant load time, it doesn’t degrade over time as the size of your data increases. Also, Query performance scales with data volume.
  • Depending on queries, the knowledge grid can retrieve results without having to uncompress the data, i.e. introspection of the meta data is all that is needed
  • Infobright is not a pluggable storage engine, rather a custom binary of MySQL. This is due to the restrictions of the API and the lack of optimizer push down conditions for example.

The product is not without some limitations, but you have to realize the product is for a data warehousing implementation, not an OLTP web app. It’s not great with SELECT *, and large text strings for example.

Functionality continues to be added, with a recent release adding many more MySQL Functions, but again, Infobright does not claim to be a solution to everybody, there is not UDF support or SP support at this time, however I’d warrant this is really not needed.

While the presentation went into some detail regarding the knowledge grid, data packs, data pack nodes, and pack to pack integration from a slide perspective, the presentation lacked the technical here is how you use the loader to get data out of MySQL and into Infobright. Here is the throughput, etc. As a marketing presentation it had the right content, but I’d like to now see the companion technical presentation.

Having previously been part of the MySQL Consulting team, and having worked also in the Storage Engine API with the Nitro Storage engine I have a distinct advantage of knowing the complexities of integration with MySQL. We can only hope this continues to improve with future releases of MySQL enabling Infobright and other products to integrate better and keep up to date with the MySQL Release cycle.

Tagged with: Databases General Infobright MySQL Open Source Storage Engines

Producing Alternative Means statistics with SQL

MySQL’s built-in AVG() computes the arithmetic mean — the sum divided by the count. That is the right default for many questions, but it is not always the right measure of central tendency.

Extending MySQL Capabilities with UDFs, Plugins and Components

MySQL offers three different approaches to extending the SQL capabilities with the default product you download and install. These are: User Defined Function (UDF) MySQL Manual MySQL Plugin MySQL Manual MySQL Component MySQL Manual For the purposes of this post I will be using the current LTS version MySQL 8.

Producing One-Sample Z-Test statistics with SQL

The one-sample Z-test determines whether a sample mean differs significantly from a known population mean when the population standard deviation is also known. It is the appropriate test when the population parameters are established — quality control benchmarks, national averages, long-run process measurements — and you want to evaluate whether a new sample is consistent with them.