Memcached

The NoSQL event in New York had a number of presentations on non relational technologies including of Hadoop, MongoDB and CouchDB.

Coming historically from a relational background of 20 years with Ingres, Oracle and MySQL I have been moving my focus towards non relational data store. The most obvious and well used today is memcached, a non persistent distributed key/value pair store. There are a number of persistent key/value stores in the marketplace, Tokyo Cabinet, Project Voldemort and Redis to name a few.

My list of data store products helps to identify the complex name space of varying products that now exist. A trend is towards schema less solutions, the ability to better manage dynamically typed/formatted information and the Agile Methodology release approach is simply non achievable in a statically type relational database table/column structure. The impact of constant ALTER TABLE commands in a MySQL database makes your production system unusable.

In a highly distribute online and increasing offline operation, fault tolerance and data synchronization and eventual consistency are required features in complex topologies such as multi-master.

I advise and promote a technology agnostic solution when possible. With the use of an API this is actually achievable, however in order to use a variety of backend data store products, one must consider the design patterns for optimal management. Two factors to support a highly distributed data set are no joins and minimal transactional semantics. The Facebook API is a great example, where there are no joins for their MySQL Relational backend. The movement back to a logical and non-normalized schema, or move towards a totally schemaless solution do require great though in the architectural concepts of your application.

Ultimately feature requirements will dictate the relative strengths and weaknesses of products. Full text search is a good example. CouchDB provides native support via Lucene. Another feature I like of couchDB is its append only data mode. This makes durability easy, and auto-recovery after crash a non issue, another feature a transactional relational database can not achieve.

With a 2 day no:sql(east) conference this month, there is definitely greater interest in this space.

$ memslap -s localhost Threads connecting to servers 1 Took 1.633 seconds to load data $ memstat -s localhost Listing 1 Server Server: localhost (11211) pid: 23868 uptime: 54 time: 1244575816 version: 1.2.2 pointer_size: 32 rusage_user: 0.90000 rusage_system: 0.120000 curr_items: 10000 total_items: 10000 bytes: 5430000 curr_connections: 1 total_connections: 3 connection_structures: 2 cmd_get: 0 cmd_set: 10000 get_hits: 0 get_misses: 0 evictions: 0 bytes_read: 5430000 bytes_written: 5430000 limit_maxbytes: 0 threads: 1

memslap -s localhost Threads connecting to servers 1 Took 0.866 seconds to load data memstat -s localhost Listing 1 Server Server: localhost (11211) pid: 8651 uptime: 375 time: 1244577237 version: 1.4.0-rc1 pointer_size: 32 rusage_user: 0.110000 rusage_system: 0.130000 curr_items: 10000 total_items: 10000 bytes: 5510000 curr_connections: 5 total_connections: 8 connection_structures: 6 cmd_get: 0 cmd_set: 10000 get_hits: 0 get_misses: 0 evictions: 0 bytes_read: 5510000 bytes_written: 5510000 limit_maxbytes: 0 threads: 5

Ronald Bradford | Enterprise Data Architect | MySQL Subject Matter Expert | Author | Speaker

NoSQL options

multi-threaded memcached