WeSQL Introduction – MySQL running on S3

November 21, 2024 by ronald

I recently became aware of WeSQL. A MySQL-compatible database that separates compute and storage, using S3 as the storage layer. The product uses a columnar format by default which is significantly more space-efficient than InnoDB.

WeSQL introduces a new storage engine called SmartEngine using a LSM-tree-based structure that is ideal for a storage bucket implementation, and documentation shows the implementation of raft replication to combat latency concerns. There is a lot more information to review, the serverless architecture and WeScale, a database proxy and resource manager.

It was very easy to take it for an initial spin using a docker container and an AWS S3 bucket. I would really like to try CloudFlare R2 which implements the S3 API.

Under the covers there are over 180 new variables comprising 83 for the smartengine, 57 for raft, and 22 for objectstore and more. This implies a lot of tunable options and a lot of complexity to optimize for a variety of workloads using the 79 new status variables.

I was able to launch a demo and confirm

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.35    |
+-----------+
1 row in set (0.01 sec)

mysql> SELECT @@wesql_version;
+-----------------+
| @@wesql_version |
+-----------------+
| 0.1.0           |
+-----------------+
1 row in set (0.00 sec)

One of my early tests showed that it does not support FOREIGN KEYS, which is not a major concern.

ERROR 1235 (42000) at line 10: SE currently doesn't support foreign key constraints

I did have some subsequent issues with the current docs version 8.0.35-0.1.0_beta1.37 and I did revert to a prior version from docs earlier this week 8.0.35-0.1.0_beta1.gedaf338.36. Given it’s a very new product I am sure there is a lot of ongoing development.

This is just a quick introduction but it’s a definitely a different architecture in the RDBMS landscape for MySQL compatibility. I hope to run some more tests using the provided sysbench use cases and my own workloads to delve under the covers more.

New Variables

branch_objectstore_id
clone_autotune_concurrency
clone_block_ddl
clone_buffer_size
clone_ddl_timeout
clone_delay_after_data_drop
clone_donor_timeout_after_network_failure
clone_enable_compression
clone_max_concurrency
clone_max_data_bandwidth
clone_max_network_bandwidth
clone_ssl_ca
clone_ssl_cert
clone_ssl_key
clone_valid_donor_list
initialize_branch_objectstore_id
initialize_from_objectstore
initialize_objectstore_bucket
initialize_objectstore_endpoint
initialize_objectstore_provider
initialize_objectstore_region
initialize_objectstore_use_https
initialize_repo_objectstore_id
initialize_smartengine_objectstore_data
objectstore_bucket
objectstore_endpoint
objectstore_mtr_test_bucket_dir
objectstore_provider
objectstore_region
objectstore_use_https
raft_replication_allow_no_valid_entry
raft_replication_appliedindex_force_delay
raft_replication_archive_log_bin_index
raft_replication_archive_recovery
raft_replication_archive_recovery_stop_datetime
raft_replication_auto_leader_transfer
raft_replication_auto_leader_transfer_check_seconds
raft_replication_auto_reset_match_index
raft_replication_check_commit_index_interval
raft_replication_checksum
raft_replication_cluseter_info_on_objectstore
raft_replication_cluster_id
raft_replication_cluster_info
raft_replication_configure_change_timeout
raft_replication_current_term
raft_replication_disable_election
raft_replication_disable_fifo_cache
raft_replication_dynamic_easyindex
raft_replication_election_timeout
raft_replication_flow_control
raft_replication_force_change_meta
raft_replication_force_recover_index
raft_replication_force_reset_meta
raft_replication_force_single_mode
raft_replication_force_sync_epoch_diff
raft_replication_heartbeat_thread_cnt
raft_replication_io_thread_cnt
raft_replication_large_batch_ratio
raft_replication_large_event_split_size
raft_replication_large_trx
raft_replication_learner_heartbeat
raft_replication_learner_node
raft_replication_learner_pipelining
raft_replication_learner_timeout
raft_replication_log_cache_size
raft_replication_log_level
raft_replication_log_type_node
raft_replication_max_delay_index
raft_replication_max_log_size
raft_replication_max_packet_size
raft_replication_min_delay_index
raft_replication_mts_recover_use_index
raft_replication_new_follower_threshold
raft_replication_optimistic_heartbeat
raft_replication_pipelining_timeout
raft_replication_prefetch_cache_size
raft_replication_prefetch_wakeup_ratio
raft_replication_prefetch_window_size
raft_replication_purged_gtid
raft_replication_recover_backup
raft_replication_recover_new_cluster
raft_replication_reset_prefetch_cache
raft_replication_send_timeout
raft_replication_start_index
raft_replication_sync_follower_meta_interva
raft_replication_with_cache_log
raft_replication_worker_thread_cnt
recovery_snapshot_from_objectstore
recovery_snapshot_only
recovery_snapshot_timestamp
recovery_snapshot_tmpdir
repo_objectstore_id
server_id_on_objectstore
serverless
smartengine_auto_shrink_enabled
smartengine_auto_shrink_schedule_interval
smartengine_batch_group_max_group_size
smartengine_batch_group_max_leader_wait_time_us
smartengine_batch_group_slot_array_size
smartengine_block_cache_size
smartengine_block_size
smartengine_bottommost_level
smartengine_bulk_load_size
smartengine_compact
smartengine_compaction_delete_percent
smartengine_compaction_task_extents_limit
smartengine_compaction_threads
smartengine_compression_options
smartengine_compression_per_level
smartengine_concurrent_writable_file_buffer_num
smartengine_concurrent_writable_file_buffer_switch_limit
smartengine_concurrent_writable_file_single_buffer_size
smartengine_data_dir
smartengine_deadlock_detect
smartengine_disable_auto_compactions
smartengine_disable_instant_ddl
smartengine_disable_online_ddl
smartengine_disable_parallel_ddl
smartengine_dump_memtable_limit_size
smartengine_enable_2pc
smartengine_estimate_cost_depth
smartengine_flush_delete_percent
smartengine_flush_delete_percent_trigger
smartengine_flush_delete_record_trigger
smartengine_flush_log_at_trx_commit
smartengine_flush_memtable
smartengine_flush_threads
smartengine_hotbackup
smartengine_idle_tasks_schedule_time
smartengine_level0_file_num_compaction_trigger
smartengine_level0_layer_num_compaction_trigger
smartengine_level1_extents_major_compaction_trigger
smartengine_level2_usage_percent
smartengine_level_compaction_dynamic_level_bytes
smartengine_lock_scanned_rows
smartengine_lock_wait_timeout
smartengine_master_thread_compaction_enabled
smartengine_master_thread_monitor_interval_ms
smartengine_max_background_dumps
smartengine_max_free_extent_percent
smartengine_max_row_locks
smartengine_max_shrink_extent_count
smartengine_max_write_buffer_number_to_maintain
smartengine_memtable_size
smartengine_min_write_buffer_number_to_merge
smartengine_mutex_backtrace_threshold_ns
smartengine_parallel_flush_log
smartengine_parallel_read_threads
smartengine_parallel_recovery_thread_num
smartengine_parallel_wal_recovery
smartengine_pause_background_work
smartengine_persistent_cache_dir
smartengine_persistent_cache_mode
smartengine_persistent_cache_size
smartengine_purge_invalid_subtable_bg
smartengine_query_trace_print_slow
smartengine_query_trace_sum
smartengine_query_trace_threshold_time
smartengine_rate_limiter_bytes_per_sec
smartengine_reset_pending_shrink
smartengine_row_cache_size
smartengine_scan_add_blocks_limit
smartengine_shrink_allocate_interval
smartengine_shrink_table_space
smartengine_sort_buffer_size
smartengine_stats_dump_period_sec
smartengine_strict_collation_check
smartengine_strict_collation_exceptions
smartengine_table_cache_numshardbits
smartengine_table_cache_size
smartengine_total_max_shrink_extent_count
smartengine_total_memtable_size
smartengine_total_wal_size
smartengine_unsafe_for_binlog
smartengine_wal_dir
smartengine_wal_recovery_mode
smartengine_write_disable_wal
snapshot_archive
snapshot_archive_dir
snapshot_archive_expire_auto_purge
snapshot_archive_expire_seconds
snapshot_archive_innodb_tar_mode
snapshot_archive_on_objectstore
snapshot_archive_period
snapshot_archive_smartengine_backup_checkpoint
snapshot_archive_smartengine_tar_mode
table_on_objectstore
wesql_version

New Status

Com_show_consensuslogs
Com_raft_replication_start
Com_raft_replication_stop
Com_native_admin_proc
Com_native_trans_proc
Com_show_consensuslog_events
Smartengine_block_cache_miss
Smartengine_block_cache_hit
Smartengine_block_cache_add
Smartengine_block_cache_index_miss
Smartengine_block_cache_index_hit
Smartengine_block_cache_filter_miss
Smartengine_block_cache_filter_hit
Smartengine_block_cache_data_miss
Smartengine_block_cache_data_hit
Smartengine_row_cache_add
Smartengine_row_cache_hit
Smartengine_row_cache_miss
Smartengine_memtable_hit
Smartengine_memtable_miss
Smartengine_number_keys_written
Smartengine_number_keys_read
Smartengine_number_keys_updated
Smartengine_bytes_written
Smartengine_bytes_read
Smartengine_block_cachecompressed_miss
Smartengine_block_cachecompressed_hit
Smartengine_wal_synced
Smartengine_wal_bytes
Smartengine_write_self
Smartengine_write_other
Smartengine_write_wal
Smartengine_number_superversion_acquires
Smartengine_number_superversion_releases
Smartengine_number_superversion_cleanups
Smartengine_number_block_not_compressed
Smartengine_snapshot_conflict_errors
Smartengine_wal_group_syncs
Smartengine_rows_deleted
Smartengine_rows_inserted
Smartengine_rows_updated
Smartengine_rows_read
Smartengine_system_rows_deleted
Smartengine_system_rows_inserted
Smartengine_system_rows_updated
Smartengine_system_rows_read
Smartengine_max_level0_layers
Smartengine_max_imm_numbers
Smartengine_max_level0_fragmentation_rate
Smartengine_max_level1_fragmentation_rate
Smartengine_max_level2_fragmentation_rate
Smartengine_max_level0_delete_percent
Smartengine_max_level1_delete_percent
Smartengine_max_level2_delete_percent
Smartengine_all_flush_megabytes
Smartengine_all_compaction_megabytes
Smartengine_top1_subtable_size
Smartengine_top2_subtable_size
Smartengine_top3_subtable_size
Smartengine_top1_mod_mem_info
Smartengine_top2_mod_mem_info
Smartengine_top3_mod_mem_info
Smartengine_global_external_fragmentation_rate
Smartengine_write_transaction_count
Smartengine_pipeline_group_count
Smartengine_pipeline_group_wait_timeout_count
Smartengine_pipeline_copy_log_size
Smartengine_pipeline_copy_log_count
Smartengine_pipeline_flush_log_size
Smartengine_pipeline_flush_log_count
Smartengine_pipeline_flush_log_sync_count
Smartengine_pipeline_flush_log_not_sync_count

Managing SQL Drift: Ensuring Stability in Database Transitions

November 9, 2024 by ronald

SQL drift is a significant challenge that occurs when SQL statements from an existing system produce unexpected results after migration to a new environment or system. These issues manifest in several critical ways: SQL statements may generate new execution errors, experience significant performance degradation, or yield differences in data integrity. Such challenges extend beyond simple compatibility issues, stemming from variations in database engines, optimization strategies, and SQL implementations. SQL drift represents a fundamental shift in how SQL behaves across different platforms and versions. Whether during on-premises to cloud migrations, transitions to managed services, database vendor switches, or even routine version upgrades, SQL drift presents a critical consideration for data-driven applications.
Database Migration from on-prem to cloud infrastructure representation.

Database Migration from on-prem to cloud infrastructure representation.

SQL drift frequently occurs during:

On-premises to cloud migrations
Cloud to managed service transitions
Cross-product migrations (e.g., switching database vendors)
Database version upgrades
Platform modernization efforts

The implications of SQL drift can be significant, leading to application instability, increased operational costs, and delayed migration timelines. The impact often extends to compromised data quality and results in a degraded user experience as systems become less reliable and responsive. Successfully managing SQL drift involves four key stages:

Identification
Prioritization
Correction
Validation

Identification is the critical first step in managing SQL drift, focusing on systematically discovering potential issues. This phase involves detecting SQL statements that may behave differently in the new environment, analyzing syntax compatibility across platforms, establishing performance baselines, and validating data outputs to ensure consistency.

Prioritization involves evaluating SQL drift issues based on business impact, risk assessment, resource allocation, and migration scheduling to determine the optimal order for resolution.

Correction addresses SQL drift through code remediation, performance optimization, syntax updates, and developing alternative solutions when necessary.

Validation confirms SQL drift corrections through comprehensive testing, performance verification against established baselines, and data integrity checks to ensure the corrected SQL maintains its intended functionality.

An effective way to demonstrate the impact of SQL drift is by using a sample collection of SQL statements executed across different versions of MySQL. The End of Life (EOL) for MySQL 5.7, coupled with AWS RDS and AWS RDS Aurora beginning extended support in 2024, has increased costs for organizations that are not proactive in managing database migrations. This situation is particularly common in development-focused teams that lack dedicated architecture and operations resources.

A MySQL demonstration of SQL Drift

Using a subset of SQL statements executed in MySQL 5.7 and subsequent MySQL versions 8.0, 8.4, 9.0, and 9.1,Next BaseLine can examine the impact of SQL drift. This output shows the changing state of errors, deprecations, warnings and notices for the 42 example SQL statements.

Next BaseLine MySQL Demo output

Example Output from Next BaseLine

In MySQL 5.7, the use of the keyword SQL_NO_CACHE in an SQL statement presents as a deprecated warning.

17 Deprecations
ID: 5, Hash: f31f2e99b2
  SQL: "SELECT SQL_NO_CACHE 1;"
  Deprecation: (1681) 'SQL_NO_CACHE' is deprecated and will be removed in a future release.

In MySQL 8.0, the MySQL Query Cache is removed, however the use of SQL_NO_CACHE in SQL statements is still valid. Even in the next GA version, 8.4, this SQL keyword is still on the deprecated list, and it continues to deprecated in the current 9.1 innovation release.

A different example of deprecated functions are ENCRYPT and DES_ENCRYPT.

ID: 17, Hash: 947fcef53a
  SQL: "SELECT ENCRYPT('BaseLine',1);"
  Deprecation: (1287) 'ENCRYPT' is deprecated and will be removed in a future release. Please use AES_ENCRYPT instead
ID: 18, Hash: 364c0ffbf4
  SQL: "SELECT DES_ENCRYPT('BaseLine');"
  Deprecation: (1287) 'DES_ENCRYPT' is deprecated and will be removed in a future release. Please use AES_ENCRYPT instead

In MySQL 8.0, these SQL statements produce a hard error. These actually present as internal functions that are not present in the schema used rather than a “FUNCTION does not exist”. (More on this later).

ID: 17, Hash: 947fcef53a
  SQL: "SELECT ENCRYPT('BaseLine',1);"
  Error 1370 (42000): execute command denied to user 'nextbaseline'@'%' for routine 'airport.ENCRYPT'
ID: 18, Hash: 364c0ffbf4
  SQL: "SELECT DES_ENCRYPT('BaseLine');"
  Error 1370 (42000): execute command denied to user 'nextbaseline'@'%' for routine 'airport.DES_ENCRYPT'

Some example GIS SQL statements that in MySQL 5.7 present as deprecated, however they each are a different error number.

ID: 19, Hash: f319748e0c
  SQL: "SELECT CONTAINS(ST_GeomFromText('POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))'), ST_GeomFromText('POINT(5 5)'));"
  Deprecation: (1287) 'CONTAINS' is deprecated and will be removed in a future release. Please use MBRCONTAINS instead
ID: 20, Hash: d686267b19
  SQL: "SELECT ST_GeomFromWKB(Point(0, 0));"
  Deprecation: (3195) st_geometryfromwkb(geometry) is deprecated and will be replaced by st_srid(geometry, 0) in a future version. Use st_geometryfromwkb(st_aswkb(geometry), 0) instead.

In MySQL 8.0+, these two deprecated statements produce different error messages.

ID: 19, Hash: f319748e0c
  SQL: "SELECT CONTAINS(ST_GeomFromText('POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))'), ST_GeomFromText('POINT(5 5)'));"
  Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(ST_GeomFromText('POLYGON((0 0, 0 10, 10 10, 10 0, 0 0))'), ST_GeomFromText('POI' at line 1
ID: 20, Hash: d686267b19
  SQL: "SELECT ST_GeomFromWKB(Point(0, 0));"
  Error 3037 (22023): Invalid GIS data provided to function st_geomfromwkb.

Migrating a WordPress site

A more realistic example would involve taking production workload, such as WordPress running on a self-hosted MySQL 5.7 server, and assessing the potential impact of switching to MySQL 8.0 without upgrading the application (not a recommended approach). We have collected representative Production SQL statements for this WordPress setup, referred to as a BaseLine

Next BaseLine MySQL WP Baseline

After collecting SQL traffic and testing this workload against a MySQL 5.7 environment, previously unnoticed SQL warnings were highlighted for the team.

Next BaseLine MySQL WP MySQL 5.7

When executed against an upgraded MySQL 8.0 instance, problematic SQL statements were immediately identified. For a larger, more complex product, this process would help prioritize where resources are most needed.

Next BaseLine MySQL WP MySQL 8.0

A modern cloud database implementation

Finally, let’s consider TiDB from PingCap as an example of validating your application with a cloud implementation. Using the same small set of 42 SQL statements, TiDB has taken a proactive approach by entirely eliminating warnings in their MySQL protocol. In TiDB, SQL statements are now either valid SQL syntax or produce a hard error.

Next BaseLine MySQL TiDB output

What was a deprecation for ENCRYPT is now a hard error. Also, a more correct error message is provided ‘FUNCTION does not exist’.

ID: 17, Hash: 947fcef53a
  SQL: "SELECT ENCRYPT('BaseLine',1);"
  Error 1305 (42000): FUNCTION ENCRYPT does not exist
ID: 18, Hash: 364c0ffbf4
  SQL: "SELECT DES_ENCRYPT('BaseLine');"
  Error 1305 (42000): FUNCTION DES_ENCRYPT does not exist

In MySQL 5.7, ENCODE was deprecated and in MySQL 8.0+ it was removed. In TiDB, it is a valid function.

TiDB also produces some interesting artifacts in error messages for SQL statements not seen with MySQL. An example is Error 1235 ... has only noop implementation in tidb now .... This syntax however shows that a setting can change the status of these SQL statements.

ID: 14, Hash: dbcb4b05a2
  SQL: "SELECT table_name, count(*) FROM information_schema.tables GROUP BY table_name ASC;"
  Error 1235 (42000): function GROUP BY expr ASC|DESC has only noop implementation in tidb now, use tidb_enable_noop_functions to enable these functions
...
ID: 26, Hash: 7369c77d51
  SQL: "SELECT SQL_CALC_FOUND_ROWS * FROM information_schema.schemata;"
  Error 1235 (42000): function SQL_CALC_FOUND_ROWS has only noop implementation in tidb now, use tidb_enable_noop_functions to enable these functions

Even during development, an interesting and unintended bug in early testing, resulted in an interesting error using TiDB.

ID: 31, Hash: 9cae50cbfc
  SQL: "SELECT DATE('2024-01-01   10:00:00'); /* Example of bad data causing warning */SELECT 'abc' AS full;"
  Error 8130 (HY000): client has multi-statement capability disabled. Run SET GLOBAL tidb_multi_statement_mode='ON' after you understand the security risk

Conclusion

Next BaseLine is now available in limited beta. Eliminate the uncertainty around “Will the migration work?” by performing an independent risk assessment of your product in a migrated database environment before committing to ad-hoc engineering efforts. If you’re interested in seeing a demo with your own SQL workload, you can register here.

Next BaseLine currently supports MySQL, PostgreSQL, Oracle, and SQL Server RDBMS products, covering both self-hosted and cloud-managed implementations across AWS, GCP, Azure, and Alibaba. It supports multiple MySQL- and PostgreSQL-compatible databases, including TiDB, SingleStore, Neon Serverless, Nile, ElephantSQL, TimeScale, and more. Additional compatibility is available for Snowflake, ClickHouse, and DuckDB.

RDS MySQL Aurora 3.07.0 is unusable for upgrades

June 21, 2024 by ronald

Yesterday I detailed an incompatible breakage with RDS MySQL Aurora 3.06.0, and one option stated is to upgrade to the just released 3.07.0.

Turns out that does not work. It is not possible to upgrade any version of AWS RDS MySQL Aurora 3.x to 3.07.0, making this release effectively useless.

3.06.0 to 3.07.0 fails

$ aws rds modify-db-cluster --db-cluster-identifier $CLUSTER_ID --engine-version 8.0.mysql_aurora.3.07.0 --apply-immediately

An error occurred (InvalidParameterCombination) when calling the ModifyDBCluster operation: Cannot upgrade aurora-mysql from 8.0.mysql_aurora.3.06.0 to 8.0.mysql_aurora.3.07.0

3.06.0 to 3.06.1 succeeds

Sometimes you need to be on the current point release of a prior version.

3.06.0 to 3.07.0 fails

$ aws rds modify-db-cluster --db-cluster-identifier $CLUSTER_ID --engine-version 8.0.mysql_aurora.3.07.0 --apply-immediately

An error occurred (InvalidParameterCombination) when calling the ModifyDBCluster operation: Cannot upgrade aurora-mysql from 8.0.mysql_aurora.3.06.1 to 8.0.mysql_aurora.3.07.0

There is no upgrade path

You can look at all valid ValidUpgradeTarget for all versions. There is in-fact no version that can upgrade to AWS RDS Aurora MySQL 3.07.0.
Seems like a common test pattern overlooked.

$ aws rds describe-db-engine-versions --engine aurora-mysql

...

{
            "Engine": "aurora-mysql",
            "Status": "available",
            "DBParameterGroupFamily": "aurora-mysql8.0",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": false,
            "DBEngineDescription": "Aurora MySQL",
            "SupportedFeatureNames": [],
            "SupportedEngineModes": [
                "provisioned"
            ],
            "SupportsGlobalDatabases": true,
            "SupportsParallelQuery": true,
            "EngineVersion": "8.0.mysql_aurora.3.04.1",
            "DBEngineVersionDescription": "Aurora MySQL 3.04.1 (compatible with MySQL 8.0.28)",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "aurora-mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "Aurora MySQL 3.04.2 (compatible with MySQL 8.0.28)",
                    "EngineVersion": "8.0.mysql_aurora.3.04.2"
                },
                {
                    "Engine": "aurora-mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "Aurora MySQL 3.05.2 (compatible with MySQL 8.0.32)",
                    "EngineVersion": "8.0.mysql_aurora.3.05.2"
                },
                {
                    "Engine": "aurora-mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "Aurora MySQL 3.06.0 (compatible with MySQL 8.0.34)",
                    "EngineVersion": "8.0.mysql_aurora.3.06.0"
                },
                {
                    "Engine": "aurora-mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "Aurora MySQL 3.06.1 (compatible with MySQL 8.0.34)",
                    "EngineVersion": "8.0.mysql_aurora.3.06.1"
                }
            ]
        },

Database testing for all version changes (including minor versions)

June 19, 2024 by ronald

We know that SQL statement compatibility can change with major database version upgrades and that you should adequately test for them. But what about minor version upgrades?

It is dangerous to assume that your existing SQL statements work with a minor update, especially when using an augmented version of an open-source database such as a cloud provider that may not be as transparent about all changes.

While I have always found reading the release notes an important step in architectural principles over the decades, many organizations skip over this principle and get caught off guard when there are no dedicated DBAs and architects in the engineering workforce.

Real-world examples of minor version upgrade issues

Here are two real-world situations common in the AWS RDS ecosystem using MySQL.

You are an organization that uses RDS Aurora MySQL for its production systems, and you upgrade one minor version at a time. A diligent approach is to be one minor version behind unless a known bug is fixed in a newer version you depend on.
You are an organization that, to save costs with a comprehensive engineering team, uses AWS RDS MySQL (not Aurora) for developers and some testing environments.

I’ve simplified a real-world example to a simple SQL statement and combined these two separate use cases into one simulated situation for demonstration purposes.

mysql> SELECT content_type FROM reserved2;
Empty set (0.00 sec)

mysql> SELECT VERSION(), @@aurora_version;
+-----------+------------------+
| VERSION() | @@aurora_version |
+-----------+------------------+
| 8.0.28    | 3.04.2           |
+-----------+------------------+

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.34    |
+-----------+
1 row in set (0.00 sec)

This is a simple enough query, this runs in AWS RDS Aurora MySQL 3.04.02 (which is the present Aurora MySQL long-term support (LTS) release). This is based on MySQL 8.0.28 which is FWIW not a supported AWS RDS MySQL version anymore, the minimum is now 8.0.32 (Supported MySQL minor versions on Amazon RDS).

It runs in AWS RDS MySQL 8.0.34 which is for example what version your developer setup is.

An AWS RDS MySQL Aurora minor version upgrade

You decide to upgrade from Aurora 3.04.x/3.05.x to 3.06.x. This Aurora version is actually based on MySQL 8.0.34 (the version you just tested in RDS). Without adequate due diligence you roll out to production only to find after the fact that this SQL statement (realize this is one simplified example for demonstrate purposes) now breaks for no apparent reason.

mysql> select content_type from reserved2;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'content_type from reserved2' at line 1

mysql> SELECT VERSION(), @@aurora_version;
+-----------+------------------+
| VERSION() | @@aurora_version |
+-----------+------------------+
| 8.0.34    | 3.06.0           |
+-----------+------------------+

Now, you need to investigate the problem, which can take hours, even days of resource time, and a lot of shaking heads to realize it has nothing to do with your application code but to do with the minor version upgrade. Which you simply cannot roll back. See Risks from auto upgrades with managed database services for some interesting facts.

Wait, what just happened?

If you performed this upgrade to the latest AWS RDS Aurora MySQL 3.06.0 version sometime after the release on 3/7/24 and before 6/4/24, a 3-month period, you are left with one choice. You have to make application code changes to address the breakage.

How many man-hours/man-days does this take? If you upgraded to this version in the past two weeks, technically you have a second choice. You can go to the most current version, 3.07.0, but you have already spent time in testing and deploying 3.06.0, which you need to re-test, then rollout in non-production accounts and then rollout to production. How many man-days of work is this?

It may be hard to justify the cost of automated testing until you uncover a situation like this one; however, it can easily be avoided in the future.

So why did this happen?

Lets look deeper are the fine-print

RDS Aurora MySQL 3.06.0

Aurora MySQL version 3.06.0 supports Amazon Bedrock integration and introduces the new reserved keywords accept, aws_bedrock_invoke_model, aws_sagemaker_invoke_endpoint, content_type, and timeout_ms. Check the object definitions for the usage of the new reserved keywords before upgrading to version 3.06.0. To mitigate the conflict with the new reserved keywords, quote the reserved keywords used in the object definitions. For more information on the Amazon Bedrock integration and handling the reserved keywords, see What is Amazon Bedrock? in the Amazon Aurora User Guide. For additional information, see Keywords and Reserved Words, The INFORMATION_SCHEMA KEYWORDS Table, and Schema Object Names in the MySQL documentation.

From AWS RDS Aurora MySQL 3.06.0 release notes (3/7/24).

While less likely you would name a column aws_bedrock_invoke_model, column names of content_type and timeout_ms are common words.

RDS Aurora MySQL 3.07.0

Aurora MySQL version 3.06.0 added support for Amazon Bedrock integration. As part of this, new reserved keywords (accept, aws_bedrock_invoke_model, aws_sagemaker_invoke_endpoint, content_type, and timeout_ms) were added. In Aurora MySQL version 3.07.0, these keywords have been changed to nonreserved keywords, which are permitted as identifiers without quoting. For more information on how MySQL handles reserved and nonreserved keywords, see Keywords and reserved words in the MySQL documentation.

From AWS RDS Aurora MySQL 3.07.0 release notes (6/4/24). Clearly someone at AWS saw the breaking changes and it was reverted. While it’s possible many customers may not need to catch this situation, this is one specific use case.

Conclusion

The moral of the database story here is Be Prepared.

You should always be prepared for future breaking compatibility. You should test with a regular software upgrade cadence and leverage automation as much as possible.

Next BaseLine is a software product that automates testing for many use cases, including this simple SQL compatibility issue. By adding to your CI/CD pipeline can help identify and risk in all SQL database access, including new engineering software releases or infrastructure updates. This product can be implemented in a few hours, and cost significantly less than the large amount of time lost with this one realistic situation.

Footnote

This example was not uncovered from a customer situation. It was uncovered and used as a demonstration because I read the release notes.

Test Case

SELECT VERSION(); SELECT VERSION(), @@aurora_version; /* No way to comment out the !Aurora example */ CREATE SCHEMA IF NOT EXISTS test; USE test; CREATE TABLE reserved1(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, accept CHAR(1) NOT NULL DEFAULT 'N'); CREATE TABLE reserved2(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, content_type VARCHAR(10) NULL DEFAULT 'text/plain'); CREATE TABLE reserved3(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, timeout_ms INT UNSIGNED NOT NULL); SELECT accept FROM reserved1; SELECT content_type FROM reserved2; SELECT timeout_ms FROM reserved3;

Are you patching your AWS RDS MySQL 5.7 EOL databases?

June 18, 2024 by ronald

Recently, I noticed a second AWS RDS MySQL 5.7 version available 5.7.44-rds.20240408. Curious what this was as 5.7.44 is the only RDS 5.7.x EOL version available, I launched an instance to discount this as errant metadata.

Today I noticed a second version 5.7.44-rds.20240529. I do not run a MySQL 5.7 AWS RDS instance or pay the AWS Extended Support tax, so I would not receive any notices or recommendations that customers may be receiving.

Image generated by ChatGPT. Mistakes left as a reminder genAI is not there yet for text.

I needed to do some searching before I found a reference here and then this announcement that mentions 5.7.44-RDS.20240408 as a vulnerability fix. This document does not mention the second version however, based on the dates, this was 18 days ago? There is also no whats-new announcement of this second version. With more searching I also came across Extended Support Version Standards which is not linked from the extended support page that describes this new format.

Are AWS customers being informed they need to continue with a minor version upgrade cadence as you would normally perform? Is it now more important because only more severe vulnerabilities will get backported? The CVE-2024-20963 does only mention MySQL 8.0 and above, but as Oracle has officially marked 5.7 as EOL. This does align with the AWS Extended Support commitment to keep MySQL 5.7.

If you are a customer that has auto-minor upgrades enabled for MySQL 5.7 be aware of the risks from auto upgrades with managed database services.

If you are running AWS RDS PostgreSQL 11 (11.22), the same need applies. There are also version updates. These
Postgres docs also show extension fixes for 20240529 but no mention of a vulnerability. This may be the trigger for the same named RDS MySQL version and there are no actual modifications?

Why are you still running MySQL 5.7?

Next BaseLine identifies and categorizes the risks for SQL migration from MySQL 5.7 to MySQL 8.0 and can accelerate moving off this EOL software version. Stop paying the AWS Extended Support Tax. Get started at https://app.kanangra.io/.

Current AWS RDS MySQL versions

$ aws rds describe-db-engine-versions --engine mysql
{
    "DBEngineVersions": [
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql5.7",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "5.7.44",
            "DBEngineVersionDescription": "MySQL 5.7.44",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 5.7.44-rds.20240408",
                    "EngineVersion": "5.7.44-rds.20240408"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 5.7.44-rds.20240529",
                    "EngineVersion": "5.7.44-rds.20240529"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.28",
                    "EngineVersion": "8.0.28"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.32",
                    "EngineVersion": "8.0.32"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.33",
                    "EngineVersion": "8.0.33"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.34",
                    "EngineVersion": "8.0.34"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.35",
                    "EngineVersion": "8.0.35"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql5.7",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "5.7.44-rds.20240408",
            "DBEngineVersionDescription": "MySQL 5.7.44-rds.20240408",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 5.7.44-rds.20240529",
                    "EngineVersion": "5.7.44-rds.20240529"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.28",
                    "EngineVersion": "8.0.28"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.32",
                    "EngineVersion": "8.0.32"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.33",
                    "EngineVersion": "8.0.33"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.34",
                    "EngineVersion": "8.0.34"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.35",
                    "EngineVersion": "8.0.35"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql5.7",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "5.7.44-rds.20240529",
            "DBEngineVersionDescription": "MySQL 5.7.44-rds.20240529",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.28",
                    "EngineVersion": "8.0.28"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.32",
                    "EngineVersion": "8.0.32"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.33",
                    "EngineVersion": "8.0.33"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.34",
                    "EngineVersion": "8.0.34"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.35",
                    "EngineVersion": "8.0.35"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": true,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql8.0",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "8.0.32",
            "DBEngineVersionDescription": "MySQL 8.0.32",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.33",
                    "EngineVersion": "8.0.33"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.34",
                    "EngineVersion": "8.0.34"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": true,
                    "Description": "MySQL 8.0.35",
                    "EngineVersion": "8.0.35"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql8.0",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "8.0.33",
            "DBEngineVersionDescription": "MySQL 8.0.33",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.34",
                    "EngineVersion": "8.0.34"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": true,
                    "Description": "MySQL 8.0.35",
                    "EngineVersion": "8.0.35"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql8.0",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "8.0.34",
            "DBEngineVersionDescription": "MySQL 8.0.34",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": true,
                    "Description": "MySQL 8.0.35",
                    "EngineVersion": "8.0.35"
                },
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql8.0",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "8.0.35",
            "DBEngineVersionDescription": "MySQL 8.0.35",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": [
                {
                    "Engine": "mysql",
                    "IsMajorVersionUpgrade": false,
                    "AutoUpgrade": false,
                    "Description": "MySQL 8.0.36",
                    "EngineVersion": "8.0.36"
                }
            ]
        },
        {
            "Engine": "mysql",
            "Status": "available",
            "DBParameterGroupFamily": "mysql8.0",
            "SupportsLogExportsToCloudwatchLogs": true,
            "SupportsReadReplica": true,
            "DBEngineDescription": "MySQL Community Edition",
            "SupportedFeatureNames": [],
            "SupportsGlobalDatabases": false,
            "SupportsParallelQuery": false,
            "EngineVersion": "8.0.36",
            "DBEngineVersionDescription": "MySQL 8.0.36",
            "ExportableLogTypes": [
                "audit",
                "error",
                "general",
                "slowquery"
            ],
            "ValidUpgradeTarget": []
        }
    ]
}

The curse of MySQL warnings

June 10, 2024 by ronald

MySQL warnings are an anti-pattern when it comes to maintaining data integrity. When the information retrieved from a database does not match what was entered, and this is not identified immediately, this can be permanently lost.

MySQL by default for several decades until the most recent versions enabled you to insert incorrect data, or insert data that was then truncated, or other patterns that resulted in failed data integrity. Very few applications considered handling warnings as errors, and there is a generation of software products that have never informed the developers that warnings were occurring.

The most simplest example is:

CREATE SCHEMA IF NOT EXISTS warnings;
USE warnings;

CREATE TABLE short_name(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  name VARCHAR(20) NOT NULL,
  PRIMARY KEY(id)
);

INSERT INTO short_name (name) VALUES ('This name is too long and will get truncated');
ERROR 1406 (22001): Data too long for column 'name' at row 1

This is what you expect would happen. In many, many applications IT DOES NOT.

For almost 20 years the default setting was to support possible data corruption

If you used an older version without setting up a more strict SQL_MODE from the default you end up with.

INSERT INTO short_name (name) VALUES ('This Name is too long and will get truncated');
Query OK, 1 row affected, 1 warning (0.00 sec)

SELECT * FROM short_name;
+----+----------------------+
| id | name                 |
+----+----------------------+
|  1 | This name is too lon |
+----+----------------------+
1 row in set (0.00 sec)

Only if you run SHOW WARNINGS and after the actual SQL statement would you know? There is no other way to find this information in any logs. There is no way to

mysql> SHOW WARNINGS;
+---------+------+-------------------------------------------+
| Level   | Code | Message                                   |
+---------+------+-------------------------------------------+
| Warning | 1265 | Data truncated for column 'name' at row 1 |
+---------+------+-------------------------------------------+
1 row in set (0.00 sec)

Numerous other examples can shock a customer when, after some time, expected data in a production is lost and unretrievable.

If you came from a more strict RDBMS background, or you tuned your MySQL installation or uncovered this and many other poor defaults, you would have improved your data integrity with and improved SQL_MODE.

So MySQL warnings are bad? No, they are ideal when used appropriately. However, the next critical dilemma occurs.

Warnings are valuable when used to identify important characteristics of an SQL statement that a developer or database administrator should be aware of. However, the only way to retrieve these warnings is from the application making the connection to the database at each statement, and generally, these warnings are just lost.

Here are some examples of warnings that are important for the engineering team that define criteria such as deprecation notices, which are important for production database upgrades.

SELECT JSON_MERGE('["a"]','["b"]'); 
Warning (Code 1287): 'JSON_MERGE' is deprecated and will be removed in a future release. Please use JSON_MERGE_PRESERVE/JSON_MERGE_PATCH instead

SELECT ST_GeomFromWKB(Point(0, 0));
Warning: (3195) st_geometryfromwkb(geometry) is deprecated and will be replaced by st_srid(geometry, 0) in a future version. Use st_geometryfromwkb(st_aswkb(geometry), 0) instead.

SELECT DATE('2024-01-01 10:00:00') 
Warning (Code 4096): Delimiter ' ' in position 11 in datetime value '2024-01-01 10:00:00' at row 1 is superfluous and is deprecated. Please remove.

SELECT BINARY 'a' = 'A' 
Warning (Code 1287): 'BINARY expr' is deprecated and will be removed in a future release. Please use CAST instead

You definitely want to know about these, collect them (hard), add them to your backlog, and don’t leave it until its too late in the I can’t upgrade my database to have to address.

If you want to know about these, collect them (hard), add them to your backlog, and don’t leave it until it’s too late for a critical last-minute upgrade to my database to have to address.

There are also warnings that should be collected and used for performance verification, which apply to running systems. I wanted to show one specific example uncovered during testing of a MySQL upgrade to version 8.0.

Warning (Code 3170): Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.

In fact, this warning occurs in MySQL 5.7, but the customer never knew because they did not look at the warnings. How many other SQL statements in your application produce warnings now? How can you find this out?

It was rather easy to create a reproducible test case but what now?

Do you set range_optimizer_max_mem_size=0
Do you set to the value you need, which you can identify with SELECT * FROM performance_schema.memory_summary_by_thread_by_event_name WHERE thread_id=PS_CURRENT_THREAD_ID() AND event_name='memory/sql/test_quick_select'\G
Do you need to modify your optimizer_switch settings?
Do you try something else?
Do you refactor your application?
Do you just leave it as is?

When you want to consider several different options, which one works best for this query? What about the impact on your entire production workload? Knowing statistically which is the best choice for your full workload and under various conditions is the optimal output, but how?

Next BaseLine was built to perform experiments comparing changes to your data, configuration, and infrastructure to validate the next version of your product statistically performs better than your current version across all of your application at different workloads.

Next BaseLine also provides numerous benefits for a major database upgrade, so I’ve focussed on getting these capabilities to customers quicker to save money. It provides the benefit of detecting SQL statements that produce errors in the next MySQL version, enabling you to categorize and prioritize areas of your application that must be corrected. It also captures important information about the performance and quality of the data from your MySQL queries; this also can help in identifying the most critical aspects of your application to invest engineering time and mitigate risk in your database upgrade plan. It can also collect warning messages such as these discussed when considering migrating from MySQL 5.7 to MySQL 8, or it can just find them with your current application.

What is your pain point with MySQL database upgrades? What are you doing right now to help reduce this additional budget spend? Join our private beta program now to find out more.

Next BaseLine

Helping to create a better and faster next version of your data-driven product

What happened to Digital Tech Trek Digest?

May 30, 2024 by ronald

I started 2024 with several goals. The first goal was to iterate over some weekend project ideas and actually deploy them. These were never designed to make money or have widespread value; they were an exercise in iterating over an idea in preparation for a larger project. This led to InstanceHunt, which turned out to be very useful, lead to a few interesting leads including MicroConf. I also have a number of future features on my todo list. I have always wanted to keep product configs so I dropped Configs Info the next month as a weekend project. Likewise, there are plenty of additional feature ideas.

A second goal was to read more outside my comfort zone. As part of keeping notes, I started publishing weekly the Digital Tech Trek Digest for the first three months of 2024. It was not part of the original goal to share the details; it just worked out that way. I still read many of my sources regularly including TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS, Micro Newsletter and more. I guess my draft of lucky #13 never made it to publication. However, my time of writing was in preparation toward my next goal of 2024.

My third goal for 2024 was to pursue taking an idea into an actual product and to look at making founder-led sales and have a goal of producing a product. I expected this would be most of 2024 before I had a deliverable. Of the 5-7 ideas I considered, I decided on one and started creating a POC, creating some slides, talking with people, and spending a lot of time on how I could market and sell my startup idea. On March 26, I decided to take my large startup idea and narrow the focus to a specific and immediate problem that could be solved; however, this is not the goal of my startup. That problem was helping customers stop paying the AWS tax of RDS Extended Support.

For the past two months, I’ve been striving to accelerate creating a SaaS solution, learning how to rewrite all my code on Go, thinking about how to identify my Ideal Customer Profile (ICP), determine a Monthly Recurring Revenue (MRR) fixture, and talk with prospects and potential partners. All this also happening in May, which is surprisingly a month of way too many end-of-school events.

Stay tuned. News is coming soon.

Digital Tech Trek Digest [#Issue 2024.12]

March 20, 2024 by ronald

Falsehoods programmers believe about time zones

If I told you there was a timezone 30 minutes past the hour, would you believe me? In a small section of Western Australia, there is. However, time zones (TZ) are way more complicated. If you have not missed a meeting due to a TZ mixup, you haven’t worked in a multi-national company or had a meeting in a country with >1 timezone.

I shared this article with several friends, and this response sums it up. “I think documenting failure is as important as celebrating success. Always share what you’ve learned. And adding in humor makes it human.”

PostgreSQL Schema Changes with pg_osc

While Online Schema Change (OSC) has been part of the MySQL ecosystem for decades, evolving from pt-osc (previously MaaKit) to gh-ost and now spirit this is the first time I’ve head of OSC for PostgreSQL. It likely has existed in some form for some time.

Source: LinkedIn

Pricing your product

One key takeaway from the MicroConf Founder-lead Sales event was pricing. “Be transparent and provide a number.” is basically what I noted as the essence. Well, I’m going to track sites that do not offer pricing to evaluate why. My first two entrants are Vanta and CultureAmp. Why is a price not offered? Is it a high-entry point, too complex to articulate easily, a way to charge the customer what they are willing to pay, or perhaps it’s not credit-card worthy but a detailed commitment? As a child, I remember being told that if I had to ask the price when it was not visible, I could not afford it. Does this apply to SaaS?

Postgres is eating the database world

The Extension is a fundamental growth mindset for PostgreSQL. Combined with the protocol for TCP communications, a robust and growing ecosystem can be seen with PostgreSQL. Add the wisdom of moving to an annual release cadence, and these factors would seem to highlight that PostgreSQL is quickly outpacing MySQL in the RDBMS open-source ecosystem. This article written by the creators of Pigsty, and other players including Tembo and Aiven are re-enforcing the narrative.

Source: LinkedIn

Why should I not upload images of code/data/errors?

This post is a comment by Bill Karwin to Stop with the Video Documentation by Jon Sustar. This post is just the recipe that should be enshrined in the ticket support system of any company when any user tries to upload an image. When a text command is provided, and the response is not provided back in the text, it’s hugely inefficient.

Source: LinkedIn

About “Digital Tech Trek Digest”

Most days, I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new about professional and personal topics of interest. I turn what I read into actionable notes in a short, committed time window, summarizing what I learned, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow. Also Scientific American Technology, Fareed’s Global Briefing, Software Design: Tidy First? by Kent Beck, Last Week in AWS, Micro Newsletter to name a few.

New Additions

I have added Building a boring, but wildly profitable, online business portfolio as a new source to review.

Digital Tech Trek Digest [#Issue 2024.11]

March 13, 2024 by ronald

In his Newsletter Solopreneur Ian Nuttal writes, “I sold my startup (again).”

In 4 months URL Monitor scaled far beyond what I expected:

550+ customers
2 million indexed pages
17 million pages monitored
$100k+ ARR

You can follow Ian on Twitter/X. (Will the word every drop the word Twitter, I would say no).

New AWS Console functionality

If you have ever tried to keep up with AWS News and Product Announcements even for a subset of products, let me know how. While I try to keep monitoring, sometimes you accidentally see a new feature. I am not a fan of using GUI interfaces. I’m all about the CLI and APIs. However, one must always spin up the AWS Console to look at what new blurb is being presented in the 10+ database products to

AWS has started offering more in-depth recommendations, but you need to Install or update to the latest version of the AWS CLI to 2.15+ to see them programmatically.

What is the doc format to use

I have moved to use MarkDown (.md) for all of my repo documentation, but there are different Markdown variants (city). I was struck by the above AWS documentation using `.rst,` known as reStructuredText, for its documentation.

MicroConf Remote 8.0: Early Stage SaaS Sales!

I recently this event as I am an entrepreneur looking at how to price

Founder-Led Sales Best Practices: Getting the 80/20 out of your sales efforts but Craig Hewitt

Selling with words: what early-stage startup founders tend to get wrong by Sam Howard. Several attendees, including myself, installed Hotjar following this presentation.

How to Build Scalable Founder-led Sales by Rachel Liaw
How to Build Your First Sales Process as a Technical Founder by Daniel Herbert.

I also got to speak to several founders 1:1, everything from I have an idea, to executing successful startups, including Sponsy (impressive logos) and PlaybookWriter.

About “Digital Tech Trek Digest”

Today’s interesting websites experience

Nice 500 gitHub

Digital Tech Trek Digest [#Issue 2024.10]

March 6, 2024 by ronald

Google advances with vector search in MySQL, leapfrogging Oracle in LLM support

As the title states, GCP is the first MySQL-managed service to offer “vector” support. Clearly the buzz-word of 2024 along with RAG, genAI and LLM is so 2023.

IMO, Oracle should just rename MySQL Heatwave to Heatwave. It would distinguish the product as unique, which is is,

4 Signs You Belong In a Startup Accelerator for SaaS Founders

YouTube

Accelerators help early stage startup and enable businesses to grow faster, get mentorship, obtain funding/equitya and are generally industry specific. In this example Tinyseed focusses on B2B.

Some tips for considering an acclerator.

1. You know your numbers and your focussing on the right ones
– MRR
– Churn
-LTV or ACV – Lifetime value, Annual Contract Value
Vanity metrics/candy metrics. Email Subscribers, Free Trial Users, Unique Visitors. Without context on how they are growing paid users

2. You experimenting, especially in marketing and sales
– You should always be tweaking and finding the right market.

3. Your coachable
– You are going to use the resources provided and you are vulnerable (receiving a lot of feedback, lower your defenses)
– Your accountable to groups, e.g. masterminds
– Be open to criticism

4. You are fast and furious
– Your drive and need to be moving forward, never satisfied with status quo
– you try new things
– Nothing is perfect at the beginning, experimentation is a key

The Founder’s Guide to Stealth Startups

EverydaySpy

In this postcast “The Diary of a CEO“, Steven Bartlett interviews Andrew Bustamante. Andrew is a former covert CIA intelligence officer and US Air Force combat veteran. He is the founder of EverydaySpy, an online education platform that teaches real-world international espionage techniques that can be used in everyday life.

Messaging builds narrative. Don’t use mass marketing via social media, believe in your brand.
Marketing, present a message, crafted with an emotion, responsing showing motivation.

Competition is for Losers with Peter Thiel (How to Start a Startup 2014: 5)

This is a very old presentation that was recently re-shared with me.

In this presentation, Thiel starts off his presentation with the concept of “Avoid competition”

Creating value is a very simple formula of two things. Create X$ for the world, and you capture Y% of X where X and Y are independent variables.

A big piece of a small pie can drastically affect profit margin. All United State airline carriers combined compare with Google. Much smaller, much higher value.

He goes on to talk about effectively two types of businesses, a competitive business or a monopoly, there is no in between.

Are you talking about data the WRONG WAY?

Scott Taylor, a colleague I discovered lived in a neighboring town and whom I could meet in person after attending a virtual conference event, asks a very valid question about the importance of data management. He re-iterates the “3Vs” of effective storytelling, Vocabulary, Voice, and Vision. You can discover a lot more information in his book Telling Your Data Story: Data Storytelling for Data Management. The art of Effective Data Storytelling is something I practice daily. It is easy for a data specialist to have the data facts and visualize the data, but the art is being able to drive change by combining data, visualizations, and what I consider the most important component, narrative. I highly recommend Brent Dykes book of the same name, and when combined with Be Data Driven by Jordan Morrow you have a cradle of strategy when discussing data management to organizations that want to become a data-driven organization. It is way more difficult to implement than any plan and strategy you may read and prepare for.

About “Digital Tech Trek Digest”

New Additions to my reading

SaaS Developer Community

Digital Tech Trek Digest [#Issue 2024.09]

February 28, 2024 by ronald

As an entrepreneur, pricing is an important consideration in any evaluation, development, and customer testing. In How To Price A SaaS Product, we see different pricing strategies, cost-based pricing, competitor-based pricing, penetration pricing, value-based pricing, freemium pricing. None of these match what I am ultimately considering: consumption-based pricing. Pricing is critical to define the value proposition statement and determine the range of the total lifetime value (TLV). It can vary greatly for B2C, B2B, and B2B enterprise offerings. If we look at YCombinator https://www.ycombinator.com/library/6h-startup-pricing-101 a basic principle is determining the gap between price and cost. That is your margin and your incentive to sell, and you work with either cost-plus or value-based pricing. Starting with founder-led sales is difficult as you do not have the luxury of a dedicated and experienced head of sales to work on different models and guide a technical founder, even before you enter the minefield of enterprise sales with applicable bids, contract, and compliance complexities. I am drawn back to “Consumption-based pricing is a pricing model that charges customers based on their product or service usage. Consumption-based pricing calculates pricing based on usage volume rather than the number of users and is a popular pricing model for IT services, SaaS, and cloud computing and storage” Cite: Consumption-Based Pricing.

Moving a Billion Postgres Rows on a $100 Budget

I wrote recently about the 1 Billion Row Challenge (1BR). This week, I found this article on the same number with a different title. The objective was not performance; it was cost. PeerDB enables the efficient extraction of data from PostgreSQL into a data warehouse, such as Big Query, ClickHouse or Snowflake. It was interesting to see Arvo as a format used over, for example, Parquet. The product also offers different streaming modes, including log-based (CDC), cursor-based (timestamp or integer), and XMIN-based. I will need to do further research on this new term XMIN-Based.

Test queries against your production database (responsibly)

This post links off to a YouTube video of The Safest Way to Test Postgres Destructive Queries, which provides a basic introduction to branching of the Neon PostgreSQL DBaaS. While the title originally interested me, the example showing the mechanics is like many other product examples in which it is extremely simplistic and not a true representation of “production” size or workload. I see this as a similar concept to AWS RDS Aurora cloning. However, any example should modify the structure of a table, measure the impact of that structure against production queries (note plural), and provide additional metadata rather than just a response time. These are important considerations in my own evaluation of test coverage of data access and the gathering of configuration, data, and infrastructure when running experiments to determine a more optimal data access path or a new functionality requirement. More documentation can be found here on Neon Branching

About “Digital Tech Trek Digest”

Random Wisdom

This week, I was reminded via a very interesting statement that work-life balance and joy in what you do are critically important. You will not find on a tombstone the statement:

“I never worked enough hours.”

Digital Tech Trek Digest [#Issue 2024.08]

February 21, 2024 by ronald

The One Billion Row Challenge Shows That Java Can Process a One Billion Rows File in Two Seconds

Well, it’s way under 2 seconds for the 1brc. The published results are in, and if you’re good, you can read 1 billion data points of weather data and analyze it. The final best number, as per the article release, is “00:00.323″. Yes, that answer is in milliseconds “Result (m:s.ms)”. Mind-blowing.

ScyllaDB Summit 2024

Last week, I attended this virtual event. All the presentations can be found online. I had never used the product before, so while some new features like Tablets were not as applicable in understanding the full impact, the DynamoDB performance and cost comparisons were very applicable.

So what is ScyllaDB? It is a distributed NoSQL DBaaS that speaks Cassandra protocol (do large companies still use this?), and it speaks AWS DynamoDB protocol. That is really interesting to me. You can choose a Cloud Hosted offering, or if you’re into managing your setup, you can use the Open Source ScyllaDB version available from GitHub. I started at ScyllaDB University to get a grip on the basics. I have yet to try the local Docker Compose setup.

Thanks also to the team for the swag which I received.

Playing a game with your CI/CD pipeline

My friend Sergey has created a game in GitLab called GitTerra. Drop a few lines into your .gitlab-ci.yml, and each build will give you a generated 3D map of a city based on your commit. I look forward to some of his next steps, leveraging potentially different colors for languages or different building structures for artifacts found in your commit.

We raised 11.6M to build Serverless Postgres for Modern SaaS

Congrats to Gwen and her co-founder for getting seed funding for Nile Serverless Postgres for Modern SaaS. Awesome news for an entrepreneur, and I’m very hopeful for the success of Nile.

The Safest Way to Test Postgres Destructive Queries

While I am a user of ElephantSQL serverless PostgreSQL and Neon, Nile and Xata are just a few that are competing in the space. With multiple other products that also speak PostgreSQL protocol, you can easily trial a small product in an RDBMS in the cloud at no cost. PostgreSQL is definitely outdoing MySQL in this space. You have the extensive set of NoSQL Cloud offerings, SycllaDB I just mentioned, and D1 by CloudFlare I have yet to try this branching feature for your database, sounds interesting and I’ve added to my just as long list of products to try, as books to read. Nit: It’s PostgreSQL, not Postgres.

About “Digital Tech Trek Digest”

I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short, committed time window is a summary of what I learned, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Digital Tech Trek Digest [#Issue 2024.07]

February 14, 2024 by ronald

Everything you need to know about seed funding for startups

A recent call with a startup founder funded by TinySeed led me to learn about MicroConf and Rob Walling. (Thanks Tony for the info). This has led to a lot of great info in several new newsletters and videos including this video. A few very valuable tips I learned included the answer to Why should you raise funds at all? The 1-9-90 rule, and different types of funding including Indie funding. It was interesting to find out that the TinySeed accelerator is 1 year, and not 13 weeks, which is common in NY. Rather than sharing my notes, go watch the video.

5 Books That Paved My Path to Entrepreneurial Success

I have not heard of any of these books, and I have such a long list, perhaps I need to publish my list and elicit feedback on prioritizing. The list from this article is as follows:

2. Embracing risk with skin in the game
3. Building habit-forming products with Hook
4. The roadmap to a billion dollar app in How to Build a Billion Dollar App

Visualization

Last week I was at two events in Brussels. I chose to head to London to fly home. I found this map present in many tube stations (The tube is the London Metro Subway). It’s been a decade since I was in London, and over two decades since I lived in the UK. I found the new map great. When I mentioned it as a good visualization, I was surprised that locals of the London area thought it was horrible. I saw the value in the visualization, but perhaps others see it like art, “in the eyes of the beholder”. It could also be “habit”.
London Tube Map - 2024 presentation
Typical London Tube map

Cats and Dogs

How many *NIX `cat` memes can there be? Well, a lot cat is the most misused thing by programmers new to Linux. I cringe every time someone uses it wrong in a bash script. Thread below with proper uses of cat only.

Hey, dogs, you are in the count also with HTTP STATUS DOGS. My picks are 300 Multiple Choices, and 429 Too Many Requests for me.

Upcoming Events on my radar

Starting today Feb 14 2024 is the ScyllaDB Summit 2024, more on that next week.
Aiven Live in New York City on Feb 28, 2024
Open Data Science Conference. I had not heard of this before and I’ll not be attending in Boston in April.

About “Digital Tech Trek Digest”

Random

I shared with a colleague on Feb 9. “3 SQL databases walked into a NoSQL bar. A little while later, they walked out because they couldn’t find a table.”

Digital Tech Trek Digest [#Issue 2024.06]

February 7, 2024 by ronald

MySQL Belgian Days 2024 and FOSDEM 2024

In this past week, I’ve been able not just to read or watch digital content online but to meet people in person. In Brussels, first at the MySQL Belgian Days 2024 event, followed by FOSDEM 2024.

There was a wide array of presentations covering many different topics; this is just a summary. Fred talked history of Command Line Monitoring and an intro to the new player Dolphie. Dave Stokes talked security, Sunny Bains gave us a brain dump of TiDB scalable architecture. We got an update on PMM and MySQL on k8s from Peter Zaitsev as well as a chat about his new product coroot. And then a great intro to a new generation of online schema change at scale with Sprit by Morgan Tocker . Alex Rubin shows us how not how to hack MySQL, but how MySQL can hack you. We have all crossed paths as MySQL Inc. employees or MySQL community members since 2006.

Marcelo Altmann gave us a detailed intro of a new era of caching with ReadySet. We also heard updates on Vitess. And that was just Day 1 presentations. The evening event was at the incredibly wall-to-wall packed Delirium Café, sponsored by ReadySet, which we offer great thanks and cheers.

Day 2 was packed with great content about MySQL Shell, MySQL Heatwave ML and Vector, MySQL Router, and the MySQL optimizer from many well-known Oracle MySQLers before amazing awards, Belgian beer, and black vodka, of course.

Congratulations Giuseppe Maxia on your MySQL Legends award at MySQL Belgium Days 2024. It is well deserved for all of your community contributions over the decades.

Check out the details at Unveiling the Highlights: A Look Back at MySQL Belgian Days 2024.

Saturday and Sunday were FOSDEM 24 and its usual location. So many people crossing the university, tunnels, and weird transit paths between all the university lecture halls it can feel like a blur. For the first time, I had no fixed agenda so I could check out random talks on random topics.

A shout-out to many people I know and some new people I met. Colin Charles, Alkin Tezuysal, Walter Heck, Charly Batista, Robert Hodges, Jens Bollmann, Monty Widenius, Matthias Crauwels, Michael Pope, Marcelo Altmann, Emerson Gaudencio, Aldo Junior and tons more I have forgotten to mention by name. There were many conversations also with random community people I didn’t even get names, for example, the team at Canonical.

About “Digital Tech Trek Digest”

Digital Tech Trek Digest [#Issue 2024.05]

January 31, 2024 by ronald

Because the world needs better dashboards

While my professional interests in Building Better Data Insights Faster rely on using visuals and narratives to show data-driven results, “Starting from first principles” is the question you have to ask. Identifying the quality of data sources, the time to delivery, and the confidence of accuracy are critical aspects of any dashboard.

Source: WrapText by Equals

This is the second article I’ve read about Equals in a week, and while I’m not ready to go back to a spreadsheet, this company has some great previous posts with excellent content, such as the 2023 summary and How to ship fast. An appropriate statement would be.

What a year. We embraced AI. We reimagined BI. We waved freemium goodbye. And as the cliché goes, we’re only just getting started.

[Last Week in AWS] Issue #352: New Year, New You, Here’s December in Review

Damm right, I think you are giving too much created by saying “a year”. More than once I had to rewrite code because AWS was years behind standard Python releases. AWS Lambda adds support for Python 3.12.

Whatever was going on with the delays in getting new language runtimes out a year or more after the language version itself was released seems to have been resolved. I wonder how long it’ll take that unpleasant chapter to fade from the collective awareness around Lambda.

Source: Last Week in AWS

Latency is the new outage

While technically a video that I listened to with Getting Started with ElastiCache for Redis Performance & Cost Optimization, this needs to be a slogan used more frequently. It is so true. The speaker in the opening minutes also describes some compelling reasons why our proliferation of data can contribute to a negative impact.

Source: Random AWS reading.

About “Digital Tech Trek Digest”

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short, committed time window is a summary of what I learned, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Digital Tech Trek Digest [#Issue 2024.04]

January 24, 2024 by ronald

NoOps and Serverless solutions

I was reminded of an upcoming expiry of a test website that I have on PythonAnywhere. This site enables you to host, run, and code Python in the cloud without any infrastructure and starts with a free account and then a $5 account. Striving towards NoOps and serverless is an important consideration for any small and simple application, I’d forgotten completely about this service.

5 IT services industry trends on tap for 2024

As major companies either want to use a service provider or maintain a relationship with one, knowing the trends lets you consider what SaaS providers of all the services you use like authentication, security, chatbots, support systems, and more are thinking about.

This article considers these trends:

Cloud cost optimization
Focused transformation, innovation
Investment in generative AI skills
Vertical market focus
Partner programs, reconsidered

Rapid developments in AI will also shape business prospects for consulting firms, MSPs, and systems integrators. AI could potentially provide a way to deliver new capabilities in shorter timeframes that satisfy the C-level demand for a quick ROI.

Source:https://www.techtarget.com/searchitchannel/feature/IT-services-industry-trends-on-tap

Context switching is killing your productivity

I believe the title says it all. The article provides several ways to combat this productivity killer.
Source: https://asana.com/resources/context-switching

Exploding Topics

A colleague pointed me to Exploding Topics. An interesting look at the growth of certain topics over recent years. I’m not sure if they are measuring, articles, products, websites, or just conversations on the topic in question.

Thoughtworks Technology Radar

I spent a lot of time reviewing the recent Thoughtworks Technology Radar. I was hoping that 2024 would issue a current version however Sep 2023 is still recent. My thoughts on the tools, techniques platforms, and frameworks in vogue I’ll leave for a separate post.

Why I’m excited about profit-sharing startups

Every year there is a list of the startups that failed and 2023 failures was no different. There is also the list of likely IPOs for the year. Is it going to be Space X, DataBricks, and Reddit for example.

This article along with a host of links reaching out to sites such as Creator Fund, Humanism and Weekend Fund and other interesting stories re-iterate that it is great people and not great ideas that are the right way of being an entrepreneur. The concept of investing that asks for a return of 1-5% of future earnings is an interesting movement from going down the VC slog.

The article lists these points:

There’s a culture shift in tech toward profit-generating businesses.
There’s a tech shift that enables talent to build more with less.
There’s a regulatory shift that makes exits challenging.

… believe a few big shifts will drive more founders and investors to pursue profit-sharing models in 2024 and beyond.

This tweet talks about Gumroad issuing dividends back to our investors. I always understood that investors wanted to see a return, or a positive change in the return capabilities within a 5 year horizon. Also interesting is this Challenging your assumptions about starutps video.

Combined with Why the Future of Startups are Studios really helps me consider what I started back in 2011 with a number of technology leaders in New York as a viable alternative to what we know about funding a startup. We were always able to get through the first 3 steps easily.

Generate an idea
Flesh out the idea
Launch and experiment
Create a project
Create a big company

I believe Graham was ahead of his time with Ultra Light Startups some 15 years ago.

Source: TLDR

About ‘Digital Tech Trek Digest’

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering the professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Digital Tech Trek Digest [#Issue 2024.03]

January 17, 2024 by ronald

Lessons from going freemium: a decision that broke our business

As an entrepreneur always considering how to produce a sustaining passive revenue, what licensing model to use, and how to acquire and retain customers, the allure of a freemium model is ever present in so many offerings. You may wish to read this article and look at the visualizations provided with the narrative. I found this a useful data storytelling example.

The allure of seeing a new product is the strongest motivator new users have to complete setup. If you make onboarding too easy, they’ll never come back to do the hard task you let them skip.

Newsletters and online content creators

Lenny’s Newsletter from the prior article, listed with over 574,000 subscribers is one of several Substack newsletters I subscribe to. Substack is described as “The subscription network for independent writers and creators”. I have been collecting the number of subscribers from several newsletters I follow, however, there is no way to see that growth over time. Also missing are the price rates over time and the ratio of free to paying subscribers. Random Idea: What is missing is a history of this information. Other stats I’ve noted previously include 66,000 subscribers for Kent Becks Newsletter with 3 subscription plan offerings, 1,250,000 subscribers for the free TLDR (I can remember this years ago being much less) and 65,000 subscribers for the Seattle Data Guy newsletter.

FWIW this post from Lenny’s Newsletter This newsletter is growing up is from 2020.

Golden Kitty Awards 2022

I came across the Golden Kitty Awards, which unfortunately are only current to 2022 (fail on being current). It was interesting to scan the list for innovative ideas. I’ve yet to visit any sites, but I’m always encouraged by what people think of and commit to building, regardless of the motivation or incentive. What counts is an entrepreneur takes an idea and releases a product.

Source: Random

Streamer JS – Video stream layout manager for OBS Studio and other streaming applications.

I am a new user of Twitch streaming for personal projects. My good friend Sergey Chernyshev organizer of the large New York Web Performance Group has created Streamer JS as a means to drive more dynamic content in the browser and with common languages of HTML/CSS/Javascript and using OBS more as the streaming only component. One objective is better version control management of assets/scenes/sources/filters/etc. It’s interesting that PouchDB is an eventually consistent distributed datastore in Javascript. Yet another simple data store to review for suitability.

Source: Word of mouth

The 37th Chaos Communication Congress (37C3) by the Chaos Computer Club

Last month I was introduced to the Chaos Computer Club. This large German-based annual tech conference focuses on security & infrastructure/hacking. Over 100 talks from the most recent event last week have been posted here.

Source: Word of mouth

About ‘Digital Tech Trek Digest’

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into actionable notes in a short, committed time window summarizes what I learned today, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Digital Tech Trek Digest [#Issue 2024.02]

January 10, 2024 by ronald

Indie Newsletter Tool Generates $15,000 a Month

There are so many different email newsletter sites you could wonder if there is market saturation. MailChimp, Mailgun, ConvertKit, Sendgrid (now part of Twilio it seems), Moosend and Mailersend come to mind.

It seems the space still has plenty of revenue-producing options including buttondown.email reportedly a side gig generating $15k per month. Source: BoringCashCow

When I asked a good friend and author of the Technical SEO Weekly his use of ConvertKit directed me to this Baremetrics Dashboard which is another product to look at sometime.

LLMs and Programming in the first days of 2024

How do use an LLM? If you are still on the fence start getting into the habit of using it more frequently then start. I now use ChatGPT and Claude AI daily, and with a crowded market there are many other emerging technologies to also consider.

I use ChatGPT for coding and image generation with DALL.E. I use Claude more for reviewing large documents that seem to be ideal for producing a summary, or to generate a fictitious movie script from those documents.

I do not like Javascript nor do I wish to actually learn this language however I write it daily via ChatGPT. Javascript is the ever-changing technology of web development and it’s impossible to keep up with the next product, or version of a product you may know. ChatGPT helps me navigate this combined with asking for HTML and TailwindCSS. However, it’s not perfect, you need to be an experienced engineer who has learned how to write code for many years to ask the right questions and to correct the LLM when it does not produce what you expect. Let’s look at CSS. Now there is flex and grid and it’s hard to keep up with changing features that browsers support. This is where ChatGPT has helped me. I have been using Tailwindcss but it still took an expert friend 30 minutes to help me debug a CSS formatting issue of a future OBS Twitch streaming project to correctly size the content all in a 1920×1080 box. I learned a lot of new features of Google Chrome Developer Tools Inspector I did not know and are probably just the start of expert debugging features.

Until a few months ago I never knew it’s now much easier to read JSON in Javascript.

async function fetchData() {
  try {
    const response = await fetch('data.json');
    const data = await response.json();
    console.log(data);
    return data
  } catch (error) {
    console.error('Error fetching data:', error);
  }
}

let data = await fetchData();

I’d like to remind users that ChatGPT can make mistakes. Consider checking important information.. Source: TLDR

ParadeDB (GitHub Repo)

Every day there is another PostgreSQL product to review. I am a current user of ElephantSQL which I didn’t know existed two months ago. Neon and Tembo are two more PostgreSQL serverless-related products on my product review list. Now adding ParadeDB as well as reading Thoughts on PostgreSQL in 2024.

About ‘Digital Tech Trek Digest’

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Some of my regular sources include TLDR, Forbes Daily, ThoughWorks Podcasts, Daily Dose of Data Science and BoringCashCow to name a few.

Digital Tech Trek Digest [#Issue 2024.01]

January 3, 2024 by ronald

The Tiny Stack (Astro, SQLite, Litestream)

I spent many years in the LAMP stack, and there are often many more acronyms of technology stacks in our evolving programming ecosystem. New today is “The Tiny Stack”, consisting of Astro, a modern meta-framework for javascript (not my words), and Lightstream Continuously stream SQLite changes. I’ve never been a fan of Javascript, a necessary evil in modern stacks, but it changes so rapidly it’s a constant stream of new products with never the time to learn any. Lightstream is interesting. Replication of SQL operations to a database is nothing new, the Change Data Capture (CDC) of your data, however, I’d not thought of SQLite which is embedded everywhere offered this type of capability.

Amazon Aurora Introduces Long-Awaited RDS Data API to Simplify Serverless Workloads

AWS Aurora Serverless version 2 has been out for at least a year (actually 20 months – Apr 21, 2022), but a feature of version 1 that was not available in version 2 is the Data API. This is for developers without SQL skills to have a RESTful interface to the database, however, it only works in AppSync and only for recent versions of PostgreSQL and only in certain regions. I’ve never used it myself, but it is news.

Speaking of what is available in what AWS region, recently released InstanceHunt allows you to identify the instance families/types available in different regions across various AWS Database services. I developed this in just a few days and released it only last week as a working MVP. Future goals are to include other clouds and other categories of services such as Compute. The prior announcement may facilitate a future version that supports the features of services in regions.

Stop Stalling And Start Your Dream Side Business In 2024

The title kinda says it all. As an inspiring entrepreneur, my pursuits have only offered limited minimal success over the decades and never a passive revenue stream. While the article did not provide valuable nuggets, the title did. One of my goals for 2024 is to elevate my creation and release of side projects, regardless of each project being a source of revenue. I consider refining my design, development, testing, and implementation skills and providing information of value, are all resources of a soft income that showcase some of my diverse skills.

About ‘Digital Tech Trek Digest ‘

Most days I take some time early in the morning to scan my inbox newsletters, the news, LinkedIn, or other sources to read something new covering professional and personal topics of interest. Turning what I read into some actionable notes in a short committed time window is a summary of what I learned today, what I should learn and use, or what is of random interest. And thus my Digital Tech Trek.

Announcing InstanceHunt

January 2, 2024 by ronald

InstanceHunt identifies the instance (families/types/classes) available for a cloud service across all the regions of that cloud.

The initial version is a working example of several AWS database services. Future releases will enable advanced filtering and will cover other service categories (e.g. compute) as well as GCP and Azure cloud platforms, as well as providing the full list of instance types within families within the service matrix.

For a few days investment this MVP is a usable service, complete with adding new regions the same day, for example ca-central-1 data was available the day of release. It is interesting and can answer questions like what regions the new generation 7 instance families are available? What consistent instance types can use use across Europe regions? Where is MemoryDB not available?

Feature requests are welcome. From today’s reading, being able to show a feature of a service may be also a useful future matrix, e.g. AWS Aurora Serverless Data API now available in Serverless v2, but only one of two engines and only in a few regions.

China regions and AWS Gov Cloud regions are coming soon.

InstanceHunt - Find what instances you can use for your cloud services