What is the right length of a blog post?

A question without a definitive answer. Finding opinions from authoritative sources can also be easily obscured due to search engine optimization or even the choice of words used while searching.

I used the following search terms initially in Google and DuckDuckGo.

  • what is the right size of a blog post
  • what is the ideal length of a blog post

I then started with the term “ideal blog post”, and here are the type-ahead responses. Clearly “length” is the definitive winner in word association. My first thought was “size”, is that a technical difference?

DuckDuckGo

  • ideal blog post length
  • ideal blog post length for seo
  • ideal blog post size
  • ideal blog post length 2021
  • ideal word count for a blog post
  • ideal length for a blog post
  • ideal length of a blog post
  • ideal length for a blog post

NOTE: Size mentioned only once.

Google

  • ideal blog post length
  • ideal blog post length for seo
  • ideal blog post title length
  • ideal blog post length 2022
  • ideal blog post length for seo 2022
  • ideal blog post length 2021
  • ideal blog post length 2020
  • ideal blog post length for seo 2020
  • ideal blog post frequency
  • ideal blog posts

NOTE: Size not mentioned once. As a result the original title of my post was changed from size to length.

Search Outcomes

Using Google, which now often will provide a summarized result (known as a feature snippet) before examples of what People also ask, or ad results that are even before ranked actual results.

what is the right size of a blog post – Google

2,100-2,400 words
For SEO, the ideal blog post length should be 2,100-2,400 words, according to HubSpot data. We averaged the length of our 50 most-read blog posts in 2019, which yielded an average word count of 2,330. Individual blog post lengths ranged from 333 to 5,581 words, with a median length of 2,164 words. Mar 2, 2020

ideal blog post length – Google

about 1,500 to 2,000 words
Although your blog post length may vary depending on your topic and audience, it is often best to aim for about 1,500 to 2,000 words for articles or posts. Longer pieces seem to do better when it comes to ranking on SERPs.

DuckDuckGo

I have not yet seen, nor in these examples is DuckDuckGo creating a single answer summary. Probably IMO a good thing.

what is the right size of a blog post – Bing

Branching out I was curious what other possible engines provided.

1,600 words – According to 2 sources

And then a non copy/paste answer that I had to extract from developer tools

In the infographic “ The Internet is a Zoo: The Ideal Length of Everything Online ” from Buffer, they find that the ideal blog post length is 1,600 words. But some sources think a good blog post should be even longer than that. In a Medium article, the writer says that posts with an average read time of 7 minutes captured the most attention.

According to research done by popular blogging platform, Medium, the ideal length for blog posts is 1,600 words (or seven minutes of reading). This number is based on an analysis of the “average total seconds spent on each post and compared this to the post length.”

ideal blog post length – Bing

To sum up, here’s a list of common blog posts lengths to help you find your own ideal length:

Micro content: 75–300 words. Super-short posts are best for generating discussion. They rarely get many shares on social…
Short-form content: 300–600 words. This is the standard blogging length, recommended by many “expert” bloggers. Shorter…

More …

what is the right size of a blog post – Yahoo

Above the fold, after ads and before People also ask and actual results was

For SEO , the ideal blog post length should be 2,100-2,400 words, according to HubSpot data. We averaged the length of our 50 most-read blog posts in 2019, which yielded an average word count of 2,330. Individual blog post lengths ranged from 333 to 5,581 words, with a median length of 2,164 words.

ideal blog post length – Baidu

As the homepage was all Chinese and I wasn’t sure if I should continue but I cut/pasted english and hit the button and got results in English.

The text of the first search response was something I’d not seen on any other page, so for reference apparently there are Blog styles :)

Ideal Blog Post Length for SEO Blog posts vary in length from a few short paragraphs (Seth Godin style) to 40,000 words (Neil Patel style).

What an SEO SME says

So I reached out to my most knowledgeable friend in SEO and asked them the question Without googling or searching online, based on your SME.

Q: What is the right size of a blog post?
A: You mean content length? 1500 to start, ideally more towards the 5,000 or 10,000

Q: What is the best reading time for a blog post?
A: depends – long form vs short – some times a simple paragraph is all you need. Other times you want a book.

Summary

Using what the engines provide as a single recommendation, not the top organic search result.

Source Response
Google 1,500-2,000 or 2,100-2,400 depending on question
DuckDuckGo -
Bing 1,600 (only to mention time of 7 minutes)
Yahoo 2,100-2,400
Human SEO SME 1,500

Additional Helpers

A recent edition to my short reading email summaries of useful articles is TLDR. While this is not new information the inclusion of 1 minute read, 2 minute read, 11 minute read is useful data to me in making an informed decision based on the factors at the moment. Other information that helps this example which is a newsletter is 300,000 Subscribers and 43% Open Rate. There are also other data points that help, and could narrow your audience and determine what you may consider and ideal size.

Returning to the summarized results of various search engines, only one, Bing, provided this additional measurement of time, and the answer was “average read time of 7 minutes captured the most attention.” which translated into 1,600 words.

I cannot ofter any personal validation of either of these data points, but I should perhaps start collecting it.

Conclusion

What is the answer? Well, only your target audience can inform you of this. The question(s) is then who is your target audience? Is your target audience who you think they are?

For the record, my last blog post was 1973 words long, and this one is 1216 words long, therefore averaging 1594 words. NOTE: These numbers were the original versions length, both of which have changed/evolved over time with additional feedback.

This leads to a more important question. How are you measuring the impact of your blog posts and how does size/length/time play a role in that?

Sidebar: Is a blog post actually the best way for people to read your content, or at least gain insights into what may be useful for your readers. Is a newsletter a better option?

Going back to the TLDR newsletter for a moment, this information can be found on the website.

  1. Highly technical audience, primarily software engineers and other tech workers
  2. 30% United States, 10% United Kingdom, 10% Canada, 25% other EU, 25% other non-EU
  3. 50% ages 25 to 34, 20% ages 18 to 24, 20% ages 35 to 44, 10% other
  4. Primary sponsors get between 1000 to 1250 clicks
  5. Developer sponsors get between 750 to 1000 clicks
  6. Subscribers from companies like: Google, Amazon, Facebook, Apple, … (it’s interesting this is a list of logos, and what order they are in, FWIW)

I do not have access to the data so I am unable to gain more insights as to what is most read articles based on time. Hint: Interesting infographic for TLDR to publish.

I would ask how do they know point 1 and point 3 of my information without additional data mining providing this detail? I provided an @gmail email address, and my location can be determined via IP.

Getting a clearer picture of http response time breakdown via CLI

I came across this handy python script https://github.com/reorx/httpstat that provides a http response breakdown in text. This saves you having to open up a browser and look at a visual network response waterfall.

For example, using my website homepage and blog for comparision.

$ python httpstat.py http://ronaldbradford.com

HTTP/1.1 200 OK
Date: Fri, 23 Sep 2016 16:52:09 GMT
Server: Apache/2.4.7 (Ubuntu)
X-Powered-By: PHP/5.5.9-1ubuntu4.17
Vary: Accept-Encoding,User-Agent
Cache-Control: max-age=1
Expires: Fri, 23 Sep 2016 16:52:10 GMT
Transfer-Encoding: chunked
Content-Type: text/html

Body stored in: /var/folders/mk/0v6thtzd7mb9sb9r4fhv4bcc0000gn/T/tmpK_foIX

  DNS Lookup   TCP Connection   Server Processing   Content Transfer
[    72ms    |      27ms      |       35ms        |       39ms       ]
             |                |                   |                  |
    namelookup:72ms           |                   |                  |
                        connect:99ms              |                  |
                                      starttransfer:134ms            |
                                                                 total:173ms
$ python httpstat.py http://ronaldbradford.com/blog/

HTTP/1.1 200 OK
Date: Fri, 23 Sep 2016 16:52:39 GMT
Server: Apache/2.4.7 (Ubuntu)
X-Powered-By: PHP/5.5.9-1ubuntu4.17
X-Pingback: http://ronaldbradford.com/blog/xmlrpc.php
Vary: Accept-Encoding,User-Agent
Cache-Control: max-age=1
Expires: Fri, 23 Sep 2016 16:52:40 GMT
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

Body stored in: /var/folders/mk/0v6thtzd7mb9sb9r4fhv4bcc0000gn/T/tmpn5R1f2

  DNS Lookup   TCP Connection   Server Processing   Content Transfer
[     5ms    |      34ms      |       129ms       |       790ms      ]
             |                |                   |                  |
    namelookup:5ms            |                   |                  |
                        connect:39ms              |                  |
                                      starttransfer:168ms            |
                                                                 total:958ms

Note that 301 redirects are not handled so be sure you are getting the full content you expect in a request.

$ python httpstat.py http://ronaldbradford.com/blog

HTTP/1.1 301 Moved Permanently
Date: Fri, 23 Sep 2016 16:52:22 GMT
Server: Apache/2.4.7 (Ubuntu)
Location: http://ronaldbradford.com/blog/
Cache-Control: max-age=1
Expires: Fri, 23 Sep 2016 16:52:23 GMT
Content-Length: 322
Content-Type: text/html; charset=iso-8859-1

Body stored in: /var/folders/mk/0v6thtzd7mb9sb9r4fhv4bcc0000gn/T/tmptLSJTv

  DNS Lookup   TCP Connection   Server Processing   Content Transfer
[     5ms    |      61ms      |       39ms        |        0ms       ]
             |                |                   |                  |
    namelookup:5ms            |                   |                  |
                        connect:66ms              |                  |
                                      starttransfer:105ms            |
                                                                 total:105ms

The agile software development lifecycle responsibility

The eXtreme Programming (XP) methodology places emphasis on a number of core principles for agile software development. These include (and are not limited to) the planning game, short and frequent iterations, testing, frequent refactoring, continuous integration, ownership and standards.

Identifying the problem

These core principles however are not the full lifecycle of software development. This is really only a portion of the lifecycle. What is lacking is the definition for the ongoing responsibility and ownership by the creators of software in the sustainability of said software for the lifetime of use and benefit to an organization.

An agile methodology approach (of which XP is just one) fails to expose and describe the full operational cost within development, testing and deployment. Just as a single line of code is viewed a hundred times more than the time is was written, the usage of that code in the full lifecycle of an organization is potentially a magnitude more investment of time and resources.

Software development is not just about new feature creation. It is also ensuring full product ownership and responsibility consistently. It is also ensuring that in a larger organization, compatibility and consistency can occur with other products. In other words, it is thinking of software for the whole organization, rather than the sum of individual parts.

Scheduling lifecycle management time

Development and engineering resources already apportion time between planning, development and unit testing. There needs to be a second more important consideration. An apportionment of time between product features, product stability and product maintainability.

A good assignment of time to cover the full lifecycle adequately is:

  • 60% of time to feature design, development and product support (i.e. bugs)
  • 20% of time in stability and sustainability management of the existing technology stack (i.e. refactoring and testing)
  • 20% of time in overall lifecycle management of delivered functionality (i.e. ongoing ownership)

Conveniently this Pareto allocation can be seen as 80% for development time and 20% for time generally considered operations.

Sustainability Management

Remember the core principles of XP that included frequent refactoring and standards. How much time is spent on refactoring code to provide a better, more consistent, more testable codebase for an application after code is initially deployed? What about across multiple applications in your organization. Engineering resources rarely invest any time let alone actively scheduling time for code maintenance by the entire engineering organization, yet there are immediate benefits. It can be amazing how more performant a system is when unnecessary code is simply deleted from software that has gradually evolved over time. The compounding benefits can mean less code to view by developers and thus adding incremental efficiency. Less code to deploy also means smaller installation and application footprint. Particularly when the code is unnecessarily executed in the common usage path.

Engineering teams in general are more focussed on delivering new functionality or fixing issues with newer functionality rather than reviewing existing functionality for optimization, consolidation, replacement or removal. What about applying an improvement to not just one application, but multiple applications across an organization whenever possible.

There is generally at least one individual at each organization that has the attitude of “Do I write the line of code, or is there a better way?”, and “What code can be deleted as it is no longer (or was never) used?”. If all engineers considered, evaluated and implemented these concepts as a daily process, code would be more stable, it would be more lean. Does your organization have a recognition for the developer that has deleted the most lines of code from your production system?

The following is the example of a single developers improvement to a production system via deletions.
github deletions

Are there better ways of implementing functionality with the version of the technology stack already in use? Many times a newer version of software is used for one feature, but what other new or improved features also exist. This is a proactive measure to look at the features of the technology in use. This is a different type of refactoring, but the same concept in code reduction. A great example here is the use of an iterator design pattern rather than a loop. In initial deployment of an application, memory optimization may not have been obvious, however over time and increasing datasets this simple proactive action has a larger benefit for the application.

A final step in improving sustainability of the software is testing. An agile approach introduces unit testing, but testing do not stop with the validation of a single line of code. Testing encompasses how that functions with the entire system, often known as functional testing. Systems often require load testing to know the capacity before failure, not after it occurs. If as much time was spent in these two additional areas of testing, as was spent in unit testing, more robust systems would exist and the unseen benefit is the productivity to spent more time developing.

Here are a few customer examples of refactoring. Unfortunately this is an all to common occurrence.

Module bloat

An assessment of the technology stack for a newly deployed application (i.e. just a few weeks old) showed a long list of PHP and Apache modules. Without any justification as to why these modules were used, and without a willing engineering sponsor it took quite some time to first produce automated deployment duplicating this custom environment, than applicable testing to strip out what was ultimately unnecessary. The overall outcome had multiple effects. What was needed to operate the system was actually documented. What was needed was actually automated to assist in future deployments. The resulting software was more performant as it had less baggage. The resulting deployed VM image was actually over 1GB smaller after all bloat was removed. This improved the time to deploy new application servers. As this system had a very large scale up and scale down weekly, we are talking 1000% at peak times, the impact of a more lean stack had a huge impact on the true deployment times of the application. This is an attribute that can be difficult for developers to appreciate, when comparing a development environment to a production system.

This entire process and the large investment of work would have been almost non-existent if this was part of the engineering methodology used during initial development (which took over one year for initial deployment), and if more (or all) individual developers stopped to ask why are we adding additional modules. This is part of the infrastructure planning that should have a feedback loop within each iteration. This also requires both a solid experience in engineering and architectural oversight to be able to estimate the impact over a much larger time period than the development cycle.

Framework bloat

An education based client faced a huge problem. The existing system had grown over a number of years, the engineering department had grown from one developer to over a dozen developers, yet the approach towards software development had not changed from that single developer original module based Drupal approach for a small application. With sales for the next annual education cycle already 4x more than the current user base that was having regular outages, the system could not (and would not) sustain known future sales.

Often the first question asked by clients in this situation when offering performance services is “How can I scale my system 10x?” I generally counter this question with “How did you scale from when your system was 10x smaller to now?”. Aside from the interesting conversations around these responses, I often need to explain that performance is about efficiency, and this often requires a cultural change. I also generally quote one of my popular lines — “When reviewing the performance of a piece of code (or SQL statement); the first objective should not be to make it better; the first objective should be to eliminate it.” This is also generally received with blank stares and silence. Efficiency it seems perhaps is no longer taught or practiced.

As with most simple yet profound assessments an example of the clients production system can best demonstrate what inefficiency is. An analysis of the user registration process unveiled alarming result. This analysis that can happen in a very short period, e.g. an hour. In summary, 50 SQL statements were executed to register a new user to the system. A physical desk check (again foreign when you have to ask multiple people how do I print out something as a visiting consultant) of just the database access showed that with the present inefficient Drupal ‘node’ schema design, just 11 SQL statements were actually needed to complete the required task. That is, the code could be 500% more efficient and nothing has been tuned or scaled. The client needed at least a 400% immediate improvement. However, just explaining this did not convince the organizations c-level executives to reset poor development practices to addressing immediate and ongoing scalability (i.e. success of your startup). They wanted a more abstract approach, they wanted a magically sharded solution were simply throwing H/W (and $) at the problem made it go away without changing the engineering mindset. If you go back to the answer to my response question you find this is often the solution to get to the current point, that is add more servers, add caching, add read-only data access. This is not actually the solution but is adding complexity to the problem and making it more expensive to correct. In the startup ecosystem this is also known as a successful catastrophe. You reached all of your marketing and sales pitch goals, and your software crumpled under your unplanned success.

Was this problem just in user registration, or was it throughout the entire application? If looking at one common and frequent code path a 500% improvement can be made with 0% feature impact. Would that not indicate the problem exists elsewhere in the codebase. In fact, this example product was not even the classic RAT v CAT that is often a more compounding performance issue.

Further assessment of this one code path demonstrated that when an optimal schema design was architected for the purpose of the application, the number of SQL statements would be reduced to 5 (i.e. a 900% improvement). This is a significant performance and scalability benefit when using applicable architectural design and strategic planning. Performing regular architectural reviews by skilled resources in your business strategy can help to address development productivity regression long before they occur. A great architect never sees the true benefits of their work. It is a silent reward that their given experience, knowledge and expertise has an unknown financial value to an organization.

Lifecycle Management

It can be difficult to understand the impact of code in the full lifecycle of a software product in the 21st century. Until individuals have seen the birth, growth, support, longevity and death of a system it can be impossible to understand the impact some lines of code have with one application and the interoperability requirements with other applications. When the waterfall approach for SDLC was still in active use this was possible with large scale projects over time. In the post tech boom age and with the use of agile methodologies the incremental development lifecycle hides a lot of important context for better assessment of true cost savings.

The introduction and increasing popularity of the devops and site reliability roles also attempts to hide what many large organizations and successful website have, that is a dedicated operations team. Tools have done so much to enable engineers to be more productive. Automated provisioning, PaaS and CI/CD tools seamlessly enable more (abstract) code to be written to provide that essential functionality to the end user. Automated testing has replaced design documents. Organizations developer systems without is a data model? All of these tools and techniques however do not replace the intelligence needed to operate a system over time, particularly for tasks including upgrades and integrations.

One simple concept can be implemented to assist in all contributors owning lifecycle management.

The first is the responsibility of a developer being paged when a production problem occurs due to the line of code they wrote. Being responsible accepting that in the early morning or weekend you may be needed to address a problem attributed to your individual work and a failure within an entire system may make the decision to consider the larger impact more prevalent. This is taking the XP principle of ownership and defining the time dimension to a period infinitely greater than the present iteration.

The following is a great tweet that shows this developer has heard of commenting their code, but not considering lifecycle management?

// When I wrote this, only God and I understood what I was doing
// Now, God only knows

Justifying the reallocation of time

In the 1990s the concept of adding a quality step to software development via means of code reviews and automated testing was seen as an impediment to productivity. This potential cost in lost productivity could not be justified. Why would developers write tests when there is an entire QA team to test new features each time the software is released? Today it is seen as an essential component for continuous integration and delivery and the testing is designed to test all functionality repeatedly, not just new functionality.

Assigning 40% of present development time elsewhere could be viewed as a loss of productivity because today projects do not have a start and end date and deliverables where a total cost of ownership could be more clearly calculated. Today, projects are a continual ongoing evolution, even the concept of cost projection simply does not exist and therefore could be stated as impossible to validate against. After more than a decade of working with startups at many stages of evolution, the cost of not undertaking stability and lifecycle management is a far greater longer term cost to an organization by an outside observer. Look no further than the much larger turnover of technology staff in today’s organizations. These resources have institutional knowledge that is lost to the organization. This information is rarely documented as a historical artifact and the reason why steps were taken cannot be inferred from what is presently the state of the current code (or even reviewing the code revision history). This cost is rarely calculated within the software development lifecycle.

Adopting ownership

Many organizations suffer from the clash of traditional infrastructure principles with the pace of accelerated innovation. This approach helps to better balance the responsibility particularly between engineering and operations departments and improves the workflow to producing better products to the business in the longer term and ultimately to those who matter, the customer.

When developers value the total impact of a line of code in the full lifecycle of the product or service, a different mindset leads to actually writing better code. This code results in being more efficient and the carryover effect is the developer is actually more effective at writing more subsequent code.

What is testing?

In software development this is a simple question. What is [the purpose of] testing? If asked to give a one sentence answer what would you say? I have asked this simple question of attendees at many presentations, and also to software developers I have worked with or consulted to.

The most common answer is. “Testing is about making sure the software works, the function your testing does what it should, for example saves the information you entered”.

Unfortunately this is not the purpose of testing, and this attitude leads to what I generally term as poor quality software. “Testing is about trying to break your product any way possible, all the time.”

With this clarification in understanding of a basic and necessary software engineering principle, the attitude towards software development and the entire focus and mindset of engineering and quality assurance can change for the better.

Another very simple example which I often ask when consulting. What does your website look like when it’s down? Again, the general answer is often vague and/or incomplete. How do you know when your website is down? I have heard the response “The users will let you know”. You may laugh, but it is certainly not funny. Show me your website in a down state? Show me your website in a degraded state? When the answer is either unclear, or with a recent employment the same response, there has simply been little thought into producing a quality product by a testing process that is intent on breaking your software.

What procedures do you follow when receiving alerts about errors? What procedures do you put in place to ensure they do not happen again? Again, one has to be disappointed when the response is, “I will set up an email alert to the team for this type of error?” This reactive response is not addressing the problem, only acknowledging the existence of a problem. What is needed is being proactive. Was a bug raised? Can the problem be easily reproduced? How was the problem fixed the first time? Can this be corrected in the code? Can the interim resolution be automated?

When there is a negative user experience from any type of failure or error another important feedback loop is the post-mortem to review the when, why, how and who of the situation and to create a plan to ensure this does not happen again.

Testing needs to baked in to everything that is done, and practice makes for a more perfect outcome. In a high volume environment it is critical to have a simulated environment where you can benchmark performance of any new release for any regressions. A well defined load testing environment can be used to review experimental branches of possible performance improvements. It is also where you can determine the bottleneck and breaking point as you increase load 2X, 5X, 10X. It is impossible to be proactive when your system can fail at 2X load, and the engineering resources needed to implement a solution will not happen in time.

Disaster is inevitable. It will happen, whether small or large. Hardware and software inherently fails. How it fails and what is done to mitigate this to ensure the best possible consistent and rewarding consumer experience is only possible by consistently practicing to break your software at all stages in the development and deployment lifecycle.

Improving performance – A full stack problem

Improving the performance of a web system involves knowledge of how the entire technology stack operates and interacts. There are many simple and common tips that can provide immediate improvements for a website. Some examples include:

  • Using a CDN for assets
  • Compressing content
  • Making fewer requests (web, cache, database)
  • Asynchronous management
  • Optimizing your SQL statements
  • Have more memory
  • Using SSD’s for database servers
  • Updating your software versions
  • Adding more servers
  • Configuring your software correctly
  • … And the general checklist goes on

Understanding where to invest your energy first, knowing what the return on investment can be, and most importantly the measurement and verification of every change made is the difference between blind trial and error and a solid plan and process. Here is a great example for the varied range of outcome to the point about “Updating your software versions”.

On one project the MySQL database was reaching saturation, both the maximum number of database connections and maximum number of concurrent InnoDB transactions. The first is a configurable limit, the second was a hard limit of the very old version of the software. Changing the first configurable limit can have dire consequences, there is a tipping point, however that is a different discussion. A simple software upgrade of MySQL which had many possible improvement benefits, combined with corrected configuration specific for this new version made an immediate improvement. The result moved a production system from crashing consistently under load, to at least barely surviving under load. This is an important first step in improving the customer experience.

In the PHP application stack for the same project the upgrading of several commonly used frameworks including Slim and Twig by the engineering department seemed like a good idea. However applicable load testing and profiling (after it was deployed, yet another discussion point) found the impact was a 30-40% increase in response time for the application layer. This made the system worse, and cancelled out prior work to improve the system.

How to tune a system to support 100x load increase with no impact in performance takes knowledge, experience, planning, testing and verification.

The following summarized graphs; using New Relic monitoring as a means of representative comparison; shows three snapshots of the average response time during various stages of full stack tuning and optimization. This is a very simplified graphical view that is supported by more detailed instrumentation using different products, specifically with much finer granularity of hundreds of metrics.

These graphs represent the work undertaken for a system under peak load showing an average 2,000ms response time, to the same workload under 50ms average response time. That is a 40x improvement!

If your organization can benefit from these types of improvements feel free to Contact Me.

There are numerous steps to achieving this. A few highlights to show the scope of work you need to consider includes:

  • Knowing server CPU saturation verses single core CPU saturation.
  • Network latency detection and mitigation.
  • What are the virtualization mode options of virtual cloud instances?
  • Knowing the network stack benefits of different host operating systems.
  • Simulating production load is much harder than it sounds.
  • Profiling, Profiling, Profiling.
  • Instrumentation can be misleading. Knowing how different monitoring works with sampling and averaging.
  • Tuning the stack is an iterative process.
  • The simple greatest knowledge is to know your code, your libraries, your dependencies and how to optimize each specific area of your technology stack.
  • Not everything works, some expected wins provided no overall or observed benefits.
  • There is always more that can be done. Knowing when to pause and prioritize process optimizations over system optimizations.

These graphs show the improvement work in the application tier (1500ms to 35ms to 25ms) and the database tier (500ms to 125ms to 10ms) at various stages. These graphs do not show for example improvements made in DNS resolution, different CDNs, managing static content, different types and ways of compression, remove unwanted software components and configuration, standardized and consistent stack deployments using chef, and even a reduction in overall servers. All of these successes contributed to a better and more consistent user experience.

40x performance improvements in LAMP stack

Poor programming practices

When will it stop. These amateur programmers that simply cut/paste code really affect those good programmers in the ecosystem trying to make a decent living. I was reviewing a developed (but incomplete) PHP/MySQL system using a common framework (which in itself is irrelevant for this post).

In one source file there were 12 repetitions of the following code:

   //permissions
    $this->security_model->setUserPermissions($id);
    if (!array_key_exists($id,$this->session->userdata['permissions']) OR
	!array_key_exists('id', $this->session->userdata['permissions'][$id]) OR
	!array_key_exists('scope', $this->session->userdata['permissions'][$id]['name'])){
      $this->session->set_flashdata('alert', 'You are not authorized to go there.');
      redirect($this->agent->referrer());
    }

It’s bad enough when code is repeated and not put in a simple re factored function. When it’s repeated 12 times in one file, and OMG over 100 times in the product, that is a recipe for bugs, and high maintenance codes due to extremely poor coding practice.

Why is my database slow?

Not part of my Don’t Assume series, but when a client states “Why is my database slow””, you need to determine if indeed the database is slow.

Some simple tools come to the rescue here, one is Firebug. If a web page takes 5 seconds to load, but the .htm file takes 400ms, and the 100+ assets being downloaded from one base url, then is the database actually slow? Tuning the database will only improve the 400ms portion of 5,000ms download.

There some very simple tips here. MySQL is my domain expertise and I will not profess to improving the entire stack however perception is everything to a user and you can often do a lot. Some simple points include:

  • Know about blocking assets in your <head> element, e.g. .js files.
  • Streamline .js, .css and images to what’s needed. .e.g. download a 100k image only to resize to a thumbnail via style elements.
  • Sprites. Like many efficient but simple SQL statements, network overhead is your greatest expense.
  • Splitting images to a different domain.
  • Splitting images to multiple domains (e.g. 3 via CNAME only needed.) — Hint: Learn about the protocol
  • Cookieless domains for static assets
  • Lighter web container for static assets (e.g. nginx, lighttpd)
  • Know about caching, expires and etags
  • Stripping out http://ww.domain.com from all your internal links (that one alone saved 12% of HTML page size for a client). You may ask is that really a big deal, well in a high volume site the sooner you can release the socket on your webserver, the sooner you can start serving a different request.

Like tuning a database, some things work better then others, some require more testing then others, and consultants never tell you all the tricks.

References

As with everything in tuning, do your research and also determine what works in your environment and what doesn’t. Two excellent resources to start with are Steve Souders and Best Practices for Speeding Up Your Web Site by Yahoo.

Upcoming book – Expert PHP and MySQL

This month will see the release of the book Expert PHP and MySQL which I was a co-author of. Initially this will be available for purchase in PDF format from the Wrox website and I am hopeful this will be available in print format for the MySQL Users Conference.

More then just your standard PHP and MySQL there is detailed content on technologies including Memcached, Sphinx, Gearman, MySQL UDFs and PHP extensions. We will be posting more information at www.ExpertPhpandMySQL.com. You can download a PDF version of Chapter 1 Techniques Every Expert Programmer Needs to Know.

The book includes the following content:

  1. Techniques Every Expert Programmer Needs to Know
  2. Advanced PHP Constructs
  3. MySQL Drivers and Storage Engines
  4. Improved Performance through Caching
  5. Memcached MySQL
  6. Advanced MySQL
  7. Extending MySQL with User-defined Functions
  8. Writing PHP Extensions
  9. Full Text Search using SPHINX
  10. Multi-tasking in PHP and MySQL
  11. Rewrite Rules
  12. User Authentication with PHP and MySQL
  13. Understanding the INFORMATION_SCHEMA
  14. Security
  15. Service and Command Lines
  16. Optimization and Debugging

Monitoring MySQL options

My recent poll What alert monitoring do you use? showed 25% of the 58 respondents to bravely state they had no MySQL monitoring. I see 1 in 3, ~33% in my consulting so this is consistent.


There is no excuse to not have some MySQL Monitoring on your production system. At the worse case, you should be logging important MySQL information for later analysis. I use my own Logging and Analyzing scripts on every client for an immediate assessment regardless of what’s available. I combine that with my modified statpack to give me immediate text based analysis, broken down by hour chunks for quick reference. These help me in troubleshooting, but they are not a complete solution.

The most popular options I see and are also reflected in the results are:

There is a good list, including some products I did not know. My goal is to get this information included in the Monitoring-MySQL information site.

I have some additional information on Cacti and MONyog, and I’ll be sharing this information in upcoming posts.

HiTCHO Top tech tips

I recent visit with old Brisbane friend HiTCHO which I met at the Brisbane MySQL Users Group in 2005, has lead to this cool list of some hardware and software technologies he used that I am now considering or have already implemented or purchased.

Software

  1. xmarks.com – Bookmark-Powered Web Discovery
  2. Pulse – Smart Pen
  3. Quicksilver Mac windows manager
  4. MailPlane – Brings Gmail to your Mac desktop
  5. Evernote – Remember Everything, with Firefox plugin and iPhone App
  6. Textmate – The missing editor for Mac OS/X
  7. Screen flow Professional screencasting Studio
  8. Snoop – A GNU/Linux file descriptor monitoring tool inspired by FreeBSD’s ‘watch’.

Hardware

  1. Drobo – Storage that manages itself
  2. Canon PowerShot SX1. True HD in a Canon compact digital camera.
  3. LiveScribe – Never miss a word

Twitter Tips

I have in the past questioned the value of Twitter as an effective business tool, but it continues to defy the trend of inability to bridge the business gap with social media.

Even with still continual growth problems (at least it’s not down as much) Twitter is everywhere I go, see or do. You see it at business events, business cards, meetups even on CNN Headline News. There are so many various differ twitter sites, applications, widgets etc, I’m surprised there isn’t a twitter index just of the twitter related sites.

I have now incorporated Twitter into my professional site and I’m using this micro-blogging approach more to share my professional skills and interests to my growing band of followers. I don’t expect to make the Twitter top list which is headed by CNN Breaking News with 667,353 followers.

Even Lance Armstrong (who rates 9th) used Twitter for press releases this week of his injuries.

For more reading check out How Twitter Makes You A Better Writer and 27 Twitter Applications Your Small Business Can Use Today.

I was surprised to see How to get a job by blogging: Tips for a setting up the kind of professional blog that will get you hired, barely mention Twitter.

Now be sure to add a background appropriate to your Twitter. This one is wicked.

Your Code, Your Community, Your Cloud… Project Kenai

Following the opening keynote announcement about Kenai I ventured into a talk on Project Kenai.

With today’s economy, the drive is towards efficiency is certainly a key consideration, it was quoted that dedicated hosting servers only run at 30% efficiency.

An overview again of Cloud Computing

  • Economics – Pay as you go,
  • Developer Centric – rapid self provisioning, api-driven, faster deployment
  • Flexibility – standard services, elastic, on demand, multi-tenant

Types of Clouds

  • Public – pay as you go, multi-tenant application and services
  • Private – Cloud computing model run within a company’s own data center
  • Mixed – Mixed user of public and private clouds according to applications

SmugMug was referenced as a Mixed Cloud example.

Cloud Layers

  • Infrastructure as a Services – Basic storage and computer capabilities offer as a service (eg. AWS)
  • Platform as a Service – Developer platform with build-in services. e.g. Google App Engine
  • Software as Service – applications offered on demand over the network e.g salesforce.com

Some issues raised about this layers included.

  • IaaS issues include Service Level, Privacy, Security, Cost of Exit
  • PaaS interesting point, one that is the bane of MySQL performance tuning, that is instrumentation
  • SaaS nothing you need to download, you take the pieces you need, interact with the cloud. More services simply like doing your Tax online.

Sun offers Project Kenai as well as Zembly.

Project Kenai

  • A platform and ecosystem for developers.
  • Freely host open source projects and code.
  • Connect, community, collaborate and Code with peers
  • Eventually easily deploy application/services to “clouds”

Kenai Features

  • Code Repository with SVN, Mercurial, or an external repository
  • Issue tracking with bugzilla, jira
  • collaboration tools such as wiki, forums, mailing lists
  • document hosting
  • your profile
  • administrative role

Within Kenai you can open up to 5 open source projects and various metrics of the respositories, issue trackers, wiki etc.

The benefits were given as the features are integrated into your project, not distributed across different sites. Agile development within the project sees a release every 2 weeks. Integration with NetBeans and Eclipse is underway.

Kenai is targeted as being the core of the next generation of Sun’s collaboration tools. However when I asked for more details about uptake in Sun, it’s only a request, not a requirement for internal teams.

The API’s for the Sun Cloud are at http://kenai.com/projects/suncloudapis.

Event: CommunityOne East in New York, NY.
Presenter: Tori Wieldt, Sun Microsystems
Article Author: Ronald Bradford

Everybody is talking About Clouds

From the opening keynote at CommunityOne East we begin with Everybody is talking About Clouds.

It’s difficult to get a good definition, the opening cloud definition today was Software/Platform/Storage/Database/Infrastructure as a service. Grid Computing, Visualization, Utility Computing, Application Hosting. Basically all the buzz words we currently know.

Cloud computing has the ideals of truly bringing a freedom of choice. For inside or outside of an enterprise, the lower the barrier, time and cost into freedom of choice give opportunities including:

  • Self-service provisioning
  • Scale up, Scale down.
  • Pay for only what you use.

Sun’s Vision has existed since 1984 with “The NETWORK is the Computer”.

Today, Sun’s View includes Many Clouds, Public and Private, Tuned up for different application needs, geographical, political, with a goal of being Open and Compatible.

How do we think into the future for developing and deploying into the cloud? The answer given today was, The Sun Open Cloud Platform which includes the set of core technologies, API’s and protocols that Sun hopes to see uptake among many different providers.

The Sun Cloud Platform

  • Products and Technologies – VirtualBox, Sun xVM, Q-Laser, MySQL
  • Expertise and Services
  • Partners – Zmanda, Rightscale, Kickapps
  • Open Communities – Glashfish, Java, Open Office, Zfs, Netbeans, Eucalyptus

The Sun Cloud includes:

  • Compute Service
  • Storage Service
  • Virtual Data Center
  • Open API – Public, RESTful, Java, Python, Ruby

The public API has been released today and is available under Kenai. It includes two key points:

  • Everything is a resource http GET, POST, PUT etc
  • A single starting point, other URI’s are discoverable.

What was initially showed was CLI interface exmaples, great to see this still is common, a demonstration using drag and drop via a web interface was also given, showing a load balanced, multi-teired, multi server environment. This was started and tested during the presentation.

Then Using Cyberduck (a WebDAV client on Mac OS/X) and being able to access the storage component at storage.network.com directly, then from Open Office you now get options to Get/Save to Cloud ( using TwoGuys.com, Virtual Data Center example document).

Seamless integration between the tools, and the service. That was impressive.

More information at sun.com/cloud. You can get more details also at the Sun Microsystems Unveils Open Cloud PlatformOfficial Press Release.

Event: CommunityOne East in New York, NY.
Article Author: Ronald Bradford

CommunityOne East – An open developer conference

With an opening video from thru-you.com – an individual taking random you-tube video and producing video mashup’s, the CommunityOne East conference in New York, NY beings.

The opening introduction was by Chief Sustainability Officer Dave Douglas. Interesting job title.

His initial discussion was around what is the relationship between technology and society. A plug for his upcoming book “Citizen Engineer” – The responsibilities of a 21st Century Engineer. He quotes “Crisis loves an Innovation” by Jonathan Schwartz, and extends with “Crisis loves a Community”.

He asks us to consider the wider community ecosystem such as schools, towns, governments, NGO’s etc with our usage and knowledge of technology.

Event: CommunityOne East in New York, NY.
Author: Ronald Bradford

Hurting the little guy?

Today I come back from the dentist, if that wasn’t bad enough news, I get an email from Google AdWords titled Your Google AdWords Approval Status.

In the email, all my AdWords campaigns are now disapproved, because of:

SUGGESTIONS:
-> Ad Content: Please remove the following trademark from your ad:
mysql.

Yeah right. I can’t put the word ‘MySQL’ in my ads. How are people to now find me? It would appear that many ads have been pulled not just mine. Is this a proactive measure by Google? is this a complaint from the MySQL trademark holder Sun Microsystems?

I’d like any comment, feedback or suggestions on how one can proceed here.

It reminds me of the days CentOS advertised itself as an “Open source provider of a popular North American Operating System”, or something of that nature.

Some Drupal observations

I had the opportunity to review a client’s production Drupal installation recently. This is a new site and traffic is just starting to pick up. Drupal is a popular LAMP stack open source CMS system using the MySQL Database.

Unfortunately I don’t always have the chance to focus on one product when consulting, sometimes the time can be minutes to a few hours. Some observations from looking at Drupal.

Disk footprint

Presently, volume and content is of a low volume, but expecting to ramp up. I do however find 90% of disk volume in one table called ‘watchdog';


+--------------+--------------+--------------+-------------+--------+
| table_schema | total_mb     | data_mb      | index_mb    | tables |
+--------------+--------------+--------------+-------------+--------+
| xxxxx        | 812.95555878 | 745.34520721 | 67.61035156 |    191 |
+--------------+--------------+--------------+-------------+--------+

+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| table_name                                | engine | row_format | table_rows | avg_row_length | total_mb     | data_mb      | index_mb    |
+-------------------------------------------+--------+------------+------------+----------------+--------------+--------------+-------------+
| watchdog                                  | MyISAM | Dynamic    |      63058 |            210 | 636.42242813 | 607.72516251 | 28.69726563 |
| cache_menu                                | MyISAM | Dynamic    |        145 |         124892 |  25.33553696 |  25.32577133 |  0.00976563 |
| search_index                              | MyISAM | Dynamic    |     472087 |             36 |  23.40134048 |  16.30759048 |  7.09375000 |
| comments                                  | MyISAM | Dynamic    |      98272 |            208 |  21.83272934 |  19.58272934 |  2.25000000 |

Investigating the content of the ‘watchdog’ table shows detailed logging. Drilling down just on the key ‘type’ records shows the following.

mysql> select message,count(*) from watchdog where type='page not found' group by message order by 2 desc limit 10;
+--------------------------------------+----------+
| message                              | count(*) |
+--------------------------------------+----------+
| content/images/loadingAnimation.gif  |    17198 |
| see/images/loadingAnimation.gif      |     6659 |
| images/loadingAnimation.gif          |     6068 |
| node/images/loadingAnimation.gif     |     2774 |
| favicon.ico                          |     1772 |
| sites/all/modules/coppa/coppa.js     |      564 |
| users/images/loadingAnimation.gif    |      365 |
| syndicate/google-analytics.com/ga.js |      295 |
| content/img_pos_funny_lowsrc.gif     |      230 |
| content/google-analytics.com/ga.js   |      208 |
+--------------------------------------+----------+
10 rows in set (2.42 sec)

Some 25% of rows is just the reporting one missing file. Correcting this one file cuts down a pile of unnecessary logging.

Repeating Queries

Looking at just 1 random second of SQL logging shows 1200+ SELECT statements.
355 are SELECT changed FROM node

$ grep would_you_rather drupal.1second.log
              7 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              5 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              3 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1
              4 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              6 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
             10 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              8 Query       SELECT changed FROM node WHERE type='would_you_rather' AND STATUS=1 ORDER BY created DESC LIMIT 1
              9 Query       SELECT field_image_textarea_value AS value FROM content_type_would_you_rather WHERE vid = 24303 LIMIT 0, 1

There is plenty of information regarding monitoring the Slow Queries in MySQL, but I have also promoted that’s it not the slow queries that ultimately slow a system down, but the 1000’s of repeating fast queries.

MySQL of course has the Query Cache to assist, but this is a course grade solution, and a high volume read/write environment this is meaningless.

There is a clear need for either a application level caching, or a database redesign to pull rather then poll this information, however without more in depth review of Drupal I can not make any judgment calls.

Where is the innovation?

The 2009 MySQL Conference has closed it’s submissions for papers. This year the motto is “Innovation Everywhere”.

Last weekend’s Open SQL Camp in Charlottesville, Virginia, we had the chance to talk about the movements in the MySQL ecosystem. I was impressed to get the details of the Percona MySQL Patches, but focus is still in 5.0. (Welcome to the Percona team Tom Basil) Our Delta is attempting now to integrate patches into various MySQL branches. There was an opening keynote by Brian Aker from Drizzle, and Drizzle team Jay Pipes and Stewart Smith on hand. It was also announced that MySQL 5.1.30 will be GA, available in early December.

But these are not innovations that are ground breaking. Last year, it was the announcement of KickFire that I found most intriguing regarding innovation.

What is there this year?. The most interesting thing I read last week was Memcached as a L2 Cache for Innodb – The Waffle Grid Project. This is my kind of innovation. It’s sufficiently MySQL, but just adds another dimension with another companion technology. The patch seems relatively simple in concept and code size, and I’m almost prepared to fire up a few EC2’s to take this one for a spin. I’m doubly impressed because the creators are two friends and colleagues that are not hard core kernel hackers, but professionals on the front line dealing with clients daily. Will it be successful, or viable? That is the question about innovation.

Unfortunately I spend more time these days not seeing innovation in MySQL, but in other alternative database solutions in general. Projects like Clustrix, Inc., LucidDB, and Mongo in the 10gen stack.

Improving your web site compatibility with browsers

Every website page content uses two basic elements, HyperText Markup Language (HTML) and Cascading Style Sheets (CSS). Each of these has various standards, HTML has versions such as 3.2, 4.0, 4.01, and the new XHTML 1.0,1.1, 2.0 along with various version flavors know as strict, transitional & frameset. CSS also has various versions including 1, 2 and 3.

Each browser renders your combined HTML & CSS differently. The look and feel can vary between FireFox, Safari, Chrome, Internet Explorer and the more less common browsers. Indeed each version of a product also renders different. With IE 8 just being released, it’s common versions now are 5.0, 5.5, 6.0 and 7.0. This product alone now has 5 versions that UI designers must test and verify.

To minimize presentation and rendering problems, adhering to the standards can only assist, and greatly benefit the majority of entrepreneurs, designers and developers that are not dedicated resources. There are two excellent online tools from the standards body, the World Wide Web Consortium (W3C) that an easily assist you.

You can also link directly to these sites, so it’s easier to validate your HTML and CSS directly from your relevant webpage. For example, HTML Validation and CSS.

It’s not always possible to meet the standards, and when you are not the full-time developer of your site, it can be time consuming if you don’t check early and regularly.

Brand identity with undesirable domain names

Choosing a domain name for your brand identity is the start. Protecting your domain name by registering for example .net, .org, and the many more extensions is one step in brand identity.

However a recent very unpleasant experience in New York, resulted in realizing some companies also register undesirable domain names. I was one of many unhappy people, mainly tourists as I was showing an Australian friend the sights of New York. We had chosen to use the City Sights NY bus line, but we caught with some 100+ people in a clear “screw the paying customers” experience.

I was really annoyed that my friend, only in New York for 2 days both had to experience this, and missed out on a night tour. I commented, I’m going to register citysightsnysucks.com, and share the full story of our experience, directing people to use Gray Line New York, which clearly by observation were providing the service we clearly did not get.

To my surprise, the domain name was already taken. To my utter surprise, the owner of the domain is the same as citysightsny.com. Did they do this by choice, or did another unhappy person (at least in 2006) register this, only to be perhaps threatened legally to give up the domain.

I would generally recommend in brand identity this approach, especially when select common misspellings and hyphenated versions if applicable can easily lead to a lot of domain names for your brand identity.

$ whois citysightsny.com

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: CITYSIGHTSNY.COM
   Registrar: INTERCOSMOS MEDIA GROUP, INC. D/B/A DIRECTNIC.COM
   Whois Server: whois.directnic.com
   Referral URL: http://www.directnic.com
   Name Server: NS0.DIRECTNIC.COM
   Name Server: NS1.DIRECTNIC.COM
   Status: clientDeleteProhibited
   Status: clientTransferProhibited
   Status: clientUpdateProhibited
   Updated Date: 31-dec-2006
   Creation Date: 28-nov-2004
   Expiration Date: 28-nov-2011

Registrant:
 CitySights New York LLC
 15 Second Ave
 Brooklyn, NY 11215
 US
 718-875-8200x103
Fax:718-875-7056


Domain Name: CITYSIGHTSNY.COM


$ whois citysightsnysucks.com

Whois Server Version 2.0

Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.

   Domain Name: CITYSIGHTSNYSUCKS.COM
   Registrar: INTERCOSMOS MEDIA GROUP, INC. D/B/A DIRECTNIC.COM
   Whois Server: whois.directnic.com
   Referral URL: http://www.directnic.com
   Name Server: NS.PUSHONLINE.NET
   Name Server: NS2.PUSHONLINE.NET
   Status: clientDeleteProhibited
   Status: clientTransferProhibited
   Status: clientUpdateProhibited
   Updated Date: 26-jun-2007
   Creation Date: 11-aug-2006
   Expiration Date: 11-aug-

Registrant:
 CitySights New York LLC
 15 Second Ave
 Brooklyn, NY 11215
 US
 718-875-8200x103
Fax:718-875-7056


Domain Name: CITYSIGHTSNYSUCKS.COM

To www or not www

Domain names historically have been www.example.com, written also with the protocol prefix http://www.example.com, but in reality www. is optional, only example.com is actually needed.

www. is technically a sub-domain and sub-domains incur a small penalty in search engine optimization.

There is no right or wrong. What is important is that you choose one, and the other needs to be a 301 Permanent Redirect to the one you have chosen.

You also need to know that creating a server alias in your web server configuration, for example Apache or Tomcat is not a permanent redirect, in-fact it is technically duplicate content, with two web sites the same also incurring a penalty for search engine rating.

So what do the big players do. Here are a few.

Use www

  • www.google.com
  • www.facebook.com
  • www.cnn.com
  • www.yahoo.com
  • www.myspace.com
  • www.ebay.com
  • www.plurk.com
  • www.amazon.com
  • www.fotolog.com
  • www.linkedin.com

Do not use www

  • digg.com
  • wordpress.com
  • identi.ca

Show duplicate content

  • flickr.com
  • chi.mp
  • corkd.com
  • vimeo.com
  • garysguide.org
  • engineyard.com

Curiously youtube.com uses a 303 redirect, microsoft.com, stumbleupon.com and craigslist.org a 302 redirect.

How do you check? Use a CLI tool such as wget.

$ wget www.google.com
--2008-09-22 19:56:48--  http://www.google.com/
Resolving www.google.com... 72.14.205.99, 72.14.205.103, 72.14.205.104, ...
Connecting to www.google.com|72.14.205.99|:80... connected.
HTTP request sent, awaiting response... 200 OK

$ wget google.com
--2008-09-22 19:57:56--  http://google.com/
Resolving google.com... 64.233.167.99, 64.233.187.99, 72.14.207.99
Connecting to google.com|64.233.167.99|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.google.com/ [following]

$ wget www.facebook.com
--2008-09-22 20:07:59--  http://www.facebook.com/
Resolving www.facebook.com... 69.63.178.12
Connecting to www.facebook.com|69.63.178.12|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

$ wget facebook.com
--2008-09-22 19:59:43--  http://facebook.com/
Resolving facebook.com... 69.63.176.140, 69.63.178.11
Connecting to facebook.com|69.63.176.140|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.facebook.com/ [following]


$ wget digg.com
--2008-09-22 20:10:47--  http://digg.com/
Resolving digg.com... 64.191.203.30
Connecting to digg.com|64.191.203.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15322 (15K) [text/html]


$ wget www.digg.com
--2008-09-22 20:14:06--  http://www.digg.com/
Resolving www.digg.com... 64.191.203.30
Connecting to www.digg.com|64.191.203.30|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://digg.com/ [following]


$ wget twitter.com
--2008-09-22 20:26:18--  http://twitter.com/
Resolving twitter.com... 128.121.146.100
Connecting to twitter.com|128.121.146.100|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2655 (2.6K) [text/html]

$ wget www.twitter.com
--2008-09-22 20:26:41--  http://www.twitter.com/
Resolving www.twitter.com... 128.121.146.100
Connecting to www.twitter.com|128.121.146.100|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://twitter.com/ [following]

Professionally, I prefer shorter and simpler without www.

References:

Patience and Passion at Web 2.0 NY

Gary Vaynerchuk spoke next at Web 2.0 NY on Building Personal Brand Within the Social Media Landscape.

He was hilarious. His video presentation is available online, to share with others. He is inspirational for new young entrepreneur and I’d love to see him talk at Ultra Light Startups

His talk was simply “Patience and Passion”.

Words of wisdom included.

  • There is no reason to do stuff you hate.
  • You can loose just as much money doing stuff you love.
  • What do you want to do for the rest of my life, do it.
  • Hussle is the most important word. We are building businesses here.
  • You can monetize anything, you need to work hard, be patient and passionate about your business.
  • You have to have a business model, that makes some cash along the way – Freemium.
  • Your goal should be to leave a legacy.
  • You need to build brand equity, in yourself. There is never a bad time when you believe, when you work hard, when you know what you are doing.
  • People are the people that are going to help you, get out there and network, be transparent, be exposed.
  • Previously you had to work hard to build a brand, now you can use any network to become known and more successful.
  • 9-5 is for your job, a few hours with your family, then 7-2am is plenty of time to focus on your dreams.

In closing, you have to do what you love.

Integrity, clarity and responsibility – Web 2.0 NY Keynote

Next on the Web 2.0 keynote speaker list was Maria Thomas of www.etsy.com with her talk The DIY Guide to Growing a Company.

If never heard of Etsy before – Your place to buy & sell all things handmade, interesting site. Most companies start small, and stay small. Only 0.1 of 1% grow to any size (e.g. > $250 Million) Esty, this year has 100M in revenue, Amazon is about $2 billion.

The opening lines included the message “Don’t loose the essence of who you are, and what you want to achieve.” and the term Filotimo.

Filotimo (Greek)

  • Operating with integrity
  • a clarity of purpose
  • a sense of social responsibility

It was an interesting point about qualifications “I got my Internet Degree at Amazon.com, then building digital media business at NPR – National Public Radio.”

Some more quotes from this discussion.

  • Listen to end user, but be clear about when you want to go.
  • Set you goals, communicate them, measure them.
  • Practice Filimato – Keep it human.
  • Go behind the resume, talk to people, be direct, be honest. Believe in employees.
  • Understand that small decisions and impact bigger decisions.
  • Just because your a DIY company doesn’t mean you have build everything.
  • Get products out the door, get them out fast.
  • It takes an effort of will, and a very good process to do this well.
  • The perfect is enemy of good enough.
  • Launch products off at platform and build more quickly.
  • The marketplace has social, it’s personal, it’s playful.
  • Visits to etsy become habit forming, it makes connections to real people with unique products.

What if software was a physical object – NY Web 2.0 Third Keynote

Some points of reference from the next Web 2.0 keynote by Jason Fried of 37 Signals

  • Software business is a great place to be.
  • You can build anything you want. All you really have to it type, it’s not easy to do, it’s just not that hard to do.
  • Change is easy, cost is cheap in relation to physical objects.
  • You can build it anywhere.
  • Software doesn’t have the same kind of feedback as physical objects.
  • Visually we can determine good verses poor design (e.g. a bottle of water or a remote control).
    It doesn’t have edges, size or weight. It just expands, continues to expand and this is bad.
  • What would your software be like if it was physical?
  • When you say yes to too many features you end up with Homer’s car.
  • The goal should be simple, clean, elegant and streamlined.
  • Once you hit bloat, it’s too hard to go back.
  • Listen to customers, but don’t do everything as they say. Think of yourself as a curator.
  • Make your software a collection, not a warehouse.
  • You don’t need to have everything in the world.
  • You need a few solid features.
  • Real work is hard, imaginary work is easy.
  • Tell the people that want to have new features, to build them. Attach real costs to any requests.

The DNA of your company has to able to say no.

Technology changes, humans don't. – Web 2.0 NY Second keynote

I needed a rest from my opening keynote review NY Tech 1995-2008. Opening Web 2.0 Expo NY Keynote but a few siginificant points from The Death of the Grand Gesture by Deb Schultz.

  • An interesting site is Visual Complexity showing graphical representations of many social networks.
  • All the binary communication becomes white noise — Information Overload.
  • “Technology changes, humans don’t” – Deb Schultz

NY Tech 1995-2008. Opening Web 2.0 Expo NY Keynote

Web 2.0 Expo NY keynotes are happening today. Technology in use included CrowdVine which I’d not heard of, and plenty of Twitter feeds such as w2e_NY08.

The opening keynote was Fred Wilson from Union Square Ventures with his presentation New York’s Web Industry From 1995 to 2008: From Nascent to Ascendent .

Some stats, Seed and early stage deals.

  • 1995 230 SF Bay area, 30 in NY
  • 2008 360 SF Bay area, 116 in NY

Fred first asked “New York is not an alley. Call it Broadway, or just New York.”

Here is a summary of his history of New York Web Industry.

  • 1991 – ZDNet
  • 1993 – New York Online Dialup services
  • 1993 – Jupiter Communications online conference
  • 1993 – Prodigy
  • 1994 – Startups such as Pseudo, Total New york, Razorfish.
  • 1994 – Time Warner Pathfinder
  • 1995 -NYIC 55 Broad St. – Technology oriented building
  • 1995 – Seth Godin – Yoyodyne – Permission Marketing
  • 1995 – itraffic, agency.com, NY Times online
  • 1995 – Softbank, Double Click, 24×7, Real Media
  • 1996 – Silicon Alley Reporter
  • 1996 – ivillage, the knot
  • 1996 – Flatiron Partners – good sued for that
  • 1997 – The Silicon Alley Report Radio Show
  • 1997 – mining co.
  • 1997 – Total NY sold to AOL
  • 1997 – Agency rollups razorfish buying 4 companies
  • 1997 – DoubleClick IPO
  • 1998 – Seth Godin moves to Yahoo
  • 1998 – Burn Rate
  • 1998 – Kozmo – We’ll be right over
  • 1998 – was the last year of sanity in the Internet wave
  • 1999 – The start of the boom
  • 1999 the big players came online , all hell breaks loose. 200 startups were funded in 1999, 300 in 2000.
  • 2000 – The Crash & Burn
  • 2000 – f**kedcompany
  • 2000 – Google came to New York. – 86th St Starbucks
  • 2001 – Layoffs, Landlords and bankruptcies
  • 2002 – Rock bottom
  • 2003 – Renewal
  • 2003 – Blogging started gizmodo
  • 2003 – Web 2.0 coined
  • 2003 – del.icio.us was launched from a computer in an apartment
  • 2004 – NY Tech Meetup
  • 2004 – Union Square Ventures $120million raised
  • 2005 – about.com acquired by NY Times
  • 2005 – Etsy
  • 2006 – Google took over port authority building, now with 750 engineers in NY
  • 2008 – Web 2.0 comes to New York City

New York is now 1/3 of Silicon valley, compared to 1/8 of funded Internet companies.

One thing mentioned is a documentary called “We live in Public”. Some of the footage from 1999, is so early Big Brother.

Web 2.0 in NY

I will be attending next week’s Web 2.0 Expo 2008 in New York.

Garys Guide has a schedule of the key events and off site associated event parties.

It will be a bit of a change from the typical MySQL Conferences and recent OSCON Conference I have attended this year.

The Keynote titles gives you an indication of the variety of talks expected.

  • Organizing Chaos: The Growth of Collaborative Filters
  • (Re)making the Internet: Accounting for the Future of Information, Communication and Entertainment Technologies
  • Next Generation of Video Games
  • 10 Things We’ve Learned at 37signals
  • High Order Bit
  • What ManyEyes Knows
  • Arianna Huffington in Conversation with Tim O’Reilly
  • Because We Make You Happy
  • The Real Future of Technology
  • Enterprise Radar
  • The Death of the Grand Gesture
  • It’s Not Information Overload. It’s Filter Failure.
  • Building Personal Brand Within the Social Media Landscape

Domain name trends

It started with del.icio.us/ (which now ironically redirects to http://delicious.com), and now it’s becoming more the trend to create a domain name with the extension included for effect.

With unique .com domains harder to come by, and dropping vowels like flickr.com so last generation, some countries must be trying to cash in on the success such as Tuvalu which has something like 10% of GDP from domain name sales of .tv.

Some recent names I’ve noticed are http://identi.ca, http://chi.mp and http://cyclo.ps.

I have even considered some recent projects using this new trend, but the combination of either the 2 letter extension not existing (For example .ld) or it’s not possible to get domains from a registrar (For example .er) it will take some time.

What is Google's direction?

Tonight over discussion was Android and what is Google’s ultimate direction. Have they lost their way, or are they just planning to explode with so many new things that will revolutionize what and how we do things. With $475,000 first price for Android, they certainly have the money available to invest in new directions.

I arrive home, and find email discussion on The Google Browser – Chrome.

Inquisitive, I take a look, to find the great teaser, nothing by a comic, come back tomorrow for the download link. Is that clever to leak information, have everybody write about it and check back tomorrow?