Content Trumps SEO

#1 Google ranked www.small-investor.comHow likely do you think it is that I could create a web site that ranks #1 when you search for very common terms like: SMALL INVESTOR IRA and SMALL INVESTOR 401k ??? This is the holy grail of what is called Search Engine Optimization, or SEO. Just so we don’t get our expectations too high, keep in mind I’m competing with the likes of Fidelity, T.Rowe Price, Scottrade, Morgan Stanley, USA Today, NY Times, CNNmoney, and thousands of other billion-dollar companies.

So let’s make it even tougher. Since we’re all indoctrinated into the magic of PageRank, we all believe that you need lots of LINKS IN to your Web site for it to gain ranking, right? After all, that’s the key technology that the Google founders claim makes Google unique. Unique even to the point of $165 billion in capitalization. (That’s about the size of IBM and just a little smaller than Microsoft and Apple.) So let’s add that we would like the #1 ranking WITHOUT ANY LINKS IN.

Nope. In fact, if you enter SMALL INVESTOR IRA (no quotes needed), you’ll see the top site is
And it’s a site I created for the sole purpose of understanding SEO.

So, you imagine I’m making millions on click-throughs, right?
Actually, almost no one ever visits the site. And far fewer ever click on the (highly relevant) ads on it.
The reason…there is very little content on the site.
Surprise! People still visit web sites because they want some kind of information. And, quite frankly, I haven’t put much up there of value.

But I certainly have learned a few things about SEO!

Paul Firth

Posted in Internet, SEO | Tagged , | Leave a comment

The OCR Curve

by Paul B. Firth 2010

Though Optical Character Recognition (OCR) has been around for a couple of decades now, for the first time we are about to see a dramatic shift in its application. Watch for a very rapid (relatively) increase in OCR usage, followed by its demise. Companies in several markets would be wise to prepare for both.

The sudden change in OCR application will be caused by the following paradigm shifts:

1. It has become painfully obvious just how expensive using paper to capture and distribute information on paper has become. The following major developments illustrate this change:

a. Usage of FAX for document transmission is decreasing rapidly. Most businesses now avoid this method of information transmission as much as possible. It has been largely replaced by E-mail.

b. Data is now captured on web sites more often than on paper. The number of places where data forms are needed is decreasing rapidly. People apply for retail jobs at a terminal in the store. Government forms such as registration renewals are normally completed online. The list is endless.

c. The next generation has no appreciation for any aspect of paper, from printing to storage and retrieval. They have no fear of a missing physical record.

d. Standards for digital information storage have become stable. For example, PDF/A documents are likely to be readable for a very long time. And data stored in legacy systems will always be exported before the system reaches end-of-life.

e. The loss of life due to a lack of information sharing will soon be considered intolerable. In the U.S. the National Healthcare Information Network will avoid the weaknesses of paper records.

f. Data is far more reliable than paper. Your personal photos and documents are actually much safer online than in your home. Businesses have also come to this realization.

g. Personal information-viewing devices like the iPad, netbook, and eReader are finally gaining widespread acceptance.

2. There is a lot of existing paper. As its weaknesses become obvious, most of it will need to be digitized. This rush to make nearly all legacy papers more useful, less expensive, easier to share, and safer will require many trillions of documents to be OCR’d.

3. Since paper is such a restricted medium, the paper-based capture of data will drop dramatically. At some point, almost nothing will ever exist on paper. It has extreme cost with limited utility. The newspaper industry is among the first to notice this. If you are expected to apply for a job at a terminal, why can’t your doctor ask you to provide your symptoms and family medical history on an iPad?

The observations above lead directly to the following OCR Curve:

The OCR Curve - by Paul Firth

OCR usage will change dramatically over the next few years.

There are five distinct sections to this curve, which has the familiar shape of any product adoption curve. What may be unusual is the suddenness of the inflections we should expect.

A – Interestingly, until today we have only seen section A of the curve…gradual adoption of OCR technology. As accuracy has improved, there has been a steady increase in OCR usage over the past 20 years. Nothing dramatic has happened…yet.

B – Because of the confluence of factors described above, we are beginning to see a very dramatic increase in the number of pages recognized.

C – It will take some time to OCR trillions of legacy documents. The width of this portion of the curve will depend on the number of OCR engines installed worldwide. The more engines deployed, the narrower the curve, since the number of documents is roughly the same.

D – At some point, two things occur. We run out of old documents to OCR, as we collect and distribute less and less new data on paper. The market for OCR (and notably printing, then scanning) plummets at that point.

E – There will always remain some documents that need to be OCR’d. The numbers have dropped dramatically, even lower than the A section, but do not go to zero for quite some time.

The coming OCR revolution will greatly increase the revenue of companies involved with scanning, OCRing, transmitting, and storing documents. Millions of new OCR engines will be deployed, as every household and business scans and converts everything they have, either locally or remotely.

Following the “gold rush,” those companies that have not diversified into digital information management will find themselves unsustainable. Though clearly foreseeable, the suddenness with which OCR becomes unnecessary will catch many off guard.

Posted in OCR | Tagged | Leave a comment

Elements of a Paperless Reality

by Paul B. Firth July 20, 2010


We have waited so long for the paperless office that today, it is popular not only to pronounce that it remains many years off, but to point out that we are actually using more paper. Like the stock market, it’s exactly when we reach the point of widest doubt that a reversal of fortunes begins.

Although more dollars are going into manipulation of paper documents than ever before in history, we are close to the inflection point. Only now we can envision no longer just a paperless office, but a paperless life. The speed of technology transitions is increasing rapidly. Just a few years after this inflection, paper-based workflows may seem like absurdly inefficient relics. Reference points include Facebook, which went from its first open use in late 2006 to over 100 million users today, and the iPhone, which went from introduction in mid-2007 to 43 million units by the end of 2009.

    Why now?

For the first time in history:

1. We have moved to a single-platform world… the web browser. The browser is the unifying technology that enables delivery of virtually all required information. It has taken about 15 years. The last few times a change of this significance in information-sharing occurred were television (introduced 1929 – adoption 1960s), and radio (introduced 1897 – adoption 1940s). Before that was Gutenberg’s press (1455). The one just prior to that was the invention of paper (ca 3500 BCE).

2. Most people will soon have web browsers in their possession at all times. Reliable sources indicate that over 50% of all web pages will be viewed on portable devices (not PCs) within 5 years…today it is under 2%. Most new television sets, and all gaming systems, include browsers. Bankers and brokers are fully aware of this trend, already offering most services via iPhone, Blackberry, and Android. The fallout of that is just now beginning.

3. We are rapidly becoming spoiled. If I’m suddenly curious how tall Bob Costas is, I can find that out in 5 seconds. But if I am asked for my mother’s social security number, I have to wait until I get home, and then dig though a file cabinet.

4. The next generation of doctors will presume information availability, and it will be provided via the National Healthcare Information Network (NHIN). Most other professions are ahead of the medical community, and are just waiting for wider enablement.

    Operational Requirements of the Near Future

Many of the needs implied by a paperless life can be illustrated via your annual personal income tax debacle. I’ll assume that I implemented my paperless reality on January 1st, last year. Note that all of these actions are already possible, just not streamlined.

1. Online Storage – I splurged and purchased 20 GB of online storage from Google for $5 per year. All my documents are maintained and visible, no matter where in the world I happen to be.

2. Receipt Scan – All year long, every time I purchased something I thought might be deductible, I dropped it into (or onto) my scanner, or took a photo with my phone. Through a fabulous user interface, perhaps some OCR and form processing, and the Google API, all of my receipts are organized into meaningful folders, by year and category. In any event, I can easily reorganize these receipts online at any time later.

3. Bank and Credit Card Statements – My bank says they keep my electronic statements for 7 years, but as they are e-mailed, I collect them (rules engine?) and copy them to my Google folder just in case they go out of business.

4. Letter from Sis – She still likes writing letters. Yes…with a pen. She usually has some tax advice. I scan them to my Family folder.

5. Junk Mail – Unbelievable! A coupon I actually wanted! I scan it to my Coupons folder. The checkout lady is nice enough to scan the image off my cell phone at the store. Soon, USPS won’t have to deliver this junk anymore, as most coupons already arrive by Email, or I look them up as I wander the aisles.

6. Expense Reports – Luckily, my company has online entry, so I take pix of my receipts and classify them while I’m in my hotel room. I don’t lose receipts anymore.

7. Tax Time – As I complete my TurboTax return, I move each item to the Archive 2009 folder as it is processed. I have access to all of my bank documents and receipts. I have scanned all my 1099s, so they are easy to find. I eFile the return, and create one large PDF/A file containing every relevant document for the year, including a meaningful, auto-generated table-of-contents.

8. Audit Time – I’m part of the random audit. I bring my laptop to the IRS, and we go over all of my receipts online. I give them a CD with my 2009 PDF/A file on it, which they immediately print out for filing and later review.


It’s unlikely that any one company could drive the paperless life. None has the reach nor the experience in user interface design today. Absent that, many companies will lead the charge. Those with scanning, storage, Web, and UI design have unique advantages. Some of the required technology gaps are as follows:

1. Intelligent Scanners – There is no reason that a scanner should require a user interface. Scanner OEMs could use our software to determine if a document requires color preservation, resize to the appropriate resolution, binarize, OCR, convert to PDF, and auto-index. New scanners must also be able to accept irregular-shaped objects, perform duplex scanning, and include sheet-feeders. Network scanners should scan directly to a destination Folder system across the Web (without a PC) for under $500.

2. Back-end Processing – Numerous operations should take place on the back end, allowing little or no software to run on mobile devices outside the browser. This includes OCR, document conversion, intelligent indexing and archiving via forms processing, document management, barcode reading, image cleanup, presentment of barcode images to the phone for scanning. The ECM companies have a unique opportunity to dominate the consumer market by extending what they have developed for the business markets.

3. Web-based Presentment – There are numerous image-manipulation technologies that are currently being applied to the distribution of images to browsers, such as Silverlight, ASP.NET, and many others.

4. TurboTax Types – Virtually all companies that build A/P, A/R, inventory, and a host of other applications, are going to find a suddenly increasing need for OCR and related technologies. The TurboTax user, for example, will want to draw a box around an item’s price and description on a credit card bill, or select a 1099 box, and have it automatically entered into their return.

Proprietary and Confidential ©2010 by Paul B. Firth

Posted in Uncategorized | Tagged , , | 1 Comment

Why we can’t recover

It’s difficult to ignore the doom and gloom. We would all like to believe that we are in the midst of an astonishingly rapid recovery. But there are several major outstanding issues that are destined to prevent that. Some people talk of a Square-root shaped recovery…a “V” followed by a stable period. But rarely has the market remained stable in recent times. Much more likely, even during a stable period, is a market fluctuating up and down 10 percent, with no long-term direction.

Interest rates
They are essentially zero now. They can go in only one direction. As they increase, the stock market will suffer because money markets will start to look better, without risk. This will pull billions out of the stock market, as investors have seen a lost decade and are more risk-averse than ever. More are also closer to retirement, reducing risk appetites. Bond prices also naturally drop as interest rates increase.

Housing glut
The trillions lost in home equity translates into years of lost purchasing power for millions. As people lose money on forced sales, the will be more money coming out of the market.

Baby boomers have just begun retiring. This is a 20-year population surge of new retirees. Many will be moving to smaller homes, further extending the housing glut. Incomes (and therefore expenditures) of this huge group will be dropping. Many will be living off of their IRAs and 401k plans, which means pulling equity from their mutual funds, draining the market. There is no replacement for this wealth except a smaller number of replacement hires, at far lower compensation rates…good for business expense, but bad for the economy.

Figures don’t lie…
…but liars figure. All past recessions in recent memory were under 15 months long. Thus, when year-over-year results improved, we were comparing to mostly “good” times. This recession, at closer to two years long, has us comparing corporate earnings with completely dismal values from the depths of the recession. Earnings up 20% over last year? That might be meaningful if last year was healthy. But if earnings were close to zero, percentages are completely meaningless. When EPS goes from a penny per share to ten cents, that’s an increase of 1000%, but it may be immaterial for evaluation purposes. And percentages are totally meaningless when earnings are negative, as they were for a significant part of the market. Our entire metric for understanding improvement is biased and we have no valid mathematical representation for tiny changes in revenue that push earnings from slightly negative to slightly positive.

Enjoy your “low” taxes
Historically, income taxes at or near their lowest point, even as the government deficit is at its highest point in history. Conclusion? Taxes will have to increase or the government goes out of business. Every dollar that goes to the government to cover debt comes directly out of the economy. And the cost to service that government debt is going to skyrocket when interest rates increase. Once again, the box is closed on all sides, with no room to move.

Too many missing jobs
The New York Times reported that, even at the fastest job-adding rate in history, adding over 300,000 jobs per month, it will take almost 5 years to return to pre-recession employment.

The federal stimulus programs are now winding down. Cash for clunkers, home buyers incentives, and major government subsidies are beginning to spend out. There’s no oomph in the recovery to absorb those losses, and we haven’t climbed back up the hill. “Just one more time” stimulus will increase most of the problems above.
Posted in Uncategorized | Tagged , , , , | Leave a comment

Financial Recovery?

Should we be optimistic about the economic recovery? IS there actually a recovery? Well, the market has certainly reacted as if there has been. But how far can it take us? In following posts, I’ll discuss the risks to recovery, the roadblocks ahead, the reason that the way out is far from easy. Stay tuned!

Posted in Recovery | Tagged , , , | Leave a comment