Thursday, 18 November 2010

Google Instant: how I *wish* it worked

There's something very grating about a product that could be really useful but just isn't. It's like the really promising kid on X-Factor / American Idol who keeps falling apart and forgetting their song. The first couple of times your rooting for them but after a blowing it repeatedly you start to wish they'd just give up. For me, this is a perfect metaphor for the current iteration of Google Instant.

Google Instant how do I hate thee? Let me cout the ways.
>> Too damn fast
I'm on a fast connection and if anything GI is just too damn fast. Results are constantly flickering just below my field of focus. I tend to use very exact (e.g. long) search phrases and this gets old quick. I find myself pausing while typing to look at GI results that are irrelevant to what I actually need. The cynic in me wonders whether this is what Google want. Are they trying to be 'sticky' now, despite a decade of saying this isn't their goal?
>> Very generic phrases
It's frustrating is how useless the GI suggestions are. GI only gives you very generic phrases and they seem to be based on an average global searches. The trouble is that I'm not an average global searcher. I've been using Google forever, they have a huge trove of data about what I've searched for and which results I've clicked on. They have know which subjects I'm interested in and which ones I'm not. They put this info to use in 'normal' search in a variety of ways but apparently not in GI.

Technologically impressive
GI is clearly a very impressive bit of technology. The number of elements that have to work in harmony for it to return results that fast is honestly a little mind boggling. I take my hat off to the clever clogs who made this happen. Nevertheless, I'd prefer to wait (like a whole *second*) for even slightly better results.

An incomplete puzzle
Having said that, I'm quite sure that GI can be made a lot better and, as it's bad form to complain without offering a solution, I have a some suggestions about how it could be better. None of my suggestions are particularly original (or insightful?), mostly I'm just suggesting that existing elements be combined in better ways.

Instant Example
Here's a sample of Google Instant in action. 5 suggestions and a ton of dead space.  It's not clear how the  suggestions are ordered or whether the order has some hidden meaning. (I should probably check out whether more costly PPC terms appear first…)

Related Searches
Here's a sample of the Related Searches option. This is buried under "More Search Tools" on the left.  Obviously there are more suggestions here but they are also different from Instant and in a different (also non-obvious) order.
Wonder Wheel (of Doom!)
Here's a sample of the Wonder Wheel option also buried under "More Search Tools". Again there is no context around any of the terms and the underlines suggest links but actually trigger a new 'wheel', the results are displayed on the left but you need a very wide screen otherwise they only get ~200 pixels of width.

A mockup example
Here's my mockup for your amusement. Points to note:
1) Search suggestions appear in columns, each column depends on the column to it's left. If you've used Finder on the Mac you know the score here. User can navigate with the mouse or arrow keys.
2) Suggested terms are greyscale to indicate some hidden metric that may help the user choose between terms. Possible metrics: number of results, popularity of the term, previous visits, etc. Previously used terms could appear in purple. There are lots of possibilities here.
3) When a term is selected (using with space or arrow right) it's added to the search box. Terms can be removed the same way (arrow left or backspace). I strongly feel that users should be encouraged to build long and specific search terms. Long terms are far more likely to result in quality responses in my experience.
4) Note that all aspects of Googles offering can be integrated in the Instant Search experience. I've noticed that the video, image and social aspects have dropped to the bottom of the results. My mockup allows them to become much more front and center.

*Do* be dense
Ultimately the Instant Search experience needs to become much more information dense. Sure "your mom" might not appreciate the color coding of the suggestions but it doesn't detract from her experience. Google needs to think much more holistically about Instant. Just getting any old crap faster is not an improvement regardless of how impressive it is, but getting the exact right result faster would be invaluable.

Tuesday, 2 November 2010

Thoughts on 37signals requirements for a "Business Analyst"

Jason Fried posted a job requirement on Friday evening for a new "Business Analyst" role at 37signals, although in reality the role is more of a Business Intelligence Analyst than a typical BA as I have experienced it. The role presents something of a conundrum for me and I thought it would be interesting to pick it apart in writing for your enjoyment.

UPDATE: I'm made a follow-up comment on the 37s blog that pretty is a good summary for this post - "I was reflecting on the requirement from years of experience doing this kind of thing (sifting meaning from piles of data). I actually said that they’ve described 2 roles not often combined in a single person.

So let me give an actual suggestion: > First, do the basics, make sure the data is organised and reliable. > Second, establish your metrics, build a performance baseline. > Third, outsource the hard analytics on a “pay for performance” basis. > Finally, if that works, then think about bringing analytic talent in-house.

It’s hard to describe the indignity of hiring a genius and then forcing them to spend 95% of their time just pushing the data around.

I call myself a Business Intelligence professional and my last title was "Solution Architect". You can review my claim to that title on my Linked In profile.  This should be a great opportunity for me. I'm a fan of 37signals products; I love the Rework book and I pretty much agree with all of it; and I really like their take on business practices like global teams and firing the workaholics.

However, it doesn't seem like this role is for me. Why not? It seems like they've left out the step where I do my best work: gather, integrate, prepare and verify the data. In the Business Intelligence industry we usually call the results of this phase the "data warehouse". A data warehouse, in practice, can't actually be defined in more detail than that. It's simply the place where we make the prepared data available for use. Nevertheless, the way that you choose to prepare the data inherently defines the outcomes that you get. It's the garbage in, garbage out axiom.

Jason tells us a little bit about where their data comes from: "[their] own databases, raw usage logs, Google Analytics, and occasional qualitative surveys." We're looking at very raw data sources here. Making good use of these data sources will take a lot of preparation (GA excepted) and will require a serious investment of time (and therefore money). The key is to create structures and processes that are automated and repeatable. This may seem obvious to you, but there's a sizeable number of white collar workers whose sole job is wrangling data between spreadsheets [ e.g. accountants ;) ].

The content and structure of the data is largely defined by the questions that you want to answer. Jason has at least given us an indication of their questions:  "How many customers that joined 6 months ago are still active?"; "What’s the average lifetime value of a Basecamp customer?"; "Which upgrade paths generate the most revenue?". Thats a good start. I can easily imagine where I'd get that data from and how I'd organise it. This is 'meat & potatoes' BI and it's where 80% of the business value is found. These are the things you put on your "dashboard" and track closely over time.

Another question is trickier: "In the long term would it be worth picking up 20% more free customers at the expense of 5% pay customers?" There's a lot of implied data packed into that question: what's the long term?; what does 'worth' mean?; can you easily change that mix?; are those variables even related?; etc. This is more of a classical business analysis situation where we'd build a model of the business (usually in a spreadsheet) and then flex the various parameters to see what happens. If you want to get fancy you then run a Monte Carlo simulation where you (effectively) jitter  the variables at random to see the 'shape' of all possible outcomes. This type of analysis requires a lot experience with the business. It's also high risk because you have to decide on the allowed range of many variables and guessing wrong invalidates the model. It can reveal very interesting structural limits to growth and revenue if  done correctly. Often the credibility of these models is defined by the credibility of the person who produced it, for better or worse. Would that be the same in 37signals?

Now we move on to slippery territory: "What are the key drivers that encourage people to upgrade?"; "What usage patterns lead to long-term customers?". We're basically moving into operational research here. We want to split customers into various cohorts and analyse the differences in behaviour. The primary success factor in this kind of analysis is experimental design and this is a specialist skill. Think briefly about the factors involved and they make the business model seem tame. How do we define usage patterns? Will we discover them via very clever statistics or just create them _a priori_? What are the implications of both approaches? The people who can do this correctly from a base of zero are pretty rare, in my experience. However, this is an area where you can outsource the work very effectively if you have already put the effort in to capture and organise the underlying information.

And the coup de grace: "Which customers are likely to cancel their account in the next 7 days?". It sounds reasonable on it's face. However, consider your own actions: Do you subscribe to any services you no longer need or use? The odds are good that you do. Why haven't you cancelled them? Have you said to yourself: "I really need to cancel that when I get a second."? And yet didn't do it. When you finally did cancel it, was it because you started paying for a substitute service? Or were you reminded of it at just the right time when you could take action? Predicting the behaviour of single human is a fools game. The best you can do is group similar people together and treat them as a whole ("we expect to lose 10% of this group"). Anyone who tells you they can do better than that is probably pulling your leg, in my humble opinion.

Finally, let's have a look at their questions for the applicants cover letter:
1. Explain the process of determining the value of a visitor to the home page.
>> Very open ended. How do we define value in this context? The cost per visit / cost per click? The cost per conversion (visitors needed to deliver a certain number of new paid/free signups)? Or perhaps the acquisition cost (usually marketing expenses as a % of year one revenue)? I'd say we need to track all of those metrics but this is pretty much baseline stuff.
2. How would you figure out which industry to target for a Highrise marketing campaign?
>> This isn't particularly analytical, I'm pretty sure the standard dogma is sell to the people who already love your stuff. It would be interesting to know how much demographic data 37signals has about their customers industry. This is an area where you typically need to spend money to get good data.
3. How would you segment our customer base and what can we do with that information?
>> This is a classical analytic piece of work. I've seen some amazing stuff done with segmentation (self organising maps spring to mind). However, in my experience models based on simple demographics (for individuals) or industry & company size (for businesses) perform nearly as well and are much easier to update and maintain.

As far as I can tell they want to hire a split personality. Someone who'll A) create a reliable infrastructure for common analysis requirements, B) build high quality models of business processes and C) do deep diving 'hard stats' analytics that can throw up unexpected insights. Good luck to them. Such people do exist, simple probability essentially dictates that that is the case. And 37signals seems to have a magnetic attraction for talent so I wouldn't bet against it. On the other hand one of their koans is not hiring rockstars. This sounds like a rockstar to me.

Full disclosure: I'm currently working on a web service (soon to be at that synchronises  various web apps and also backs them up. It's going to launch with support for Basecamp, Highrise and Freshbooks (i.e. 2/3rds 37signals products). Make of that what you will.

Disqus for @joeharris76