Tuesday, September 28, 2010

Android Voice Actions

Tim Bray posted a note about a new "listen to" Voice Action which is intended to permit users to issue commands in a fairly flexible (and not strongly defined) manner such as "listen to Birthday by the Beatles".

It is a neat concept for integrating technologies, but I'd urge him to reconsider the wisdom of the command's voicing, as it is not an imperative command. You don't go to a concert and scream out "Listen to Freebird;" the correct form is, "Play Freebird!"

This is not a small issue. The more such mechanisms grow and blossom, the more vital it will be to choose and adhere to the right form of expression so it scales without inconsistencies and descent into mealy-mouthed passivity.

They should fix that.

Friday, June 18, 2010

Gmail/Google/Android contacts

Google's contact management system for Gmail has a wide impact on how I communicate these days. It is a poorly organized affair, whether you look at it from the perspective of an end-user or a third-party app coder who must make use of the data.

This entry covers the shortfalls I see from the end user side.

Impact on End Users

In a manner typical of Google's strengths of collecting information to serve the user, its various platforms (Gmail, Android, etc) keep track of any contact-related information, whether or not it is directly tied to a "contact" in the sense of the word consistent with common practice (a record for a person explicitly added to an address book by reference or authored from scratch). To wit, if you send or receive an email, the email address of the other correspondent will be filed helpfully away, whether or not this person is a "contact" the user has explicitly created a record for. Similar steps are taken with other communications, such as explicitly dialed phone numbers or caller ID information encountered while using your Android phone, for instance. It scales out to just about any type of media you could come up with ... tweets, IMs, activity on Youtube, etc.

This is a helpful sort of initiative, serving the user in future by allowing Google to offer auto-completion service should the user start entering a familiar phone number or email address in another literally (in the sense I typed an address rather than a contact name) addressed communication, etc. However, it falls flat when Google takes the entire venture too far, and fails to recognize and honor a distinction between people the user truly cares about in a persistent manner and those who are merely "passing ships in the night".

For instance, my Gmail page has a presence-tracking chat list in the left column, peopled largely by random bozos spiced with a few truly chat-worthy people. This is substantially due to customers who sent me a single email inquiry regarding my Android app having been automatically tossed into there alongside friends I have known for 15 years and see regularly in real-world social situations. It turns out there is an option to disable this automatic function, but one has to wonder if the better default would not have been to differentiate those people I have explicitly created a contact record from those with whom Google senses I have had some interactions.

Worse things happen when I visit Gmail's (or Android's) contact apps, where we see that the line between who IS a contact versus who is NOT a contact is blurred to the point of erasure. Google generically calls all these people "contacts", and attempts to placate my concerns about this lumping-in by placing the true contacts into a group called "My Contacts" and to throw all the others into a superset of this called "All Contacts". This is an unhelpful organization, and one that I believe replaced a superior model Google had previously employed.

In the old model, rather than have a superset of My Contacts which added in the fragmentary or rich hints of other people who might be good candidates for promotion to the fold of people I consider my true contacts, Google kept these aspiring contacts in a disjoint set called "Suggested contacts" (or similar). This offered me an easier means of managing those who were "My Contacts" and those who were not, as I could visit the set of Suggested Contacts and tag and move over any who seemed to want to come over to the rarified pantheon of power.

In the new model, there is no way I can review in one place the people and addresses Google has sensed might be important to me -- if I am seeing this information, it is in "All Contacts" and they are heterogeneously mixed in with the people who are already "My Contacts".

Why is this important? Well, other than having a chat bar of people I find relevant to my day, I am particularly dogged by having a manageable contact list for use in voice dialing on my Android phone. This is not only important for my own purposes, but for the purposes of everyone who uses my Android app, as it offers voice dialing and other services where users must speak the names of one of their contacts, and Google is not making it easy for users to manage who is a contact and who is not. This is crucial for me, as the speech recognizer is limited in how many names it can store, and the utility of the app plunges when the precious name space is clogged with transient losers who are difficult to keep from naturally trying to swim into the contact list.

Impact on Developers

It is also next to impossible for me to determine this myself in my Java code, but that is to be a subject of a second chapter touching on Android's lack of an object-oriented Java API for accessing contact data. The upshot of this deficiency is a vastly increased need for every third party code to write the same arduous URI-based code on sands that shift with each revision of the Android platform. It creates instability, versioning nightmares and exploding test case scenarios no small third-party developer will ever actually be able to identify, enumerate, let alone support properly.


Friday, May 21, 2010

Apple: how text search should work

My own sense of how the simple text search functions built into word processing, text editing and a variety of other common apps was probably forged by Apple's early usage patterns for this field (or was this a case of Microsoft doing something right?). It's simple, really: you invoke a key combination or menu function, and a dialog pops up to let you type out the pattern you'd like to find within the document you were last looking at. There are some other functions that are nice to have:
  • Do you want to search forward or backward from the present spot?
  • Must the case of the letters match?
  • (in a fancy app such as a code editor) will there be regular expressions in the search criterion?
  • (almost always) if the application can modify the text, other fields to permit search-and-replace
And, lastly, there are some contextual factors to consider, most notably: what should be in the text area where you enter the search string when the dialog box appears?

It seemed that there was a consensus of how this should be done, and things were spiffy: if there was text selected in the document when you pressed cmd+f, the selected text would be copied into the search box, affording you an easy means of double-clicking a word and hitting cmd+f and then enter to find the next occurrence.

In a thoughtful embodiment, successive presses of enter on the search box (or cmd+g from any context) would find successive occurrences of the search string. If there were no text selected in the document when you pressed cmd+f, the search box would come up blank or with the previous search string entered and selected, which offered a convenient alternate means to repeat a recent search. The fact that the text would be selected, however, meant that typing any new search string at all would replace the old one. This kept the fresh-start case from being even a tiny hassle. Though you still can find apps that do this precisely right (e.g., the "search within page" function in Firefox), life was good when apps abided by this pattern or a near variation.

But suddenly, the apps that seem most vital to me on the Mac do it pretty darn wrong. They offer no advantages, only shortfalls from the pattern described above. TextEdit (v1.6) and Dashcode (v3.0.1) do not take any text selection from the document when their cmd+f search box is invoked -- they present the previously searched-for text. If you want to search for a word you've selected, rather than the two steps of cmd+f/enter, you must perform 4 steps: cmd+c/cmd+f/cmd+v/enter to achieve this very common desideratum. However, they add a second search function ("Use selection for find" ... cmd+e) which behaves totally differently. It does not actually perform a find at all, but simply copies the selected text into an unseen buffer of what WOULD be in the search box. Actual searching is achieved by subsequent use of cmd+g or shift+cmd+g. Pressing cmd+e when there is no text selected greets the user with a "bonk" error tone... as if he made a mistake of some kind ... stupid user.

Proving that more thought can only be more toxic, X-Code (v3.2.2) takes the cake. It starts with the same behavior of the smaller and adds a third function, ("Find selected text"... no cmd key shortcut) which almost works as you'd like, but it does so without a searchbox being present, which denies you access to any of the other options this box affords (if could be summoned via cmd+f). Here again, the case of your using this function when there is no selected text is rewarded by a "bonk" tone.

Why are so many "behaviorful" functions created when the problem and its sub-cases are so simple? If there is a plausible rationale for why anyone would prefer the approaches taken in these apps to the simpler and more flexible pattern defined above, I'd like to hear it. I would like to feel this is an embarrassing situation of the right way being patented, as I'd tend to think only someone hamstrung by the exigencies of our patent system could prompt Apple to produce such an almost Soviet-style solution to text search. 

Take Away Apple... make your text searches as easy as cmd+f, cmd+g, shift+cmd+g -- with additional options for text replacement and such being available on a find panel that can be dismissed by pressing escape. Your present approach is elephantine and requires the user to mull a variety of tools when only one is needed.

Wednesday, May 19, 2010

Future topics

  • OS X's icons and labels for modifier keys
  • Creating folders in the OS X Finder
  • Rudder pedals on aircraft
  • power windows on an Aston Martin Vantage (sorry, Blade)
  • Auto-correct and word completion on iPhones and Android phones
  • Eclipse, Visual Studio, X-code and IDEA
  • completeness of an interface