Ambient News

As some people know, it’s possible to get the latest news about our favorite sites on a single page through a fairly ubiquitous technology called web syndication. The advantage of this is that we can look at all the news we want in a single place, instead of having to visit dozens of websites per day.

Unfortunately, actually setting up web syndication can be a chore—and often, a confusing one at that. For instance, the way Firefox lets the user know if syndication is available for a page they’re looking at is by using an icon on the URL bar:

It’s that funky thing to the left of the star that looks like some concentric quarter-circles on a blue background. As Aza has explained in his post The End of an Icon, using a cryptic graphic can make it difficult for an end-user to know what the icon means unless someone tells them. So that’s the first barrier.

There’s more, though. On many pages, clicking on the aforementioned icon gives you a pop-up menu that looks like this:

RSS 2.0, RSS 0.92, and Atom 0.3 are all different formats for conveying essentially the same information. I personally have no idea what the differences between them are, and I imagine that most people don’t either. So presenting end-users with a fairly meaningless and intimidating question is yet another barrier to taking advantage of this technology.

But there’s even more. At this point, the user is presented with a page that requires them to choose a program to actually read their news with. After doing some research and picking a reader and learning how to use it, they need to manually subscribe to all the sites that they visit often.

All in all, this process is such a hassle that most people I know don’t bother using web syndication. I’ve only been an infrequent user of it myself; my newsreader tends to fall into disuse when my subscription list inevitably becomes out-of-sync with the sites that I actually visit.

So, in an attempt to solve this problem and explore the possibility of ambient information in the browser, I’ve started a little experiment. It’s a Firefox Extension called “Ambient News”, and its goal is to provide the user with zero-cost news about the sites that they visit frequently. The extension requires no configuration; you just install it and see if it helps you out.

One of the many great things about Firefox 3 is its Places subsystem—this isn’t so much a user-facing feature as it is an underlying engine that makes it really easy to create functionality that takes the user’s web-browsing history into account. So Ambient News leverages this to automatically figure out what sites you visit most frequently. When you visit them, it sees if they have news associated with them. And whenever you open a new browser tab, the blank page that shows up doesn’t stay blank. News about the sites you visit gently fades in, and you can click on any of it to view the new content.

For instance, shortly after installing the extension, I visit Planet Mozilla and Joel on Software. When I create a new tab, first news about Planet fades in, and then news about Joel-on fades in, which results in the following:

The Planet Mozilla news shows up before the Joel-on news because Ambient News has used the Places subsystem to figure out that I visit Planet more often than Joel-on. It can automatically access protected information like LiveJournal friends-only posts and intranet forums as long as I’m logged in to the relevant sites. And it all perfectly preserves my privacy, because the information that Ambient News mines is on my computer and stays there—it never goes to some company’s server for analysis and indexing.

Right now the extension is pretty primitive, and doesn’t do a lot of things that I’d like it to. But it’s good enough to start dogfooding and experimenting with, so if you’re brave and would like to try it out, feel free to install version 0.0.5 alpha. And if you’re a developer, you can check out the HG repository.

EDIT: The original version posted was 0.0.3 alpha, but bugfixes have been made since then.

Herdict: The Verdict of the Herd

I’m still in the middle of reading The Future of the Internet and How to Stop It, but one of the major “take-aways” from the book is a software suite that Zittrain has been working on at Harvard University’s Berkman Center for Internet and Society called Herdict, which is a portmanteau of “herd” and “verdict”.

From what I understand, one component of the suite, Herdict for Network Health, is a Firefox/IE plug-in that allows an end-user’s computer to tell “the herd”—that is, the other users of the software as a single anonymous entity—what sites it can access. If a user can’t access a particular site, they can ask the herd for more information; this “verdict” can help determine whether you can’t access a site because the site is down (in which case the entire herd can’t access it), or because a firewall is in the way (in which case only some of the herd can access it). This information can then be used to generate a snapshot of Internet health by geography, and empowers users to figure out the true cause behind cryptic messages like “The Connection Has Been Reset”.

The other component of the suite, Herdict for PC Health, is analogous in that it uses the same “herd verdict” concept to figure out how your PC is doing. Anonymized configuration data is sent from your computer to the herd, and if your computer is running slowly or abnormally, the herd can be queried for advice. Assuming that the Herdict software itself isn’t compromised, this can help identify malware, as well as pinpointing the causes of less malevolently-intentioned computer malfunction. For instance, if your computer keeps crashing, consultation with the herd could result in the discovery that everyone with your graphics card has been having the same problem, implying that you may need to change your graphics card drivers.

It looks like the network health component is still under development, but an initial version of the PC health software has been released and is available for download. It’s only available for Windows, so I installed it on my Mac’s VMware virtual machine running Windows XP; there isn’t much to say about it, because it doesn’t currently appear to have any usable features. Right-clicking the program’s tray icon and selecting a “View Data Sent” option from the popup menu just results in a dialog box with the text “There are no logs to be displayed”, despite the fact that the software has been running for a few hours. Selecting the “Herdict Online” option takes me to a Herdometer web page where all the data is aggregated for public use.

It’s a pretty interesting idea, and one that reminds me of Mitchell Baker’s desire to see Mozilla address the issue of data. Herdict is an example of software that uses data about the sites you visit and the programs you install on your computer for honorable ends, publishing it in an anonymized and aggregate form that is useful as a public asset.

The only thing I’m really curious about right now is: why isn’t Herdict open-source software? It seems like the ideal kind of project to open-source for a variety of reasons, and the non-profit, public benefit goals of the Berkman Center certainly seem to agree with the philosophy of community-based development. In any case, I’m looking forward to seeing this project evolve; it’s wonderful to see experiments that try to make the Internet and PCs safer places without sacrificing freedom and generativity.

Tab Navigation: Tradeoffs

One of the latest features to land on the trunk of the mozilla-central source code repository—what will eventually become Firefox 3.1—is a new mechanism for switching between tabs in Firefox when using the Ctrl+Tab and Ctrl+Shift+Tab shortcut gestures.

In Firefox 3.0 and earlier, pressing Ctrl+Tab brings the tab to the right of the currently visible tab into focus, and pressing Ctrl+Shift+Tab brings the tab to the left into focus.

One major problem with this interface is that it’s usually modal: the user’s locus of attention is often focused on the page they want to see, rather than the location of the desired page relative to the current page in the tab order. As a result, switching to another tab with the keyboard usually just involves repeatedly pressing Ctrl+Tab until the content the user wants is in front of them. Sometimes the user may overshoot and then have to press Ctrl+Shift+Tab to backtrack.

Another downside of this approach, as Jenny Boriss has noted, is that the user has very little information about what’s actually contained in an unfocused tab; all they really know is the name of the page and its icon.

The new Ctrl+Tab interface in the Firefox trunk tries to solve some of these shortcomings.

Assume that you arrive at a computer that has Firefox open with three tabs loaded in it named “Wikipedia”, “Google”, and “About Ubiquity”, in that order. The Wikipedia tab is the one that’s currently selected, and you want to go to the next tab, which is Google. Holding down Ctrl and tapping the Tab key results in the following:

The first thing you’ll notice is that the current tab you’re on is still Wikipedia; the overlay in the center of the screen indicates that the tab you’ll go to if you release the Ctrl key is the “About Ubiquity” page, which is two tabs to the right of the Wikipedia tab, and that tapping Tab once more before releasing Ctrl will bring you to the “Google” page that you want to go to. You may be puzzled as to why the overlay in the center of the screen doesn’t reflect the same ordering as the tabs at the top of the browser.

Here, the user interface is following in the footsteps of a similar feature from windowed operating systems like Mac OS X and Windows to navigate between active applications. The order of the thumbnails is based on how recently you’ve visited them, which makes it easy to quickly switch between two places. In the above example, this means that whoever used this computer before you was on the “About Ubiquity” page before moving to the Wikipedia page.

This change is probably based on the premise that the last-visited tab is more frequently the user’s locus of attention than the tab to the left or right of the current one, which is probably true. But the same core problem remains: the last page that the user is on isn’t always their locus of attention. Indeed, unless someone is rapidly switching between two places, most people don’t even remember the last web page they were on; even less relevant is the second-to-last web page they were on, and the ordering of anything older than that looks like randomness. What this means is that for the particular use case of quickly switching between two tabs, this new mechanism is non-modal, and indeed quite efficient. But for all other cases, this interface is modal, because the same gesture is resulting in a different response based on a part of the browser’s state (i.e., tab-viewing history) that not only isn’t the user’s current locus of attention, but also isn’t even knowable to the user until they press Ctrl+Tab—and even then, the user can only see three tabs at a time in the overlay, even if more are open in the browser. At least with the previous mechanism, one could look at the tab bar to infer what the results of pressing Ctrl+Tab would be.

So, in the case where the user isn’t going back to the tab that they were just at very recently, they’re stuck in the same kind of situation that they were in with the old Ctrl+Tab mechanism: just keep pressing Ctrl+Tab until the page you want shows up. But this brings up another problem with the new interface: the end-user is only presented with thumbnails. In the old Ctrl+Tab mechanism, at least I was presented with each full-sized page every time I pressed Ctrl+Tab so that I could see if it was the one I wanted; here, I’m relegated to looking at a tiny thumbnail of the page, which only serves to make the task that much more difficult.

All of this is to say that I think the new Ctrl+Tab interface is really more of a trade-off than a decisive improvement: in the case where I want to go back to the tab I was just at, it’s great—although the thumbnails are completely unnecessary—but in most other cases, it’s actually harder to use than the old interface.

I haven’t thought much about possible solutions; one band-aid is to create a hybrid of both mechanisms, for instance by bringing the currently-selected tab to the foreground in the new Ctrl+Tab interface but still providing thumbnails for the next and previous tabs in an overlay. If anyone else has ideas, I’m sure the Mozilla community would love to hear them.

Parchment on the iPhone

I recently spent time making Parchment work properly on my new iPhone 3G.

The iPhone has been my first foray into the world of the mobile web, and getting Parchment to work well on it was an interesting experience. Some of the challenges I faced involved getting the iPhone’s on-screen keyboard to display properly—Parchment doesn’t actually have any text input fields on it, so by default the iPhone didn’t think that users had to enter text—and modifying some processor-intensive JavaScript code so that the iPhone didn’t think that Parchment had gone into an infinite loop.

Another interesting challenge was figuring out how best to display content so that it would be viewable in a way that didn’t require the end-user to perform unnecessary panning and zooming. By default, the iPhone assumes that pages it views were made to look good on a page that’s about 980 pixels wide, so that’s how big the “virtual viewport” is; this default seems to make sense, since most pages were made with desktop screens in mind, and the ability to zoom and pan makes it relatively usable. However, when a developer is creating a rendering of a site specifically for smaller screens, they can force a particular viewport size—thus minimizing or entirely obviating the need to pan and zoom—by inserting a viewport meta tag in the HTML and using some iPhone-specific CSS properties like -webkit-text-size-adjust. All of these sorts of issues were documented nicely in Apple’s Safari Web Content Guide for iPhone. Though they’re not “standards” per se (at least as far as I know), nonetheless I think that tailoring pages for an incredibly small, zoomable screen could be viewed as little different from tailoring them for print—something CSS already supports—so I hope that these extensions can eventually be turned into standards and supported by other mobile browsers like Fennec.

Right now my biggest criticism of the iPhone, despite all its enormous benefits, is the fact that its native platform is quite sterile, or at best contingently generative. I was hoping that the open web would be one way to get around this, and getting Parchment to work on the device was my way of testing these waters.

Parchment’s ultimate goal, as outlined on its Google Code project page, is to serve as a compelling replacement for desktop-based Z-machine interpreters. Among other things, this implies that it should work fine when disconnected from the internet, and it does: once you’ve started a game, it’s entirely loaded into the browser and no further network connections are made. In Firefox 2 or above, selecting “Work Offline” from the “File” menu will allow you to still access the game and play it even if you don’t currently have it loaded in a tab (why Firefox can’t automatically detect whether you’re offline is being worked on). You can even save and load games while offline because the game state is appended to the URL hash—put in DOM Storage too, if support is detected—and Damien Neil has a patch that uses PersistJS to provide even better support.

The thing is, Safari on the iPhone doesn’t seem to have any kind of “offline mode” like Firefox does. Further, it only keeps one web page loaded at a time, even though it has something resembling tabbed browsing; so if you switch from Parchment to another website and back to Parchment, the entire page is reloaded—something that not only consumes network bandwidth but also processor time. This hugely limits Parchment from becoming a compelling replacement for a native iPhone app that does the same thing. And thanks to Apple’s gate-keeping, there’s serious doubt as to whether the latter is even a possibility.

I also have no idea if Apple is even likely to introduce an offline mode for the iPhone, because there appears to be a clear conflict of interest between their closed platform and the open web: it makes sense for Apple to “gimp” the web as much as it can so that application developers are forced to develop for the platform that Apple controls. So I’m really looking forward to both Fennec and Android supporting generativity in the mobile space.

In any case, if you have an iPhone, feel free to play a story on Parchment and let me know what you think.

Towards Inter-Community Trust

In my recent post on Trusting Functionality I alluded to a socially-based framework for trust that would allow software to be generative and safe at the same time.

When trying to figure out a solution to this problem, I realized that there are already communities on the internet that have built-in social mechanisms for trust. Python, for example, is a language notorious for its lack of protection against untrusted code. Yet we don’t see much concern that a Python script may contain malicious code, even though it has the ability to do whatever it wants to our computer. Why is this?

One obvious answer has to do with the users of Python code: because there’s relatively few of them and they tend to be quite technically skilled, there’s a high risk involved in creating malicious Python code, and little economic gain to be made from it.

But another answer, I think, has something to do with community. A open-source community is a lot like a corporation with a few key differences: its processes are largely transparent, and the software itself is almost guaranteed to be developed by its users. The former allows for peer review and accountability, while the latter helps ensure that the software’s features are always in the best interests of the users. These are pre-existing social mechanisms that help create trust between an open-source community and its stakeholders.

So ultimately, a project’s community can be at least as reliable as a corporate entity: the economic incentive structure of the latter is replaced by a social incentive structure in the former. I trust that the Python SVN repository won’t contain malicious code because I trust the community to only grant commit privileges to those it trusts, and I trust the community’s server administrators to ensure that the python.org domain won’t be hacked.

Another advantage of the transparency of open communities is the fact that—since well before the advent of Facebook and MySpace—they have meaningful social networks embedded in them. One merely has to take a look at the Subversion commit logs for the Python code repository to see who its most highly-respected members are, for instance, and public newsgroups and forums can be mined to discover other relationships. What this means is that it’s possible for us to infer—or be told explicitly—what individuals a community trusts.

So, individuals trust certain open communities and vice versa. This information can be leveraged to create a relatively low-cost web of trust which can be used in a variety of ways.

For instance, let’s assume for a moment that Mozilla’s source control system at hg.mozilla.org supported OpenID and had a simple web API that allowed a remote service to query whether or not a particular user has commit privileges. This would allow Ubiquity’s source control system—which is hosted here on Toolness—to instantly inherit the permission system from Mozilla: anyone trusted enough to have commit privileges to Mozilla’s code repository would instantly be able to commit to Ubiquity. (This is, by the way, the reason that Ubiquity isn’t currently hosted on hg.mozilla.org: it forces us to think of ways to decentralize the Mozilla community.)

There’s at least one existing web of trust that we can draw from, too: the open-source social networking site Ohloh uses a ranking system called Kudos, which could be used as a rough measure of trust, to make inferences about whether an arbitrary piece of code from a known programmer can be trusted.

Such webs of trust would be useful for things other than code, too; it could be used as an alternative to spam-filtering to determine whether content can be trusted. Imagine a workflow in which blog software draws from publicly-available social history to see if a comment posted by an OpenID-logged-in user is spam. This means, for instance, that my Wikipedia history and my Yelp standing could be used to infer that a comment I leave on a blog isn’t spam, obviating the need for frustrating and error-prone captchas.

Inter-community trust doesn’t have to be the only form of trust we use to make our decisions, either; it can easily be used in conjunction with the hierarchical trust system that the web currently uses, for instance; or individual webs of trust can be leveraged from existing social network sites like LinkedIn, as long as the final solution doesn’t inconvenience the end-user.

This is just one potential social solution to the problem of participating in a digital ecosystem with bad actors who have economic incentives to hurt others. If you have any ideas for other solutions to the trust problem, or know of any existing ones, I’d love to hear them.

Mercurial Woes

Over the past few days my friends Ben Collins-Sussman and Jim Blandy and I have been having an interesting conversation about the use of Mercurial for development collaboration. Eventually one of my email responses got so long-winded that I figured it’d be best to make the conversation public.

So, here’s my take on Mercurial, and some reasons for why a HG birds-of-a-feather session at the Mozilla Summit coming up next week would be very useful for me.

To begin with, from a purely social standpoint, the concept of distributed revision control is amazing to me because of how it removes many technological barriers to collaboration, providing software projects with an enormous amount of freedom on how their development process is structured. For any readers who aren’t familiar with it, check out Chapter 1 of the HG Book.

But with all this additional freedom comes additional responsibilities. To quote the MDC Mercurial basics: this gun is loaded.

I’m used to working on small-to-medium sized projects with relatively small teams. Subversion was great for this, because when we were working on things simultaneously, we rarely ran into situations where we were editing the same file, much less the same part of the same file. There were rarely reasons to need to create SVN branches—though we all knew how to do it, and did it when necessary. But as a result, merges were very rare, and when we had to merge, we were extremely careful and diligent about it.

I’m still working on relatively small-to-medium-sized projects (e.g. Weave and Ubiquity) and the forced merging that HG makes us do almost every time we push is a world of pain, relatively speaking. With SVN, I’d just svn commit and see if SVN rejected our commit because someone else committed a change to the same file while we were editing it—this happened rarely, and when it did, we were careful about ensuring that our changes gelled. In this sense, SVN was really humane; 90% of the time things “just worked”, and when things didn’t just work, it was for very good reasons.

But HG almost never “just works”. If I edit a.py and my friend edits b.py and pushes it before I’ve pushed my changes, I have to make a merge commit and manually ensure that nothing bad happened. The end result of this is a huge burden on each programmer compared to SVN, as they have to do a separate merge commit for nearly every push they make, which essentially encourages people to either (A) not push often or (B) ignore their merge commits (a practice which is encouraged by the use of hg fetch). The disadvantages of the first approach are nicely explained by Ben’s post on Programmer Insecurity; the latter approach is bad for obvious reasons.

This is basically the axis around which all my woes with HG revolve. With SVN, it’s really easy to see how code has changed, but because of the constant merging of tiny branches in HG, the whole code history becomes obfuscated and it’s hard to tell what’s happened to it. In fact, several weeks ago a friend somehow mis-merged his commits to Weave, which undid a major refactoring I did, and the really scary thing is that it was somehow impossible for me to tell this had happened from looking at the diff logs alone. I looked at them for a good half-hour or so and was still scratching my head. Needless to say, my inability to understand what had happened to the code by looking at the logs drastically reduced my faith in the tool.

While everyone I know understands the basics of HG and the philosophy behind distributed VCS, it’s the particulars of actually “working in the wild” that many are finding very confusing. So a HG BoF at the Mozilla Summit would be extremely useful.

Trusting Functionality

One of the major challenges we face with the design of our new linguistic command-line project is that of trust. As Zittrain mentions in The Future of the Internet, this is really the fundamental problem of generative systems, and also their most valuable asset: the ability for a user to run arbitrary code is simultaneously what gives the personal computer its revolutionary power, but it’s also its greatest vulnerability.

At present, because our project is still in the prototyping stage, we’re opting for freedom of expressiveness and experimentation over security. That means that all the various verbs we write, while written in JavaScript, are always executed with Chrome privieges, meaning that they’re capable of doing whatever they want to the end-user’s computer.

So the particular dilemma that needs to be solved here is: how can an end-user trust that a verb won’t do anything harmful to their data or privacy—be it intentional or accidental—while still providing a low barrier of entry for aspiring authors to write and distribute their own verbs?

We’ve considered some technical options so far. One is the idea of “security manifests” that come with verbs, specifying what a verb is capable of doing. For example, the “email” verb mentioned in my last post could specify in a manifest that it needed access to the user’s email service and their current selection. This information could then be presented to a user when they choose to install a verb. At an implementation level, the code could run in a specially-constructed sandbox to ensure that the verb code never steps outside the bounds prescribed by its manifest. Alternatively, or in addition to this, an object-capabilities subset of JavaScript like Caja can be used. Such mechanisms ensure that untrusted code can only go as far as the end-user lets it—which, unfortunately, also puts a burden on said end-user. While I don’t personally mind having such a burden myself, I know I wouldn’t want to put it on my friends and family.

Digital certificates are another component of potential solutions, but they too have their own problems. While they’re easy for centralized corporations to deal with, they’re problematic for more distributed operations, and the monetary cost involved in obtaining one significantly increases the barrier to entry for individual software authors. And even signed code doesn’t prevent the more privacy-invasive—but not outright malicious—classes of software like spyware.

As I’ve indicated earlier, this general issue of trust isn’t a new problem, or even just an important issue for an experimental Firefox addon. It’s what Zittrain believes is at the core of the future of the internet and the PC, and the solutions we create—or don’t create—will determine whether their future is one that is sterile or generative. Windows Vista, being the most frequently exploited operating system as a natural result of its widespread use, is the harbinger of a future that relies entirely on technology and corporate trust heirarchies without taking any kind of social mechanisms into account; the result is a notoriously hard-to-use user interface that places an enormous burden on the end-user to constantly make informed security decisions about their computing experience. I don’t think that the answer is merely a “less extreme” version of Vista—which is the model that most other operating systems and extensible applications seem to be following—but rather that a more effective solution is primarily a social one that is supported by technological tools.

I have a particular solution in mind that I’ll be writing about soon. That said, I’d still love to hear any thoughts that anyone has on this topic.

My First Ambulate-For-a-Cause

Yesterday I participated in the San Francisco AIDS Walk with two Mozilla interns.

I’ve always been a bit puzzled by the concept of walks/runs-for-a-cause because at a surface level, the energy an individual spends running or walking doesn’t directly contribute to the actual cause they’re ambulating for. Ultimately, it seems like it’s a transaction for one’s time and energy in exchange for a cause’s publicity: rather than simply donating a few dollars to a cause, ambulating for the cause is indicative of the sacrifice of one’s time in the name of a cause (which can be more valuable than money, depending on the individual). On the micro level, this can result in additional revenue for the cause as friends and family of the individual pledge money in recognition of that sacrifice. On the macro level, the collective behavior of so many people doing this at once creates significant publicity for the cause, which raises awareness for it and consequently leads to more revenue.

It’s no surprise that corporations have an incentive to ride this wave by “donating” money and human resources in exchange for publicity and associating their brand with the cause in the minds of consumers. For the vast majority of companies that I don’t care about, I tend to perceive this as the coldly calculated move I just described it as: The Gap, Blockbuster, McDonald’s, Williams-Sonoma, Wachovia, and a number of other corporations helped sponsor the event, their employees wearing t-shirts saying things like “[Company Name] Cares!”. Such things elicited a gag response from me. I don’t think that this actually damaged my perception of the companies in question—their behavior is entirely rational and has good consequences—but it doesn’t particularly improve my perception of the companies, either. I’d like to say that this is because I only want to associate qualities like compassion and benevolence with actual human beings that I know personally and trust.

Things aren’t that simple, of course. Pixar was there, and I couldn’t help but cheer them on, as they’re one of the few corporations that I have a particularly positive impression of, and consequently mentally anthropomorphize into an awesome person rather than the usual faceless machination. My two companions were wearing Firefox swag and got a few cheers as well, which was nice.

Still, there was something odd about the whole event. Some American Idol people sang a song at the opening ceremony and some famous people emcee’d it as though they were hosting the Oscars, which added to the uneasy feeling that this was a publicity stunt rather than an authentic experience. In the end, I suppose it was a bit of both, but what made it really worthwhile was exploring Golden Gate Park and hanging out with my fellow Mozillians.

Ubiquitous Interfaces, Ubiquitous Functionality

Lately some of us at Mozilla Labs have been experimenting with graphical keyboard user interfaces in Firefox. Our current work-in-progress is something that we’re calling Ubiquity for the time being, though the name is by no means set in stone.

This project is heavily informed by Enso, a software product developed by me and my colleagues at Humanized from 2005-07. Aside from the benefits outlined in Alex Faaborg’s blog post entitled The Graphical Keyboard User Interface, this experiment is intended to solve few other problems, one of which I’ll address in this post.

Web applications, much the same as desktop applications, are a bit like isolated cities: it’s difficult for an end-user to arbitrarily share data and functionality between them. This is alleviated to some extent by creations like Firefox Add-ons that add toolbars or sidebars to Firefox’s UI, Bookmarklets, and Greasemonkey, but while all of these solutions are powerful, each comes with its own set of problems. The buttons and bars of many Firefox add-ons don’t scale well because of the valuable screen real-estate they consume; Bookmarklets are restricted in scope because they only have the access privileges of the website they’re running on; and Greasemonkey doesn’t prescribe any kind of interaction model, which makes it difficult to reuse the functionality of a script in a context other than the ones it was expressly designed for.

Our new project attempts to alleviate all of these problems by allowing end-users to apply textual commands, or verbs, to whatever they’re looking at. For instance, let’s assume that I’ve found a typo on a friend’s blog, and I want to let him know about it. I can first select the typo on the page by dragging my mouse over it:

Then, I can activate what we’re calling the “command entry mode”. At the moment, this is done via a hotkey that displays a popup window, though this particular interface is only temporary (more on that later):

Now I type “highlight”, because that’s what I want to do to the typo, to emphasize the location of the typo to my friend:

I then press enter, which highlights the text, like so:

Now I’d like to email part of the sentence containing the highlighted typo to my friend, so that he has some context for where the typo is located. I do this by highlighting part of the sentence, and issuing the “email” command. This causes Firefox to switch to my Gmail tab and compose an email for me with some pre-filled content:

I can now modify the email as necessary to tell my friend about his typo, and then send it.

Our current code is capable of performing this workflow, though Gmail is the only supported mail application at the moment. The hotkey interface is also something of a placeholder, because it was the most expedient to implement while making the project dogfoodable. Ultimately, it would be ideal for verbs to be exposed to end-users through interfaces that already exist in Firefox, such as the AwesomeBar or contextual menus.

Exactly what the user needs to type in order to invoke a command is also under contention, and Jono has been doing some really interesting work on this front. There’s a number of additional aspects of the experiment that I’d like to write about, so keep reading Toolness for more updates.

We’re currently sprinting to have something more presentable by the Firefox Summit—hopefully a 0.1 release, if all goes well. In the meantime, if you’re interested, feel free to join the Google Group, and even check out the source code.

The Future of the Internet, How to Stop It, and Me

A few weeks ago, Oxford University professor Jonathan Zittrain recommended Firefox 3 on The Colbert Report. While getting the bump for the browser from this great show was certainly a win for Mozilla, more provocative was the new book Zittrain was discussing with Colbert, entitled The Future of the Internet and How to Stop It.

Contrary to the Luddism I’d assumed the book’s title implied—e.g., that the Internet is a terrible thing and needs to be dismantled—Zittrain instead celebrates how open and positive the Internet and personal computing have been for us, but warns that this liberating period is in danger of coming to a close.

I can certainly agree with the first part: as far as technology goes, it seems to me like we’re living in something of a golden age of computing. Participatory, user-generated content is ubiquitous. An organic, community-authored browser has been declared the best way to experience the web, and a vast number of people rely on a surprisingly balanced community-maintained encyclopedia as a major source of information about their world. Individuals who want to learn about programming and create solutions to help or entertain others can do so on an open web, using free tools that they can take apart, modify, and remix as they please, without needing much more than an Internet connection and a computer. Extremely low-cost computers and network infrastructures are being created for those with lower incomes, thereby helping create a more level playing field for children.

This is an improved version of the world that I grew up in, where my first computer, an Atari 400, was like a blank slate waiting to be experimented with. But what drew me most to this particular kind of tinkering—as opposed to, say, electronics, a field which I still know next to nothing about—is that both experimentation and the sharing of one’s creations had negligible costs associated with them. After one had a computer and perhaps some instructional materials, inventing things with it involved spending no money on supplies, since the computer came with its own version of BASIC. There was no “gatekeeper” preventing you from distributing your software to as many people as you wanted, either by printing out your source code, or putting your program on a disk, or uploading it to a BBS.

Growing up in a freedom-infused computing environment like this, it was no surprise that I was a bit puzzled when I ran into things that had computers inside but didn’t let me tinker with them and share my solutions with others. Cell phones are the most obvious example of this; when I bought my first in 2001, while I understood some of the motivations behind my phone carrier and manufacturer “locking me out” from being able to tinker with my phone, I was nonetheless disappointed.

It’s the distinction between these two philosophies of information devices that Zittrain is concerned with. The personal computer and the Internet are generative tools that invite their users to discover new uses for them, providing the freedom to create and share at the cost of security and reliability: costs like tremendous quantities of unsolicited spam, phishing websites, botnets, viruses, and a diverse software ecosystem that inevitably results in bugs and crashes. The cell phone that we know today, on the other hand, is sterile: if it even allows any kind of invention and sharing, it does so in a very restrictive way, but with understandable motives. For instance, not only are we socially conditioned to expect a phone to not crash while making a call, but it’s actually life-critical that this be true, e.g. when calling 911 during an emergency.

In short, Zittrain is concerned about these kinds of choices we make, between freedom and safety, in the world of information. He believes that while the history of the Internet has been one of enormous generativity, it’s eminently possible for its future to be one of sterility as we’re continually faced with new challenges that personal computers and the Internet were never built for.

I’ve only read Part I of his book so far, and I’ve found his analysis to be quite thought-provoking, as well as a little frightening. What impresses me most, however, is one of the more mundane aspects of his work: it’s published under a Creative Commons license, which means that, among other things, you can read it online and collaboratively annotate it with others, download a PDF and print it out (or send it to your sterile Kindle like I did), or pretty much do almost anything else you want with it. That Zittrain has actively chosen to allow his own work to be freely used for generative ends suggests to me that he practices what he preaches.