An Augmented Reality Interface requires a holistic, high level design approach.
On my desk there is an instance of a pen, since it's a "pen" I know I should be able to do certain things with it as you probably already know as well. If a "pen" or this specific pen had additional "augmented" contexts to it then I would have had more options.
Seeing a pen makes me think of it and all of its pressing related contexts, but I can also pop those up by simply thinking "pen".
If I want to communicate "pen" to my computer I would probably type it as a search command in an internet browser, I would then expect my search to be categorized, localized and corrections offered in case I entered a partial pattern. My browser might monitor my history with "pen". I may encounter everyone's (all users) history when I inspect categories, localizations and possible corrections for my search command.
Maybe, I already handled "pen" before. If that is the case then I should probably have it stored on my local hard drive in some fashion, or at least on a nearby proxy server. One problem is; it is hard to find, e.g. if I take a picture of me sitting at my desk (I use pens sometimes when I am at my desk) unless I "tag" my pens; there is no way the computer will be able to find them (an "intelligent" computer would be able to tag objects with no human intervention, if allowed to do so).
If my computer was "intelligent" it could alert me when I leave my pen's cap open… it is known that pens that are left without their caps on for X amount of time in certain environment conditions dry off and become useless. I would not desire to be alerted about other people's pens; "My Computer" should take care of me in a personal fashion.
I think that on a very basic level, if we are to build advanced augmented reality interfaces our computers will have to perceive and understand the real world close to the way we do. Then we could intuitively communicate in the real world with our computers too.