You Know What Really Grinds My Gears? Augmented Reality! September 3, 2009
Posted by Mok Oh in augmented reality.Tags: augmented reality
21 comments

AR really grinds my gears!
My last post was more about being rational, but this post is more about being raw and emotional. I wanna make two points that really grind my gears about so called “Augmented Reality” apps we’re seeing these days.
1. What really grinds my gears about “Augmented Reality”: It really fucking SUCKS!
There I said it. By “Augmented Reality,” I mean those iPhone and Android apps I mentioned in the previous blog in their current incarnation. And if you really think they’re useful, I will respectfully argue you’re full of shit. Get a REAL use case, and try a comparison with Google Maps. If you still think those “Augmented Reality” apps are still more useful, then I’d again respectfully argue that you are a delusional fuck.
Please don’t get me wrong — I’d *love* to see those guys succeed. I’m just saying there’s a lot of work to be done at its current state, and they really need to differentiate the functionality from all other local search and mapping applications beyond the video-overlay eye candy.
2. What really grinds my gears about “Augmented Reality”: HYPE will KILL the industry.
Just like “Artificial Intelligence” and “Virtual Reality,” or any other technical buzz words that were hyped waaaaaaay beyond their technological capabilities, the current AR hype will kill the future of AR industry.
Let’s take a look at what happened to Artificial Intelligence (AI). From late 50′s on, the buzz words “AI” was coined and researched by amazingly brilliant minds. For one reason or another, too much hype followed too much funding, and eventually too much promises and vision could not be realized, even today. It’s cuz the vision was great, but the technology didn’t exist yet!
Nowadays, we see AI-inspired applications everywhere, e.g. Pandora’s music recommendations, Amazon’s “you might also like,” and even facial recognition algorithms — these are in one form or another inspired by AI. But the problem is no one uses the word “AI” anymore. In fact, some avoid it like a plague.
AI is still not realized today according to Isaac Asimov’s definition. But this does not mean AI-inspired technologies aren’t useful. In fact, they are. I would even venture out to say that if the hype was minimized and expectations set properly, perhaps there would be more overall stream of funding to advance these technologies much beyond what we have today. (Rule of thumb: If you hear, “but we should be able to do that in 10 years,” then, shit, you ain’t got no solution.)
Similarly, AR is not realized today as defined by William Gibson or Bruce Sterling. But we should be able to do this in 10 years, right? I wouldn’t bet on it. Gibson and Sterling are futurists — they can beautifully write scenarios and use cases that are really quite useful and believable for the future. And these use cases really should drive technology to make our lives for the better. BUT that doesn’t mean that these technologies CAN be realized.
I would argue that the forefathers of AR, did and do have the right idea (pls read the last blog post). I still think we need to continue to expand/expound on vision algorithms (e.g. image tracking, image detection/recognition, etc.) and couple that with other sensors (e.g. Wifi, RFID, Bluetooth, accelerators, gyros, GPS, compasses, etc.) to more precisely tell people what they’re seeing in an interactive and augmented sense.
The level of precision provided by current apps are good from a mapping perspective (i.e. the 2D “aerial” view), but not good enough from a first-person’s ground perspective. (I will definitely write another blog more on the technical short-comings.)
I think that AR has been over hyped many years ago, and I don’t want to see any over-hyping done today or the future anymore. Perhaps, we need to reset people’s expectations somehow, or rebrand the words to something else. Because I really do think that there’s plenty of use for AR-inspired technologies as being defined by Layars and Wikitudes of the world.
Let’s not throw the baby out with the bath water.
Shit, I believe in AR. Just don’t fucking kill it… (Sorry about my fucks and shits. I told you it was going to be emotional..)
Is That *Really* Augmented Reality? August 23, 2009
Posted by Mok Oh in augmented reality.Tags: augmented reality, Layar, Wikitude
21 comments

Augmented Reality?
(This is a series in Augmented Reality and you can find Part 1 here.)
Historically, when someone used the words “Augmented Reality” (AR), they typically involved HUDs or HMDs, and a bunch of very expensive hardware that produced accurate results. Oh, and everyone owned one of these… NOT.
Recently, AR has been getting quite a bit of buzz thanks to mobile devices, such as iPhones and Android devices. ”But why,” you might ask? Many use cases of AR need mobility — from a simple question like, “What is that I’m seeing?” to “Where is X?” these questions in most cases require the user to be on the road. Smartphones these days meet much of these needs for these use cases in the palm of your hand. (And getting to my point,) Applications such as Layar and Wikitude are being touted as “augmented reality” browsers.
But, is that reeeeeeally AR?
Many of the folks who’ve researched and invented AR might say this: ”THAT IS NOT AUGMENTED REALITY!!!”
By definition, AR integrates or “blends” the virtual computer graphics objects to the real on your visual device, and it displays them in real time. But perhaps more importantly,
“Augmented reality does not simply mean the superimposition of a graphic object over a real world scene. This is technically an easy task. One difficulty in augmenting reality, as defined here, is the need to maintain accurate registration of the virtual objects with the real world image.”
(From Jim Vallino’s webpage.)
So, what exactly is the issue with Layar and Wikitude’s apps that does not make them AR? By using the smartphone’s camera, they can superimpose crude but virtual overlays in real time. Isn’t that AR?
The problem lies in the accurate registration of the virtual objects with the real-world image.
Without going into too much detail (and being nerdy), neither Layar nor Wikitude uses the visual information (the video stream) to accurately register and and display virtual objects with the real-world image. In fact, the video streams are not analyzed at all to determine anything. They don’t track or use image recognition or vision algorithms to tell you what you are seeing. They are simply using the device’s GPS and compass to determine where you are and which way you’re facing. Once you know that, they find out what’s near by and display a “dot” on top of the video stream to tell you the general direction, distance, and description of that place. In fact, if you turn the video stream off and the AR “layer” on, it would essentially still work. Functionally, there’s not much difference between this and Google Maps on your mobile phone. (Well, I think Google Maps is much more useful.)
So, when you are a block away from a POI (point of interest), they would generally give you a good “ish” estimation. But otherwise, they are subject to GPS errors, magnetic interferences on the compasses, and a database that might be wrong or old.
So, I ask you, is this really AR? I think not — not in the “classical” sense, at least.
Now, let’s flip the coin and see the other side.
AR’s been coined and been around for a couple of decades (and the idea, even longer). But it really hasn’t made that much progress over the decades, in a sense that it really hasn’t influenced our lives. Do you experience AR technology in your day to day? Most of us don’t (unless you are a Top Gun flying multi-million dollar airplanes).
So, where did it go wrong?
Here’s my guess: It focused too much on accuracy of registering virtual objects to the real world.
The same freaking problem.
Let’s take a simple example. I want to place my can of virtual diet coke on any surface that I see (it could be curved, like on top of my car). So, I take my smartphone, point it at some surface, and click on the “place can” button. And my virtual coke can is placed accurately as I’ve indicated. Should be simple, right? Wrong.
This scenario is quite difficult technically. In fact, I’d say it’s not possible yet in general. For this to happen, your smart phone (or even a super computer) needs to first understand the geometry of the surface. Just from a single video stream, it’s hard to robustly compute this in general. You either need multiple cameras to determine the geometry via stereo (or move around the (static, diffuse, and simple) surface suffciently), or laser scan it, or have some image understanding. None of these are robust enough for our everyday use yet, and potentially decades away (unless it’s very specific use case — you are a Top Gun and looking for enemy airplanes, flying a multi-million dollar hardware with top-of-the-line everything). There are applications out there that use some markers or fiducials to tell you the orientation and scale of a flat surface, but you can’t have these markers everywhere. (This topic really is another blog. I will talk more about this, since it actually is quite interesting to see these apps popping up more and more.)
At this rate, AR technology may never come to the consumer market if accuracy is the gating issue.
So, going back to my original question, I ask again: Are these mobile “AR” applications really AR? I would still say no, with a caveat that they have the right idea.
I think the right idea is not to place too much focus on accuracy but on use cases that influence our daily lives. Don’t worry too much about the brain surgery use case, where 1mm accuracy matters. When the constraints are relaxed, more solutions arise. Using other sensors combined with the visual sensor, such as GSP, compass, accelerators, gyros, RFID, Wifi, markers, etc., I think AR can actually start tackling the consumer markets (i.e. the long tail), and have the potential to come up with a killer app for this tech (which really is sorely lacking).
What if every POI had a unique sensor that broadcasted what and where it was? E.g. every Starbucks had a short/medium/long sensors to tell you where they were. It could even be applicable to dynamic things, such as your car, luggage, pets, and every inventory out there. It doesn’t have to be exact. Just needs to work. (And these are different!)
So, I say to researchers and inventors — SCREW ACCURACY (for now)! Focus on what will make a difference in our lives! And make something that works!
And to Layars and Wikitudes of the world — keep going, and don’t forget to push innovation.
Much Ado About Augmented Reality? July 26, 2009
Posted by Mok Oh in 3D, Mobile, augmented reality.Tags: 3D, augmented reality, iPhone, Mobile, real time
9 comments
My question to you is, do you believe in Augmented Reality?
In many ways, the notion of augmented reality is similar to 3D in a sense that expectations have been set high, but it hasn’t quite delivered. We expected HUD in cars that help us with directions and more; we expected augmented reality glasses that help us with what we are seeing — and we expected this to happen a decade ago. Hm…
Recently, there’s been a lot of talk about iPhone 3.1 that will enable real-time overlays for applications like augmented reality stuff. Definitely looking forward to what will happen there, and what type of innovations will be pushed from this community.
Personally, I think there’s a lot more innovation in technology and UI/UX for AR to succeed. Image recognition algorithms are not robust enough for general scenes (like in our every-day lives), but when supplemented with GPS, compass, accelerometers, and other sensors just might work and be useful.
What do you think?
Mobile Reality Demo June 30, 2009
Posted by Mok Oh in 3D, Mobile, Photography, panorama, post processing.Tags: 3D, augmented reality, Mobile, panorama, Photography
2 comments
I was privileged enough to be a part of a panel in Mobile Reality at the Where 2.0 Conference this year, chaired by Brady Forrest. Here’s the short description of the panel:
“An emerging class of smartphones including location-based services and persistent data connections are lenses by which we can effectively view data layers atop physical space. What was once only available from tethered desktop computers is now possible from pocket-sized companion devices that travel with us. We are seeing examples of this in their earliest incarnations – social networking, gaming, reference and commerce.
Opposed to looking far into the future, this panel looks at examples of this technology in use and available today to consumers on a variety of smartphone platforms, including the Apple iPhone and Google Android. Panelists will provide short demonstrations of this technology, followed by a topic discussion and Q&A.”
The reason for sharing this is to show you EveryScape’s initiatives towards mobility. I believe EveryScape has one of the coolest and most useful visual platforms around (in my unbiased opinion), and you can see a glimpse of what’s being worked on in the video below (starting at the 2:3o mark).


