Context & Permutations


In the pursuit of Artificial General Intelligence, one of the challenges that comes up again and again is how to deal with context.  To illustrate: telling a robot to cross the street would seem simple enough.  But consider the context that five minutes ago somebody else told this robot not to cross the street because there was some kind of construction work happening on the other side.  What does the robot decide to do?  Whose instruction does it consider more important?

A robot whose ‘brain’ did not account for context properly would naively go crossing the street as soon as you told it to, ignoring whatever had come before.  This example is simple enough, but you can easily imagine other situations in which the consequences would be catastrophic.

The difficulty in modeling context in a mathematical sense is that the state space can quickly explode, meaning that the number of ways that things can occur and sequences they can occur in is essentially infinite.  Reducing these effective infinities down to manageable size is where the magic occurs.  The holy grail in his case is to have the computing of the main algorithm remain constant (or at least linear) even as the number of possible permutations of contextual state explodes.

How is this done?  Conceptually, one needs to represent things sparsely, and have the algorithm that traverses this representation only take into account a small subset of possibilities at a time.  In practice, this means representing the state space as transitions in a large graph, and only traversing small walks through the graph at any given time.  In this space-time tradeoff, space is favored heavily.

The ability to adeptly handle context is of utmost importance for current and future AIs, especially as they take on more responsibility in our world.  I hope that AI developers can form a common set of idioms for dealing with context in intelligent systems, so that they can be collaboratively improved upon.

We’ve had it all wrong.


All this time, we’ve had it all wrong.

Artificial Intelligence (AI) has been a science for over 50 years now, and in that time has accomplished some amazing things – computers that beat human players at chess and Jeopardy, find the best routes for delivery trucks, optimize drug delivery, and many other feats.  Yet the elusive holy grail of “true AI”, or “sentient AI”, “artificial general intelligence” – by whatever name, the big problem – has remained out of our grasp.

Look at what the words actually say though – artificial intelligence.  Are we sure that intelligence is really the crucial aspect to creating a sentient machine?

I claim that we’ve had it wrong.  Think about it: intelligence is a mere mechanical form, a set of axioms that yield observations and outcomes.  Hypothesis, action, adjustment – ad infinitum.  The theory has been if we could just create the recursively self-optimizing intelligence kernel, BOOM! – instant singularity.  And we’d have our AGI to run our robots, our homes, our shipping lanes, and everything imaginable.

The problem with this picture is that it assumes intelligence is the key underlying factor.  It is not.

I claim the key factor is…

…wait for it…

Consciousness.

Consciousness might be defined as how ‘aware’ an entity is of itself and its environment, which might be measured by how well it was able to distinguish things like where it ends and its environment begins, a sense of agency with reference to past actions it performed, and a unified experience of its surroundings that gives it a constantly evolving sense of ‘now’.  This may overlap with intelligence, but it is a different goal: looking in the mirror and thinking “that’s me” is different than being able to beat humans at chess.  A robot understanding “I broke the vase” is different than an intelligence calculating the Voronoi diagram of the pottery’s broken pieces lying on the floor.

Giulio Tononi’s work rings a note in harmony with these ideas.  Best of all, he and others discuss practically useful metrics of consciousness.  Whether Integrated Information Theory is the root of all consciousness or not is immaterial; the point is that this is solid work in a distinctly new direction, and approaches the fundamental problems of AI in a completely new way.

Tononi’s work may be a viable (if perhaps only approximate) solution to the binding problem, and in that way could be immensely useful in designing systems that have a persisting sense of their evolving environment, leading us to sentience.  It is believable that intelligence may be an emergent property of consciousness, but it seems unlikely that intelligence alone is the ingredient for consciousness itself, and that somehow a certain ‘amount’ of intelligence will yield sentience.  One necessarily takes precedence over the other.

Given this, from now on I’ll be focusing my work on Artificial Consciousness, which will differ from Artificial Intelligence namely in its goals and performance metrics: instead of how effectively an agent solved a problem, how aware it was of its position in the problem space; instead of how little error it can achieve, how little ambiguity it can achieve in understanding its own boundaries of existence (where the program ends and the OS begins, where the robot’s body ends and the environment begins).

I would urge you to read Tononi’s work and Adam Barrett’s work here.  My Information Theory Toolkit (https://github.com/MaxwellRebo/ittk) has several of the functions you’ll need to start experimenting on systems with a few more lines of code (namely, use Kullback-Leibler divergence).

In the coming months, I’ll be adding ways to calculate the Information Integration of abstracted systems, or its Phi value.  This is NP-Hard, so it will have to remain in the domain of small systems for now.  Nonetheless, I believe if we start designing systems with the intent of maximizing their integration, it will yield some system topologies that have more beneficial properties than our usual ‘flat’ system design.

Artificial Intelligence will no doubt continue to give us great advances in many areas, but I for one am embarking on a quest for something subtly but powerfully different: Artificial Consciousness.

Note: If you have some programming skill and would like to contribute to the Information Theory Toolkit, please fork the repository and send me an email so we can discuss possibilities.  I’ll continue to work on this as I can.

The Best & Worst of Tech in 2013


Keeping with tradition, I’ll review some of the trends I noticed this past year, and remark on what they might mean for those of us working in technology.

THE BEST

JavaScript becomes a real language!

To some this might seem trivial, but there’s a lot to be said here.  With the massive growth of Node.js and many associated libraries, Google’s V8 engine has been stirring up the web world.  Write a Node.js program, and I guarantee you’ll never think of a web server the same way again.

Why is this good?  This isn’t an advertisement for Node.js, but I would posit that these developments are good because they open up entire new worlds of productivity – rapid prototyping, readable code, and entirely new ways of thinking about web servers.  Some folks are even running JavaScript on microcontrollers now, a la Arduino.  JavaScript has been unleashed from the confines of the browser, and is maturing into a powerful tool for creating production-quality systems with high scalability and developer productivity.  Exciting!

Cognitive Computing Begins to Take Form

Earlier this year I stated a belief that 2013 would be the year of cognitive systems.  Well that hasn’t been fulfilled completely, but we’ve nonetheless seen some intriguing developments in that direction.  IBM continues to chug away at their cognitive platforms, and Watson is now deployed working full time as an AI M.D. of sorts.  Siri has notably improved from earlier versions.  Vicarious used their algorithms to crack CAPTCHA.  Two rats communicated techepathically (I just made that word up) with each other from huge distances, and people have been controlling robots with their minds.  It’s been an amazing year.

The cognitive computing/cybernetics duo is going to change, well, everything.  I would argue that cybernetics may just top the list of most transformative technologies, but it has a ways to go before we go full Borg.

Wearables Start to Become a Thing

Ah, wearables.  We’ve waited for nifty sci-fi watches for so long – and lo!  They have come.  Sort of.  They’re on their way, and we’re starting to catch glimpses of what this will actually mean for technology.  I agree with Sergey Brin here: it’ll get the technology out of our hands and integrated into our environment.  Personally I envision tech becoming completely seamless and unnoticeable, nature-friendly and powerful, much like our own biological systems, but that’s another article entirely.

Wearable technology will combine with the “Internet of Things” in ways we can’t yet imagine, and will make life a little easier for some and much, much better for others.

Internet of Things

The long-awaited Internet of Things is finally starting to coalesce into something real.  Apple is filing patents left and right for connected home gear, General Electric is making their way into the space with new research, and plenty of startups are sprouting to address the challenges in the space (and presumably be acquired by one of the big players).

This development is so huge it’s almost difficult to say what it will bring.  One thing is for sure: the possibilities are only limited to one’s imagination.

21st Century Medicine is Shaping up to be AWESOME

Aside from the fact that we now have an artificial intelligence assisting in medical diagnosis, there have been myriad amazing developments in medicine.  From numerous prospects for cures to cancer, HIV, and many other disease to the advances in regenerative medicine and bionanotechnology, we’re on the fast track to a future wherein medical issues can be resolved quickly and with relatively little pain.  There’s also a different perspective: solve the issue at the deepest root, instead of treating symptoms with drugs.

THE WORST

Every Strategy is a Sell Strategy

This year, tech giants went acquisition-mad.  It seems like every day one of them has blown another few billion dollars on some startup somewhere.

Why is this bad?  It may be good for the little guy (startup) in the short term – they walk away with loads of cash – but in the long term I suspect it will have a curious effect.  It’s almost like business one-night-stand-ism.  You build a company knowing full well that you’re just going to sell it to Google or Facebook.  If not, you fold.

You can see where this goes.  People are often saying they look forward to ‘the next Google’, or ‘the next Facebook’, or whatever.  Well there might not be any.  That is, all the big fish are eating the little fish before they have the chance to become big fish.  Result?  Insanely huge fish.

It’s great that a couple of smart kids can run off, Macbook Pros in hand, and [potentially] make a few billion bucks in a few years, with or without revenue.  But who is going to outlast the barrage of acquisition offers and become the next generation of companies?

Big Data is Still not Clearly Defined

Big Data.  Big data.  BIG.  DATA.

What does it mean?

The buzzword and its many ilk have been floating around for a couple of years now, and still nobody can really define what it does.  Most seem to agree it goes something like: prop up a Hadoop cluster, mine a bunch of stale SQL records in massive company/organization, cast the MapReduce spell and – Hadoopra cadabra!  Sparkling magical insights of pure profit glory appear, fundamentally altering life and the universe forever – and sending you home with bigger paychecks.

I’m all for data analysis.  In fact I believe that a society that makes decisions based on hard evidence and good data-crunching is a smart society indeed.  But the ‘Big Data’ hype has yet to form into anything definitive, and remains a source of noise.  (Big data fanboys, go ahead and flame in the comments.)

 America’s Innovation Edge Dulls

It’s true.  I hate to admit, but it is, undeniably, absolutely true.  America has dropped the ball when it comes to innovation.  That’s not to say we’re not innovating cool things, generating economy and all of that – we are.  But that gloss has started to tarnish.  Specifically, America has a problem with denying talented people the right to be here and work.

It could be our hyper-paranoid foreign policy in lieu of 9/11, it could be the flawed immigration system, it could be Washington gridlock or a million other things.  It’s not particularly fruitful to pass the blame now.  We’re turning away the best and the brightest from around the world, and simultaneously continuing to outsource some of what used to be our core competencies.  The bright spot in all of this is that high-tech manufacturing would seem to be making a comeback, perhaps in part thanks to 3D printing, but it’s not quite enough.  We need more engineers, more inventors, and more people from outside our borders.  This has always been the place people come to plant the seeds of great ideas.  Let’s stay true to that.

IITK Released


I’ve just released the first version of IITK, or Information Theory Toolkit.  This is a handful of extremely useful functions from information theory- entropy, mutual information, and Kullback-Leibler divergence, variation of information – in a concise, simple package.

You can grab it at: https://github.com/MaxwellRebo/ittk

I look forward to hearing about what projects it might find its way into!

2013: Cognitive Systems


Between IBM’s Cognitive Computing initiative, GE’s Industrial Internet, and Cisco’s Internet of Everything, one point is becoming perfectly clear: the push to digitally integrate every object in our modern existence is stronger than ever.

Problem: If it’s programmable, it’s hackable.

Corollary: If it’s connected, it’s pwnable.

No doubt these major companies have sophisticated schemes for keeping device security under control, but a paradigm shift may be in order.  IBM is nearing the mark on this: by drawing a direct analogy between digital systems and sensory systems, they’re unlocking a world of infinite potential.  It’s clear how this method applies to generating immediate commercial value, but how does it apply to security?

The security case is difficult.  Secure embedded Linux OS’s are abundant, yet how many of them employ a robust analogy?  How many of them can boast a sentient operating system?  None yet, but it will come around.

This year, and several years following, marks the rise of cognitive systems.  As an entrepreneur, I too will play my part.  A lot of folks think we live in the Information Age.  I’m saying the Information Age has not even yet begun.

An Account of Seemingly Disparate Endeavors


Some people know me as a software designer, others know me as a mathematician-in-training.  I’ve heard from some that I should stop wasting my time on one and just commit 100% to the other.  I claim that this would be a huge mistake, and here’s why.

In fact, these two things are more deeply entangled than many would think.  Here’s the crux of it: to write good programs and ultimately design good software, you have to have an intuition for and understanding of deep, complex structures and the relationships between them.  To do good mathematics, you have to have an intuition for and understanding of deep, complex structures and the relationships between them.  Skill in one area can easily overlap with skill in the other.

Based on this, designing software helps me understand deeper mathematical structures, and studying mathematics (especially matrix groups, complex analysis) helps me understand software designs.  Quitting one or the other would ruin this synergy I have going between them, and besides, at present I earn my bread through software anyway.

 

A slight misunderstanding


Often in discussions of artificial intelligence I see and hear the quote, “The brain does around X calculations per second.”  Usually this number is around 100 trillion.  Why?

This is presumed because the brain is said to have about 100 trillion synapses between all of its neurons.  By treating each synapse as a computational element capable of performing an action based on a stimulus, the brain is then modeled as “something doing 100 trillion calculations per second.”

There are several problems with this:

1. What is the nature of these “calculations” we’re talking about?  Is this simple addition, probability tables, or differential equations?

2. The language of “per second” could wrongly imply that the brain somehow runs on a constant master clock.

3. Are there additional layers of important information being exchanged beyond merely the synapses?

In more detail,

#1: Given the tendency to want to measure things in FLOPS (Floating Point Operations Per Second), I can see why this approach would be appealing.  It’s as simple as just counting how many “computational elements” are in the system, and then saying it can do that many calculations per second, right?  I’m led to think, well, no.  A FLOP is likely to be an extremely simple operation such as addition or multiplication.  Something more complex, such as linear algebra routines or probability functions, will require sophisticated code and hence numerous instructions/FLOPS to execute it.  The argument that “the brain does 100 trillion calculations per second and therefore we will have true AI when computers can do 100 trillion CPS” then is as useful as saying “a human is made of 150 lbs of matter so when you have 150 pounds of matter you’ll have a human”, or something equally ridiculous.  The number of calculations is not completely unimportant, but it is secondary to what kind of calculations are being done.  In the case of classical computers, as stated, complex instruction sets use up many operations to do their work, and so a raw measure of simple calculations isn’t very informative of the system’s overall capability.

#2: We’re used to think of calculations happening in a uniform, clock-like way because of the way our chips are designed.  The problem is, the brain, for as far as we can tell, processes everything asynchronously.  Each node is operating more or less independently of the others.  That’s not to say that classical computers won’t be useful in emulating brain-like mechanics, but modeling an asynchronous system with a highly synchronized one comes with complications that should not be ignored.

#3: This also ties in with #1.  Namely, due to the passing of information through electrochemical channels, there are additional layers of computation that, as of yet, have not been completely modeled nor understood.  The actual communication mechanisms of the brain could be simpler than we thought, or they could turn out to be vastly complex.  It’s anybody’s guess right now.  But in any guess, a direct conversion from calculations per second (as in the brain) to something like FLOPS (as in classical computers) is like saying “a machine that can add 10 numbers together in a second can also solve 10 high-order partial differential equations per second”.  With extremely clever software something like this may eventually be possible (perhaps a map from addition operators to matrix solvers or something like that), but for now this kind of crude conversion is wildly inaccurate.

I worry that a lot of people are buying this idea that once we get 100 TeraFLOPS machines we’ll somehow have an uber-AI.  Unless software comes a long ways, those who are counting on this idea may be very disappointed when this emergence doesn’t happen.

It is worth noting that a quantum computer used for AI would be a completely different picture – different from both classical computers and from the brain.  A behemoth of the quantum variety would be capable of things that neither an Intel i7 nor a human brain can do, but that is another discussion entirely.

Until next time, then.