Mind Hash Tables: An Estimate of Reality

Science and Philosophy are Models

Anyone who’s taken a philosophy course knows that there exists an endless supply of propositions trying to explain anything and everything, from existence to reasoning. Furthermore, anyone who’s taken a science or math course knows that everything science can explain has exceptions. Exceptions will always exist because any time humans come up with a theory, whether it be for further understanding or prediction, it’s simply a model. A model, by my definition, is a simplified version of reality which tries to represent, as accurately as possibly, the aspects and relationships which are most significant to what we are attempting to model. Therefore, by definition, a model has some degree of information loss. If a model didn’t have information loss, that ideal model would be reality.

Philosophy is the discipline concerned with questions of how one should live (ethics); what sorts of things exist and what are their essential natures (metaphysics); what counts as genuine knowledge (epistemology); and what are the correct principles of reasoning (logic)”
Wikipedia

Models are Hash Functions

An analogy for a computer scientist is that the mind is a hash table, or a collection of hash tables. Following from this model, a scientific or philosophical model is a hash function which is subject to collisions. The sciences are specialized hash functions that have been highly optimized (and generally agreed upon) for explaining certain areas (math, biology, physics, chemistry, etc). The division of areas themselves are models. Scientific hash functions are small enough for undergrads to conceptualize and provide the greatest minimization to the number of collisions in the set of real life it’s attempting to represent.

When starting a mind hash table, just as in computer science, the potential dataset size is unknown. Children start with smaller hash tables because they don’t know how much they will be modeling. They start with much simpler models than most adults use. Much of human calorie consumption is done by the brain so a smaller hash table, which consumes less resources, is desirable when basic survival is critical. Resizing a hash table is expensive because everything must be rehashed. This can be thought of as a paradigm shift or accepting a new model as our own.

Why Deliberate and Create Models?

We build hash tables for more efficient look up and understanding. Hash tables are easier to wrap our minds around and it’s arguably impossible to wrap our minds around all reality at once. It’s also faster to think about a model vs. trying to figure out all science every time. Proving a theory, for example, simply means that for a given set of critical criteria, the theory (model/hash function) minimizes collisions in the mind hash table. Determinists, realists idealists, fundamentalists, and everybody in between will always be able to find collisions in everyone else’s models. That doesn’t mean anyone is right or wrong, it simply means their hash has information loss, just like everyone else’s hash.

Deliberation and argument is important because it can be thought of as a catalyst for the process of growing and rehashing our minds. A Grand Unified Theory, by my model of models, seems impossible to find because it means finding a hash function small enough to conceptualize, even with the most intelligent minds, and a hash function with minimal collisions on all input to satisfy the unified constraint. Don’t give up on postulating, though. Try to find as many different models as possible. Though it may be impossible to completely understand reality, through the overlap of models we gain the best possible sense of reality. This can be thought of as finding the area under the curve with Riemann Sums.

If each box inside the curve is a theory about what the reality-curve’s real shape is, then the more theories we have while still holding true to sensible reasoning, the more accurate one’s picture of reality is. A more accurate model may be to have boxes of different sizes, representing the fact that some of our theories are more accurate than others. This leaves the potential to replace a big box with more little boxes that “hash” with less collisions, and therefore, fit more tightly to the curve we call reality. This replacement can be achieved through deliberation.

With the box model it may be tempting to say that no one is ever truely right or wrong. This may be true in the general sense, but if we think of right and wrong as smaller and larger boxes, respectively, then in a certain context, someone’s boxes are always smaller and are therefore more accurate.

To Understand Reality, Create More Models

Minimize input information loss by learning a common set of arguments known as philosophy. Minimize model information loss by learning as many models as possible, known as science.

Of course, this analogy of a mind hash table is a model itself and therefore is ultimately missing at least some key aspects of reality. The idea of reality itself contains information loss but the more models we try and use will cause enough overlap to give a pretty good idea of what reality truly is. That’s probably the best we can do.

“It comes down ultimately to a question of philosophy. Does the world make sense or do we make sense of the world? If you believe the world makes sense, then anyone who tries to make sense of the world differently than you is presenting you with a situation that needs to be reconciled formally, because if you get it wrong, you’re getting it wrong about the real world.

“If, on the other hand, you believe that we make sense of the world, if we are, from a bunch of different points of view, applying some kind of sense to the world, then you don’t privilege one top level of sense-making over the other. What you do instead is you try to find ways that the individual sense-making can roll up to something which is of value in aggregate, but you do it without an ontological goal. You do it without a goal of explicitly getting to or even closely matching some theoretically perfect view of the world.”
Clay Shirky

Technorati Authority Boosting Script

Technorati Logo

Technorati is an Internet search engine for searching blogs, competing with Google [and] Yahoo.” —Wikipedia. Not only does Technorati allow for blog searching, they have their own rating system called “authority” which is analogous to Google’s PageRank. Authority is a function of how many incoming links a blog has. High authority is critical to finding new readers and making new connections in the blogging community.

The problem is that, unlike Google, Technorati does not regularly crawl sites to update the link information. Technorati’s link information depends on blogs to notify, or “ping,” Technorati when they’ve been updated. This is undesirable because, for example, my blog could have many other blogs linking to it but if the linking blogs never notify Technorati of their existence, my blog’s authority will not benefit.

TechnoPing

TechnoPing is a program I wrote in python to ping Technorati with the sites that are linking to me, essentially doing the pinging for them. The script, as input, requires a Google Webmaster Tools external links CSV file to know which sites link to me. To get the CSV file, in Webmaster Tools, browse to “Dashboard ? Links ? Pages with external links.” All the way at the bottom of the external links page, the CSV file with all the site’s links is labeled “Download all external links.” TechnoPing can be run from the command line with python technoping.py <link file>.

Download TechnoPing (Remove the ‘.txt’ extension. I know what your thinking: The name is awesome.)

Technical Notes

This idea and program is based on Joost de Valk‘s idea to automate this process. As far as I can tell, he’s never completed and/or released his script. TechnoPing, unlike Joost’s script, does not scrape search engines to find links and therefore requires the extra annoying step of having to download the CSV file. Joost also has a Bookmarklet to ping Technorati intended for individual pages. On Joost’s site he’s pointed out that IP addresses will be banned for over-pinging so watch out.

Shortcomings of Mercurial

Mercurial is, by far, the best revision control application I’ve ever used, with Git a close second. Fundamentally, Mercurial does revision control correctly: distributed, clean CLI, and good documentation. I’ve never had any qualms with it, even in team settings. On the other hand, I have noted some complaints teammates have had with Mercurial, sometimes complaints causing them to stop using Mercurial or DRCSs all together.

  1. Mercurial doesn’t maintain file and folder permissions. This becomes a problem when hosting a shared Hg repository on a machine without root access. Other group members add files to the shared repo via ssh and it’s up to the users umask to set file permissions. This means I may have files in my home directory which I can’t access.1
  2. Merging tools and file servers on Windows are lacking. Windows users expect the revision control to supply these tools. Unix users (including Mac OS X) are used to having file servers and merge tools supplied by the distribution of the operating system, not the revision control application. Some Windows users want GUIs like TortoiseSVN.2
  3. Binaries are handled terribly. Mercurial currently doesn’t good way to handle binaries. For instance, if a binary (such as an image) is moved, it’s considered a new binary. Keeping many binaries in revision control and moving them a lot will make the repo huge.

Aside from binary handling issue, I think the reason these “shortcomings” have never bothered me before is because I consider the file permissions to be a part of the operating system and the merge and server tools to be separate applications. Mercurial, thought easier to use with said external tool, is just as functional without and I will continue to use it as my main development tool.

  1. When I asked about the file permission problems on the Mercurial mailing list, the Mercurial maintainer responded with this:

    Patches to inherit permissions from “.hg” are now in the crew repo which
    will get merged into mainline shortly and show up in the 1.0 release
    very soon.

    In the meantime, you’ll have to beat people into setting their umask
    properly.
    — Matt Mackall

    []

  2. TortoiseHg is now available for Mercurial. []

Beginning Blogging

There are three key aspects to think about when starting to blog: Finding interesting blogs to read, reading blogs and becoming inspired and reactive, and then writing blog posts.

Reading

Big RSS Logo

Comparing blogs to physical newspapers, blogs are analogous to individual columns of a newspaper. Physical newspapers have editors who choose the columns and articles to feature each time the newspaper is published. The beauty of blogs is that they allow for the reader to pick and chose their own preferred columns and essentially take on the editor roll themselves; The reader is the editor.

Newspapers have a rigid submission deadline for columnists’ articles. Bloggers, on the other hand, are publishing articles all the time in a constant stream of content. Feed readers, like Google Reader, allow readers to subscribe to blog “feeds” (streams) and be notified of updates. Feed readers are the tool that allow readers to be editors and automatically build their personalized newspapers in one place.

“There’s an analogy here with every journalist who has ever looked at the Web and said ‘Well, it needs an editor.’ The Web has an editor, it’s everybody. In a world where publishing is expensive, the act of publishing is also a statement of quality — the filter comes before the publication. In a world where publishing is cheap, putting something out there says nothing about its quality. It’s what happens after it gets published that matters. If people don’t point to it, other people won’t read it. But the idea that the filtering is after the publishing is incredibly foreign to journalists.”
Clay Shirky

Finding

Feed Logo

I find blogs by reading sites where people vote on blog posts (and actually anything on the internet, but blogs pop up more often than not) such as Digg and Reddit. To add a site to a feed reader (specifically Google Reader), a subscription must be available. Almost all sites have subscriptions available and they are usually denoted by the pictured orange feed logo. Copying and pasting the sites address into Google Reader’s “Add subscription” box will add the subscription. Google Reader can also look at the current subscriptions and make suggestions for more feeds.

Writing

Sign up at a place like WordPress for free and just start writing. I definitely use a few tricks to get going, though:

  • I write about things that I find myself “preaching” to many different people. For instance, lately I’ve been telling many people how to get started with blogging and now I will be able to point them to this post.
  • Read blog feeds to help inspire unique content. “Blog reactions” are blog posts about other peoples’ posts. Building on other people’s ideas is definitely encouraged in the blogging community.
  • Get involved with similar bloggers. Competitor blogs tend to share a lot of the same readership. Comment on peer blogs and don’t be afraid to start some controversy!
  • Talk to real-life friends about blog ideas. Blogging doesn’t just have to be an online activity. I actively talk to my friends about what would appeal to them and what they think about certain topics.
  • Think about the target audience. Most of the time, I have no idea who my target audience is but I have an ideal in mind. My peers in school and at work are definitely the closest I know to this ideal reader. I try and gauge how much I should explain, for instance, technical details based on what my target audience would already know.

Duty Calls

Evolution

When first starting out it’s hard to know exactly what to blog about. It takes time and practice to find a niche. I’m constantly starting over as I change what I want in my blog. I’ve had three blogs previously and two blogs now and each is different and helped me evolve my writing.

Many blog posts that I write are never actually get posted. I think of Humanist as a magazine where I publish a select subset of my ideas to a select subset of the internet population. Even if I have a good idea, I won’t post it if it’s not exactly relevant to my readership. Of course, posting anything and everything I can think of would probably still find some subset of the population to appeal to, but part of the blogging fun is finding a binding theme. I’m not writing about myself and thats why I don’t blog at LukeHoersten.com. Jim Whimpey describes it well when explaining why he’s not blogging at his name anymore: “… I feel like I want to write for a publication rather than run what felt like a public consciousness. Valhalla Island by Jim Whimpey feels like there’s an extra layer of editorial integrity and quality.”

Try to think “what unique perspective do I have to write about? What can I add to the internet?”

Motivation & Rhythm

It’s critical to stay motivated to keep the flow of posts coming. Things like reader feedback, money from ads, and even site design keep me interested enough to keep posting. Finding a rhythm is also important. I post about once a week and my post lengths tend to be medium range. Readers come to expect a rhythm. Some people post less often but their posts are huge articles with many ideas. Microblogging is another form of blogging where the writer simply does little one-line updates very often. Either way, I think most authors probably blog about 500 words per three days, give or take.

If you have trouble coming up with ideas, let me know and I’ll try and help tap your brain.

Unified Web Persona: Custom OpenID & G-Talk URLs

OpenID

An online web persona is essentially a collection of online personal effects: web pages, profiles, and accounts. It’s important to have a well maintained web persona for a few reasons: to be easily discovered by potential employers, to build connections, and to be easily contacted. I try to saturate search results for my name with sites that I control to ensure that the online image of myself is well formed and professional. To aid in that professionalism, it’s important to unify all personal effects under one domain name (in my case, luke.hoersten.org).

The four most important units of my web persona are my personal website and unified login (luke.hoersten.org), and my email and instant message handle (luke@hoersten.org). The email address and website are trivial to set up with a custom domain name but the unified login and instant message handle can be harder. OpenID and Google Talk can aid in the customization.

Unified Login: Custom OpenID URL

OpenID is a decentralized single sign-on system. Using OpenID-enabled sites, web users do not need to remember traditional authentication tokens such as username and password.
Wikipedia

OpenID is finally starting to get wide-spread use! For me, though, the biggest turnoff to OpenID is the horrible choice of OpenID provider domain names. If I’m going to be picking a one-size-fits-all login, I want it to look good, right? While setting up my own OpenID Provider, I realized that OpenID is intended to use any site I own as my OpenID login. That means I can get an OpenID provider anywhere, as long as the URL I’m using as my OpenID URL is linked to my OpenID provider. Here’s how it’s done:

  1. Sign up for an OpenID provider. This is what will handle the authentication
  2. Link a personal website to the OpenID provider. This is the URL which will be used to log in to OpenID enabled websites.
    <link rel="openid.server" href="http://www.myopenid.com/server" />
    <link rel="openid.delegate" href="http://youraccount.myopenid.com/" />

Now I am able to log in to OpenID enabled websites with “luke.hoersten.org” even though my OpenID Provider may be with myOpenID or some other provider. This is because “luke.hoersten.org” forwards to my real OpenID.

IM Handle: Custom Jabber

Getting a personalized instant message handle is not easy. There are many different IM networks and matching all of those to a personal brand can be tricky. Luckily Google is really pushing Google Talk.They’ve added support for the largest IM network, America Online Instant Message (AIM), so now a Google Talk handle has one of the widest exposures. Google also has something called Google Apps which allows users to bind parts of their domain name to certain functions like Jabber IM and email. This is what I use for both email and IM, and it’s free. Sign up for Google Apps and set up both email and a custom Jabber account all in one place.

The new Google Talk account can be accessed with any Jabber client. Here is the Pidgin setup:

Pidgin Basic Google Talk Setup

Pidgin Advanced Google Talk Setup

Completing a Web Persona

Of course, the above steps are only a few in creating a complete online persona. Web Worker Daily has some great posts about why someone would want an online persona and how to build a complete persona. Both Google and Yahoo! allow unrelated email addresses to be added to your respective accounts. Google has a special secret page for this and Yahoo! can be controlled right from the profile screen. Hopefully both will allow OpenIDs to be attached in the near future.