Friday, April 19, 2013

Node Data Evolution

While working on Thinker I searched for a data syntax that could gracefully handle the kind of challenges that would Thinker would present. I wanted a human readable, minimal syntax that handled node-based data. These considerations were mostly for the sake of the developer (in this case me); though, I was also thinking of the user. If the user ever needed to export or convert Thinker data into text for backup or transport, I wanted it to be something they could understand and pick up easily.
In the beginning, Thinker data was based in XML. XML certainly satisfied the node-based requirement. You can stick many nodes of the same kind under one parent, and XML even has explicit metadata syntax for each node in the form of attributes. But XML, despite being human readable, is still a chore to work with manually. Markup really is best used for marking text and not complex data. And despite having explicit metadata, the metadata doesn't really have any structure. If you attempt to store structured metadata in an XML attribute (such as styles in HTML) the results are hardly readable.
Then Thinker used JSON. JSON was nice because it was easy to work with in Javascript, to stringify and transport objects. The data was easier to manually work with than XML; there were no lengthy closing tags to deal with. But JSON is not inherently node-based. You can make node structures in JSON: child nodes are just nested in an array, but nesting objects in an array gets very messy in JSON when you're stacking all those brackets, braces, and commas together.
Then I got into CoffeeScript. I've always liked the beauty of languages with whitespace syntax though I hardly use Ruby or Python. So I though I would just use CoffeeScript syntax to work with my data in the form of CSON if you will. (That's actually a thing) But despite the beautiful whitespace syntax of objects in CoffeeScript there's one glaring problem: you still need brackets for arrays. And putting indented objects into an array over multiple lines wasn't much better than JSON with node-based structures... So close, yet so far.
But then I had a brilliant idea! Why not just use colons without keys to represent indexed items in an array. Since "arrays" in Javascript are generally just objects (sparse hashes) anyway, colons without keys would just be shorthand for assigning indices. So I posted on a ticket on Github...
Alas, they get people complaining about the array syntax all the time, and they are probably tired of hearing about it, especially since there are alternatives like Coco and LiveScript. Though I actually agree with jashkenas that stars and hyphens are not the best solution to the problem. Though one problem my solution would be the risk of accidentally mixing keyed and unkeyed properties. Arrays with keyed properties work in runtime, but not in a precompiled language. It would have to be a compiler error.
So ultimately, I had no choice but to create my own data syntax, which really means I had to create my own parser. The reason I didn't just go with YAML at this point was that, 1) I don't actually prefer it's hyphen syntax, and 2) I thought I would go all the way and create a syntax that satisfied my node-based requirements as well. Also, YAML doesn't accept tabs in whitespace. Yes, I see that as a flaw, and I don't care who knows.
So to start, I just created a parser that parses objects like CoffeeScript (or actually like YAML with unquoted strings). Then I added the ability to convert objects with unkeyed properties to arrays. Since this data will be parsed at runtime, it allows for arrays with some keyed properties. Of course, those properties don't output to JSON, but they could be set in the result.
So this:
obj:
 foo: bar
 key: val

arr:
 : 2
 : -1
 :
  foo: bar
  zip: zap
 :
  : 1
  : 2
  : 3
Translates to this in JSON:
{
        "obj": {
                "foo": "bar",
                "key": "val"
        },
        "arr": [
                2,
                -1,
                {
                        "foo": "bar",
                        "zip": "zap"
                },
                [
                        1,
                        2,
                        3
                ]
        ]
}
Awesome! It's the most straightforward and simple hierarchical data syntax ever. But while I'm at it with a custom parser  I could go ahead and create an expanded syntax with support for node structures. So I added some extra key symbols for Thinker node data shorthand.
So this:
root node
 some node prop: 0
 position:
  x: 0
  y: 0
 - link
  link prop: 0
  > a child node
   more node props:
    :1
    :2
    :3
   -
    > grandchild node
 -
  > another child node
...is equivalent to:
_value: root node
some node prop: 0
position:
 x: 0
 y: 0
_links:
 :
  _label: link
  link prop: 0
  _target:
   _value: a child node
   more node props:
    :1
    :2
    :3
   _links:
    :
     _target:
      _value: grandchild node
 :
  _target:
   _value: another child node
...which in JSON looks like this :
{
        "_value": "root node",
        "some node prop": 0,
        "position": {
                "x": 0,
                "y": 0
        },
        "_links": [
                {
                        "_label": "link",
                        "link prop": 0,
                        "_target": {
                                "_value": "a child node",
                                "more node props": [
                                        1,
                                        2,
                                        3
                                ],
                                "_links": [
                                        {
                                                "_target": {
                                                        "_value": "grandchild node"
                                                }
                                        }
                                ]
                        }
                },
                {
                        "_target": {
                                "_value": "another child node"
                        }
                }
        ]
}
...which would be a real chore to work with manually. This node structure uses the new link paradigm. Note that the links themselves are a kind of node and can contain their own properties. The hyphens in this case don't just designate array items, but link nodes that are added to a links array. I also added "=" as shorthand for a meta link. Carets point to link "targets" which are actual nodes, which themselves can contain links to other nodes. I even added some support for node id's and references to serve as the basis for associative links in Thinker. Normal node properties use the same basic object structure from before.
This expanded node data syntax is obviously not quite as straightforward as the basic object structure, but it is much better than the alternative for node-based data. I'm not even going to share the parser code here because I know nothing about writing parsers, and it's probably the most clumsy, inefficient, fragile parser ever written. It just works... most of the time. So For now I'll just use it internally.

Edit: mixed up the terms compiler and parser...

Tuesday, April 9, 2013

Interface Improvements

The Thinker interface is currently all HTML with some SVG. The future Thinker interface will be WebGL. WebGL is entirely hardware accelerated which will vastly increase graphics performance. Since WebGL is just OpenGL for the browser, it should also make Thinker more portable to other platforms. And of course, it will also allow for cool things like 3D mind mapping and data visualizations. Panning and zooming around a world will be trivial (currently Thinker only has panning, unless you count browser zooming).


There is also another interesting aspect of moving to WebGL that will help improve the interface. I've noticed that the Beta testers like to think visually from the center out. Currently, Thinker places the origin and title of the world in the upper left corner with upper and left limits. Nodes can only extend down and to the right. This was a natural consequence of using an HTML paradigm to display the worlds. With the WebGL paradigm, the origin and title of the worlds will be centered and nodes will be free to extend outward in all directions (and dimensions). This will make it easier to rearrange node trees manually and automatically as they will have more space as they grow outward.

I might even be able to recreate the Metaphysical Index entirely in Thinker...

Thursday, April 4, 2013

Protometa

Protometa - my username on Github and the beta invite code for Thinker. It's short for prototype metadata, the virtual metadata nodes that stores default values that are otherwise undefined. If a node does not have a certain property defined in it's metadata nodes, it would inherent this default prototype metadata. Whenever a property of a node is initially set, a metadata child node is added that overwrites the protometa.

Metadata nodes within metadata nodes...

Why would you want the properties of nodes to be stored as nodes within nodes? Ideally, this fractal, recursive structure makes nodes more editable and extensible.

Metadata can as complex as any other data, and since Thinker nodes are meant to be a way to navigate and edit such data, it only makes sense to use such nodes to explore the metadata as well. In this way, nested node metadata allows for manual editing of properties. E.g. you can edit the radius of a node by dragging the edge and resizing it, or you can open the radius metadata and set it to some specific value.

Also, it potentially allows for the use of data to drive the appearance of nodes, which would be a way to create custom data visualization. E.g. the radius of a node could be driven by a value which would allow you visualize it relative to others. Creating formulas to drive values is itself done with metadata nodes. It's sort of like visual scripting.

Working out formula/function syntax in nodes

But it has some major downsides. It overcomplicates many things that would otherwise be very simple, and the nested metadata nodes are difficult to query and update in the database.

In considering the new linked data structure for Thinker, specifically the idea of link node metadata, I had a happy realization. Nested metadata nodes should not be created by default every time a property is adjusted. According to the new paradigm, metadata links could be an optional addition to allow for plugging node values into node properties. This would let nodes store their own properties in a compact fashion by default while still allowing for advanced functionality such as manual override of properties, custom data visualization, and function nodes.

Furthermore, adding these metadata links could simply be a matter of adding a child node and changing the type of link that connects it. So there would be normal links with any arbitrary labels and meta links with special keyword labels that could alter various properties or provide other advanced functionality based on the values to which they linked.

Bottom line: things will be much simpler for the developer and the user without sacrificing any potential functionality.




Tuesday, April 2, 2013

App vs Web

In my first post, I touched on the topic of monetization and the struggle to display contextual ads in a web app. (Since then I've discovered Google Interactive Media Ads and Google In-Game Advertising. Unfortunately the former seems to be specifically for video and the latter is specifically for Flash games...)

I thought it might be good to review my thoughts on the web vs app issue. Thinker is currently being developed on the web.

Thinker would make a great app. It's probably easier to develop as an app with existing libraries, SDK's, and standards. I might be easier to monetize as an app. Thinker is designed around a touch screen interface paradigm which would naturally work well as an app on a tablet device. There are some aspects of advanced web technology and standards that almost make it somewhat browser specific, so the cross-platform advantages of the web are more limited.

Despite these points in favor of apps, Thinker is on the web for now. One big advantage of the web is it's inherent interconnectivity. A web app can easily link and connect to any other content on the web. I want Thinker to have that kind of interconnectivity. Also, the web is a good place to start. When you want something that goes everywhere you can always start with the web and them make optimized apps and platform specific versions later.