The Tagtriples scheme is working pretty well at work (which is turning out to be a good testbed for the (opensource) tagtriples code I write in my spare time). We've got a deployment of the aggregator with 1.5M triples in it.

I think I've got the semantics pretty much sorted:

  • Symbols used multiple times in the same graph are assumed to mean the same thing each time.
  • However the same symbol used in different graphs may or may not denote the same thing. (It's up to the reader to decide for themselves how likely this is when they interpret the graphs within a certain context.)

This sounds pretty woolly, but I've found it to be reasonably workable - it's easy to use descriptive statements to whittle down the set of meanings for a symbol within a certain context.

The remaining hurdle to this scheme occurs when you want to handle two meanings/senses of the same symbol in a single document.

E.g. maybe you want to say

"TradeBroker (the application) has project team TradeBroker (the team)."

This seems to arise most frequently when you are making statements to link information from two (un-connected) sources that use the same symbol to mean different things. E.g. in the above example the first 'TradeBroker' might come from an application database, and the second a people/groups ldap directory.

After much thought (mainly about whether this could be encoded using statement proximity), I've decided that the most workable way I can think of is to add an extra graph-specific field for each value in a statement.

In practice this means the addition of 3 new fields to the triples table: i.e. in addition to the s,p,o fields for each statement, there's also a graph specific (g)s, (g)p, (g)o that allows differentiation between different 'uses' of the same symbol within the same graph. This sounds heavyweight, but I'm hoping the new fields won't need indexing since they'll be used to filter already-small resultsets.

In serialized form, I'd imagine the above example would be expressed:

TradeBroker type Application TradeBroker hasProjectTeam TradeBroker[2] TradeBroker[2] type ProjectTeam

Hopefully I'll get a chance to implement this and try it out some time later this week.