This page explains the basic annotation features for Semantic MediaWiki
Template:SMW 0.7 feature. Annotations are special markup-elements which allow editors to make some parts of the wiki's content explicit so that software tools can use them to assist the users. In particular, semantic annotations provide the basis for more powerful search functions within the wiki. They also allow changes in data on one page to automatically propagate to other pages containing the same data (somewhat comparable with what can be done with templates). Users who are not familiar with the basics of editing MediaWiki should read Help:Editing first.
Annotations in Semantic MediaWiki can be viewed as an extension of the existing system of categories in MediaWiki. Categories are a means to classify articles according to certain criteria. For example, by adding [[Category:Cities]] to an article, the page is tagged as describing a city. Software tools can use this information to generate an ordered list of all cities in a wiki, and thus help users to browse the information.
Semantic MediaWiki provides further means of structuring the wiki:
- Relations basically are "categories for links." They describe the meaning of hyperlinks between articles. For example, the link from the article Berlin to the article Germany describes the relationship of being the capital of some country.
- Attributes describe the meaning of data values in articles. For example, the text 3,396,990 in the article Berlin describes its population.
These additions enable users to go beyond mere categorisation of articles. Usage and possible problems with using these features are similar to the existing category system. Since categories, relations, and attributes merely emphasize a particular part of an article's content, they are often called (semantic) annotations. Information that was provided in an article anyway, e.g. that Berlin is the capital of Germany, is now provided in a formal way accessible for software tools.
The main reference for the use of categories is the MediaWiki documentation on categories. Categories are used as universal "tags" for articles, describing that the article belongs to a certain group of articles. To add an article to a category "Example category", just write
anywhere in the article. The name of the category (here: "Example category") is arbitrary but, of course, you should try to use categories that already exist instead of creating new ones. Every category has its own article, which can be linked to by writing [[:Category:Example category]]. The category's article can be empty, but it is strongly recommended to add a description that explains which articles should go into the category.
MediaWiki's categories have many different interpretations. For example, the category "City" might comprise all articles about particular cities, i.e. a member of this category is a city. Or it might describe the topic area of articles, such as articles on city squares, urbanism, etc. Or both. MediaWiki encourages this practical usage of categories: a category forms a collection of articles that are considered useful or interesting for users, and categories are organized to browse narrower or broader groupings and to find related concepts.
Ad hoc use of categories doesn't "break" Semantic MediaWiki, but leads to inconsistency when you try to interpret semantics. SMW's Special:Export RDF applies precise semantics to categories, as described in its help page, that will be wrong for some usage.
The advanced search functions of Semantic MediaWiki makes some categories superfluous, so that an SMW-enabled wiki might achieve a high degree of organization with fewer categories. For example, the subcategory "Large cities" could be replaced by a query for articles with Category:city with an area larger than 10 kmÂ², or a popultation larger than 1,000,000.
Relations can be viewed as "categories for links." To understand the idea, consider the Wikipedia article on Berlin. This article contains many links to other articles, such as "Germany," "European Union," and "United States." However, the link to "Germany" has a special meaning: it was put there since Berlin is the capital of Germany. To make this knowledge available to computer programs, one would like to "tag" the link
that is given in the article text, saying that this is a link that describes a "capital-relationship." With Semantic MediaWiki, this is done by writing
In the article, this text still is displayed as a simple hyperlink to "Germany." The additional text "is capital of" is the name of the relation that we use to classify the link to Germany. As in the case of categories, you are free to use any label that you like to describe a link, but it is useful to re-use relations that already appear elsewhere.
To simplify this re-use, every relation has its own article, where its proper usage can be described. You can search through these articles with the Special:Search page to find existing relations. The titles of relation articles are prefixed with "Relation::" to distinguish them from other articles. Creating these articles is optional, but it greatly helps others to find and apply your relation.
There are various ways of adding relations between two pages:
|What it does||What you type|
|Classify a link with the relation "example relation."|| |
Classify a [[example relation::link]] with the relation "example relation."
|Use an alternative text for a classified link.|| |
Use an [[example relation::link|alternative text]] for a classified link.
|To make an ordinary link with two colons without creating a relation to another article, escape the markup with a colon in front, e.g. std::out.|| |
To make an ordinary link with two colons without creating a relation to another article, escape the markup with a colon in front, e.g. [[:std::out]].
There are many statements that one cannot easily annotate with relations and categories alone. For example, to say that Berlin has a population of 3,396,990, one would not give a typed link [[has population::3,396,990]] simply because an article "3,396,990" does not make much sense. Yet, one would like Semantic MediaWiki to create a list of all German cities, ordered by number of inhabitants. This "ordering by number" is different from the lexicographic order that one would expect for article names. For example, in the lexicographic order, "1,000,000" is smaller than "345" (in the same way that "Alphabet" is earlier than "Order" in a dictionary).
So we have two requirements:
- state that Berlin has a population of 3,396,990 without creating a link to "3,396,990" and
- tell the wiki software that population should be treated as a number, not as a text label or anything else.
The first is achieved by writing in the article on Berlin the text
The only difference to a relation is that we write ":=" instead of "::" as before. The number 3,396,990 now appears as normal text and no link is created. The label "population" again is our free choice. We could have used any other text as well. As in the case of relations, our attribute "population" gets its own article where we can add descriptions for other users. The article name starts with "Attribute:", i.e. the article is called "Attribute:Population" in our case.
We still have to say that "population" is a number. Semantic MediaWiki knows a number of different datatypes that we can choose for attributes. In our case, the type is called Type:Integer. The prefix "Type:" is again a separate namespace that distinguishes descriptive articles about types from normal pages. What we want to say is that the attribute population has the type integer, i.e. that the two things have a special relation. As with all relations, this is stated in the population's article Attribute:population. There, we write
to say that the special relation "has type" holds between Attribute:population and Type:integer. Semantic MediaWiki knows a number of special relations like Relation:has type. Regardless of whether these relations have their own articles in the wiki, they have a special built-in meaning and are not evaluated like other relations.
Datatypes are very important for evaluating attributes. Firstly, the datatype determines how tools should handle the given values, e.g. for sorting search results. Secondly, the datatype is required to understand which values have the same meaning, e.g. the values "1532", "1,532", and "1.532e3" all encode the same number. Finally, some datatypes offer special functions, as will be described below. For these reasons, every attribute must have a datatype. If no datatype was defined, an annotated article will still be displayed correctly, but the semantic annotation cannot be exploited until an attribute is given and the annotated article is saved again. Likewise, changing the type of an attribute later on does not affect the annotations of existing articles until they are modified and stored the next time.
The most important mark-up elements for attributes are
|What it does||What you type|
|Assign the value 1,234,567 to the attribute "example."|| |
Assign the value [[example:=1,234,567]] to the attribute "example."
|Assign a value of about a million, but showing a different text in the article.|| |
Assign a value of [[example:=999,331|about a million]], but showing a different text in the article.
|Escaping annotations: in Pascal, variable assignments use the operator :=.|| |
In Pascal, variable assignments use the [[:operator :=]].
| Giving the type in an attribute's article:
This attribute is an integer number.
Giving the type in an attributes article: This attribute is an [[has type::Type:Integer|integer number]].
| Combining MediaWiki markup with attribute values:
John's email address is email@example.com
Hint: Use a template for this.
Combining MediaWiki markup with attribute values: John's email address is [[email:=firstname.lastname@example.org|[mailto:email@example.com firstname.lastname@example.org]]].
Datatypes and units of measurement
Using different types, attributes can be used to describe very different properties. A complete list of available types is available from Special:Types. Basic types include:
- Type:String (text strings)
- Type:Integer (whole numbers)
- Type:Float (decimal numbers with optional exponent)
These can be used creatively for very different purposes. For instance, attributes of type string can be used for encoding phone numbers (which in fact can contain non-numeric symbols).
Type:Float allows a unit after the numeric value to distinguish values (e.g. "30.3 mpg" versus "47 km/liter"), but does not know how to convert between them. (In SMW 0.6, providing units leads to the warning "this attribute supports no unit conversion" in the factbox and query results, Template:SMWbug filed.)
To support automatic conversion and multiple unit formats, you can define your own type with custom units. These automatically convert values to and from standard representations, so that users are free to use their preferred unit in each article yet still query and compare with attribute values in other articles.
There are some special built-in types which support more complicated formats and unit conversions.
- Type:Enumeration (new in SMW 0.7) is like Type:String but restricts the value of an attribute to a limited set of values.
- Type:Temperature can't be user-defined since converting temperature units is more complicated than multiplying by a conversion factor.
- Type:Geographic coordinate describes geographic locations. It includes functions for recognizing different forms of geographic coordinates, and it dynamically provides links to online map services.
- Type:Date specifies particular points in time. This type is still somewhat experimental, but may feature complex conversions between (historic) calendar models in the future.
For specifying URLs and emails, there are some special variations of the string type:
- Type:URL and Type:URI both just seem to work like Type:String. (In SMW 0.6, when a value of this type is produced in a query it does not work as a link.)
- Type:Annotation URI: attributes of this type are interpreted as relations to external objects, denoted by the URI. They are special since they are interpreted as annotation properties on export. See the type page for documentation. (Again, in SMW 0.6 when a value of this type is produced in a query it does not work as a link.)
- Type:Email stores emails as a string datavalue, but automatically links them (with mailto:) within the page.
It is possible to embed semantic annotations into MediaWiki templates. This can help to simplify syntax for the users, to support the consistent usage of annotations, and to quickly obtain a great amount of semantic data by annotating existing templates. Read Help:Semantic templates for details.
Using a query to produce wikitext for annotations
If multiple pages P have an annotation P R Q for the same Q, corresponding annotations Q Rinv P can conveniently be produced with a query: [[Rinv::<ask sep="| ]][[Rinv::">[[R::Q]]</ask>| ]] This can then be copied from the rendered page to the edit box of Q. If applicable, namespace prefixes have to be added.
For example, if you have many pages with the relation "located in::California", and you want to annotate the California page with the inverse relation "location of" for each of these pages, you could put the following inline query in a scratchpad page:
[ [location of::<ask sep="| ]][[location of::">[[located in::California]]</ask>| ]]
This generates[ [location of::<ask sep="| ]]">[[located in::California
and you can copy this generated wikitext to the edit box of the California page. (Remove the space between the first two brackets.) Also user pages are either removed, or the prefix "user:" is provided.