User talk:GerardM
[edit] Welcome
Hallo daar en welkom, ik had nu pas door dat je een doorgewinterde wikipediaan bent. Leuk dat je interesse toont voor dit ambitieus project. En bedankt voor de goeie suggestie:-) --Joris Gillis 22:22, 6 February 2006 (CET)
[edit] Semantic Wikipedia not localisable
"This means that all the work that is done in one wikipedia needs to be done all over again in another wikipedia..." I didn't think of it before, but you're absolutely right; this is a major concern. SemanticWikipedia should be extended. I propose the following:
Whatever the language an article is written in, its relation to other articles is universal and not dependant of any specific language. Why not just put interwiki links inside the relations themselves? The abstract relation is a male sibling of would take the shape of is brother of on en.wikipedia, est le frère de on fr.wikipedia, is een broer van on nl.wikipedia, etc... The semantic information could then be harvested from all wikipedias and be combined by the semantic reasoner. As long as a particular relation has a verbal equivalent in a specific language, the triple search of the local wikipedia could enlist that relation, even if not explicitely defined on that wiki.--Joris Gillis 22:15, 6 February 2006 (CET)
- When a filly is a juvenile horse, you can indeed do all kind of clever things however, when you have this information in a database that specialises in words and more :) , there are three DefinedMeanings involved: "filly", "is a juvenile" and "horse". This is all it would take. The translations of the terms are all that it takes to make it work when the same data is needed in other languages.
- When an article is disambiguated, the name of the article is not obvious this results in problems when associating article names with article names in other languages. This would truly complicate a localisation effort in a semantic wiki way. In order to truly solve the interwiki issue, there is a need for database functionality. I would like to link both encyclopedic and dictionary information. This in turn will enable a richly integrated environment.. Again, WiktionaryZ intends to include thesaurus like structures. GerardM 23:05, 6 February 2006 (CET)
- There is already a very powerful bult-in technique for internationalisation in MediaWiki. It looks like:
[[en:House]] [[de:Haus]] [[fr:Maison]] [[it:Casa]]
- The only Thing we need to do now is [ transforming this into | interpreting this as ]:
[[is a::en:House]] [[is a::de:Haus]] [[is a::fr:Maison]] [[is a::it:Casa]]
- And repeating that for every language. This can be done because the prefixes (en, de, fr, etc.) are mapping to Namespaces defined by W3C-URLs (
en:expands tohttp://en.wikipedia.com/wiki/) and therefore working as (globally) unique Identifiers. - Anyway - the Disabigution-problem is another task, that could get solved by spezialized bots witch may need to remove unclear statements. Another - but dirty - solution migtht to let the Page X, where the meaning is more special, derive from the Disabigution-Page Y (in the other Language) with the "is::a" Statement (X,is_a,Y), but to don't state the reverse (Y,is_a,X).
- MovGP0 23:07, 10 February 2006 (CET)
- And repeating that for every language. This can be done because the prefixes (en, de, fr, etc.) are mapping to Namespaces defined by W3C-URLs (
- The interwiki eg [[en:House]] is an absolutely horrible construct. It is one of my pet hates because it does not scale, it does not properly translate, it does not properly disambiguate. When I get into a position to change things in MediaWiki this is what I would change.
- For your information in Februari alone I have done 14983 edits on the French wiktionary. A bigger number of edits is needed in Wikipedia due to its size and number of projects.
- Another argument against this construct is that it limits the functionality of this software to the Wikipedia environment. If the development of what you are doing is limited to Wikipedia I think you are betting on the wrong [[nl:paard]] GerardM 09:25, 11 February 2006 (CET)
- Well - in the current implementation this statement might be right, because Users tend to don't translate completely and/or correct - so it might be better to use global translation-articles where the translation takes place to reduce the redundancy. Therefore, the Metadata for Translations and Cathegories should have a separate Edit-Button. This will also force Categories to be the same in all Languages.
[[en:Category:EnglishCategoryName]] [[de:Kategorie:GermanCategoryName]] [[...]]
- Seeing it from the semantic statement, a central store might also reduce the flexibility in expressing semantic depencies. But I think this might be a better solution then the current approach. I'm also not truly a fan of Categories - they should get replaced by semantics.
- MovGP0 11:24, 12 February 2006 (CET)
When a central store is designed for terminological information as well as thesaurus information, I wonder why flexibility for the expression would be reduced. The current semantic aproach is English centred while the understanding of English is not universal anyway. GerardM 12:12, 12 February 2006 (CET)
- Not every word is one-by-one translateable, and so are phrases. Ie. take a word W witch has two possible meanings WA, WB. This meanings may have different translations TA and TB:
- W = {WA; WB}
- WA -> TA
- WB -> TB
- The Problems comes into play when a user creates a link TA -> W. That means that we may also need a technique to avoid creating direct tranlations of disambiguation-only pages. A technique witch is missing today is proper translation and other semantic markup of Redirect-Pages. A central Datastore may solve this.
- Semantic search is not based on english. It works also in most other languages by doing something like:
[[en:is a::en:is a]]
- In the translational Version of the Article Relation:is a in each language. Another possibility may to include this line also within the "translation-metastore" witch I stated before. Because (translation, is a, is same as) the translations will also generate the proper semantic meaning. Even when the most users can't read most of the non-english versions of the semantic markup, the users witch can will (hopefully) correct it. The name of the article witch does the translation has no meaning, so we can also use a unique CLSID. It would also be a requirement to have merging of such translation-articles. Ie. when there are two translations T(A,B) and T(C,D), and then comes a translation T(A,C), then the two translations need to get merged into T(A,B,C,D). MovGP0 23:03, 16 February 2006 (CET)
[edit] WiktionaryZ
This is the name of the project I am working to realise. It intends to be a "translation-metastore" among other things. Actually it intends to be of a lexicological, terminological and thesaurus nature.. (you could say everything and the kitchen sink). It would allow for the localisation for things like commons, we intend to write the software to enable this functionality.
With these "DefinedMeanings" linked to commons and to Wikipedia. With thesaurus like structures where all three elements are based on the "eat your own dogfood" principle, I can envision that we could include much if not all of the information that you would include in a semantic web.. :) GerardM 00:24, 17 February 2006 (CET)
- Indeed, I thaught on that while writing the statements above. The Wikidata Idea was great, at least for substituting Infoboxes by a database driven model. But Wikidata is a outdated project now. Also WiktionaryZ is in an early development state for the moment. The WiktionaryZ Prototype Database seems to be not editable at the moment, even its looking impressive.
- see also: WiktionaryZ Blog, WiktionaryZ on Meta
- MovGP0 11:16, 17 February 2006 (CET)
- Wikidata is the name for the technology that will drive WiktionaryZ. The documentation on Meta does indeed not do it justice. The Namespace manager, part of the Wikidata development, is being integrated in the MediaWiki software. Multilingual MediaWiki is another part of this puzzle; it intends to make MediaWiki language aware.
- The tentative timeline for editability for the alpha software will be March 15. A limited number of people will be given access to start of with because it will not be vandalproof from the start. Emphasis will be on people who are willing to help in working on the terminology that will be part of the user interface. They will be names of languages, lexicological / terminological terminology. Also of interest will be the phrases that express relations. Here we want to start with the relations that are defined in existing standards. The point is that when you start with this and translate them to other languages, it is likely that the adoption of these standards will be helped a lot and that it will prevent many unnecessary iterations because a good initial start will be provided. GerardM 11:34, 17 February 2006 (CET)
- I've tried to explain my ideas for interlanguage semantics a bit more graphically (picture upload is disabled in this wiki; so I used my german Userpage). But I think Multilingual MediaWiki goes very close to this - and in most cases far further.
- http://de.wikipedia.org/wiki/Benutzer:MovGP0/Interwiki_Semantics
- MovGP0 00:26, 20 February 2006 (CET)